what stands-in for a missing tool?: a prototypical ...ceur-ws.org/vol-2325/paper-05.pdf · madhura...
TRANSCRIPT
What Stands-in for a Missing Tool?
A Prototypical Grounded Knowledge-based
Approach to Tool Substitution
Madhura Thosar1, Christian A. Mueller2, Sebastian Zug11 Faculty of Computer Science,
Otto von Guericke University Magdeburg, Germany2 Robotics Group,
Computer Science & Electrical Engineering Department,Jacobs University Bremen, Germany
[email protected], [email protected],[email protected]
Abstract
When a robot is operating in a dynamic en-vironment, it cannot be assumed that a toolrequired to solve a given task will always beavailable. In case of a missing tool, an idealresponse would be to find a substitute to com-plete the task. In this paper, we present aproof of concept of a grounded knowledge-based approach to tool substitution. In orderto validate the suitability of a substitute, weconducted experiments involving 22 substitu-tion scenarios. The substitutes computed bythe proposed approach were validated on thebasis of the experts’ choices for each scenario.Our evaluation showed, in 20 out of 22 scenar-ios (91%), the approach identified the samesubstitutes as experts.
1 Introduction
The sophistication pertaining to tool-use in humansinvolves not just the dexterity in manipulating a tool,but also the diversity in tool exploitation. The abil-ity to exploit the tools has enabled humans to adaptand thus exert control over an uncertain environment,
Copyright c© by the paper’s authors. Copying permitted for pri-vate and academic purposes.
In: G. Steinbauer, A. Ferrein (eds.): Proceedings of the 11th In-ternational Workshop on Cognitive Robotics, Tempe, AZ, USA,27-Oct-2018, published at http://ceur-ws.org
especially when they are faced with unfavorable situa-tions. For example, if we don’t find a hammer to ham-mer a nail into a wall, we will use a heel of a shoe or arock or if a tray is unavailable for serving the drinks,we will use a plate for serving. In situations like these,humans seem to know - either from the past experi-ence or from observations or from the “necessity is themother of improvisation (invention)” type approach -what kind of object is needed as a substitute.
On the contrary, consider a robot performing a taskthat involves tool use. When a robot is operating in adynamic environment, it can not be assumed that atool required in the task will always be available. Insituations like these, an effective way for a robot wouldbe to find an alternative as humans do, for example,use an eating plate for serving, rather than wait un-til a tray becomes available. This skill is significantwhen operating in a dynamic, uncertain environmentbecause it allows a robot to adapt to unforeseen sit-uations to a degree. The question is how can a robotdetermine which object in the environment is a viablecandidate for a substitute? A possible approach wouldbe by interacting with an object in a manner missingtool is maneuvered. However, it would be time con-suming if a robot interacts with every single object inthe environment to determine a viability which makesthis approach less practical.
In this prototypical work, we propose a non-invasiveapproach that identifies viable candidate/s from theexisting objects in the environment. This paper makesthe following contributions: 1) An approach to creategrounded knowledge about objects expressed in termsof their properties (Sec. 5.2), 2) an approach to identify
20
relevant properties of a missing tool and determine asubstitute on the basis of them (Sec. 5.3).
2 Related Work
Typically, a substitute for a missing tool is determinedby means of knowledge base that provides knowledgeabout objects and similarity measures to determine thesimilarity between a missing tool and a potential sub-stitute. In the following, in addition to the approachesto determine a substitute, we also report the litera-ture related to existing knowledge bases developed forrobotic applications.
Knowledge Base
We reviewed in [24] nine existing knowledge basesnamely: KNOWROB [23], MLN-KB [25], NMKB [19],OMICS [10], OMRKF [21], ORO [14], OUR-K [15],PEIS-KB [8], and RoboBrain [20]. The objective wasto determine whether these existing knowledge basescontain 1) ontological knowledge about the propertiesof objects, 2) such knowledge is grounded into robot’sperception, and 3) intra-class variability in a propertyis modeled instead of expressing the property in a bi-nary form. We gained primarily the following insightswhich form the basis for our work.
We noted that the majority of the knowledge basesrelied on the external human-centric commonsenseknowledge bases such as WordNet, Cyc, OpenCyc, andsome either relied on the hand-coded knowledge oron the knowledge acquired by human-robot interac-tion. The main issue, we believe is that, the depth andbreadth of the human-centric knowledge base is notobservable by a robot in its entirety due to its lim-ited sensing capabilities. This causes a disconnect be-tween human-centric knowledge and robot-centric per-ception. To deflect this issue, we aim to acquire therobot-centric perceptual data for different propertiesof objects. Such property data can then be used togenerate grounded knowledge about objects (see Sec.5.1).
Substitution Computation
One of the closest areas that study the usability of anobject is affordancs of tools where the primary focusis to examine various functional abilities of an objectby exploring what actions can be performed on theobject and observing its responses. As such, using asubstitute in place of a missing tool can also be seen astransferring of an affordance of the missing tool to thesubstitute after determining similarity between them.
In [3], a substitute for a missing tool is inferred onthe basis of inheritance and equivalence relations. The
work discussed in [2] retrieves the knowledge about ob-jects from the ROAR [22] relational database and de-termines a substitute that shares similar affordances.However, in ROAR, the knowledge is acquired eitherusing machine learning techniques requiring trainingexamples or inferred or hand-coded. The work in [7]uses the ConceptNet where potential candidates areextracted from the knowledge base if they share thesame parent with a missing tool for the predeterminedrelations: has-property, capable-of and used-for. Aftereliminating irrelevant candidates, a substitute is deter-mined on the basis of the similarity metrics. The ap-proach proposed in [1] uses a part-based 3D model andweight of an object to determine the orientation andmanipulation of a substitute to be used as a missingtool. In the cases where supervised machine learningtechnique is used, providing bulk of labeled examplesbeforehand would not be realistic for a substitutionproblem scenario. On the other hand, the approacheswhich rely on existing external knowledge bases arebuilt around the available knowledge in the knowledgebases which does impose some constraints. We circum-vents this issue by first identifying what knowledge isgenerally required to determine a substitute and thenbuild an approach to acquire the required knowledgeand compute a substitute on the basis of it.
3 Challenges
How to characterize similarity between a missing tooland a potential substitute: A candidate for a substi-tute is expected to be similar to a missing tool to somedegree to ensure a substitutability. The notion of simi-larity can be understood in various forms, for instance,a distance between two objects denoted by two pointsin a multi-dimensional space or two objects belongingto the same cluster or aspects of the objects that areidentified as shared. In this work, the question will beaddressed in a broader sense: it is not merely aboutidentifying a similar object by deploying some similar-ity measure, instead, it is about gaining an access towhat aspects of the objects were found to be sharedbetween the similar objects.
What kind of knowledge is required to determine thesimilarity: It has been demonstrated in the literatureon tool use in humans and animals alike that in orderto use an object in tasks one needs to have knowledgeabout objects [4]. Baber in [5] also noted that con-ceptual knowledge about objects is especially desiredin tool use where a systematic deliberation is calledfor. For a robot, the story won’t be much different ifit is expected to perform in the real world along sidehumans. As a consequence, a robot needs conceptualknowledge about an object where the object will notbe only a physical entity that is merely to be perceived,
21
but also a concept which consists of distinct character-istics and relations which set each object apart fromeach other and also similar to each other.
How to acquire the necessary knowledge: The ac-quisition of such conceptual knowledge is not with-out challenge. From a robot stand-point, it is a trade-off between what needs to be known and what canbe known. The trade-off is a direct consequence ofthe limited perception capabilities of a robot whichoften leads to partial understanding of the environ-ment. While deploying a multi-modal perception toextract the required knowledge about objects wouldbe an ideal solution, however, it carries its own setof complexities such as noisy sensors, dynamicity ofthe environment, complexities of the composition ofan object. For this prototypical work, the necessaryknowledge is acquired using human-centric as well asmachine-centric methods.
How to maneuver a substitute as a missing tool:Once the substitute has been identified, a robot is ex-pected to use it in place of a missing tool and achievethe same result as the missing tool in the task. Thechallenge to estimate the maneuver as well as graspingof a substitute is two fold: to determine whether themaneuver and grasping knowledge of a missing toolcan be transferred and utilized on a substitute, else,estimate the maneuver and grasping for a substitutesuch that it can be used as a missing tool in the task.
For this work, we have focused on the first threechallenges and have developed a prototypical systemcalled ERSATZ (German word for a substitute or al-ternative) where the focus is to identify the requiredknowledge to determine a substitute and develop a sys-tem that computes a substitute for a missing tool.
4 Approach
The proposed approach distinguishes a tool from asubstitute where a tool is defined as an artifact thatis designed, manufactured and maneuvered in accor-dance with its designated purpose in the tasks suchas hammer for hammering, tray for serving etc., whilea substitute is seen as an extension of a missing tool.Within the context of a designated purpose, the rela-tionship between a tool and a substitute is symmetric,for instance, for hammering, a hammer can be replacedby a heeled shoe and vice versa. However, it may notalways be the case once you step outside the context,for instance, a hammer can not replace a heeled shoe.Our research work, therefore, focuses on searching fora substitute for a conventional tool required in the on-going task as opposed to determining a substitute foritself.
Consider a scenario in which a robot has to choosebetween a plate and a mouse pad as an alternative
for a tray. A tray can be defined as a rigid, rectangu-lar, flat, wooden, brown colored object while a platecan be defined as a rigid, circular, semi-flat, white col-ored object and a mouse pad as soft, rectangular, flat,leather-based object. Bear in mind, however, that someproperties are more relevant than others with respectto the designated purpose of the tool. For a tray whosedesignated purpose is to carry, rigid and flat are morerelevant to carry than a material or a color of a tray.Consequently, to find the most appropriate substitute,the relevant properties of the unavailable tool need tocorrespond to as large a degree as possible to the prop-erties of the possible choices for a substitute.
The proposed approach performs conceptualknowledge-driven computation to identify the relevantproperties of the missing tool and determines the mostsimilar substitute on the basis of those properties. Be-sides identifying the most similar object as a substitutefor a missing tool, the proposed approach grants an ex-plicit access to the relevant properties of the missingtool which carries twofold advantages: firstly, knowingwhich properties are primarily required in the poten-tial substitute narrows down the search space and sec-ondly, in case of an unknown object instance, only therelevant properties will have to be learned to determinea substitute.
The conceptual knowledge considered in this workprimarily involves properties of the objects. The prop-erties considered are divided into physical and func-tional properties where physical properties describethe physicality of the objects such as rigidity, weight,hollowness while the functional properties ascribe the(functional) abilities or affordances to the objectssuch as containment, blockage, support. The functionalproperties in the proposed approach play a primaryrole in identifying the relevant properties of a missingtool (see Sec. 5.3).
The functional properties considered in this workare derived from the theory of image schemas [9] whichhas its roots in cognitive linguistics. According to [12]image schemas are patterns abstracted from spatio-temporal experiences. Essentially image schemas cap-ture recurrent patterns that emerge from our percep-tual and bodily interactions with the environment.Since some of these patterns are posited on the op-erational abilities of objects Kuhn postulated in [12]that affordances a.k.a. functional properties [9] for thespatio-temporal processes can be derived from imageschemas. For example, the containment schema sug-gest an object’s ability to contain something or thesupport schema indicates an object’s ability to holdup something or the blockage refers to the ability ofan object to block or obstruct the movement of another object. Currently, the proposed system is re-stricted to three functional properties based on image
22
Figure 1: Illustration of the object shape conceptual-ization approach [18]. Concepts are randomly colored.
schemas: containment, support and blockage. While thefunctional properties as well as the designated purposecan both be identified as affordances, the proposed ap-proach is built by hypothesizing that the functionalproperties are building blocks upon which designatedpurposes of tools rest.
5 Methodology
5.1 Knowledge Acquisition
Our ultimate objective is to acquire machine centricdata from which property specific data can be ex-tracted. Such property data will then be used to gener-ate grounded knowledge about objects. As a first step,our initial property acquisition focuses on the compos-ite of a machine-centric and a human-centric method.In the machine-centric approach, geometrical prop-erties are acquired using a non-invasive vision-basedtechnique while non-geometric properties are acquiredby sampling from the data from the expert generatedintuitive model for the properties.
Machine Generated Properties: In this paper,we introduce a state-of-art data-driven approach thatunsupervisedly conceptualizes shape according to com-monalities within object point clouds which is dis-cussed in detail in our work [18]. As a result of theprocess, a set of shape concepts is generated whichconcept responses for an unknown object are used inthe knowledge base as machine-generated geometricobject properties.
In our previous work on shape concept learning [18],raw sensor information in form of point clouds is ab-stracted to a symbolic level in which point cloud seg-ments [17] may represent meaningful shape compo-nents in a symbolic space [16]. Therein we introducea hierarchical learning procedure that leads to sym-bols which are gradually organized to reflect generic-to-specific facets of shape components and can be sub-sequently used as building blocks that constitute ob-jects (see A in Fig. 1).
An object shape representation is introduced thatgradually encodes observed objects symbol composi-tions (see B in Fig. 1): from local components to com-ponent groups that may represent object parts or ob-jects as a whole. The proposed shape representationincorporates aspects of exemplar, respectively, proto-
type theory since we believe that the richness of a pro-totype provides an unaltered perspective on the char-acteristics of object instances. Based on the proposedsymbolic shape representation we analyze topology andstructure within the encoded symbol compositions inorder to discover persistent patterns that may repre-sent shape concepts.
We introduce an iterative filtering process [18] toassociated instances to groups which may representshape concepts (see C in Fig. 1). Given the set oflearned concepts, for an unknown object, concept re-sponses are retrieved (see D in Fig. 1) and exploitedas machine-generated geometric object property val-ues in our tool-substitution scenario. Note that in ourtool-substitution scenario, concepts are learned fromunlabeled object instances of the Object DiscoveryDataset(ODD) [17]; the ODD provides a variety of ob-jects from teddy bears over flash lights to shoes whichfacilitates an expressive concept generation.
Human Generated Properties The geometricproperties alone offer a very limited scope of the phys-icality as well as the functionality of an object. There-fore, to compensate the gap, we also considered non-geometrical properties such as weight, rigid, hollow-ness as physical and support, blockage, containmentas functional. Note that, in general, these proper-ties are challenging and cumbersome to extract solelyfrom non-invasive visuoperceptual approaches. Conse-quently, extracting such properties via multi-modal ormanipulation capabilities is needed, but this is beyondthe scope of this paper. In the generation process, a setof labeled prototype objects selected from the Wash-ington dataset (see Table 1) were taken into account.The distribution of each property for particular objectlabels (cf. Table 1) was approximated by an expertto resemble the scope for the variations in the valuesof the property in general. Consequently, given an ob-ject and its label, a sample value was drawn from thea-priori generated property distribution.
5.2 Knowledge about Objects
Knowledge about objects is spread across three levels:the first level consists of the data about the machine-generated as well as human-generated properties, thesecond level consists of qualitative knowledge aboutindividual object instances, while the third level con-sists of the aggregated qualitative fuzzy knowledgeabout respective classes of object instances. The fuzzyformalism is used to model the intra-class variationsin the objects. In the following, we discuss the for-mal description of the methodology deployed to creategrounded knowledge about objects.
Consider O as a given set of object class labelswhere (by abuse of notation) each object class is iden-
23
tified with its label. Let each object class O ∈ O bea given set of its instances. Let
⋃O be a union of all
object classes such that |⋃O| = n. Let P and F be
the given sets of physical properties’ labels and a setof functional properties’ labels respectively. By abuseof notation, each physical and functional property isidentified with its label. For each physical propertyP ∈ P as well as for a functional property F ∈ F,sensory data is acquired from each object instanceo ∈
⋃O. Let Pn and Fn represent sets of n number of
extracted sensory values from n number of object in-stances for a physical property P ∈ P and a functionalproperty F ∈ F respectively.
Sub-categorization - From Continuous to Dis-crete
The sub-categorization process is performed to form(more intuitive) qualitative measures to represent thedegree with which a property is reflected by an ob-ject instance. It is the first step in creating symbolicknowledge about object classes where the symbols rep-resenting the qualitative measures of a physical or afunctional property reflected in an object instance aregenerated unsupervisedly by a clustering mechanism.A qualitative measure of a physical property is referredto as a physical quality and that of a functional prop-erty as a functional quality.
In this process, Pn and Fn representing measure-ments of a physical property P ∈ P and a functionalproperty F ∈ F respectively extracted from n numberof object instances is categorized into a given num-ber of discrete clusters η using a clustering algorithm.Let ∇P and ∇F be partitions of the sets Pn and Fnafter performing clustering on them. Let Pη and Fηbe the sets of labels, expressing physical qualities andfunctional qualities, generated for a physical propertyP ∈ P and a functional property F ∈ F respectively.Given the label for a property, the quality labels aregenerated by combining a property label P and a clus-ter label (created by the clustering algorithm). Forinstance, the quality labels for a property size arerepresented as {size 1, size 2, size 3, size 4}. At theend of the sub-categorization process, the clusters aremapped to the generated symbolic labels for qualita-tive measures.
Note that the number of clusters essentially de-scribes the granularity with which each property canqualitatively be represented. The higher number ofclusters suggest that an object is described in a finerdetail which may obstruct the selection of a substitutesince it may not be possible to find a substitute whichis similar to a missing tool down to the finer details.For example, in size = {small, medium, big, bigger},size is a physical property and small, medium, big,
bigger are its physical qualities. The semantic termsgiven above are meant for the readers to understandthe qualitative measures of the properties.
Attribution - Object Instance Knowledge
The attribution process generates knowledge abouteach object instance by aggregating all the physi-cal and functional qualities assigned to the objectinstance by the sub-categorization step. In otherterms, the knowledge about an instance consists ofthe physical as well as functional qualities reflectedin the instance. Let Pη and Fη be the families ofsets containing the physical quality labels Pη and thefunctional quality labels Fη for each physical propertyP ∈ P and functional property F ∈ F respectively.Thus, each object instance o ∈
⋃O is represented as
a set of all the physical as well as functional qualitiesattributed to it which are expressed by a symbol holdsas: holds ⊂
⋃O× (Pη ∪ Fη) For example, knowledge
about the instance plate1 of a plate class can be givenas, holds(plate1,medium), holds(plate1, harder),holds(plate1, can support) where medium is a phys-ical quality of size property, harder is a physicalquality of rigidity property and can support is afunctional quality of support property.
Conceptualization - Knowledge about Objects
The conceptualization process aggregates the knowl-edge about all the instances of an object class. Theaggregated knowledge is regarded as conceptual knowl-edge about an object class.
Let OKB be a knowledge base about object classeswhere each object class O ∈ O. Given the knowledgeabout all the instances of an object class O, in the con-ceptualization process, the knowledge about the objectclass OK ∈ OKB is expressed as a set of tuples con-sisting of a physical or a functional quality and itsproportion (membership) value in the object class. Atuple is expressed as 〈O, t,m〉 where t ∈ Pη ∪ Fη anda proportion value m is calculated using the follow-ing membership function: m = P (holds(o, t)|o ∈ O).The proportion value allows to model the intra-classvariations in the objects.
For example, knowledge about object class tablecan be expressed as: {〈plate, harder, 0.6 〉, 〈plate,light weight, 0.75 〉, 〈plate, less hollow, 0.67 〉, 〈plate,hollow, 0.33 〉, 〈plate, more support, 0.71 〉}, where thenumbers indicate that, for instance, physical qualityharder was observed in 60% instances of object classplate. At the end of the conceptualization process,conceptual knowledge about an object class is cre-ated which is represented in a symbolic fuzzy formand grounded into the human-generated or machine-generated data about the properties of objects. The
24
Figure 2: A typical process flow to determine a substi-tute for a missing tool from the available objects.
knowledge about objects is then used to determine asubstitute from the existing objects in the environ-ment.
Conceptualization - Knowledge about Func-tional Properties
In addition to conceptual knowledge about ob-jects, Conceptualization process also creates knowl-edge about functional quality, termed as a functionmodel, by associating the occurrence of physical qual-ities in an object instance with the occurrence of afunctional quality in the instance and aggregating theresult of such concurrent occurrences. The role of afunctional model is discussed later in the section 5.3.Given the knowledge about the object instances, afunction model fd of a functional quality f ∈ Fη is ex-pressed as a set of tuples containing a functional qual-ity f ∈ Fη, a physical quality p ∈ Pη and a propor-tion value d. A tuple is represented as 〈f, p, d〉 wheref ∈ Fη, p ∈ Pη and a proportion value d is computedas, d = P (holds(o, p)|holds(o, f)) For example, a func-tion model for a functional quality more support isgiven as, { 〈more support, harder,0.8 〉, 〈more support,softer, 0.2 〉 where the number indicates that, for in-stance, functional quality more support and a physicalquality harder co-occurred in the knowledge about theobject instances 80% of the time.
5.3 Reasoner
Fig. 2 illustrates a process flow consisting of the pri-mary operations involved in determining a substitute.The flow offers an approximated aerial view for theprototypical model of ERSATZ. When ERSATZ isqueried to find a substitute for a missing tool x fromthe set of available objects Y the system checks if thesubstitution model for x exists in the knowledge base.If the substitution model does not exist, then the rea-soner computes the relevant functional and physicalproperties of the queried tool.
Representative Models
A representative physical model and a representativefunctional model of an object consists of the physicalor functional qualities, respectively, that are regardedas representative qualities of the object class, while thequalities which do not fall under representative qual-
ities are regarded as exceptional or uncommon quali-ties.
Let O ∈ O be an object class of a missing tool andlet θ is a representative model threshold which qualifiesa physical or a functional quality as stereotypical orrepresentative to the object class O. Orp is called asa representative physical model of an object class Osuch that Orp = {p : implies(O, p) ≥ θ, p ∈ Pη} andOrf is called as a representative functional model of anobject O such that Orf = {f : implies(O, f) ≥ θ, f ∈Fη}. Similarly, let fd be a function model of functionalquality f , then frp is called as a representative physicalmodel of a functional quality f such that frp = {p :implies(f, p) ≥ θ, p ∈ Pη}
Relevant Qualities
Due to the abstract nature of an image schema andby extension a corresponding functional property, itcan subsume various purposes of objects, for example,a functional property support which can subsume thepurposes place on, sit on and serve on of the a table,a chair and a tray respectively. It is suggested in [6]that a certain assemblage of physical properties areessential prerequisites to enable a functional property.Thus, it can be assumed that by knowing the relevanceof one functional property can help identify the rele-vant physical properties of different objects which areused for different purposes.
The relevance of a representative functional qualityis decided by examining whether the physical charac-terization of the function model of the representativefunctional quality of a tool are in a close proximity tothe physical characterization of a representative phys-ical model of the tool. The close proximity between afunctional quality and the object class of the tool isdetermined using Jaccard Index. Jaccard Index deter-mines a similarity and dissimilarity between the twosets A and B where the similarity is calculated by di-viding the magnitude of the intersection of A and Bby the magnitude of the union of A and B.
Let Orp and frp be the representative physical mod-els of an object class O of the missing tool and of afunction model fd of a representative functional qual-ity f ∈ Fη of the object class O respectively. Let φ bea Minimum Similarity Tolerance threshold for similar-ity. Then, Jaccard Index ofOrp and frp is computed as:
J(Orp, frp) =|Orp∩frp||Orp∪frp| . A representative functional
quality f of an object class O is regarded as relevantif J(Orp, frp) > φ. Let OF ′ be a set of all relevantfunctional qualities of an object class O. Let frp be arepresentative physical model of a function model fdof a relevant functional quality f ∈ OF ′ . Let Orp be arepresentative physical model of O. Then, the relevantphysical qualities of an object class O, expressed by a
25
set OP ′ = (Orp ∩ frp).
Reasoning about a Substitute
Let Oµ ∈ O be an object class of a missing tool andlet Oβ ∈ O be an object class of a possible candidatefor a substitute. Let OµP ′ be a set of relevant phys-ical qualities of Oµ and let Oβrp be a representative
physical model of Oβ . Let φ be a Minimum SimilarityTolerance threshold for similarity. The substitutabilityof a candidate is determined by measuring the sim-ilarity between OµP ′ and Oβrp using Jaccard’s Index.
Oβ is termed as a substitute, expressed as Oβ+, ifJ(OµP ′ , Oβrp) > φ, else it is regarded as not a substi-
tute and expressed as Oβ−. Given the set of relevantphysical qualities OµP ′ , the set of relevant functionalqualities OµF ′ and a positive substitute Oβ+, and a neg-ative substitute Oβ−, a substitution model of Oµ is ex-pressed as a tuple: 〈OµP ′ , O
µF ′ , Oβ+, Oβ−〉. The knowl-
edge about object Oµ ∈ O is then extended in OKB
to accommodate its substitution model.
6 Experimental Evaluation
The objective of the experimental evaluation of ER-SATZ is to validate the suitability of the substitutescomputed by ERSATZ by comparing the results withthat of human experts. For the experimental eval-uation, we used the images from the WashingtonDataset [13] to generate human-based and machine-based properties. Around 22 object categories were se-lected and for each category, we selected random im-ages from all the given instances of the category lead-ing up to total of 692 images. Table 1 illustrates thenumber of images selected from each category. For theexperiment, we generated 22 queries based on 22 ob-ject categories. Each query consisted of a missing tooland 5 randomly selected objects from which a sub-stitute was to be selected. We gave 22 queries, to 14human experts and asked them to select a substitutein each query. The distribution of the human selectionsfor each scenario is illustrated in Fig. 3(a). Similarly,the queries were run on ERSATZ with the following(heuristically determined) optimal values of the targetparameters: i) Number of machine-generated proper-ties is set to 4 (Sec. 5.1), ii) Number of clusters to 4(Sec. 5.2), iii) Representative threshold (Sec. 5.3) andMinimum Similarity Tolerance (Sec. 5.3) to 0.35.
The results of both experiments were plotted as aheat map where the y-axis shows missing tools and x-axis shows the available objects illustrated in Fig. 3.The grayed cells mean the corresponding object cate-gories were not available in the respective query. Thecells that are marked with represents substitutes se-lected by experts and ERSATZ. Out of 22 scenarios,
Table 1: Number of scans (#) per category (Σ# =692) of the Washington RGBD dataset [13].
Category
lab
el
ball
bin
der
bow
lcap
cere
al
box
coff
ee
mug
flash
light
food
bag
food
box
food
can
food
cup
food
jar
hand
tow
el
keyb
oard
kle
enex
note
book
pit
cher
pla
te
sham
poo
soda
can
sponge
wate
rb
ott
le
Inst
.
1-7
1-3
1-6
1-4
1-5
1-8
1-5
1-8
1-1
2
1-1
4
1-5
1-6
1-5
1-5
1-5
1-5
1-3
1-7
1-6
1-6
1-1
2
1-9
Scans
per
Inst
.5 10
5 8 6 4 6 4 3 2 6 5 6 6 6 6 10
5 5 5 3 4
# 35303032303230323628303030303030303530303636
ball
bin
der
bow
lca
pce
real
box
coff
eem
ug
flash
light
food
bag
food
box
food
can
food
cup
food
jar
han
dto
wel
keyb
oard
kle
enex
note
book
pit
cher
pla
tesh
am
poo
sod
aca
nsp
on
ge
wate
rb
ott
le
Substitutes
ballbinder
bowlcap
cereal boxcoffee mug
flashlightfood bagfood boxfood canfood cupfood jar
hand towelkeyboard
kleenexnotebook
pitcherplate
shampoosoda can
spongewater bottle
Mis
sin
gto
ols
(a) Human expert selectiondistributions
bal
lb
ind
erb
owl
cap
cere
alb
oxco
ffee
mu
gfl
ash
ligh
tfo
od
bag
food
box
food
can
food
cup
food
jar
han
dto
wel
keyb
oard
kle
enex
not
ebook
pit
cher
pla
tesh
amp
oo
sod
aca
nsp
onge
wat
erb
ottl
e
Substitutes
ballbinder
bowlcap
cereal boxcoffee mug
flashlightfood bagfood boxfood canfood cupfood jar
hand towelkeyboard
kleenexnotebook
pitcherplate
shampoosoda can
spongewater bottle
0.0
0.2
0.4
0.6
0.8
1.0
Dis
trib
uti
on/
Sim
ilari
ty
(b) ERSATZ selections with sim-ilarity to the missing tool
Figure 3: Substitution results w.r.t. human expert se-lection distribution and ERSATZ similarity responses.Note that, gray cells correspond to object categorieswhich are not available in the respective query, cellsmarked with represents substitutes selected by ex-perts and ERSATZ.
ERSATZ and the experts identified the same substi-tutes in 20 scenarios (91%).
7 Future Work
The paper presents a prototypical system to deter-mine a substitute for a missing tool using the groundedknowledge about objects. The approach has drawn in-spiration from symbol grounding, the theory of affor-dances and the theory of image schemas to representthe grounded knowledge and to determine a substi-tute. This is an ongoing research with a focus on the
26
Sensory Data - Object Instances
Physical Property Data - Object Instances
Fuzzy Conceptual Knowledge - Object Instances
Fuzzy Conceptual Knowledge - Object Classes
Property Extraction Methods
Clustering Method
Bivariate Joint Frequency Distributions
Functional Property Data - Object Instances
Aggregation
Figure 4: Multi-layered dataset to build a robot-centricgrounded knowledge about objects.
following aspects.Our immediate goal focuses on the fuzzification of
the clustering method and the reasoning method tocombat the migration of the data points within clus-ters. Moreover, we have derived three functional prop-erties, namely, contain, support, block from the imageschemas Containment, Support and Blockage respec-tively. However, further investigation is needed to for-malize the identification of additional functional prop-erties to be derived from the existing image schema.For robot-centric property acquisition, we are cur-rently developing a framework that allows a robot toextract properties of individual objects and build aknowledge base in a bottom-up manner such that theknowledge about properties of objects is constructedon the basis of what is sensed (see Fig. 4). We haveproposed the preliminary framework in [11].
References
[1] P. Abelha, F. Guerin, and M. Schoeler. A model-based approach to finding substitute tools in 3Dvision data. Proceedings - IEEE InternationalConference on Robotics and Automation, 2016-June, 2016.
[2] A. Agostini, M. J. Aein, S. Szedmak, E. E. Ak-soy, J. Piater, and F. Worgotter. Using struc-tural bootstrapping for object substitution inrobotic executions of human-like manipulationtasks. IEEE International Conference on Intelli-gent Robots and Systems, 2015-Decem:6479–6486,2015.
[3] I. Awaad, G. K. Kraetzschmar, and J. Hertzberg.Challenges in Finding Ways to Get the JobDone. Planning and Robotics (PlanRob) Work-shop at 24th International Conference on Auto-mated Planning and Scheduling, 2014.
[4] C. Baber. Cognition and Tool Use. Taylor andFrancis, 2003.
[5] C. Baber. Introduction. In Cognition and ToolUse, chapter 1, pages 1–15. Taylor and Francis,2003.
[6] C. Baber. Working With Tools. In Cognition andTool Use, pages 51–68. 2003.
[7] A. Boteanu, A. St. Clair, A. Mohseni-Kabir,C. Saldanha, and S. Chernova. LeveragingLarge-Scale Semantic Networks for AdaptiveRobot Task Learning and Execution. Big Data,4(4):217–235, 2016.
[8] M. Daoutis, S. Coradeshi, and A. Loutfi. Ground-ing commonsense knowledge in intelligent sys-tems. Journal of Ambient Intelligence and SmartEnvironments, 1(4):311–321, 2009.
[9] P. Gardenfors. Cognitive semantics and imageschemas with embodied forces. Image (Rochester,N.Y.), pages 1–16, 1987.
[10] R. Gupta and M. J. Kochenderfer. Common SenseData Acquisition for Indoor Mobile Robots. InProceedings of the Nineteenth National Confer-ence on Artificial Intelligence, Sixteenth Confer-ence on Innovative Applications of Artificial In-telligence, pages 605–610, San Jose, California,USA, 2004.
[11] G. Jager, C. A. Mueller, M. Thosar, S. Zug,and A. Birk. Towards Robot-Centric ConceptualKnowledge Acquisition. In Robots that learn andreason in IEEE/RSJ International Conference onIntelligent Robots and Systems, Madrid, 2018.
[12] W. Kuhn. An Image-Schematic Account of Spa-tial Categories. Spatial Information Theory, pages152–168, 2007.
[13] K. Lai, L. Bo, X. Ren, and D. Fox. A Large-Scale Hierarchical Multi-View RGB-D ObjectDataset. In IEEE International Conference onRobotics and Automation (ICRA), pages 1817–1824, Shanghai, China, 2011.
[14] S. Lemaignan, R. Ros, L. Mosenlechner, R. Alami,and M. Beetz. ORO, a knowledge managementplatform for cognitive architectures in robotics.IEEE/RSJ 2010 International Conference on In-telligent Robots and Systems, IROS 2010 - Con-ference Proceedings, (April):3548–3553, 2010.
[15] G. H. Lim, I. H. Suh, and H. Suh. Ontology-based unified robot knowledge for service robotsin indoor environments. IEEE Transactions onSystems, Man, and Cybernetics Part A:Systemsand Humans, 41(3):492–509, 2011.
27
[16] C. Mueller, K. Pathak, and A. Birk. Object shapecategorization in rgbd images using hierarchicalgraph constellation models based on unsupervis-edly learned shape parts described by a set ofshape specificity levels. In International Confer-ence on Intelligent Robots and Systems, 2014.
[17] C. A. Mueller and A. Birk. Hierarchical Graph-Based Discovery of Non-Primitive-Shaped Ob-jects in Unstructured Environments. In Interna-tional Conference on Robotics and Automation,2016.
[18] C. A. Mueller and A. Birk. Conceptualization ofObject Compositions Using Persistent Homology.In IEEE/RSJ International Conference on Intel-ligent Robots and Systems (IROS), Madrid, 2018.
[19] L. A. Pineda, A. Rodrıguez, G. Fuentes,C. Rascon, and I. Meza. A light non-monotonicknowledge-base for service robots. Intelligent Ser-vice Robotics, 10(3):159–171, 2017.
[20] A. Saxena, A. Jain, O. Sener, A. Jami, D. K.Misra, and H. S. Koppula. RoboBrain: Large-Scale Knowledge Engine for Robots. arXiv, pages1 – 11, 2014.
[21] I. H. Suh, G. H. Lim, W. Hwang, H. Suh,J. H. Choi, and Y. T. Park. Ontology-based
multi-layered robot knowledge framework (OM-RKF) for robot intelligence. IEEE InternationalConference on Intelligent Robots and Systems,(October):429–436, 2007.
[22] S. Szedmak, E. Ugur, and J. Piater. KnowledgePropagation and Relation Learning for Predict-ing Action Effects. In IEEE/RSJ InternationalConference on Intelligent Robots and Systems,Chicago, 9 2014.
[23] M. Tenorth and M. Beetz. KNOWROB-Knowledge Processing for Autonomous PersonalRobots. In IEEE/RSJ International Conferenceon Intelligent Robots and Systems, pages 4261–4266, 2009.
[24] M. Thosar, S. Zug, A. M. Skaria, and A. Jain.A Review of Knowledge Bases for Service Robotsin Household Environments. In 6th InternationalWorkshop on Artificial Intelligence and Cogni-tion, 2018.
[25] Y. Zhu, A. Fathi, and L. Fei-Fei. Reasoning AboutObject Affordance in a Knowledge Based Repre-sentation. European Conference on Computer Vi-sion, (3):408–424, 2014.
28