semantic parsing in spoken language understanding using
TRANSCRIPT
Semantic Parsing in Spoken Language Understanding using Abstract Meaning Representation
Master’s Thesis
Presented to
The Faculty of the Graduate School of Arts and SciencesBrandeis University
Department of Computer ScienceDr. Nianwen Xue, Advisor
In Partial Fulfillmentof the Requirements for the Degree
Master of Sciencein
Computational Linguistics
byHongyuan Shen
August 2018
Copyright by
Hongyuan Shen
© 2018
Acknowledgements
For Seyoon Kang
First, and most of all, I would like to thank my advisor, Dr. Nianwen Xue, for his enthusiastic support and guidance from the moment I approached him about this topic. Without your help this thesis would not have been possible. I would like to thank my committee members, Dr. James Pustejovsky and Dr. Lotus Goldberg, for their assistance and encouragement. I would like to give special thanks to the Semantic Analytics Group at the University of Rome for sharing the related dataset and transcription, and I would also like to extend my sincere gratitude to the entire Computational Linguistics Department at Brandeis University for allowing me to participate in such a wonderful discipline. Last but not least, I would like to thank my friends and everyone else who helped contribute to this project, and to my best friends Becky, Jiajie, Ju and KaMan. Thank you for keeping me company along the road.
iii
ABSTRACT
Semantic Parsing in Spoken Language Understanding using Abstract Meaning Representation
A thesis presented to the Department of Computer Science
Graduate School of Arts and SciencesBrandeis University
Waltham, Massachusetts
By Hongyuan Shen
With the increasing interest in Spoken Language Understanding (SLU) related
applications such as voice assistant and automated answering system, we aim to find a
better system solution for such tasks, in particular, concentrating on the semantic
parsing procedure that transforms the spoken language into a Meaning Representation
Language (MRL) where only the semantic meaning and logistic relations in the word
span are preserved. The viability of the stage in SLU known as "grounding" between
our proposed MRL, Abstract Meaning Representation (AMR) language, and computer-
interpretable commands is also analyzed in this thesis. To assist with the planned
experiments, the first AMR corpus that is populated with spoken language transcription
iv
is created by manual annotation, providing as a reference for future studies on SLU-
oriented AMR parsing tasks.
Keywords: SLU, AMR, annotation, semantic parsing, grounding, NLP
v
ContentsAcknowledgements iii
Abstract iv
Contents vi
List of Figures viii
List of Tables ix
List of Abbreviations x
1 Introduction 1
2 Background 3
2.1 Spoken Language Understanding . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Abstract Meaning Representation . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Related Work 7
3.1 SLU Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 AMR Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Data Preparation 10
4.1 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 AMR Parsing in JAMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Experiments 14
5.1 Experiment 1: Semantic Parsing in SLU . . . . . . . . . . . . . . . . . . . . . 14
5.1.1 Using Pre-trained AMR Model . . . . . . . . . . . . . . . . . . . . . . 14
vi
5.1.2 Using Retrained AMR Model . . . . . . . . . . . . . . . . . . . . . . . 15
5.1.3 Experiment 1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Experiment 2: Grounding in SLU . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2.1 Pseudo Function Designing . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2.3 Experiment 2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6 Conclusion 26
Bibliography 28
A Annotated Sentences (Partial) 31
vii
List of Figures2.1 Sample SLU work flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 AMR Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Logical triples in Smatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1 PropBank frame example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Relations in AMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.1 Subject issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Pronoun issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.3 AMR and Robotic Command . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
viii
List of Tables5.1 JAMR performance using pre-trained AMR models . . . . . . . . . . . . . . 14
5.2 JAMR performance on HuRIC corpus using different models . . . . . . . . . 16
5.3 Performance of AC and PC classifiers . . . . . . . . . . . . . . . . . . . . . . 24
ix
List of AbbreviationsAC Action Classification
AMR Abstract Meaning Representation
AMRL Alexa Meaning Representation Language
ASR Automatic Speech Recognition
CFG Context-Free Grammar
CI Concept Identification
CL Computational Linguistics
CRF Conditional Random Field
DAG Directed Acyclic Graph
GSL Grammar Specification Language
HuRIC Human-Robot Interaction Corpus
LSTM Long Short-Term Memory
ML Machine Learning
MRL Meaning Representation Language
MSCG Maximum Spanning Connected Subgraph
NLP Natural Language Processing
NLU Natural Language Understanding
NN Neural Network
PC Parameter Classification
RI Relation Identification
SLU Spoken Language Understanding
SVM Supported Vector Machine
x
1 IntroductionMachine reading comprehension has always been one of the hottest topics for discussion
in the fields of Computational Linguistics (CL) and Natural Language Processing (NLP).
In the mid 1960s, a computer program STUDENT (Bobrow, 1964) first showed the like-
lihood of utilizing a machine to process natural language input. Subsequently, another
program which enables human and the machine to have simple conversations (Weizen-
baum, 1966) was introduced two years later. With more relevant researches emerged,
Natural Language Understanding (NLU) turns into a term of science that describes such
tasks, as it appears like the machine is attempting to "understand" the language during
the parsing process. Later on, the advance on Automatic Speech Recognition (ASR) tech-
nology further made a subordinate branch of NLU - the Spoken Language Understanding
(SLU) - possible. Slightly different from the tasks in NLU, an SLU system generally re-
quires an additional step before language parsing, which is to transcribe a piece of audio
speech material using ASR approaches. Moreover, the transcription in SLU usually con-
tains more ungrammatical constructions and disfluencies than the text in NLU (Ward and
Issar, 1994). Due to these differences, it is worth noting that some methodologies succeed
in NLU may not be as effective when conveyed in an SLU task.
Abstract Meaning Representation (AMR) language is one of the recently developed tools
that has been utilized on many NLU tasks such as Entity Linking (Pan et al., 2015) and
Question Answering (Mitra and Baral, 2016). It is known for its capability of capturing
the semantic meaning of sentences using simple Directed Acyclic Graph(DAG) structures.
AMR focuses on the intrinsic meaning rather than syntactic structure of the sentence, and
as a result sentences share a comparable semantic meaning will be mapped to the same
AMR graph. Its universality is another favorable feature comparing to other Meaning
Representation Languages (MRLs): theoretically, all English sentences can be parsed into
1
Chapter 1. Introduction
its AMR form following the annotation guideline.
However, because AMR is as yet a comparatively new concept, this method has been
scarcely applied on SLU-oriented tasks, and the annotation of SLU-related transcripts is
also absent in the corpus. Therefore, in this thesis, we aim to examine the effectiveness
of applying AMR in an SLU project, and in the meantime, by adding new annotations of
transcription from human speeches, we plan to expand the size and domain coverage of
the current AMR corpus.
At last, since semantic parsing is only an intermediate stage in most SLU applications, the
last step of transforming the meaning representation into tangible computer-operational
commands is likewise pivotal in the work flow. In this way, we decide to investigate
the feasibility of extracting information from AMR structure for command processing
purposes by using Machine Learning (ML) tools, and the overall robustness of AMR lan-
guage is thus examined in an SLU-based scenario.
2
2 Background
2.1 Spoken Language Understanding
For most if not all SLU tasks, the ultimate objective is to extract all meaningful compo-
nents from the utterance, and then generate a semantic structure built using some pre-
defined grammar representing such extracted elements. A standout amongst the most
commonly applied semantic representation is the frame-based structure, where the se-
mantic meanings are mapped into scene-based "frames", that is, the collective pragmatic
knowledge related to such meanings. The theory of frame semantics was first raised by
Fillmore (1976). In his work, besides the frames, for each lexical item that has semantic
relations with the frame in a sentence, a "slot" is created to fit it in with the frame. Thus, in
semantic frame-based SLU, the transcription of an utterance can be separated into frames
followed by their corresponding slots. For instance, when the sentence "open the door" is
processed using frame semantics rules, the verb "open" will be linked to a specific frame
explaining the pragmatic meaning of "open", while the lexical word "door" will be filled
into a slot that connects to the frame. After the transformation, the system only needs
to handle a finite set of frames and slots using either knowledge-based or data-driven
methods. Figure 2.1 demonstrates an example work flow of a semantic frame-based SLU
system:
In the work flow, after the original utterance is transcribed to the sentence "bring the can
on the dining table", the accompanying step "Action Detection" maps the verb "bring"
to its frame "bringing", and then the step after maps the rest of lexical items with their
corresponding slots, such as "Theme" and "Goal". Finally, the frame and slots are trans-
formed into computer-interpretable commands through a grounding procedure (Harnard
3
Chapter 2. Background
FIGURE 2.1: Sample SLU work flow of a semantic frame-based SLU system
(Bastianelli et al., 2016)
and Harnad, 1990) which will be further discussed in section 5.2.
2.2 Abstract Meaning Representation
AMR, an abbreviation for Abstract Meaning Representation, is one of the forms of Mean-
ing Representation Language (MRL). The latter derives from the idea of programming
languages, where they are both constructed using formal logic methods with human de-
signed grammar and computer semantic structure (Tur and De Mori, 2011). The concept
of AMR was first introduced by Langkilde and Knight (1998), and has turned out to be
one of the most popular MRLs being utilized in researches since then. The primary goal
of AMR, according to Banarescu et al. (2013), is to "spur new work in statistical natural
language understanding and generation."
AMRs are designed as single-rooted, labeled and directed graphs which take after the
neo-Davidsonian style (Davidson, 1969). In addition to the graph format, to make the
reading and annotation process smoother, the authors have additionally created another
4
Chapter 2. Background
input format in light of PENMAN notation (Matthiessen and Bateman, 1991), which is
known as the AMR format. The third format of AMR is designed as the conjunction of
logical triples, which is named as the logic format. An example of the said three formats
representing the sentence "the boy wants to go" can be seen in Figure 2.2.
FIGURE 2.2: Three equivalent AMR formats representing the sentence "the
boy wants to go" (Banarescu et al., 2013)
When it comes to the semantic part, AMR uses PropBank framesets (Kingsbury and
Palmer, 2002; Palmer, Kingsbury, and Gildea, 2005) for its semantic element construc-
tion. In PropBank, every verb is assigned to one or more frames based on the number
of its semantic meanings. For instance, when the verb "follow" serves as the sense of "be
subsequent", it will be linked to the frame "follow-01"; but when it has the meaning of "ad-
here to" in a sentence, it will then be linked to another frame "follow-02". Like other frame
semantics corpora, in PropBank each frame has its own set of slots called "arguments".
5
Chapter 2. Background
Each AMR consists of two types, concept and relation. In the graph format of AMR, con-
cepts are represented as graph nodes and relations as edges. In the AMR format, concepts
are always led by a pivot instance, and relations always start with a colon. Relations link
concepts, as edges link nodes in a tree. This pure rule-based grammar abstracts away
from syntactic idiosyncrasies and focuses on representing the logical features of natural
language sentences.
To facilitate the annotation process for annotators and to ensure the annotation consis-
tency, an online AMR annotation tool
1
is provided with functions such as dictionary
look-up, common relation shortcuts, and last command history.
For evaluation purposes, an algorithm called Smatch (semantic match) (Cai and Knight,
2013) was proposed to calculate the parsing accuracy of generated AMR structures. Dur-
ing the evaluation procedure, an AMR graph is broken down into a conjunction of logical
propositions, or triples. Each triple consists of one relation and either a variable plus a
concept or two concepts. Figure 2.3 demonstrates an instance of triples from the sentence
"the boy wants to go."
FIGURE 2.3: Logical triples of sentence "the boy wants to go." (Cai and
Knight, 2013)
1
https://www.isi.edu/~ulf/amr/AMR-editor.html
6
3 Related Work
3.1 SLU Approaches
As mentioned in section 2.1, Spoken Language Understanding is the process of receiving
audio speeches, extracting the information, and reacting to the information based on pre-
programmed commands. There are presently two mainstream methodologies that have
been adopted in tackling the SLU problem: knowledge-based and data-driven.
Knowledge-based solutions depend intensely with respect to the general linguistic knowl-
edge and usually use the Context-Free Grammar (CFG) as the framework to populate the
input words. The first advantage of knowledge-based approaches is that, unlike data-
driven methods, the grammar-based system requires very few - if not no at all - data to
be functional. Also, in a grammar-based system, the CFG designed for an SLU model can
be used in both parsing and ASR procedure to improve system accuracy. An example
can be found in the research by Bos and Oka (2007), where they created an interactive
robot named Gobot. In the SLU part of the task, they used a CFG model called Grammar
Specification Language (GSL) for both semantic parsing and ASR process and produced
the transcription altogether with the final decision of the robot.
Data-driven methods, on the other hand, do require significantly more labeled data for
training purposes. Be that as it may, without the laborious grammar-designing stage, the
annotations for statistical SLU systems are usually much more accessible to create. More-
over, while the supervised-learning strategy of knowledge-based systems often makes the
process of new data adaption a time-consuming manual effort, a data-driven approach
can for the most part finish the adaption process automatically, without intervention from
human specialists. Chen and Mooney (2011) used this methodology in their research
7
Chapter 3. Related Work
where they built an SLU system that parses natural-language navigation instructions into
executable routes on the map. It is noteworthy that even though unsupervised-learning
procedure may save the program designer’s time in updating grammar rules, to achieve
a fair accuracy for the SLU system, an efficient classifier accompanied with its manual
feature-engineering during parsing is still necessary. Classifiers such as SVM (Supported
Vector Machine) and LSTM (Long Short-Term Memory) are some of the most popular
choices applied in recent studies (Moschitti, Pighin, and Basili, 2008; Foland and Martin,
2016). We follow the data-driven strategy in this project because due to the ungram-
matical and fragmented nature of SLU transcription, the same grammar rule adopted in
NLU is expected to perform not as efficient in SLU tasks, making the knowledge-based
methodology a less favorable choice.
To create a statistical classifier in SLU, other than the manual engineering of features men-
tioned in the last section, the representation of outcomes must also be determined. Mod-
els designed for such representation are called the Meaning Representation Language
(MRL). In the field of CL, two of the most well-constructed and broadly used MRLs nowa-
days are AMR and FrameNet (Baker, Fillmore, and Lowe, 1998), as they are both designed
to be general-purpose MRLs and can both be applied to a wide range of semantic pars-
ing tasks. AMR, as introduced in chapter 2.2, is a graph-based MRL of lexical concepts
and typed relations. FrameNet is built upon the frame semantics theory, and its roles and
relations are scene-specific. However, in the FrameNet corpus, most annotations are lexi-
cographic, whereas in AMR almost all annotations are done in the sentence level. The lack
of full-text annotation makes FrameNet less suitable for SLU tasks comparing to AMR.
Within the realm of statistical SLU, several alternative MRLs have been designed. How-
ever, comparing to AMR or FrameNet, the domain of those corpora is often limited, serv-
ing mainly for a specific task in SLU. Kollar et al. (2010) created an MRL with a sequence
8
Chapter 3. Related Work
of spatial description clauses for route-direction parsing purpose. Perera et al. (Perera
et al., 2018) presented an MRL intended for Alexa, a virtual assistant built for voice inter-
action, called Alexa Meaning Representation Language (AMRL). The domain of AMRL
is also restricted to a few intents that Alexa supports, such as "playback", "search" and
"repeat". The downside of such MRLs is that they are isolated on their own: they can-
not benefit from the much broader range of annotated data and associated algorithms
contributed by others outside the domain of SLU. In this work, we embrace AMR as the
parsing method for its universality and immense amount of data.
3.2 AMR Parsers
In the domain of AMR, several automatic parsing approaches have been proposed. JAMR
(Flanigan et al., 2014) is the first system that parses English sentences into their AMR for-
mats. It utilizes a two-step algorithm that first extracts concepts using a semi-Markov
model, and then identifies relations between the concepts using a Maximum Spanning
Connected Subgraph (MSCG) algorithm. In like manner, Another comparative graph-
based parser is realized in the work of Werling (2015). An alternative strategy is applied
in the making of CAMR (Wang et al., 2016), a transition-based parser that generates AMR
graphs utilizing the dependency tree structure of sentences. The idea of using depen-
dency relations for AMR parsing can likewise be found in the researches by Sawai (2015)
and Zhou et al. (2016). In addition, as one of the most popular computing methods in
the general field of computer science, Neural Network (NN) approaches (Barzdins and
Gosko, 2016; Peng et al., 2017) also prevail in the field of AMR parsing. In this thesis, we
adopt JAMR as the semantic parser since graph-based parsers are less likely to be affected
by syntactic errors such as incorrect grammar that exists in spoken language transcripts.
9
4 Data PreparationThe data used as a part of this work is prepared through the following two steps. Firstly,
the audio transcription of utterance is extracted from the Human-Robot Interaction Cor-
pus(HuRIC) (Bastianelli et al., 2014) contributed by the Semantic Analytics Group at the
University of Rome; next, the AMR graphs of the transcription are gathered through a
separate annotation procedure with the AMR guideline provided as a reference. In the
final semantic parsing task, a total of 302 human-robot interaction sentences and their
AMRs were selected. The train/dev./test sets are split into 202/50/50 sentences, respec-
tively.
4.1 Annotation
This section describes the AMR annotation procedure for the HuRIC corpus as follows:
(1) To begin the annotation work, the annotator first familiarizes themselves with the
guideline of AMR (Banarescu, Bonial, and Cai, 2012) and the usage of PropBank frame
sets.
(2) For each sentence in HuRIC, by matching its verb or verb-particle constructions to
their corresponding PropBank role sets, the annotator is then ready to pick the most
suitable verb sense from PropBank. For instance, the main verb in sentence "the boy
wants to go" will be mapped to PropBank frame "want-01" (Figure 4.1) which stands
for "desire".
(3) By identifying the syntax structure and semantic meaning of the sentence, the annota-
tor makes decisions on the types of relations to use in the annotation. There exist ap-
proximately 100 types of relations in AMR (Figure 4.2); therefore the selection process
10
Chapter 4. Data Preparation
FIGURE 4.1: Documentation of PropBank frame "want-01"
https://verbs.colorado.edu/propbank/framesets-english/want-v.html
requires a certain level of proficiency of the annotator. The concepts related to such
relations are thus decided subsequently. In the previous example, relations "arg0" and
"arg1" are chosen for the main verb "want" in view of the roles specified in its role-set
"want-01" (Figure 4.1).
(4) With settled relations, the annotator finally fills the concepts in the child node posi-
tions of the AMR graph. In the event that a concept has other children nodes, steps 3
to 4 are repeated. The annotation stops when all concepts in the span are connected
to at least one relation and all children nodes are connected to their parent concept.
Figure 2.2 demonstrates the finished AMR annotation of sentence "the boy wants to
go."
11
Chapter 4. Data Preparation
FIGURE 4.2: A selection of relations in AMR (Banarescu et al., 2013)
4.2 AMR Parsing in JAMR
The AMR parsing in JAMR comprises of two stages: Concept Identification (CI) and Rela-
tion Identification (RI). In the first stage, concepts of AMR are extracted from the sentence
using a semi-Markov sequence labeling algorithm. In the second stage, relations are gen-
erated based on the selected concepts using a maximum spanning connected subgraph
(MSCG) algorithm. During the data preparation process, an automatic aligner maps each
word in the sentence to its concept fragment. Then the word span is processed with the
Illinois Named Entity Tagger (Ratinov and Roth, 2009) and the Stanford Parser (Klein and
Manning, 2003; De Marneffe, MacCartney, and Manning, 2006) to obtain its named enti-
ties, part-of-speech tags, and dependency structures. For the training procedure in the
CI stage, NEs, POS tags, and basic dependencies are used as the input, with the sentence
12
Chapter 4. Data Preparation
labeled with concept subgraph fragments as the output. In the RI stage, in addition to the
labeled concepts, the same input set as in the CI stage is also used in the input, and the
AMR graph is used as the output.
13
5 Experiments
5.1 Experiment 1: Semantic Parsing in SLU
As addressed previously, the effectiveness of using AMR for semantic parsing in SLU
tasks is one of the primary concentrations in our thesis. In this section, we first demon-
strate the result of AMR parsing in an SLU task using pre-trained models processed in
JAMR, then compare the outcome with another model that has been retrained on HuRIC,
an SLU-oriented corpus. A summary is provided toward the finish of the section.
5.1.1 Using Pre-trained AMR Model
Several large-sized annotated datasets have already been trained on JAMR since the parser
was made. The most up to date ones include the two models trained in 2016 using Sem-
bank (semantic treebank) LDC2014T12 and LDC2015E86 (Flanigan et al., 2016). Since the
two models both cover a wide range of data and vocabulary, it is theorized that when a
much smaller test set -the HuRIC corpus - is applied, the parser is able to give equivalent
or higher outcomes. In this manner, we initially adopt the two pre-trained models for SLU
parsing on the annotated HuRIC corpus. Table 5.1 demonstrates the JAMR performance
on the test set of HuRIC and the general results testing on the original datasets.
TABLE 5.1: JAMR performance using pre-trained AMR models
P(CI+RI) R(CI+RI) F1(CI+RI) F1(CI only) F1 (RI only)LDC2014T12 0.679 0.643 0.660 - -
LDC2014T12(HuRIC) 0.700 0.715 0.707 0.817 0.818LDC2015E86 0.697 0.645 0.670 0.769 0.784
LDC2015E86(HuRIC) 0.714 0.718 0.716 0.847 0.837
14
Chapter 5. Experiments
In Table 5.1, "CI + RI" refers to both Concept Identification and Relation Identification
steps in JAMR, where "CI only" means only the CI step is performed. In "RI only", gold
CI is provided as the input.
Through the result, it can be seen that the test set of HuRIC corpus gets moderately higher
F1 scores comparing to original datasets. The outcome is expected since the HuRIC cor-
pus contains considerably less vocabulary than the other two tested corpora. The syntax
structure of sentences in HuRIC is also viewed as more straightforward than those in the
other two datasets, where they use newswire as their primary source of input (Knight,
2014). Nonetheless, the strategy of using pre-trained AMR models is proved to be a vi-
able solution for semantic parsing in SLU tasks.
5.1.2 Using Retrained AMR Model
As mentioned in subsection 5.1.1, in the pre-trained models of JAMR, the majority of
annotation comes from newswire and other formal articles. However, such written texts
are different from transcription used in common SLU tasks in the following aspects:
• The human utterance is likely to be ungrammatical and lacking in fluency due to
the absence of proofreading methods, whereas articles in newspapers are always
reviewed multiple times before publishing and are, in most of the time, error-free.
The situation for SLU parsing may even get worse when there is no gold transcrip-
tion, with only texts generated by an ASR system provided as the input.
• In SLU tasks such as human-machine interaction, many transcribed sentences are in
the imperative mood(e.g., "bring me the book in the dining room"). The subject in
15
Chapter 5. Experiments
an imperative sentence is usually omitted, and the syntax of such sentences is like-
wise not the same as the most frequently used subject-verb-object (SVO) structure
in written articles.
On account of these inconsistencies between SLU transcription and vast-scale annotated
datasets, a pre-trained model utilizing only corpora built for general purposes is likely to
be not sufficient enough for the semantic parser to achieve a satisfiable accuracy in SLU
systems. Therefore, in this work, with help of built-in functions in JAMR, a model is re-
trained using both the sembank data and training data from the HuRIC corpus to support
the spoken language features such as grammar errors and different syntax structures. The
result of SLU parsing using this retrained model can be found in Table 5.2.
TABLE 5.2: JAMR performance on HuRIC corpus using different models
P(CI+RI) R(CI+RI) F1(CI+RI) F1(CI only) F1 (RI only)LDC2014T12 0.700 0.715 0.707 0.817 0.818
LDC2015E86 0.714 0.718 0.716 0.847 0.837
Retrained Model 0.818 0.826 0.821 0.861 0.908
The outcome through modification of the training model is inspiring: when using the
retrained model, the CI+RI parsing gains an F1 score leaping by 10% and 7% respectively
when contrasted with the highest F1s perform in pre-trained corpora. The F1 in RI parsing
with gold CI also picks up an 7% improvement, which reaches 90%. This significant
advance in parsing performance confirms the effectiveness of model retraining. A few
cases of parsed AMRs are given to help demonstrate the enhancements of the parser.
• Subject issue
Figure 5.1 demonstrates the improvement of the retrained model in handling im-
perative sentences. The left AMR is human-annotated graph for sentence "take the
mug to the coffee table in the living room", whereas the other two graphs represent
16
Chapter 5. Experiments
the parsed results from JAMR using the self-retrained model and pre-trained model
LDC2015E86. Through observation, we can see that the parsed result from model
LDC2015E86 is clearly different from the original meaning of the sentence. When
translated into natural language, the parse can be roughly translated as "the living
room takes a mug to the coffee table." Apparently, when using the model which is
pre-trained on formal articles, the parser often inclines to arrange a subject for the
sentence, even when there does not exist one. By contrast, the retrained model effec-
tively fixes this issue: since AMR allows relations originated from the same concept
node to be disordered, the two AMRs from human annotation and retrained model
can be considered identical.
FIGURE 5.1: AMRs for sentence "take the mug to the coffee table in the living
room" from gold annotation and two different models.
• Pronoun issue
The coreference phenomena also appears frequently in SLU transcription. For ex-
ample, in the sentence "search (for) the book and bring it to me", the pronoun "it"
is referring to the concept "book", and this coreference should be captured by the
parser. However, the pre-trained model does not succeed in identifying the pro-
noun. The retrained model, on the other hand, matches the gold AMR annotation.
17
Chapter 5. Experiments
FIGURE 5.2: AMRs for sentence "Search (for) the book and bring it to me"
from gold annotation and two different models.
5.1.3 Experiment 1 Summary
To better comprehend the high performance in SLU sentence parsing, we ought to first
look into the way JAMR works. As presented in section 4.2, the semantic parsing in JAMR
involves two stages: Concept Identification and Relation Identification. Dissimilar to
many other AMR parsers such as the transition-based CAMR, the dependency structure
in JAMR is not deterministic information for either CI or RI. Rather, it serves merely as
one of many features that are used during CI and RI classification. This characteristic
of JAMR determines its robustness in handling SLU-related tasks, for those tasks often
contain ungrammatical and disfluent sentences that will harm the dependency parsing
accuracy. This advantage ensures a satisfiable baseline for the parser during SLU sentence
parsing.
The retrained model, on the other hand, further enhances the performance of the parser.
Since this experiment is a data-driven task, all results are bound to the quantity and qual-
ity of data. The largest annotated corpus in AMR contains around 40,000 sentences, which
makes the pre-trained models using such large-scale corpora capable of classifying rela-
tively small test sets such as the one used in this thesis in terms of quantity. However,
because the corpora used in pre-trained models share somewhat different grammatical
18
Chapter 5. Experiments
and syntactic structures with the SLU corpus, the parser may not be able to capture those
features embedded in the SLU transcription entirely based on the previous models. The
retrained model makes up for the shortcoming of the system: it familiarizes the parser
with spoken language examples so the framework is more robust in dealing with SLU-
specific structures.
5.2 Experiment 2: Grounding in SLU
A well-designed MRL system for SLU should represent sentences in a suitable struc-
tured form which supports robust grounding techniques (Harnard and Harnad, 1990),
a procedure required to transform the semantic interpretation within MRL into intrinsic,
machine-interpretable commands. Because within the realm of SLU, there exists a variety
of different applications and scenarios such as voice assistant, chatbot, automatic calling
system and interactive gaming; in different scenarios, the grounding procedure may vary
depending on the front-end designs. For instance, the grounding for a music application
is likely to be links between NEs of song/singers and actual audio track files, whereas
in a role-playing game the grounding is more inclined to be pointers between targeted
semantic vocabulary and choices the non-player character makes.
If MRL is the source code of information and the final non-symbolic representation (music
files, game choices) of the system is the object code, then grounding can be considered as
the decoding process of the two. With the object code fixed, the structure of source code
is decisive to the performance of grounding algorithms. In this section, we explore the
robustness of AMR by applying it to a human-robot interaction scenario where the HuRIC
corpus is involved.
19
Chapter 5. Experiments
5.2.1 Pseudo Function Designing
In this work, our object is to generate robotic commands based on AMR graphs using
Information Extraction (IE) methods. The theory is, if a high level of efficiency is achieved
during this IE task, the structure of AMR can thus be proved to be valid for grounding
in similar SLU systems. The robustness of AMR is then examined by identifying the
practicability level in the process of extracting features from the semantic structure. Due
to size and time constraints of this experiment, commands are designed in a mimic form
of robotic arguments and are not executable by existing robot control systems. However,
all necessary inputs that can be extracted from the transcript are included in this design
for a better simulation display purpose. The pseudo-robotic command is made up of the
following three parts:
• Functions. A function is a pre-programmed set of moves that the robot performs.
• Parameters. The parameters determine how the set of moves in a function is per-
formed in details.
• Return types. The return type in a function can either be none or a valid value.
Based on the contents of the HuRIC corpus, we have summarized the following seven
actions along with their corresponding pseudo-robotic functions:
• Define_Action ("there is a radio next to the bed")
set_info(C1): C2; return[None]
• Relocate_Action ("bring me my towel that is in the bathroom")
set_location(C1): C2; return[None]
• Search_Action ("find the fruit")
20
Chapter 5. Experiments
get_location(Loc): C1; return[Loc]
• New_State_Action ("turn on the tv")
set_state(C1): C2; return[None]
• Check_Action ("check main door status")
get_state(St): C1; return[St]
• Follow_Action ("follow the person in front of you")
self_follow(): C1; return[None]
• Grab_Action ("take the phone near the table")
self_grab(): C1; return[None]
In the above pseudo function representation, C1 and C2 stand for concepts being taken as
inputs. The mapping between AMR graphs and the pseudo function is explained through
Figure 5.3, where both the AMR and its robotic-command form of the sentence "there is a
radio next to the bed" are displayed:
FIGURE 5.3: AMR and Robotic Command for sentence "there is a radio next
to the bed."
In AMR format, the character "#" is the symbol used to lead a comment line. Any text
after this symbol can only be seen by human readers and will be omitted by the AMR
parser. The note "Define_Action" in the first comment line indicates the function type
21
Chapter 5. Experiments
of this AMR as "set_info", and the note "parameters" in the second line indicates the pa-
rameters of this function as instances x4 and x5, representing concept items "radio" and
"next-to" respectively. In the pseudo command, the function can be read as "set the in-
formation of ITEM to INFO." Based on the parameter list specified in AMR, the value of
ITEM and INFO is then determined as x4 and x5. The "op1" relation linked with concept
x8 is also included because it is a necessary component for concept x5 to be functional.
The grounding procedure is then considered as finished after both the action type and
parameters are selected for the pseudo command. Detailed classification strategies are
given in the next subsection.
5.2.2 Classification
The classification task is split into two steps: Action Classification (AC) and Parameter
Classification (PC). In the AC step, an SVM classifier modified from the python module
Scikit-learn (Pedregosa et al., 2012) is used to identify the action type. In the PC step,
the parameters are identified using a built-in Conditional Random Field (CRF) classifier
in Scikit-learn. The data used in this task includes 266 HuRIC sentences with their an-
notated AMR graphs, action tags, and parameters. The train/dev./test sets are split into
206/30/30 sentences respectively. Features designed for both classifiers are listed below.
Features used in Action Classifier:
22
Chapter 5. Experiments
First_word: the first word in the sentence
First_pos: the first POS tag in the sentence
First_is_verb: True if the first POS is VB; otherwise False
Arg_count: number of direct arguments in the AMR graph
(arguments that directly connect to the top node)
1st_arg_word: the top node in the first argument (NONE if no first argument)
2nd_arg_word: the top node in the second argument(NONE if no second argument)
3rd_arg_word: the top node in the third argument(NONE if no third argument)
Features used in Parameter Classifier:
Word: the token itself
Pos: the POS tag of the token
Action: the action tag that belongs to its sentence
Is_arg: True if the token is an argument’s top node
-1_word: the token before the current token
-1_pos: the pos tag of the token before the current token
BOS: True if the token is the beginning of the sentence
+1_word: the token after the current token
+1_pos: the pos tag of the token after the current token
EOS: True if the token is at the end of the sentence
Table 5.3 shows the performance of both classifiers. In the PC step, when only concepts
selected as parameters are calculated, the tag "(I)" is marked; when all concepts are taken
into account, then the tag "(All)" is used. "AC+PC" stands for the complete process of
grounding: the "action_tag" feature in PC is generated directly by AC, not extracted from
the gold annotation.
It is expected that the AC step will achieve such a high accuracy, because only seven action
23
Chapter 5. Experiments
TABLE 5.3: Performance of AC and PC classifiers
Precision Recall F1AC only 0.967 0.929 0.948PC(All) 0.947 0.947 0.947
PC(I) 0.943 0.825 0.880AC+PC(I) 0.886 0.775 0.827
tags are taken into account during this classification task. Besides, there also exist several
strong connections between some actions and main verbs (e.g. "Search_Action"/"find",
"Check_Action"/"check"), which may be a ”shortcut“ for the classifier in making correct
choices. In addition, because concepts out of the parameter list are usually more than the
concepts being tagged, the PC done on all concepts is reasonable to obtain a higher F1.
The importance of the action tag feature in PC step is obvious: with gold action tags, the
classifier is able to generate 88% of the parameters correctly, whereas the F1 drops to 82%
when only action tags from AC step are provided.
5.2.3 Experiment 2 Summary
Different from natural languages, AMR aims to minimize the influence of syntax and
promotes a logic-based framework. The syntax-free grammar makes the process of con-
cept extraction more accurate, provides a solid foundation for grounding. Also, the AMR
relation bank which covers a wide range of cases that take place in SLU applications fur-
ther supports the grounding procedure. AMR is not designed for any specific domain
or dataset; however, it is in the same time compatible for most semantic parsing tasks.
This universality feature of AMR determines its robustness and flexibility in dealing with
new domains that contain little prior knowledge. Due to size constrains of the corpus,
the classification accuracy for AMR grounding is not optimal in this task. However, with
24
Chapter 5. Experiments
more annotated data being trained and further feature engineering, a better performance
of the classifier shall be achieved in the future.
25
6 ConclusionThrough the experiments in chapter 5, we have demonstrated the potentiality of utilizing
AMR language in dealing with SLU-oriented semantic parsing tasks. By annotating a cor-
pus that populated with spoken language transcription, we have created an SLU dataset
that contains features not seen often in other AMR datasets, such as grammar errors and
imperative mood. To examine the compatibility of the newly annotated dataset, we at
that point test it on the semantic parser JAMR using models trained on previous AMR
libraries. The outcome shows that the SLU annotation can likewise be parsed in the same
process as other AMR datasets, and to some extent achieves a higher F-measure score as
a result of its limited vocabulary and length of sentence. Along these lines, to investigate
if the performance of semantic parsing in SLU can be further enhanced by reinforcing the
parser with SLU-specific features, we retrain a model using the created annotation and
rerun the parser on the same test set. The high score of 82% F1 proofs the advantage of the
retraining strategy and has provided a direction for future semantic parsing tasks in SLU.
Last but not least, because typically the parsed semantic structure needs to be converted
to some intrinsic representation of computer language for the system to be functional in
an SLU task, the robustness of such structure in assisting the grounding procedure is also
tested in our work. The result indicates that while AMR performs great simplicity on
the level of syntax and provides clear and easy logistic representation of concepts and
relations, the grounding between AMR and computer-interpretable commands is still not
trivial, especially in the case that very little of training data is provided.
In this work, we offer a possible direction for future SLU-oriented projects of using AMR
as the semantic parsing tool, and we provide an annotated corpus as a reference and the
initial step for such tasks. As all data-driven systems rely on the quantity of data, it is
believed that by further expanding the existing annotation corpus, a further advance in
26
Chapter 6. Conclusion
the performance of such systems will be realized.
27
BibliographyBaker, Collin F., Charles J. Fillmore, and John B. Lowe (1998). “The Berkeley FrameNet
Project”. In: Proceedings of the 36th annual meeting on Association for Computational Lin-guistics -. ISBN: 978-88-07-72177-9.
Banarescu, L, C Bonial, and S Cai (2012). “Abstract meaning representation (AMR) 1.0
specification”. In: Parsing on Freebase . . . Pp. 1533–1544.
Banarescu, Laura et al. (2013). “Abstract meaning representation for sembanking”. In:
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse,
pp. 178–186.
Barzdins, Guntis and Didzis Gosko (2016). “RIGA at SemEval-2016 Task 8: Impact of
Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accu-
racy”. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1143–1147.
Bastianelli, Emanuele et al. (2014). “HuRIC : a Human Robot Interaction Corpus”. In: Lrec,
pp. 4519–4526.
Bastianelli, Emanuele et al. (2016). “A discriminative approach to grounded spoken lan-
guage understanding in interactive robotics”. In: IJCAI International Joint Conference onArtificial Intelligence 2016-January, pp. 2747–2753. ISSN: 10450823.
Bobrow, Daniel G. (1964). Natural Language Input for a Computer Problem Solving System.
Bos, Johan and Tetsushi Oka (2007). “A spoken language interface with a mobile robot”.
In: Artificial Life and Robotics 11.1, pp. 42–47. ISSN: 14335298.
Cai, Shu and Kevin Knight (2013). “Smatch: an Evaluation Metric for Semantic Feature
Structures”. In: Annual Meeting of the Association for Computational Linguistics (ACL).ISBN: 9781937284510.
Chen, David L. and Raymond J. Mooney (2011). “Learning to Interpret Natural Language
Navigation Instructions from Observations”. In: AAAI Conference on Artificial Intelli-gence August, pp. 859–865. ISSN: 1938-7228. arXiv: arXiv:1011.1669v3.
Davidson, Donald (1969). “The Individuation of Events”. In: Essays on Actions and Events,
pp. 163–180. ISSN: 0039-7857.
De Marneffe, Marie-Catherine, Bill MacCartney, and Christopher D. Manning (2006). “Gen-
erating typed dependency parses from phrase structure parses”. In: Proceedings of the5th International Conference on Language Resources and Evaluation (LREC 2006). ISSN: 03600300.
arXiv: 1602.01925.
Fillmore, Charles J. (1976). “FRAME SEMANTICS AND THE NATURE OF LANGUAGE”.
In: Annals of the New York Academy of Sciences. ISSN: 17496632.
Flanigan, Jeffrey et al. (2014). “A Discriminative Graph-Based Parser for the Abstract
Meaning Representation”. In: Acl.Flanigan, Jeffrey et al. (2016). “CMU at SemEval-2016 Task 8: Graph-based AMR Parsing
with Infinite Ramp Loss”. In: Proceedings of the 10th International Workshop on SemanticEvaluation (SemEval-2016), pp. 1202–1206.
28
BIBLIOGRAPHY
Foland, William R and James H Martin (2016). “CU-NLP at SemEval-2016 Task 8: AMR
Parsing using LSTM-based Recurrent Neural Networks”. In: Proceedings of the 10th In-ternational Workshop on Semantic Evaluation (SemEval-2016), pp. 1197–1201.
Harnard, Stevan and Stevan Harnad (1990). “The Symbol Grounding Problem”. In: Phys-ica D. ISSN: 01672789. arXiv: 9906002 [cs].
Kingsbury, Paul and Martha Palmer (2002). “From TreeBank to PropBank”. In: Proceedingsof the International Conference on Language Resources and Evaluation (LREC).
Klein, Dan and Christopher D. Manning (2003). “Accurate unlexicalized parsing”. In: Pro-ceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL ’03.
Knight, Kevin et al. (2014). “Abstract Meaning Representation (AMR) Annotation Release
1.0 LDC2014T12”. In: Philadelphia: Linguistic Data Consortium.
Kollar, Thomas et al. (2010). “Toward understanding natural language directions”. In:
Proceeding of the 5th ACM/IEEE international conference on Human-robot interaction - HRI’10. ISBN: 9781424448937.
Langkilde, Irene and Kevin Knight (1998). “Generation that exploits corpus-based statisti-
cal knowledge”. In: Proceedings of the 36th annual meeting on Association for ComputationalLinguistics - 1, p. 704. ISSN: 1098-6596. arXiv: arXiv:1011.1669v3.
Matthiessen, Christian M I M and John A Bateman (1991). “Text Generation and Systemic-
Functional Linguistics: Experiences from English and Japanese”. In: Computational Lin-guistics, pp. 201–203.
Mitra, Arindam and Chitta Baral (2016). “Addressing a Question Answering Challenge
by Combining Statistical Methods with Inductive Rule Learning and Reasoning”. In:
Proceedings of the 30th Conference on Artificial Intelligence (AAAI 2016), pp. 2779–2785.
Moschitti, Alessandro, Daniele Pighin, and Roberto Basili (2008). “Tree kernels for seman-
tic role labeling”. In: Computational Linguistics. ISSN: 08912017.
Palmer, Martha, Paul Kingsbury, and Daniel Gildea (2005). “The proposition bank: An
annotated corpus of semantic roles”. In: Computational Linguistics. ISSN: 08912017.
Pan, Xiaoman et al. (2015). “Unsupervised Entity Linking with Abstract Meaning Repre-
sentation”. In: Naacl2015, pp. 1130–1139.
Pedregosa, Fabian et al. (2012). “Scikit-learn: Machine Learning in Python”. In: Journal ofMachine Learning Research. ISSN: 15324435. arXiv: 1201.0490.
Peng, Xiaochang et al. (2017). “Addressing the Data Sparsity Issue in Neural AMR Pars-
ing”. In: arXiv: 1702.05053.
Perera, Vittorio et al. (2018). “Multi-Task Learning for parsing the Alexa Meaning Rep-
resentation Language”. In: The Thirty-Second AAAI Conference on Artificial Intelligence(AAAI-18), pp. 5390–5397.
Ratinov, Lev and Dan Roth (2009). “Design challenges and misconceptions in named en-
tity recognition”. In: Proceedings of the Thirteenth Conference on Computational NaturalLanguage Learning - CoNLL ’09. ISBN: 9781932432299. arXiv: 1003.2281.
Sawai, Yuichiro (2015). “Semantic Structure Analysis of Noun Phrases using Abstract
Meaning Representation”. In: Proceedings of the 53rd Annual Meeting of the Association
29
BIBLIOGRAPHY
for Computational Linguistics and the 7th International Joint Conference on Natural LanguageProcessing (Volume 1: Long Papers), pp. 851–856.
Tur, Gokhan and Renato De Mori (2011). Spoken Language Understanding: Systems for Ex-tracting Semantic Information from Speech. ISBN: 9780470688243.
Wang, Chuan et al. (2016). “CAMR at SemEval-2016 Task 8: An Extended Transition-based
AMR Parser”. In: Proceedings of SemEval, pp. 1064–1069.
Ward, Wayne and Sunil Issar (1994). “Recent improvements in the CMU spoken language
understanding system”. In: Proceedings of the workshop on Human Language Technology,
pp. 213–216.
Weizenbaum, Joseph (1966). “ELIZA — A Computer Program For the Study of Natu-
ral Language Communication Between Man And Machine”. In: Communications of theACM 9.1, pp. 36–45. ISSN: 0549-4974.
Werling, Keenon, Gabor Angeli, and Christopher Manning (2015). “Robust Subgraph
Generation Improves Abstract Meaning Representation Parsing”. In: ISSN: 1937284115.
arXiv: 1506.03139.
Zhou, Junsheng et al. (2016). “AMR Parsing with an Incremental Joint Model”. In: Proceed-ings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP-16) 2015, pp. 680–689.
30
A Annotated Sentences (Partial)# ::id 1
# ::snt follow this guy
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x3 / guy
:mod (x2 / this)))
# ::id 2
# ::snt this is a table with a glass deck
# ::parameters x4 x8
# ::action Define_Action
(x4 / table
:domain (x1 / this)
:ARG2-of (x8 / deck
:mod (x7 / glass)))
# ::id 3
# ::snt search for the coffee cups
# ::parameters x5
# ::action Search_Action
(x1 / search-01
:ARG2 (x5 / cup
:mod (x4 / coffee)))
# ::id 4
# ::snt carry the book to my nightstand
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / carry-01
:ARG1 (x3 / book)
:destination (x6 / nightstand
:poss (x5 / i)))
# ::id 5
# ::snt go to the kitchen
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
31
Appendix A. Annotated Sentences (Partial)
:ARG4 (x4 / kitchen))
# ::id 6
# ::snt carry the mug to the bathroom
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / carry-01
:ARG1 (x3 / mug)
:destination (x6 / bathroom))
# ::id 7
# ::snt find the lamp
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / lamp))
# ::id 8
# ::snt bring the mobile phone to the livingroom
# ::parameters x4 x7
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x4 / phone
:mod (x3 / mobile-02))
:ARG2 (x7 / livingroom))
# ::id 9
# ::snt find the refrigerator
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / refrigerator))
# ::id 10
# ::snt follow the person with the blonde hair and the black pants fast
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG2 (x3 / person
:ARG1-of (x8 / and
:op1 (x7 / hair
:ARG1-of (x6 / blonde))
32
Appendix A. Annotated Sentences (Partial)
:op2 (x11 / pants
:ARG1-of (x10 / black-05))))
:manner (x12 / fast))
# ::id 11
# ::snt move near the right lamp
# ::parameters x0 x2
# ::action Relocate_Action
(x1 / move-01
:destination (x2 / near-01
:op1 (x5 / lamp
:mod (x4 / right))))
# ::id 12
# ::snt bring the mug to the couch in the living room
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x3 / mug)
:ARG2 (x6 / couch
:location (x10 / room
:mod (x9 / live-01))))
# ::id 13
# ::snt there are two sinks in the kitchen
# ::parameters x4 x7
# ::action Define_Action
(x4 / sink-01
:quant 2
:location (x7 / kitchen))
# ::id 14
# ::snt bring the mug to the kitchen
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x3 / mug)
:ARG2 (x6 / kitchen))
# ::id 15
# ::snt follow the person behind you
33
Appendix A. Annotated Sentences (Partial)
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x3 / person
:location (x4 / behind
:op1 (x5 / you))))
# ::id 16
# ::snt drive to the fridge
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / drive-01
:destination (x4 / fridge))
# ::id 17
# ::snt search for the lamp
# ::parameters x4
# ::action Search_Action
(x1 / search-01
:ARG2 (x4 / lamp))
# ::id 18
# ::snt follow me carefully
# ::parameters x0 x2
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x2 / i)
:manner (x3 / careful))
# ::id 19
# ::snt bring mug to bedroom
# ::parameters x2 x4
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x2 / mug)
:ARG2 (x4 / bedroom))
# ::id 20
# ::snt search for towel
# ::parameters x3
# ::action Search_Action
(x1 / search-01
34
Appendix A. Annotated Sentences (Partial)
:ARG2 (x3 / towel))
# ::id 21
# ::snt the fridge is on your right side
# ::parameters x2 x7
# ::action Define_Action
(x2 / fridge
:location (x7 / side
:mod (x6 / right)
:op1 (x5 / you)))
# ::id 22
# ::snt drive to kitchen
# ::parameters x0 x3
# ::action Relocate_Action
(x1 / drive-01
:destination (x3 / kitchen))
# ::id 23
# ::snt follow that person
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x3 / person
:mod (x2 / that)))
# ::id 24
# ::snt put the mug in the living room
# ::parameters x3 10
# ::action Relocate_Action
(x1 / put-01
:ARG1 (x3 / mug)
:ARG2 (x10 / room
:mod (x9 / live-01)))
# ::id 25
# ::snt find the wine in the dining room
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / wine)
:location (x7 / room
35
Appendix A. Annotated Sentences (Partial)
:mod (x6 / dine-01)))
# ::id 26
# ::snt go to the bathroom
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / bathroom))
# ::id 27
# ::snt there are a lot of couches in the living room
# ::parameters x6 x10
# ::action Define_Action
(x6 / couch
:quant (x4 / lot)
:location (x10 / room
:mod (x9 / live-01)))
# ::id 28
# ::snt go to the living room and find a drink
(x6 / and
:op1 (x1 / go-01
:ARG4 (x5 / room
:mod (x4 / live-01)))
:op2 (x7 / find-01
:ARG1 (x9 / drink)))
# ::id 29
# ::snt there is some bread on the desk
# ::parameters x4 x7
# ::action Define_Action
(x4 / bread
:quant (x3 / some)
:location (x7 / desk))
# ::id 30
# ::snt carry the mug to the dining room
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / carry-01
:ARG1 (x3 / mug)
:destination (x6 / room
36
Appendix A. Annotated Sentences (Partial)
:mod (x4 / dine-01)))
# ::id 31
# ::snt go to the living room
# ::parameters x0 x5
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x5 / room
:mod (x4 / live-01)))
# ::id 32
# ::snt you are in the bedroom and the bed is between two lamps
(x6 / and
:op1 (x1 / you
:location (x5 / bedroom))
:op2 (x8 / bed
:location (x10 / between
:op1 (x12 / lamp
:quant 2))))
# ::id 33
# ::snt bring my phone to the bathroom
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x3 / phone)
:ARG2 (x6 / bathroom))
# ::id 34
# ::snt follow the guy with the blue jacket
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x3 / guy)
:ARG1-of (x7 / jacket
:ARG1-of (x6 / blue)))
# ::id 35
# ::snt bring me the coke from the fridge
# ::parameters x4 x2
# ::action Relocate_Action
(x1 / bring-01
37
Appendix A. Annotated Sentences (Partial)
:ARG1 (x4 / coke)
:ARG2 (x2 / i)
:ARG4 (x7 / fridge))
# ::id 36
# ::snt this is a bed room
# ::parameters x5 x0
# ::action Define_Action
(x5 / room
:domain (x1 / this)
:mod (x4 / bed))
# ::id 37
# ::snt follow the person in front of you
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG2 (x3 / person
:location (x5 / front
:op1 (x7 / you))))
# ::id 38
# ::snt move to the living room
# ::parameters x0 x5
# ::action Relocate_Action
(x1 / move-01
:ARG2 (x5 / room
:mod (x4 / live-01)))
# ::id 39
# ::snt bring me the pillow from the bed
# ::parameters x4 x2
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x4 / pillow)
:ARG2 (x2 / i)
:ARG4 (x7 / bed))
# ::id 40
# ::snt bring the phone to the dinner table
# ::parameters x3 x7
# ::action Relocate_Action
38
Appendix A. Annotated Sentences (Partial)
(x1 / bring-01
:ARG1 (x3 / phone)
:ARG2 (x7 / table
:mod (x6 / dinner)))
# ::id 41
# ::snt follow the person in front of you
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG2 (x3 / person
:location (x5 / front
:op1 (x7 / you))))
# ::id 42
# ::snt go to the sofa and search for the pillow
(x5 / and
:op1 (x1 / go-01
:ARG4 (x4 / sofa))
:op2 (x6 / search-01
:ARG2 (x9 / pillow)))
# ::id 43
# ::snt go quickly to the corner and follow the skinny person
(x6 / and
:op1 (x1 / go-01
:ARG4 (x5 / corner)
:manner (x2 / quick-02))
:op2 (x7 / follow-01
:ARG0 (x10 / person
:ARG1-of (x9 / skin))))
# ::id 44
# ::snt move to the lamp on the right side of the bed
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / move-01
:ARG2 (x4 / lamp
:location (x8 / side
:mod (x7 / right)
:ARG1-of (x11 / bed))))
39
Appendix A. Annotated Sentences (Partial)
# ::id 45
# ::snt place the mug on the sink nearest to the refrigerator
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / place
:ARG1 (x3 / mug)
:ARG2 (x6 / sink-01
:location (x7 / near
:op1 (x10 / refrigerator)
:degree (xap0 / most))))
# ::id 46
# ::snt there is a bed with two lamps
# ::parameters x4 x7
# ::action Define_Action
(x4 / bed
:ARG1-of (x7 / lamp
:quant 2))
# ::id 47
# ::snt go to the mirror
# ::parameters x0 x7
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / mirror))
# ::id 48
# ::snt find me a cushion
# ::parameters x4 x2
# ::action Relocate_Action
(x1 / find-01
:ARG1 (x4 / cushion))
:ARG2 (x2 / i)
# ::id 49
# ::snt go along with them
# ::parameters x4
# ::action Follow_Action
(x1 / go-01
:accompanier (x4 / them))
# ::id 50
40
Appendix A. Annotated Sentences (Partial)
# ::snt bring the mug to the bedroom
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x3 / mug)
:ARG2 (x6 / bedroom))
# ::id 51
# ::snt go to the table
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / table))
# ::id 52
# ::snt the living room is very light and bright
(x7 / and
:op1 (x6 / light
:degree (x5 / very))
:op2 (x8 / bright
:degree x5)
:domain (x3 / room
:mod (x2 / live-01)))
# ::id 53
# ::snt slowly follow my father
# ::parameters xap0
# ::action Follow_Action
(x2 / follow-01
:ARG1 (xap0 / person
:ARG0-of (x4 / have-rel-role-91
:ARG2 (f / father))
:poss (x3 / i))
:manner (x1 / slow))
# ::id 54
# ::snt bring the phone to the living room
# ::parameters x3 x7
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x3 / phone)
:ARG2 (x7 / room
41
Appendix A. Annotated Sentences (Partial)
:mod (x6 / live-01)))
# ::id 55
# ::snt get the blue slippers from the bathroom
# ::parameters x4 x0
# ::action Relocate_Action
(x1 / get-01
:ARG1 (x4 / slipper
:ARG1-of (x3 / blue))
:ARG2 (x7 / bathroom))
# ::id 56
# ::snt bring the book near the lamp
# ::parameters x3 x0
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x3 / book)
:location (x4 / near
:op1 (x6 / lamp)))
# ::id 57
# ::snt the bath-tub is on the left side
# ::parameters x2 x7
# ::action Define_Action
(x2 / bath-tub
:location (x7 / side
:mod (x6 / left)))
# ::id 58
# ::snt find the flowers
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / flower))
# ::id 59
# ::snt follow me slowly
# ::parameters x2
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x2 / i)
:manner (x3 / slow))
42
Appendix A. Annotated Sentences (Partial)
# ::id 60
# ::snt go to the door
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / door))
# ::id 61
# ::snt follow me
# ::parameters x2
# ::action Relocate_Action
(x1 / follow-01
:ARG1 (x2 / i))
# ::id 62
# ::snt bring me my telephone from the couch
# ::parameters x4 x2
# ::action Relocate_Action
(x1 / bring-01
:ARG1 (x4 / telephone-01
:poss (x3 / i))
:ARG2 (x2 / i)
:ARG4 (x7 / couch))
# ::id 63
# ::snt this is a very wide and bright livingroom
# ::parameters x8 x6
# ::action Define_Action
(x8 / livingroom
:ARG1 (x6 / and
:op1 (x5 / wide-02
:degree (x4 / very))
:op2 (x7 / bright-02
:degree x4))
:domain (x1 / this))
# ::id 64
# ::snt bring me the towel from the bathroom
# ::parameters x4 x2
# ::action Relocate_Action
(x1 / bring-01
43
Appendix A. Annotated Sentences (Partial)
:ARG1 (x4 / towel)
:ARG2 (x2 / i)
:ARG4 (x7 / bathroom))
# ::id 65
# ::snt follow me
# ::parameters x2
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x2 / i))
# ::id 66
# ::snt find a glass in the dining room
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / glass
:location (x7 / room
:mod (x6 / dine-01))))
# ::id 67
# ::snt move to the fridge
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / move-01
:ARG2 (x4 / fridge))
# ::id 68
# ::snt go to the dining table
# ::parameters x0 x5
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x5 / table
:mod (x4 / dine-01)))
# ::id 69
# ::snt go get a book
(xap0 / and
:op1 (x1 / go-01)
:op2 (x2 / get-01
:ARG1 (x4 / book)))
44
Appendix A. Annotated Sentences (Partial)
# ::id 70
# ::snt this is a bathroom with a shower bath and double sink
# ::parameters x4 x9
# ::action Define_Action
(x4 / bathroom
:domain (x1 / this)
:ARG1-of (x9 / and
:op1 (x8 / bath
:mod (x7 / shower-01))
:op2 (x11 / sink-01
:quant 2)))
# ::id 71
# ::snt take the mug to the coffee table in the living room
# ::parameters x3 x7
# ::action Relocate_Action
(x1 / take-01
:ARG1 (x3 / mug)
:ARG3 (x7 / table
:mod (x6 / coffee)
:location (x11 / room
:mod (x10 / live-01))))
# ::id 72
# ::snt go to the bathroom
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / bathroom))
# ::id 73
# ::snt go follow my sister around the house
(xap0 / and
:op1 (x1 / go-01)
:op2 (x2 / follow-01
:location (x5 / around
:op1 (x7 / house))
:ARG1 (xap1 / person
:ARG0-of (x4 / have-rel-role-91
:ARG2 (s / sister))
:poss (x3 / i))))
45
Appendix A. Annotated Sentences (Partial)
# ::id 74
# ::snt grab my wine glass from the dining room
# ::parameters x4 x0
# ::action Relocate_Action
(x1 / grab-01
:ARG1 (x4 / glass
:mod (x3 / wine)
:poss (x2 / i))
:location (x8 / room
:mod (x7 / dine-01)))
# ::id 75
# ::snt go to the kitchen and bring me some bread from the pantry
(x5 / and
:op1 (x1 / go-01
:ARG4 (x4 / kitchen))
:op2 (x6 / bring-01
:ARG1 (x9 / bread
:mod (x8 / some)
:ARG2 (x2 / i)
:ARG4 (x12 / pantry))))
# ::id 76
# ::snt go to the fridge inside the kitchen
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / fridge)
:location (x5 / inside
:op1 (x7 / kitchen)))
# ::id 77
# ::snt follow that guy over there
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x3 / guy
:mod (x2 / that))
:location (x4 / there))
# ::id 78
# ::snt take my phone and place it on the bench in the kitchen
46
Appendix A. Annotated Sentences (Partial)
(x4 / and
:op1 (x1 / take-01
:ARG1 (x3 / phone
:poss (x2 / i)))
:op2 (x5 / place
:ARG1 (x3
:location (x9 / bench
:location (x12 / kitchen)))))
# ::id 79
# ::snt the sink is in the kitchen
# ::parameters x2 x6
# ::action Define_Action
(x2 / sink-01
:location (x6 / kitchen))
# ::id 80
# ::snt get my coat from the closet
# ::parameters x3 x0
# ::action Relocate_Action
(x1 / get-01
:ARG1 (x3 / coat
:poss (x5 / i))
:ARG2 (x6 / closet))
# ::id 81
# ::snt come with me
# ::parameters x2
# ::action Follow_Action
(x1 / come-01
:prep-with (x2 / i))
# ::id 82
# ::snt take the mug to the bedroom
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / take-01
:ARG1 (x3 / mug)
:ARG2 (x6 / bedroom))
# ::id 83
# ::snt find a plate
47
Appendix A. Annotated Sentences (Partial)
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / plate))
# ::id 84
# ::snt carry this mug to the bedstand
# ::parameters x3 x6
# ::action Relocate_Action
(x1 / carry-01
:ARG1 (x3 / mug
:mod (x2 / this))
:destination (x6 / bedstand))
# ::id 85
# ::snt go to the bedroom
# ::parameters x0 x4
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x4 / bedroom))
# ::id 86
# ::snt follow the person in front of me
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG2 (x3 / person
:location (x5 / front
:op1 (x7 / i))))
# ::id 87
# ::snt this is a bathroom where the door is on the right
# ::parameters x4 x7
# ::action Define_Action
(x4 / bathroom
:domain (x1 / this)
:ARG1-of (x7 / door
:location (x11 / right-05)))
# ::id 88
# ::snt take my cellphone to the bedroom
# ::parameters x3 x6
48
Appendix A. Annotated Sentences (Partial)
# ::action Relocate_Action
(x1 / take-01
:ARG1 (x3 / cellphone
:poss (x2 / i))
:ARG3 (x6 / bedroom))
# ::id 89
# ::snt find the refrigerator
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / refrigerator))
# ::id 90
# ::snt this is a living room with white furniture
# ::parameters x5 x8
# ::action Define_Action
(x5 / room
:domain (x1 / this)
:mod (x4 / live-01)
:ARG1-of (x8 / furniture
:ARG1-of (x7 / white)))
# ::id 91
# ::snt put the cell phone on the dining room table
# ::parameters x4 x9
# ::action Relocate_Action
(x1 / put-01
:ARG1 (x4 / phone
:mod (x3 / cell))
:ARG2 (x9 / table
:mod (x8 / room
:mod (x7 / dine-01))))
# ::id 92
# ::snt go to the dining room
# ::parameters x0 x5
# ::action Relocate_Action
(x1 / go-01
:ARG4 (x5 / room
:mod (x4 / dine-01)))
49
Appendix A. Annotated Sentences (Partial)
# ::id 93
# ::snt find the lamp in the living room
# ::parameters x3
# ::action Search_Action
(x1 / find-01
:ARG1 (x3 / lamp)
:location (x7 / room
:mod (x6 / live-01)))
# ::id 94
# ::snt follow the man closely
# ::parameters x3
# ::action Follow_Action
(x1 / follow-01
:ARG1 (x3 / man)
:manner (x4 / close))
# ::id 95
# ::snt fetch me the tissue box
# ::parameters x5 x2
# ::action Relocate_Action
(x1 / fetch
:ARG1 (x5 / box
:mod (x4 / tissue))
:ARG2 (x2 / i))
# ::id 96
# ::snt find me the butcher knife
# ::parameters x5 x2
# ::action Search_Action
(x1 / find-01
:ARG1 (x5 / knife
:mod (x4 / butcher))
:ARG2 (x2 / i))
# ::id 97
# ::snt go over to the sofa
# ::parameters x0 x5
# ::action Relocate_Action
(x1 / go-01
:direction (x2 / over)
:ARG4 (x5 / sofa))
50
Appendix A. Annotated Sentences (Partial)
# ::id 98
# ::snt place the mug to the head of the table
# ::parameters x3 x9
# ::action Relocate_Action
(x1 / place
:ARG1 (x3 / mug)
:ARG2 (x9 / table
:ARG1-of (x6 / head)))
# ::id 99
# ::snt follow my mother to the garden
# ::parameters xap0
# ::action Follow_Action
(x1 / follow-01
:destination (x6 / garden)
:ARG2 (xap0 / person
:ARG0-of (x3 / have-rel-role-91
:ARG2 (m / mother))
:poss (x2 / i)))
# ::id 100
# ::snt follow my friend into the living room
# ::parameters xap0
# ::action Follow_Action
(x1 / follow-01
:ARG2 (xap0 / have-rel-role-91
:ARG2 (f / friend
:poss (x2 / i)))
:destination (x7 / room
:mod (x6 / live-01)))
51