semantic parsing in spoken language understanding using

61
Semantic Parsing in Spoken Language Understanding using Abstract Meaning Representation Master’s Thesis Presented to The Faculty of the Graduate School of Arts and Sciences Brandeis University Department of Computer Science Dr. Nianwen Xue, Advisor In Partial Fulfillment of the Requirements for the Degree Master of Science in Computational Linguistics by Hongyuan Shen August 2018

Upload: others

Post on 15-Oct-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semantic Parsing in Spoken Language Understanding using

Semantic Parsing in Spoken Language Understanding using Abstract Meaning Representation

Master’s Thesis

Presented to

The Faculty of the Graduate School of Arts and SciencesBrandeis University

Department of Computer ScienceDr. Nianwen Xue, Advisor

In Partial Fulfillmentof the Requirements for the Degree

Master of Sciencein

Computational Linguistics

byHongyuan Shen

August 2018

Page 2: Semantic Parsing in Spoken Language Understanding using

Copyright by

Hongyuan Shen

© 2018

Page 3: Semantic Parsing in Spoken Language Understanding using

Acknowledgements

For Seyoon Kang

First, and most of all, I would like to thank my advisor, Dr. Nianwen Xue, for his enthusiastic support and guidance from the moment I approached him about this topic. Without your help this thesis would not have been possible. I would like to thank my committee members, Dr. James Pustejovsky and Dr. Lotus Goldberg, for their assistance and encouragement. I would like to give special thanks to the Semantic Analytics Group at the University of Rome for sharing the related dataset and transcription, and I would also like to extend my sincere gratitude to the entire Computational Linguistics Department at Brandeis University for allowing me to participate in such a wonderful discipline. Last but not least, I would like to thank my friends and everyone else who helped contribute to this project, and to my best friends Becky, Jiajie, Ju and KaMan. Thank you for keeping me company along the road.

iii

Page 4: Semantic Parsing in Spoken Language Understanding using

ABSTRACT

Semantic Parsing in Spoken Language Understanding using Abstract Meaning Representation

A thesis presented to the Department of Computer Science

Graduate School of Arts and SciencesBrandeis University

Waltham, Massachusetts

By Hongyuan Shen

With the increasing interest in Spoken Language Understanding (SLU) related

applications such as voice assistant and automated answering system, we aim to find a

better system solution for such tasks, in particular, concentrating on the semantic

parsing procedure that transforms the spoken language into a Meaning Representation

Language (MRL) where only the semantic meaning and logistic relations in the word

span are preserved. The viability of the stage in SLU known as "grounding" between

our proposed MRL, Abstract Meaning Representation (AMR) language, and computer-

interpretable commands is also analyzed in this thesis. To assist with the planned

experiments, the first AMR corpus that is populated with spoken language transcription

iv

Page 5: Semantic Parsing in Spoken Language Understanding using

is created by manual annotation, providing as a reference for future studies on SLU-

oriented AMR parsing tasks.

Keywords: SLU, AMR, annotation, semantic parsing, grounding, NLP

v

Page 6: Semantic Parsing in Spoken Language Understanding using

ContentsAcknowledgements iii

Abstract iv

Contents vi

List of Figures viii

List of Tables ix

List of Abbreviations x

1 Introduction 1

2 Background 3

2.1 Spoken Language Understanding . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Abstract Meaning Representation . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Related Work 7

3.1 SLU Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 AMR Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Data Preparation 10

4.1 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2 AMR Parsing in JAMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5 Experiments 14

5.1 Experiment 1: Semantic Parsing in SLU . . . . . . . . . . . . . . . . . . . . . 14

5.1.1 Using Pre-trained AMR Model . . . . . . . . . . . . . . . . . . . . . . 14

vi

Page 7: Semantic Parsing in Spoken Language Understanding using

5.1.2 Using Retrained AMR Model . . . . . . . . . . . . . . . . . . . . . . . 15

5.1.3 Experiment 1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2 Experiment 2: Grounding in SLU . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.2.1 Pseudo Function Designing . . . . . . . . . . . . . . . . . . . . . . . . 20

5.2.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2.3 Experiment 2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6 Conclusion 26

Bibliography 28

A Annotated Sentences (Partial) 31

vii

Page 8: Semantic Parsing in Spoken Language Understanding using

List of Figures2.1 Sample SLU work flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 AMR Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3 Logical triples in Smatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4.1 PropBank frame example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2 Relations in AMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5.1 Subject issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2 Pronoun issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.3 AMR and Robotic Command . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

viii

Page 9: Semantic Parsing in Spoken Language Understanding using

List of Tables5.1 JAMR performance using pre-trained AMR models . . . . . . . . . . . . . . 14

5.2 JAMR performance on HuRIC corpus using different models . . . . . . . . . 16

5.3 Performance of AC and PC classifiers . . . . . . . . . . . . . . . . . . . . . . 24

ix

Page 10: Semantic Parsing in Spoken Language Understanding using

List of AbbreviationsAC Action Classification

AMR Abstract Meaning Representation

AMRL Alexa Meaning Representation Language

ASR Automatic Speech Recognition

CFG Context-Free Grammar

CI Concept Identification

CL Computational Linguistics

CRF Conditional Random Field

DAG Directed Acyclic Graph

GSL Grammar Specification Language

HuRIC Human-Robot Interaction Corpus

LSTM Long Short-Term Memory

ML Machine Learning

MRL Meaning Representation Language

MSCG Maximum Spanning Connected Subgraph

NLP Natural Language Processing

NLU Natural Language Understanding

NN Neural Network

PC Parameter Classification

RI Relation Identification

SLU Spoken Language Understanding

SVM Supported Vector Machine

x

Page 11: Semantic Parsing in Spoken Language Understanding using

1 IntroductionMachine reading comprehension has always been one of the hottest topics for discussion

in the fields of Computational Linguistics (CL) and Natural Language Processing (NLP).

In the mid 1960s, a computer program STUDENT (Bobrow, 1964) first showed the like-

lihood of utilizing a machine to process natural language input. Subsequently, another

program which enables human and the machine to have simple conversations (Weizen-

baum, 1966) was introduced two years later. With more relevant researches emerged,

Natural Language Understanding (NLU) turns into a term of science that describes such

tasks, as it appears like the machine is attempting to "understand" the language during

the parsing process. Later on, the advance on Automatic Speech Recognition (ASR) tech-

nology further made a subordinate branch of NLU - the Spoken Language Understanding

(SLU) - possible. Slightly different from the tasks in NLU, an SLU system generally re-

quires an additional step before language parsing, which is to transcribe a piece of audio

speech material using ASR approaches. Moreover, the transcription in SLU usually con-

tains more ungrammatical constructions and disfluencies than the text in NLU (Ward and

Issar, 1994). Due to these differences, it is worth noting that some methodologies succeed

in NLU may not be as effective when conveyed in an SLU task.

Abstract Meaning Representation (AMR) language is one of the recently developed tools

that has been utilized on many NLU tasks such as Entity Linking (Pan et al., 2015) and

Question Answering (Mitra and Baral, 2016). It is known for its capability of capturing

the semantic meaning of sentences using simple Directed Acyclic Graph(DAG) structures.

AMR focuses on the intrinsic meaning rather than syntactic structure of the sentence, and

as a result sentences share a comparable semantic meaning will be mapped to the same

AMR graph. Its universality is another favorable feature comparing to other Meaning

Representation Languages (MRLs): theoretically, all English sentences can be parsed into

1

Page 12: Semantic Parsing in Spoken Language Understanding using

Chapter 1. Introduction

its AMR form following the annotation guideline.

However, because AMR is as yet a comparatively new concept, this method has been

scarcely applied on SLU-oriented tasks, and the annotation of SLU-related transcripts is

also absent in the corpus. Therefore, in this thesis, we aim to examine the effectiveness

of applying AMR in an SLU project, and in the meantime, by adding new annotations of

transcription from human speeches, we plan to expand the size and domain coverage of

the current AMR corpus.

At last, since semantic parsing is only an intermediate stage in most SLU applications, the

last step of transforming the meaning representation into tangible computer-operational

commands is likewise pivotal in the work flow. In this way, we decide to investigate

the feasibility of extracting information from AMR structure for command processing

purposes by using Machine Learning (ML) tools, and the overall robustness of AMR lan-

guage is thus examined in an SLU-based scenario.

2

Page 13: Semantic Parsing in Spoken Language Understanding using

2 Background

2.1 Spoken Language Understanding

For most if not all SLU tasks, the ultimate objective is to extract all meaningful compo-

nents from the utterance, and then generate a semantic structure built using some pre-

defined grammar representing such extracted elements. A standout amongst the most

commonly applied semantic representation is the frame-based structure, where the se-

mantic meanings are mapped into scene-based "frames", that is, the collective pragmatic

knowledge related to such meanings. The theory of frame semantics was first raised by

Fillmore (1976). In his work, besides the frames, for each lexical item that has semantic

relations with the frame in a sentence, a "slot" is created to fit it in with the frame. Thus, in

semantic frame-based SLU, the transcription of an utterance can be separated into frames

followed by their corresponding slots. For instance, when the sentence "open the door" is

processed using frame semantics rules, the verb "open" will be linked to a specific frame

explaining the pragmatic meaning of "open", while the lexical word "door" will be filled

into a slot that connects to the frame. After the transformation, the system only needs

to handle a finite set of frames and slots using either knowledge-based or data-driven

methods. Figure 2.1 demonstrates an example work flow of a semantic frame-based SLU

system:

In the work flow, after the original utterance is transcribed to the sentence "bring the can

on the dining table", the accompanying step "Action Detection" maps the verb "bring"

to its frame "bringing", and then the step after maps the rest of lexical items with their

corresponding slots, such as "Theme" and "Goal". Finally, the frame and slots are trans-

formed into computer-interpretable commands through a grounding procedure (Harnard

3

Page 14: Semantic Parsing in Spoken Language Understanding using

Chapter 2. Background

FIGURE 2.1: Sample SLU work flow of a semantic frame-based SLU system

(Bastianelli et al., 2016)

and Harnad, 1990) which will be further discussed in section 5.2.

2.2 Abstract Meaning Representation

AMR, an abbreviation for Abstract Meaning Representation, is one of the forms of Mean-

ing Representation Language (MRL). The latter derives from the idea of programming

languages, where they are both constructed using formal logic methods with human de-

signed grammar and computer semantic structure (Tur and De Mori, 2011). The concept

of AMR was first introduced by Langkilde and Knight (1998), and has turned out to be

one of the most popular MRLs being utilized in researches since then. The primary goal

of AMR, according to Banarescu et al. (2013), is to "spur new work in statistical natural

language understanding and generation."

AMRs are designed as single-rooted, labeled and directed graphs which take after the

neo-Davidsonian style (Davidson, 1969). In addition to the graph format, to make the

reading and annotation process smoother, the authors have additionally created another

4

Page 15: Semantic Parsing in Spoken Language Understanding using

Chapter 2. Background

input format in light of PENMAN notation (Matthiessen and Bateman, 1991), which is

known as the AMR format. The third format of AMR is designed as the conjunction of

logical triples, which is named as the logic format. An example of the said three formats

representing the sentence "the boy wants to go" can be seen in Figure 2.2.

FIGURE 2.2: Three equivalent AMR formats representing the sentence "the

boy wants to go" (Banarescu et al., 2013)

When it comes to the semantic part, AMR uses PropBank framesets (Kingsbury and

Palmer, 2002; Palmer, Kingsbury, and Gildea, 2005) for its semantic element construc-

tion. In PropBank, every verb is assigned to one or more frames based on the number

of its semantic meanings. For instance, when the verb "follow" serves as the sense of "be

subsequent", it will be linked to the frame "follow-01"; but when it has the meaning of "ad-

here to" in a sentence, it will then be linked to another frame "follow-02". Like other frame

semantics corpora, in PropBank each frame has its own set of slots called "arguments".

5

Page 16: Semantic Parsing in Spoken Language Understanding using

Chapter 2. Background

Each AMR consists of two types, concept and relation. In the graph format of AMR, con-

cepts are represented as graph nodes and relations as edges. In the AMR format, concepts

are always led by a pivot instance, and relations always start with a colon. Relations link

concepts, as edges link nodes in a tree. This pure rule-based grammar abstracts away

from syntactic idiosyncrasies and focuses on representing the logical features of natural

language sentences.

To facilitate the annotation process for annotators and to ensure the annotation consis-

tency, an online AMR annotation tool

1

is provided with functions such as dictionary

look-up, common relation shortcuts, and last command history.

For evaluation purposes, an algorithm called Smatch (semantic match) (Cai and Knight,

2013) was proposed to calculate the parsing accuracy of generated AMR structures. Dur-

ing the evaluation procedure, an AMR graph is broken down into a conjunction of logical

propositions, or triples. Each triple consists of one relation and either a variable plus a

concept or two concepts. Figure 2.3 demonstrates an instance of triples from the sentence

"the boy wants to go."

FIGURE 2.3: Logical triples of sentence "the boy wants to go." (Cai and

Knight, 2013)

1

https://www.isi.edu/~ulf/amr/AMR-editor.html

6

Page 17: Semantic Parsing in Spoken Language Understanding using

3 Related Work

3.1 SLU Approaches

As mentioned in section 2.1, Spoken Language Understanding is the process of receiving

audio speeches, extracting the information, and reacting to the information based on pre-

programmed commands. There are presently two mainstream methodologies that have

been adopted in tackling the SLU problem: knowledge-based and data-driven.

Knowledge-based solutions depend intensely with respect to the general linguistic knowl-

edge and usually use the Context-Free Grammar (CFG) as the framework to populate the

input words. The first advantage of knowledge-based approaches is that, unlike data-

driven methods, the grammar-based system requires very few - if not no at all - data to

be functional. Also, in a grammar-based system, the CFG designed for an SLU model can

be used in both parsing and ASR procedure to improve system accuracy. An example

can be found in the research by Bos and Oka (2007), where they created an interactive

robot named Gobot. In the SLU part of the task, they used a CFG model called Grammar

Specification Language (GSL) for both semantic parsing and ASR process and produced

the transcription altogether with the final decision of the robot.

Data-driven methods, on the other hand, do require significantly more labeled data for

training purposes. Be that as it may, without the laborious grammar-designing stage, the

annotations for statistical SLU systems are usually much more accessible to create. More-

over, while the supervised-learning strategy of knowledge-based systems often makes the

process of new data adaption a time-consuming manual effort, a data-driven approach

can for the most part finish the adaption process automatically, without intervention from

human specialists. Chen and Mooney (2011) used this methodology in their research

7

Page 18: Semantic Parsing in Spoken Language Understanding using

Chapter 3. Related Work

where they built an SLU system that parses natural-language navigation instructions into

executable routes on the map. It is noteworthy that even though unsupervised-learning

procedure may save the program designer’s time in updating grammar rules, to achieve

a fair accuracy for the SLU system, an efficient classifier accompanied with its manual

feature-engineering during parsing is still necessary. Classifiers such as SVM (Supported

Vector Machine) and LSTM (Long Short-Term Memory) are some of the most popular

choices applied in recent studies (Moschitti, Pighin, and Basili, 2008; Foland and Martin,

2016). We follow the data-driven strategy in this project because due to the ungram-

matical and fragmented nature of SLU transcription, the same grammar rule adopted in

NLU is expected to perform not as efficient in SLU tasks, making the knowledge-based

methodology a less favorable choice.

To create a statistical classifier in SLU, other than the manual engineering of features men-

tioned in the last section, the representation of outcomes must also be determined. Mod-

els designed for such representation are called the Meaning Representation Language

(MRL). In the field of CL, two of the most well-constructed and broadly used MRLs nowa-

days are AMR and FrameNet (Baker, Fillmore, and Lowe, 1998), as they are both designed

to be general-purpose MRLs and can both be applied to a wide range of semantic pars-

ing tasks. AMR, as introduced in chapter 2.2, is a graph-based MRL of lexical concepts

and typed relations. FrameNet is built upon the frame semantics theory, and its roles and

relations are scene-specific. However, in the FrameNet corpus, most annotations are lexi-

cographic, whereas in AMR almost all annotations are done in the sentence level. The lack

of full-text annotation makes FrameNet less suitable for SLU tasks comparing to AMR.

Within the realm of statistical SLU, several alternative MRLs have been designed. How-

ever, comparing to AMR or FrameNet, the domain of those corpora is often limited, serv-

ing mainly for a specific task in SLU. Kollar et al. (2010) created an MRL with a sequence

8

Page 19: Semantic Parsing in Spoken Language Understanding using

Chapter 3. Related Work

of spatial description clauses for route-direction parsing purpose. Perera et al. (Perera

et al., 2018) presented an MRL intended for Alexa, a virtual assistant built for voice inter-

action, called Alexa Meaning Representation Language (AMRL). The domain of AMRL

is also restricted to a few intents that Alexa supports, such as "playback", "search" and

"repeat". The downside of such MRLs is that they are isolated on their own: they can-

not benefit from the much broader range of annotated data and associated algorithms

contributed by others outside the domain of SLU. In this work, we embrace AMR as the

parsing method for its universality and immense amount of data.

3.2 AMR Parsers

In the domain of AMR, several automatic parsing approaches have been proposed. JAMR

(Flanigan et al., 2014) is the first system that parses English sentences into their AMR for-

mats. It utilizes a two-step algorithm that first extracts concepts using a semi-Markov

model, and then identifies relations between the concepts using a Maximum Spanning

Connected Subgraph (MSCG) algorithm. In like manner, Another comparative graph-

based parser is realized in the work of Werling (2015). An alternative strategy is applied

in the making of CAMR (Wang et al., 2016), a transition-based parser that generates AMR

graphs utilizing the dependency tree structure of sentences. The idea of using depen-

dency relations for AMR parsing can likewise be found in the researches by Sawai (2015)

and Zhou et al. (2016). In addition, as one of the most popular computing methods in

the general field of computer science, Neural Network (NN) approaches (Barzdins and

Gosko, 2016; Peng et al., 2017) also prevail in the field of AMR parsing. In this thesis, we

adopt JAMR as the semantic parser since graph-based parsers are less likely to be affected

by syntactic errors such as incorrect grammar that exists in spoken language transcripts.

9

Page 20: Semantic Parsing in Spoken Language Understanding using

4 Data PreparationThe data used as a part of this work is prepared through the following two steps. Firstly,

the audio transcription of utterance is extracted from the Human-Robot Interaction Cor-

pus(HuRIC) (Bastianelli et al., 2014) contributed by the Semantic Analytics Group at the

University of Rome; next, the AMR graphs of the transcription are gathered through a

separate annotation procedure with the AMR guideline provided as a reference. In the

final semantic parsing task, a total of 302 human-robot interaction sentences and their

AMRs were selected. The train/dev./test sets are split into 202/50/50 sentences, respec-

tively.

4.1 Annotation

This section describes the AMR annotation procedure for the HuRIC corpus as follows:

(1) To begin the annotation work, the annotator first familiarizes themselves with the

guideline of AMR (Banarescu, Bonial, and Cai, 2012) and the usage of PropBank frame

sets.

(2) For each sentence in HuRIC, by matching its verb or verb-particle constructions to

their corresponding PropBank role sets, the annotator is then ready to pick the most

suitable verb sense from PropBank. For instance, the main verb in sentence "the boy

wants to go" will be mapped to PropBank frame "want-01" (Figure 4.1) which stands

for "desire".

(3) By identifying the syntax structure and semantic meaning of the sentence, the annota-

tor makes decisions on the types of relations to use in the annotation. There exist ap-

proximately 100 types of relations in AMR (Figure 4.2); therefore the selection process

10

Page 21: Semantic Parsing in Spoken Language Understanding using

Chapter 4. Data Preparation

FIGURE 4.1: Documentation of PropBank frame "want-01"

https://verbs.colorado.edu/propbank/framesets-english/want-v.html

requires a certain level of proficiency of the annotator. The concepts related to such

relations are thus decided subsequently. In the previous example, relations "arg0" and

"arg1" are chosen for the main verb "want" in view of the roles specified in its role-set

"want-01" (Figure 4.1).

(4) With settled relations, the annotator finally fills the concepts in the child node posi-

tions of the AMR graph. In the event that a concept has other children nodes, steps 3

to 4 are repeated. The annotation stops when all concepts in the span are connected

to at least one relation and all children nodes are connected to their parent concept.

Figure 2.2 demonstrates the finished AMR annotation of sentence "the boy wants to

go."

11

Page 22: Semantic Parsing in Spoken Language Understanding using

Chapter 4. Data Preparation

FIGURE 4.2: A selection of relations in AMR (Banarescu et al., 2013)

4.2 AMR Parsing in JAMR

The AMR parsing in JAMR comprises of two stages: Concept Identification (CI) and Rela-

tion Identification (RI). In the first stage, concepts of AMR are extracted from the sentence

using a semi-Markov sequence labeling algorithm. In the second stage, relations are gen-

erated based on the selected concepts using a maximum spanning connected subgraph

(MSCG) algorithm. During the data preparation process, an automatic aligner maps each

word in the sentence to its concept fragment. Then the word span is processed with the

Illinois Named Entity Tagger (Ratinov and Roth, 2009) and the Stanford Parser (Klein and

Manning, 2003; De Marneffe, MacCartney, and Manning, 2006) to obtain its named enti-

ties, part-of-speech tags, and dependency structures. For the training procedure in the

CI stage, NEs, POS tags, and basic dependencies are used as the input, with the sentence

12

Page 23: Semantic Parsing in Spoken Language Understanding using

Chapter 4. Data Preparation

labeled with concept subgraph fragments as the output. In the RI stage, in addition to the

labeled concepts, the same input set as in the CI stage is also used in the input, and the

AMR graph is used as the output.

13

Page 24: Semantic Parsing in Spoken Language Understanding using

5 Experiments

5.1 Experiment 1: Semantic Parsing in SLU

As addressed previously, the effectiveness of using AMR for semantic parsing in SLU

tasks is one of the primary concentrations in our thesis. In this section, we first demon-

strate the result of AMR parsing in an SLU task using pre-trained models processed in

JAMR, then compare the outcome with another model that has been retrained on HuRIC,

an SLU-oriented corpus. A summary is provided toward the finish of the section.

5.1.1 Using Pre-trained AMR Model

Several large-sized annotated datasets have already been trained on JAMR since the parser

was made. The most up to date ones include the two models trained in 2016 using Sem-

bank (semantic treebank) LDC2014T12 and LDC2015E86 (Flanigan et al., 2016). Since the

two models both cover a wide range of data and vocabulary, it is theorized that when a

much smaller test set -the HuRIC corpus - is applied, the parser is able to give equivalent

or higher outcomes. In this manner, we initially adopt the two pre-trained models for SLU

parsing on the annotated HuRIC corpus. Table 5.1 demonstrates the JAMR performance

on the test set of HuRIC and the general results testing on the original datasets.

TABLE 5.1: JAMR performance using pre-trained AMR models

P(CI+RI) R(CI+RI) F1(CI+RI) F1(CI only) F1 (RI only)LDC2014T12 0.679 0.643 0.660 - -

LDC2014T12(HuRIC) 0.700 0.715 0.707 0.817 0.818LDC2015E86 0.697 0.645 0.670 0.769 0.784

LDC2015E86(HuRIC) 0.714 0.718 0.716 0.847 0.837

14

Page 25: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

In Table 5.1, "CI + RI" refers to both Concept Identification and Relation Identification

steps in JAMR, where "CI only" means only the CI step is performed. In "RI only", gold

CI is provided as the input.

Through the result, it can be seen that the test set of HuRIC corpus gets moderately higher

F1 scores comparing to original datasets. The outcome is expected since the HuRIC cor-

pus contains considerably less vocabulary than the other two tested corpora. The syntax

structure of sentences in HuRIC is also viewed as more straightforward than those in the

other two datasets, where they use newswire as their primary source of input (Knight,

2014). Nonetheless, the strategy of using pre-trained AMR models is proved to be a vi-

able solution for semantic parsing in SLU tasks.

5.1.2 Using Retrained AMR Model

As mentioned in subsection 5.1.1, in the pre-trained models of JAMR, the majority of

annotation comes from newswire and other formal articles. However, such written texts

are different from transcription used in common SLU tasks in the following aspects:

• The human utterance is likely to be ungrammatical and lacking in fluency due to

the absence of proofreading methods, whereas articles in newspapers are always

reviewed multiple times before publishing and are, in most of the time, error-free.

The situation for SLU parsing may even get worse when there is no gold transcrip-

tion, with only texts generated by an ASR system provided as the input.

• In SLU tasks such as human-machine interaction, many transcribed sentences are in

the imperative mood(e.g., "bring me the book in the dining room"). The subject in

15

Page 26: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

an imperative sentence is usually omitted, and the syntax of such sentences is like-

wise not the same as the most frequently used subject-verb-object (SVO) structure

in written articles.

On account of these inconsistencies between SLU transcription and vast-scale annotated

datasets, a pre-trained model utilizing only corpora built for general purposes is likely to

be not sufficient enough for the semantic parser to achieve a satisfiable accuracy in SLU

systems. Therefore, in this work, with help of built-in functions in JAMR, a model is re-

trained using both the sembank data and training data from the HuRIC corpus to support

the spoken language features such as grammar errors and different syntax structures. The

result of SLU parsing using this retrained model can be found in Table 5.2.

TABLE 5.2: JAMR performance on HuRIC corpus using different models

P(CI+RI) R(CI+RI) F1(CI+RI) F1(CI only) F1 (RI only)LDC2014T12 0.700 0.715 0.707 0.817 0.818

LDC2015E86 0.714 0.718 0.716 0.847 0.837

Retrained Model 0.818 0.826 0.821 0.861 0.908

The outcome through modification of the training model is inspiring: when using the

retrained model, the CI+RI parsing gains an F1 score leaping by 10% and 7% respectively

when contrasted with the highest F1s perform in pre-trained corpora. The F1 in RI parsing

with gold CI also picks up an 7% improvement, which reaches 90%. This significant

advance in parsing performance confirms the effectiveness of model retraining. A few

cases of parsed AMRs are given to help demonstrate the enhancements of the parser.

• Subject issue

Figure 5.1 demonstrates the improvement of the retrained model in handling im-

perative sentences. The left AMR is human-annotated graph for sentence "take the

mug to the coffee table in the living room", whereas the other two graphs represent

16

Page 27: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

the parsed results from JAMR using the self-retrained model and pre-trained model

LDC2015E86. Through observation, we can see that the parsed result from model

LDC2015E86 is clearly different from the original meaning of the sentence. When

translated into natural language, the parse can be roughly translated as "the living

room takes a mug to the coffee table." Apparently, when using the model which is

pre-trained on formal articles, the parser often inclines to arrange a subject for the

sentence, even when there does not exist one. By contrast, the retrained model effec-

tively fixes this issue: since AMR allows relations originated from the same concept

node to be disordered, the two AMRs from human annotation and retrained model

can be considered identical.

FIGURE 5.1: AMRs for sentence "take the mug to the coffee table in the living

room" from gold annotation and two different models.

• Pronoun issue

The coreference phenomena also appears frequently in SLU transcription. For ex-

ample, in the sentence "search (for) the book and bring it to me", the pronoun "it"

is referring to the concept "book", and this coreference should be captured by the

parser. However, the pre-trained model does not succeed in identifying the pro-

noun. The retrained model, on the other hand, matches the gold AMR annotation.

17

Page 28: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

FIGURE 5.2: AMRs for sentence "Search (for) the book and bring it to me"

from gold annotation and two different models.

5.1.3 Experiment 1 Summary

To better comprehend the high performance in SLU sentence parsing, we ought to first

look into the way JAMR works. As presented in section 4.2, the semantic parsing in JAMR

involves two stages: Concept Identification and Relation Identification. Dissimilar to

many other AMR parsers such as the transition-based CAMR, the dependency structure

in JAMR is not deterministic information for either CI or RI. Rather, it serves merely as

one of many features that are used during CI and RI classification. This characteristic

of JAMR determines its robustness in handling SLU-related tasks, for those tasks often

contain ungrammatical and disfluent sentences that will harm the dependency parsing

accuracy. This advantage ensures a satisfiable baseline for the parser during SLU sentence

parsing.

The retrained model, on the other hand, further enhances the performance of the parser.

Since this experiment is a data-driven task, all results are bound to the quantity and qual-

ity of data. The largest annotated corpus in AMR contains around 40,000 sentences, which

makes the pre-trained models using such large-scale corpora capable of classifying rela-

tively small test sets such as the one used in this thesis in terms of quantity. However,

because the corpora used in pre-trained models share somewhat different grammatical

18

Page 29: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

and syntactic structures with the SLU corpus, the parser may not be able to capture those

features embedded in the SLU transcription entirely based on the previous models. The

retrained model makes up for the shortcoming of the system: it familiarizes the parser

with spoken language examples so the framework is more robust in dealing with SLU-

specific structures.

5.2 Experiment 2: Grounding in SLU

A well-designed MRL system for SLU should represent sentences in a suitable struc-

tured form which supports robust grounding techniques (Harnard and Harnad, 1990),

a procedure required to transform the semantic interpretation within MRL into intrinsic,

machine-interpretable commands. Because within the realm of SLU, there exists a variety

of different applications and scenarios such as voice assistant, chatbot, automatic calling

system and interactive gaming; in different scenarios, the grounding procedure may vary

depending on the front-end designs. For instance, the grounding for a music application

is likely to be links between NEs of song/singers and actual audio track files, whereas

in a role-playing game the grounding is more inclined to be pointers between targeted

semantic vocabulary and choices the non-player character makes.

If MRL is the source code of information and the final non-symbolic representation (music

files, game choices) of the system is the object code, then grounding can be considered as

the decoding process of the two. With the object code fixed, the structure of source code

is decisive to the performance of grounding algorithms. In this section, we explore the

robustness of AMR by applying it to a human-robot interaction scenario where the HuRIC

corpus is involved.

19

Page 30: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

5.2.1 Pseudo Function Designing

In this work, our object is to generate robotic commands based on AMR graphs using

Information Extraction (IE) methods. The theory is, if a high level of efficiency is achieved

during this IE task, the structure of AMR can thus be proved to be valid for grounding

in similar SLU systems. The robustness of AMR is then examined by identifying the

practicability level in the process of extracting features from the semantic structure. Due

to size and time constraints of this experiment, commands are designed in a mimic form

of robotic arguments and are not executable by existing robot control systems. However,

all necessary inputs that can be extracted from the transcript are included in this design

for a better simulation display purpose. The pseudo-robotic command is made up of the

following three parts:

• Functions. A function is a pre-programmed set of moves that the robot performs.

• Parameters. The parameters determine how the set of moves in a function is per-

formed in details.

• Return types. The return type in a function can either be none or a valid value.

Based on the contents of the HuRIC corpus, we have summarized the following seven

actions along with their corresponding pseudo-robotic functions:

• Define_Action ("there is a radio next to the bed")

set_info(C1): C2; return[None]

• Relocate_Action ("bring me my towel that is in the bathroom")

set_location(C1): C2; return[None]

• Search_Action ("find the fruit")

20

Page 31: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

get_location(Loc): C1; return[Loc]

• New_State_Action ("turn on the tv")

set_state(C1): C2; return[None]

• Check_Action ("check main door status")

get_state(St): C1; return[St]

• Follow_Action ("follow the person in front of you")

self_follow(): C1; return[None]

• Grab_Action ("take the phone near the table")

self_grab(): C1; return[None]

In the above pseudo function representation, C1 and C2 stand for concepts being taken as

inputs. The mapping between AMR graphs and the pseudo function is explained through

Figure 5.3, where both the AMR and its robotic-command form of the sentence "there is a

radio next to the bed" are displayed:

FIGURE 5.3: AMR and Robotic Command for sentence "there is a radio next

to the bed."

In AMR format, the character "#" is the symbol used to lead a comment line. Any text

after this symbol can only be seen by human readers and will be omitted by the AMR

parser. The note "Define_Action" in the first comment line indicates the function type

21

Page 32: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

of this AMR as "set_info", and the note "parameters" in the second line indicates the pa-

rameters of this function as instances x4 and x5, representing concept items "radio" and

"next-to" respectively. In the pseudo command, the function can be read as "set the in-

formation of ITEM to INFO." Based on the parameter list specified in AMR, the value of

ITEM and INFO is then determined as x4 and x5. The "op1" relation linked with concept

x8 is also included because it is a necessary component for concept x5 to be functional.

The grounding procedure is then considered as finished after both the action type and

parameters are selected for the pseudo command. Detailed classification strategies are

given in the next subsection.

5.2.2 Classification

The classification task is split into two steps: Action Classification (AC) and Parameter

Classification (PC). In the AC step, an SVM classifier modified from the python module

Scikit-learn (Pedregosa et al., 2012) is used to identify the action type. In the PC step,

the parameters are identified using a built-in Conditional Random Field (CRF) classifier

in Scikit-learn. The data used in this task includes 266 HuRIC sentences with their an-

notated AMR graphs, action tags, and parameters. The train/dev./test sets are split into

206/30/30 sentences respectively. Features designed for both classifiers are listed below.

Features used in Action Classifier:

22

Page 33: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

First_word: the first word in the sentence

First_pos: the first POS tag in the sentence

First_is_verb: True if the first POS is VB; otherwise False

Arg_count: number of direct arguments in the AMR graph

(arguments that directly connect to the top node)

1st_arg_word: the top node in the first argument (NONE if no first argument)

2nd_arg_word: the top node in the second argument(NONE if no second argument)

3rd_arg_word: the top node in the third argument(NONE if no third argument)

Features used in Parameter Classifier:

Word: the token itself

Pos: the POS tag of the token

Action: the action tag that belongs to its sentence

Is_arg: True if the token is an argument’s top node

-1_word: the token before the current token

-1_pos: the pos tag of the token before the current token

BOS: True if the token is the beginning of the sentence

+1_word: the token after the current token

+1_pos: the pos tag of the token after the current token

EOS: True if the token is at the end of the sentence

Table 5.3 shows the performance of both classifiers. In the PC step, when only concepts

selected as parameters are calculated, the tag "(I)" is marked; when all concepts are taken

into account, then the tag "(All)" is used. "AC+PC" stands for the complete process of

grounding: the "action_tag" feature in PC is generated directly by AC, not extracted from

the gold annotation.

It is expected that the AC step will achieve such a high accuracy, because only seven action

23

Page 34: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

TABLE 5.3: Performance of AC and PC classifiers

Precision Recall F1AC only 0.967 0.929 0.948PC(All) 0.947 0.947 0.947

PC(I) 0.943 0.825 0.880AC+PC(I) 0.886 0.775 0.827

tags are taken into account during this classification task. Besides, there also exist several

strong connections between some actions and main verbs (e.g. "Search_Action"/"find",

"Check_Action"/"check"), which may be a ”shortcut“ for the classifier in making correct

choices. In addition, because concepts out of the parameter list are usually more than the

concepts being tagged, the PC done on all concepts is reasonable to obtain a higher F1.

The importance of the action tag feature in PC step is obvious: with gold action tags, the

classifier is able to generate 88% of the parameters correctly, whereas the F1 drops to 82%

when only action tags from AC step are provided.

5.2.3 Experiment 2 Summary

Different from natural languages, AMR aims to minimize the influence of syntax and

promotes a logic-based framework. The syntax-free grammar makes the process of con-

cept extraction more accurate, provides a solid foundation for grounding. Also, the AMR

relation bank which covers a wide range of cases that take place in SLU applications fur-

ther supports the grounding procedure. AMR is not designed for any specific domain

or dataset; however, it is in the same time compatible for most semantic parsing tasks.

This universality feature of AMR determines its robustness and flexibility in dealing with

new domains that contain little prior knowledge. Due to size constrains of the corpus,

the classification accuracy for AMR grounding is not optimal in this task. However, with

24

Page 35: Semantic Parsing in Spoken Language Understanding using

Chapter 5. Experiments

more annotated data being trained and further feature engineering, a better performance

of the classifier shall be achieved in the future.

25

Page 36: Semantic Parsing in Spoken Language Understanding using

6 ConclusionThrough the experiments in chapter 5, we have demonstrated the potentiality of utilizing

AMR language in dealing with SLU-oriented semantic parsing tasks. By annotating a cor-

pus that populated with spoken language transcription, we have created an SLU dataset

that contains features not seen often in other AMR datasets, such as grammar errors and

imperative mood. To examine the compatibility of the newly annotated dataset, we at

that point test it on the semantic parser JAMR using models trained on previous AMR

libraries. The outcome shows that the SLU annotation can likewise be parsed in the same

process as other AMR datasets, and to some extent achieves a higher F-measure score as

a result of its limited vocabulary and length of sentence. Along these lines, to investigate

if the performance of semantic parsing in SLU can be further enhanced by reinforcing the

parser with SLU-specific features, we retrain a model using the created annotation and

rerun the parser on the same test set. The high score of 82% F1 proofs the advantage of the

retraining strategy and has provided a direction for future semantic parsing tasks in SLU.

Last but not least, because typically the parsed semantic structure needs to be converted

to some intrinsic representation of computer language for the system to be functional in

an SLU task, the robustness of such structure in assisting the grounding procedure is also

tested in our work. The result indicates that while AMR performs great simplicity on

the level of syntax and provides clear and easy logistic representation of concepts and

relations, the grounding between AMR and computer-interpretable commands is still not

trivial, especially in the case that very little of training data is provided.

In this work, we offer a possible direction for future SLU-oriented projects of using AMR

as the semantic parsing tool, and we provide an annotated corpus as a reference and the

initial step for such tasks. As all data-driven systems rely on the quantity of data, it is

believed that by further expanding the existing annotation corpus, a further advance in

26

Page 37: Semantic Parsing in Spoken Language Understanding using

Chapter 6. Conclusion

the performance of such systems will be realized.

27

Page 38: Semantic Parsing in Spoken Language Understanding using

BibliographyBaker, Collin F., Charles J. Fillmore, and John B. Lowe (1998). “The Berkeley FrameNet

Project”. In: Proceedings of the 36th annual meeting on Association for Computational Lin-guistics -. ISBN: 978-88-07-72177-9.

Banarescu, L, C Bonial, and S Cai (2012). “Abstract meaning representation (AMR) 1.0

specification”. In: Parsing on Freebase . . . Pp. 1533–1544.

Banarescu, Laura et al. (2013). “Abstract meaning representation for sembanking”. In:

Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse,

pp. 178–186.

Barzdins, Guntis and Didzis Gosko (2016). “RIGA at SemEval-2016 Task 8: Impact of

Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accu-

racy”. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1143–1147.

Bastianelli, Emanuele et al. (2014). “HuRIC : a Human Robot Interaction Corpus”. In: Lrec,

pp. 4519–4526.

Bastianelli, Emanuele et al. (2016). “A discriminative approach to grounded spoken lan-

guage understanding in interactive robotics”. In: IJCAI International Joint Conference onArtificial Intelligence 2016-January, pp. 2747–2753. ISSN: 10450823.

Bobrow, Daniel G. (1964). Natural Language Input for a Computer Problem Solving System.

Bos, Johan and Tetsushi Oka (2007). “A spoken language interface with a mobile robot”.

In: Artificial Life and Robotics 11.1, pp. 42–47. ISSN: 14335298.

Cai, Shu and Kevin Knight (2013). “Smatch: an Evaluation Metric for Semantic Feature

Structures”. In: Annual Meeting of the Association for Computational Linguistics (ACL).ISBN: 9781937284510.

Chen, David L. and Raymond J. Mooney (2011). “Learning to Interpret Natural Language

Navigation Instructions from Observations”. In: AAAI Conference on Artificial Intelli-gence August, pp. 859–865. ISSN: 1938-7228. arXiv: arXiv:1011.1669v3.

Davidson, Donald (1969). “The Individuation of Events”. In: Essays on Actions and Events,

pp. 163–180. ISSN: 0039-7857.

De Marneffe, Marie-Catherine, Bill MacCartney, and Christopher D. Manning (2006). “Gen-

erating typed dependency parses from phrase structure parses”. In: Proceedings of the5th International Conference on Language Resources and Evaluation (LREC 2006). ISSN: 03600300.

arXiv: 1602.01925.

Fillmore, Charles J. (1976). “FRAME SEMANTICS AND THE NATURE OF LANGUAGE”.

In: Annals of the New York Academy of Sciences. ISSN: 17496632.

Flanigan, Jeffrey et al. (2014). “A Discriminative Graph-Based Parser for the Abstract

Meaning Representation”. In: Acl.Flanigan, Jeffrey et al. (2016). “CMU at SemEval-2016 Task 8: Graph-based AMR Parsing

with Infinite Ramp Loss”. In: Proceedings of the 10th International Workshop on SemanticEvaluation (SemEval-2016), pp. 1202–1206.

28

Page 39: Semantic Parsing in Spoken Language Understanding using

BIBLIOGRAPHY

Foland, William R and James H Martin (2016). “CU-NLP at SemEval-2016 Task 8: AMR

Parsing using LSTM-based Recurrent Neural Networks”. In: Proceedings of the 10th In-ternational Workshop on Semantic Evaluation (SemEval-2016), pp. 1197–1201.

Harnard, Stevan and Stevan Harnad (1990). “The Symbol Grounding Problem”. In: Phys-ica D. ISSN: 01672789. arXiv: 9906002 [cs].

Kingsbury, Paul and Martha Palmer (2002). “From TreeBank to PropBank”. In: Proceedingsof the International Conference on Language Resources and Evaluation (LREC).

Klein, Dan and Christopher D. Manning (2003). “Accurate unlexicalized parsing”. In: Pro-ceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL ’03.

Knight, Kevin et al. (2014). “Abstract Meaning Representation (AMR) Annotation Release

1.0 LDC2014T12”. In: Philadelphia: Linguistic Data Consortium.

Kollar, Thomas et al. (2010). “Toward understanding natural language directions”. In:

Proceeding of the 5th ACM/IEEE international conference on Human-robot interaction - HRI’10. ISBN: 9781424448937.

Langkilde, Irene and Kevin Knight (1998). “Generation that exploits corpus-based statisti-

cal knowledge”. In: Proceedings of the 36th annual meeting on Association for ComputationalLinguistics - 1, p. 704. ISSN: 1098-6596. arXiv: arXiv:1011.1669v3.

Matthiessen, Christian M I M and John A Bateman (1991). “Text Generation and Systemic-

Functional Linguistics: Experiences from English and Japanese”. In: Computational Lin-guistics, pp. 201–203.

Mitra, Arindam and Chitta Baral (2016). “Addressing a Question Answering Challenge

by Combining Statistical Methods with Inductive Rule Learning and Reasoning”. In:

Proceedings of the 30th Conference on Artificial Intelligence (AAAI 2016), pp. 2779–2785.

Moschitti, Alessandro, Daniele Pighin, and Roberto Basili (2008). “Tree kernels for seman-

tic role labeling”. In: Computational Linguistics. ISSN: 08912017.

Palmer, Martha, Paul Kingsbury, and Daniel Gildea (2005). “The proposition bank: An

annotated corpus of semantic roles”. In: Computational Linguistics. ISSN: 08912017.

Pan, Xiaoman et al. (2015). “Unsupervised Entity Linking with Abstract Meaning Repre-

sentation”. In: Naacl2015, pp. 1130–1139.

Pedregosa, Fabian et al. (2012). “Scikit-learn: Machine Learning in Python”. In: Journal ofMachine Learning Research. ISSN: 15324435. arXiv: 1201.0490.

Peng, Xiaochang et al. (2017). “Addressing the Data Sparsity Issue in Neural AMR Pars-

ing”. In: arXiv: 1702.05053.

Perera, Vittorio et al. (2018). “Multi-Task Learning for parsing the Alexa Meaning Rep-

resentation Language”. In: The Thirty-Second AAAI Conference on Artificial Intelligence(AAAI-18), pp. 5390–5397.

Ratinov, Lev and Dan Roth (2009). “Design challenges and misconceptions in named en-

tity recognition”. In: Proceedings of the Thirteenth Conference on Computational NaturalLanguage Learning - CoNLL ’09. ISBN: 9781932432299. arXiv: 1003.2281.

Sawai, Yuichiro (2015). “Semantic Structure Analysis of Noun Phrases using Abstract

Meaning Representation”. In: Proceedings of the 53rd Annual Meeting of the Association

29

Page 40: Semantic Parsing in Spoken Language Understanding using

BIBLIOGRAPHY

for Computational Linguistics and the 7th International Joint Conference on Natural LanguageProcessing (Volume 1: Long Papers), pp. 851–856.

Tur, Gokhan and Renato De Mori (2011). Spoken Language Understanding: Systems for Ex-tracting Semantic Information from Speech. ISBN: 9780470688243.

Wang, Chuan et al. (2016). “CAMR at SemEval-2016 Task 8: An Extended Transition-based

AMR Parser”. In: Proceedings of SemEval, pp. 1064–1069.

Ward, Wayne and Sunil Issar (1994). “Recent improvements in the CMU spoken language

understanding system”. In: Proceedings of the workshop on Human Language Technology,

pp. 213–216.

Weizenbaum, Joseph (1966). “ELIZA — A Computer Program For the Study of Natu-

ral Language Communication Between Man And Machine”. In: Communications of theACM 9.1, pp. 36–45. ISSN: 0549-4974.

Werling, Keenon, Gabor Angeli, and Christopher Manning (2015). “Robust Subgraph

Generation Improves Abstract Meaning Representation Parsing”. In: ISSN: 1937284115.

arXiv: 1506.03139.

Zhou, Junsheng et al. (2016). “AMR Parsing with an Incremental Joint Model”. In: Proceed-ings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP-16) 2015, pp. 680–689.

30

Page 41: Semantic Parsing in Spoken Language Understanding using

A Annotated Sentences (Partial)# ::id 1

# ::snt follow this guy

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x3 / guy

:mod (x2 / this)))

# ::id 2

# ::snt this is a table with a glass deck

# ::parameters x4 x8

# ::action Define_Action

(x4 / table

:domain (x1 / this)

:ARG2-of (x8 / deck

:mod (x7 / glass)))

# ::id 3

# ::snt search for the coffee cups

# ::parameters x5

# ::action Search_Action

(x1 / search-01

:ARG2 (x5 / cup

:mod (x4 / coffee)))

# ::id 4

# ::snt carry the book to my nightstand

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / carry-01

:ARG1 (x3 / book)

:destination (x6 / nightstand

:poss (x5 / i)))

# ::id 5

# ::snt go to the kitchen

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

31

Page 42: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:ARG4 (x4 / kitchen))

# ::id 6

# ::snt carry the mug to the bathroom

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / carry-01

:ARG1 (x3 / mug)

:destination (x6 / bathroom))

# ::id 7

# ::snt find the lamp

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / lamp))

# ::id 8

# ::snt bring the mobile phone to the livingroom

# ::parameters x4 x7

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x4 / phone

:mod (x3 / mobile-02))

:ARG2 (x7 / livingroom))

# ::id 9

# ::snt find the refrigerator

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / refrigerator))

# ::id 10

# ::snt follow the person with the blonde hair and the black pants fast

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG2 (x3 / person

:ARG1-of (x8 / and

:op1 (x7 / hair

:ARG1-of (x6 / blonde))

32

Page 43: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:op2 (x11 / pants

:ARG1-of (x10 / black-05))))

:manner (x12 / fast))

# ::id 11

# ::snt move near the right lamp

# ::parameters x0 x2

# ::action Relocate_Action

(x1 / move-01

:destination (x2 / near-01

:op1 (x5 / lamp

:mod (x4 / right))))

# ::id 12

# ::snt bring the mug to the couch in the living room

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x3 / mug)

:ARG2 (x6 / couch

:location (x10 / room

:mod (x9 / live-01))))

# ::id 13

# ::snt there are two sinks in the kitchen

# ::parameters x4 x7

# ::action Define_Action

(x4 / sink-01

:quant 2

:location (x7 / kitchen))

# ::id 14

# ::snt bring the mug to the kitchen

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x3 / mug)

:ARG2 (x6 / kitchen))

# ::id 15

# ::snt follow the person behind you

33

Page 44: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x3 / person

:location (x4 / behind

:op1 (x5 / you))))

# ::id 16

# ::snt drive to the fridge

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / drive-01

:destination (x4 / fridge))

# ::id 17

# ::snt search for the lamp

# ::parameters x4

# ::action Search_Action

(x1 / search-01

:ARG2 (x4 / lamp))

# ::id 18

# ::snt follow me carefully

# ::parameters x0 x2

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x2 / i)

:manner (x3 / careful))

# ::id 19

# ::snt bring mug to bedroom

# ::parameters x2 x4

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x2 / mug)

:ARG2 (x4 / bedroom))

# ::id 20

# ::snt search for towel

# ::parameters x3

# ::action Search_Action

(x1 / search-01

34

Page 45: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:ARG2 (x3 / towel))

# ::id 21

# ::snt the fridge is on your right side

# ::parameters x2 x7

# ::action Define_Action

(x2 / fridge

:location (x7 / side

:mod (x6 / right)

:op1 (x5 / you)))

# ::id 22

# ::snt drive to kitchen

# ::parameters x0 x3

# ::action Relocate_Action

(x1 / drive-01

:destination (x3 / kitchen))

# ::id 23

# ::snt follow that person

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x3 / person

:mod (x2 / that)))

# ::id 24

# ::snt put the mug in the living room

# ::parameters x3 10

# ::action Relocate_Action

(x1 / put-01

:ARG1 (x3 / mug)

:ARG2 (x10 / room

:mod (x9 / live-01)))

# ::id 25

# ::snt find the wine in the dining room

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / wine)

:location (x7 / room

35

Page 46: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:mod (x6 / dine-01)))

# ::id 26

# ::snt go to the bathroom

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / bathroom))

# ::id 27

# ::snt there are a lot of couches in the living room

# ::parameters x6 x10

# ::action Define_Action

(x6 / couch

:quant (x4 / lot)

:location (x10 / room

:mod (x9 / live-01)))

# ::id 28

# ::snt go to the living room and find a drink

(x6 / and

:op1 (x1 / go-01

:ARG4 (x5 / room

:mod (x4 / live-01)))

:op2 (x7 / find-01

:ARG1 (x9 / drink)))

# ::id 29

# ::snt there is some bread on the desk

# ::parameters x4 x7

# ::action Define_Action

(x4 / bread

:quant (x3 / some)

:location (x7 / desk))

# ::id 30

# ::snt carry the mug to the dining room

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / carry-01

:ARG1 (x3 / mug)

:destination (x6 / room

36

Page 47: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:mod (x4 / dine-01)))

# ::id 31

# ::snt go to the living room

# ::parameters x0 x5

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x5 / room

:mod (x4 / live-01)))

# ::id 32

# ::snt you are in the bedroom and the bed is between two lamps

(x6 / and

:op1 (x1 / you

:location (x5 / bedroom))

:op2 (x8 / bed

:location (x10 / between

:op1 (x12 / lamp

:quant 2))))

# ::id 33

# ::snt bring my phone to the bathroom

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x3 / phone)

:ARG2 (x6 / bathroom))

# ::id 34

# ::snt follow the guy with the blue jacket

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x3 / guy)

:ARG1-of (x7 / jacket

:ARG1-of (x6 / blue)))

# ::id 35

# ::snt bring me the coke from the fridge

# ::parameters x4 x2

# ::action Relocate_Action

(x1 / bring-01

37

Page 48: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:ARG1 (x4 / coke)

:ARG2 (x2 / i)

:ARG4 (x7 / fridge))

# ::id 36

# ::snt this is a bed room

# ::parameters x5 x0

# ::action Define_Action

(x5 / room

:domain (x1 / this)

:mod (x4 / bed))

# ::id 37

# ::snt follow the person in front of you

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG2 (x3 / person

:location (x5 / front

:op1 (x7 / you))))

# ::id 38

# ::snt move to the living room

# ::parameters x0 x5

# ::action Relocate_Action

(x1 / move-01

:ARG2 (x5 / room

:mod (x4 / live-01)))

# ::id 39

# ::snt bring me the pillow from the bed

# ::parameters x4 x2

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x4 / pillow)

:ARG2 (x2 / i)

:ARG4 (x7 / bed))

# ::id 40

# ::snt bring the phone to the dinner table

# ::parameters x3 x7

# ::action Relocate_Action

38

Page 49: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

(x1 / bring-01

:ARG1 (x3 / phone)

:ARG2 (x7 / table

:mod (x6 / dinner)))

# ::id 41

# ::snt follow the person in front of you

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG2 (x3 / person

:location (x5 / front

:op1 (x7 / you))))

# ::id 42

# ::snt go to the sofa and search for the pillow

(x5 / and

:op1 (x1 / go-01

:ARG4 (x4 / sofa))

:op2 (x6 / search-01

:ARG2 (x9 / pillow)))

# ::id 43

# ::snt go quickly to the corner and follow the skinny person

(x6 / and

:op1 (x1 / go-01

:ARG4 (x5 / corner)

:manner (x2 / quick-02))

:op2 (x7 / follow-01

:ARG0 (x10 / person

:ARG1-of (x9 / skin))))

# ::id 44

# ::snt move to the lamp on the right side of the bed

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / move-01

:ARG2 (x4 / lamp

:location (x8 / side

:mod (x7 / right)

:ARG1-of (x11 / bed))))

39

Page 50: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::id 45

# ::snt place the mug on the sink nearest to the refrigerator

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / place

:ARG1 (x3 / mug)

:ARG2 (x6 / sink-01

:location (x7 / near

:op1 (x10 / refrigerator)

:degree (xap0 / most))))

# ::id 46

# ::snt there is a bed with two lamps

# ::parameters x4 x7

# ::action Define_Action

(x4 / bed

:ARG1-of (x7 / lamp

:quant 2))

# ::id 47

# ::snt go to the mirror

# ::parameters x0 x7

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / mirror))

# ::id 48

# ::snt find me a cushion

# ::parameters x4 x2

# ::action Relocate_Action

(x1 / find-01

:ARG1 (x4 / cushion))

:ARG2 (x2 / i)

# ::id 49

# ::snt go along with them

# ::parameters x4

# ::action Follow_Action

(x1 / go-01

:accompanier (x4 / them))

# ::id 50

40

Page 51: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::snt bring the mug to the bedroom

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x3 / mug)

:ARG2 (x6 / bedroom))

# ::id 51

# ::snt go to the table

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / table))

# ::id 52

# ::snt the living room is very light and bright

(x7 / and

:op1 (x6 / light

:degree (x5 / very))

:op2 (x8 / bright

:degree x5)

:domain (x3 / room

:mod (x2 / live-01)))

# ::id 53

# ::snt slowly follow my father

# ::parameters xap0

# ::action Follow_Action

(x2 / follow-01

:ARG1 (xap0 / person

:ARG0-of (x4 / have-rel-role-91

:ARG2 (f / father))

:poss (x3 / i))

:manner (x1 / slow))

# ::id 54

# ::snt bring the phone to the living room

# ::parameters x3 x7

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x3 / phone)

:ARG2 (x7 / room

41

Page 52: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:mod (x6 / live-01)))

# ::id 55

# ::snt get the blue slippers from the bathroom

# ::parameters x4 x0

# ::action Relocate_Action

(x1 / get-01

:ARG1 (x4 / slipper

:ARG1-of (x3 / blue))

:ARG2 (x7 / bathroom))

# ::id 56

# ::snt bring the book near the lamp

# ::parameters x3 x0

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x3 / book)

:location (x4 / near

:op1 (x6 / lamp)))

# ::id 57

# ::snt the bath-tub is on the left side

# ::parameters x2 x7

# ::action Define_Action

(x2 / bath-tub

:location (x7 / side

:mod (x6 / left)))

# ::id 58

# ::snt find the flowers

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / flower))

# ::id 59

# ::snt follow me slowly

# ::parameters x2

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x2 / i)

:manner (x3 / slow))

42

Page 53: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::id 60

# ::snt go to the door

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / door))

# ::id 61

# ::snt follow me

# ::parameters x2

# ::action Relocate_Action

(x1 / follow-01

:ARG1 (x2 / i))

# ::id 62

# ::snt bring me my telephone from the couch

# ::parameters x4 x2

# ::action Relocate_Action

(x1 / bring-01

:ARG1 (x4 / telephone-01

:poss (x3 / i))

:ARG2 (x2 / i)

:ARG4 (x7 / couch))

# ::id 63

# ::snt this is a very wide and bright livingroom

# ::parameters x8 x6

# ::action Define_Action

(x8 / livingroom

:ARG1 (x6 / and

:op1 (x5 / wide-02

:degree (x4 / very))

:op2 (x7 / bright-02

:degree x4))

:domain (x1 / this))

# ::id 64

# ::snt bring me the towel from the bathroom

# ::parameters x4 x2

# ::action Relocate_Action

(x1 / bring-01

43

Page 54: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

:ARG1 (x4 / towel)

:ARG2 (x2 / i)

:ARG4 (x7 / bathroom))

# ::id 65

# ::snt follow me

# ::parameters x2

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x2 / i))

# ::id 66

# ::snt find a glass in the dining room

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / glass

:location (x7 / room

:mod (x6 / dine-01))))

# ::id 67

# ::snt move to the fridge

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / move-01

:ARG2 (x4 / fridge))

# ::id 68

# ::snt go to the dining table

# ::parameters x0 x5

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x5 / table

:mod (x4 / dine-01)))

# ::id 69

# ::snt go get a book

(xap0 / and

:op1 (x1 / go-01)

:op2 (x2 / get-01

:ARG1 (x4 / book)))

44

Page 55: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::id 70

# ::snt this is a bathroom with a shower bath and double sink

# ::parameters x4 x9

# ::action Define_Action

(x4 / bathroom

:domain (x1 / this)

:ARG1-of (x9 / and

:op1 (x8 / bath

:mod (x7 / shower-01))

:op2 (x11 / sink-01

:quant 2)))

# ::id 71

# ::snt take the mug to the coffee table in the living room

# ::parameters x3 x7

# ::action Relocate_Action

(x1 / take-01

:ARG1 (x3 / mug)

:ARG3 (x7 / table

:mod (x6 / coffee)

:location (x11 / room

:mod (x10 / live-01))))

# ::id 72

# ::snt go to the bathroom

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / bathroom))

# ::id 73

# ::snt go follow my sister around the house

(xap0 / and

:op1 (x1 / go-01)

:op2 (x2 / follow-01

:location (x5 / around

:op1 (x7 / house))

:ARG1 (xap1 / person

:ARG0-of (x4 / have-rel-role-91

:ARG2 (s / sister))

:poss (x3 / i))))

45

Page 56: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::id 74

# ::snt grab my wine glass from the dining room

# ::parameters x4 x0

# ::action Relocate_Action

(x1 / grab-01

:ARG1 (x4 / glass

:mod (x3 / wine)

:poss (x2 / i))

:location (x8 / room

:mod (x7 / dine-01)))

# ::id 75

# ::snt go to the kitchen and bring me some bread from the pantry

(x5 / and

:op1 (x1 / go-01

:ARG4 (x4 / kitchen))

:op2 (x6 / bring-01

:ARG1 (x9 / bread

:mod (x8 / some)

:ARG2 (x2 / i)

:ARG4 (x12 / pantry))))

# ::id 76

# ::snt go to the fridge inside the kitchen

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / fridge)

:location (x5 / inside

:op1 (x7 / kitchen)))

# ::id 77

# ::snt follow that guy over there

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x3 / guy

:mod (x2 / that))

:location (x4 / there))

# ::id 78

# ::snt take my phone and place it on the bench in the kitchen

46

Page 57: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

(x4 / and

:op1 (x1 / take-01

:ARG1 (x3 / phone

:poss (x2 / i)))

:op2 (x5 / place

:ARG1 (x3

:location (x9 / bench

:location (x12 / kitchen)))))

# ::id 79

# ::snt the sink is in the kitchen

# ::parameters x2 x6

# ::action Define_Action

(x2 / sink-01

:location (x6 / kitchen))

# ::id 80

# ::snt get my coat from the closet

# ::parameters x3 x0

# ::action Relocate_Action

(x1 / get-01

:ARG1 (x3 / coat

:poss (x5 / i))

:ARG2 (x6 / closet))

# ::id 81

# ::snt come with me

# ::parameters x2

# ::action Follow_Action

(x1 / come-01

:prep-with (x2 / i))

# ::id 82

# ::snt take the mug to the bedroom

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / take-01

:ARG1 (x3 / mug)

:ARG2 (x6 / bedroom))

# ::id 83

# ::snt find a plate

47

Page 58: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / plate))

# ::id 84

# ::snt carry this mug to the bedstand

# ::parameters x3 x6

# ::action Relocate_Action

(x1 / carry-01

:ARG1 (x3 / mug

:mod (x2 / this))

:destination (x6 / bedstand))

# ::id 85

# ::snt go to the bedroom

# ::parameters x0 x4

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x4 / bedroom))

# ::id 86

# ::snt follow the person in front of me

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG2 (x3 / person

:location (x5 / front

:op1 (x7 / i))))

# ::id 87

# ::snt this is a bathroom where the door is on the right

# ::parameters x4 x7

# ::action Define_Action

(x4 / bathroom

:domain (x1 / this)

:ARG1-of (x7 / door

:location (x11 / right-05)))

# ::id 88

# ::snt take my cellphone to the bedroom

# ::parameters x3 x6

48

Page 59: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::action Relocate_Action

(x1 / take-01

:ARG1 (x3 / cellphone

:poss (x2 / i))

:ARG3 (x6 / bedroom))

# ::id 89

# ::snt find the refrigerator

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / refrigerator))

# ::id 90

# ::snt this is a living room with white furniture

# ::parameters x5 x8

# ::action Define_Action

(x5 / room

:domain (x1 / this)

:mod (x4 / live-01)

:ARG1-of (x8 / furniture

:ARG1-of (x7 / white)))

# ::id 91

# ::snt put the cell phone on the dining room table

# ::parameters x4 x9

# ::action Relocate_Action

(x1 / put-01

:ARG1 (x4 / phone

:mod (x3 / cell))

:ARG2 (x9 / table

:mod (x8 / room

:mod (x7 / dine-01))))

# ::id 92

# ::snt go to the dining room

# ::parameters x0 x5

# ::action Relocate_Action

(x1 / go-01

:ARG4 (x5 / room

:mod (x4 / dine-01)))

49

Page 60: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::id 93

# ::snt find the lamp in the living room

# ::parameters x3

# ::action Search_Action

(x1 / find-01

:ARG1 (x3 / lamp)

:location (x7 / room

:mod (x6 / live-01)))

# ::id 94

# ::snt follow the man closely

# ::parameters x3

# ::action Follow_Action

(x1 / follow-01

:ARG1 (x3 / man)

:manner (x4 / close))

# ::id 95

# ::snt fetch me the tissue box

# ::parameters x5 x2

# ::action Relocate_Action

(x1 / fetch

:ARG1 (x5 / box

:mod (x4 / tissue))

:ARG2 (x2 / i))

# ::id 96

# ::snt find me the butcher knife

# ::parameters x5 x2

# ::action Search_Action

(x1 / find-01

:ARG1 (x5 / knife

:mod (x4 / butcher))

:ARG2 (x2 / i))

# ::id 97

# ::snt go over to the sofa

# ::parameters x0 x5

# ::action Relocate_Action

(x1 / go-01

:direction (x2 / over)

:ARG4 (x5 / sofa))

50

Page 61: Semantic Parsing in Spoken Language Understanding using

Appendix A. Annotated Sentences (Partial)

# ::id 98

# ::snt place the mug to the head of the table

# ::parameters x3 x9

# ::action Relocate_Action

(x1 / place

:ARG1 (x3 / mug)

:ARG2 (x9 / table

:ARG1-of (x6 / head)))

# ::id 99

# ::snt follow my mother to the garden

# ::parameters xap0

# ::action Follow_Action

(x1 / follow-01

:destination (x6 / garden)

:ARG2 (xap0 / person

:ARG0-of (x3 / have-rel-role-91

:ARG2 (m / mother))

:poss (x2 / i)))

# ::id 100

# ::snt follow my friend into the living room

# ::parameters xap0

# ::action Follow_Action

(x1 / follow-01

:ARG2 (xap0 / have-rel-role-91

:ARG2 (f / friend

:poss (x2 / i)))

:destination (x7 / room

:mod (x6 / live-01)))

51