Download - Formal Issues in Natural Language Generation
![Page 1: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/1.jpg)
Kees van DeemterMatthew Stone
Formal Issuesin
Natural Language Generation
Lecture 1Overview + Reiter & Dale
1997
![Page 2: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/2.jpg)
We are Kees van Deemter…
• Principal Research Fellow, ITRI, University of Brighton, UK
• PhD University of Amsterdam 1991• Research interests: - Formal semantics of Natural Language (underspecification, anaphora, prosody) - Generation of text, documents, speech
![Page 3: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/3.jpg)
…and Matthew Stone
• Assistant Professor, Comp. Sci. & Cog. Sci.Rutgers University, USA
• PhD University of Pennsylvania 1998• Research interests: - Knowledge representation (action, knowledge, planning,
inference) - Task-oriented spoken language
![Page 4: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/4.jpg)
Our Question This Week
What are the possible ways of using knowledge (of the world and of language) in formulating an utterance?
![Page 5: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/5.jpg)
Knowledge in utterances
Knowledge of the worldUtterance says something useful and reliable.
Knowledge of languageUtterance is natural and concise,in other words, it fits hearer and context.
![Page 6: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/6.jpg)
A Concrete Example
Turn handle to locked position.
OPEN
LOCKED
Our partner is working with equipment that looks like:
The instruction that we’d like to give them is:
![Page 7: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/7.jpg)
Knowledge in this utterance
Knowledge of the worldUtterance says something useful and reliable.
OPEN
LOCKED
This is what has to happen next.
![Page 8: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/8.jpg)
Knowledge in this utteranceKnowledge of language
Utterance is natural and concise.Consider the alternatives…
OPEN
LOCKED
Move the thing around.
![Page 9: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/9.jpg)
Knowledge in this utteranceKnowledge of language
Utterance is natural and concise.Consider the alternatives…
OPEN
LOCKED
You ought to readjust the fuel-line access panel handle by pulling clockwise 48 degrees
until the latch catches.
![Page 10: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/10.jpg)
Our Question This Week
What are the possible ways of using knowledge (of the world and of language) in formulating an utterance?
This is a formal question; the answers will depend on the logics behind grammatical information and real-world inference.
![Page 11: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/11.jpg)
The NLG problem depends on the input to the system
OUTPUT: Turn handle to locked position.
INPUT: turn’(handle’, locked’)
If the input looked like this:
Deriving the output would be easy:
![Page 12: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/12.jpg)
Must support correct, useful domain reasoninge.g., characterizing the evident function of equipment
Real conceptual input is richer and organized differently
OPEN
LOCKED
e.g., simulating/animating the intended action
![Page 13: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/13.jpg)
Difference in Content
Input: New info complete, separate from old
Output: New info cut down, mixed with old
Turn handle to locked position.
![Page 14: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/14.jpg)
Difference in Organization
Input: deictic representation for objectsthrough atomic symbols that index a flat database
handle(object388). number(object388, “16A46164-1”).composedOf(object388,steel). color(object388,black).goal(object388,activity116).
partOf(object388,object486).
Output: descriptions for objectsthrough complex, hierarchical structures
NP – DET – the - N’ – N – handle
![Page 15: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/15.jpg)
Formal problem
NLG means applying input domain knowledge that looks quite different from output language!
![Page 16: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/16.jpg)
Formal problem
How can we characterize these different sources of informationin a common frameworkas part of a coherent model of language use
For example: how can we represent linguistic distinctions that make choices in NLG good or bad?
![Page 17: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/17.jpg)
Our Question This Week
What are the possible ways of using knowledge (of the world and of language) in formulating an utterance?
This is not just a mathematical question –
This is a computational question; possible ways of using knowledge will be algorithms.
![Page 18: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/18.jpg)
No Simple Strategy to Resolve Differences
Lots of variability in natural instructions
Lift assembly at hinge.Disconnect cable from receptacle.Rotate assembly downward.Slide sleeve onto tube.Push in on poppet.
![Page 19: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/19.jpg)
Strategy Must DecideWhen to Say More
Using utterance interpretation as a whole
Turn handle by hand-grip from current open position for handle 48 degrees clockwise to locked position for handle.
OPEN
LOCKED
![Page 20: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/20.jpg)
In particular, hearer matches shared initial state
So describe objects and places succinctly
Turn handle by hand-grip from current open position for handle 48 degrees clockwise to locked position for handle.
OPEN
LOCKED
![Page 21: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/21.jpg)
In particular, hearer applies knowledge of the domain
So omit inevitable features of action
Turn handle by hand-grip from current open position for handle 48 degrees clockwise to locked position for handle.
OPEN
LOCKED
![Page 22: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/22.jpg)
Computational problem
Because this process is so complex, it takes special effort to specify the process, to make it effective, and ensure that it works appropriately.
For example: How much search is necessary? How can we control search when search is required? What will such a system be able to say?
![Page 23: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/23.jpg)
OK, so generation is hard
Is this worth doing? Why does it matter?
NLG systems also have practical difficulties:Actually getting domain information.Successful software engineering.Knowing when you’ve completed them.
These are not to be underestimated – They’ve been the focus of most NLG research.
![Page 24: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/24.jpg)
Why does this matter?
Formally precise work in NLG is starting, with motivations in computer science, cognitive science and linguistics.
Particularly challenging because generation has received much less attention than understanding.
![Page 25: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/25.jpg)
Computer Science MotivationsDialogue systems are coming.
Spoken language interfaces are standard for simple phone applications; they will soon be the norm for customer service, information, etc.
Generation is an important bottleneck.After speech recognizer performance, the biggest impediment to user satisfaction is the lack of concise, natural, context-dependent responses. (More Wednesday.)
![Page 26: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/26.jpg)
Computer Science Motivations• On-line help can be enhanced when help
messages are generated. (Assumption: There are too many to be stored separately.)
• Multilingual document authoring can be an efficient alternative to MT. (Different texts can be generated from the same input.)
Slightly less obvious:
• Sophisticated text summarization requires NLG
• Some approaches to MT require NLG
![Page 27: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/27.jpg)
Computer Science MotivationsBetter formal understanding of NLG
promises• More flexible architectures• More robust and natural behavior• Less labor-intensive programming
methods
In short, NLG systems that work better and are easier to build.
![Page 28: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/28.jpg)
Motivations in Cognitive Science
Want to know how language structure supports language use.– different representations make different
processing problems.
![Page 29: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/29.jpg)
Motivations in Cognitive Science
Different representations make different processing problems.
E.G. incremental interpretation: how do you understand an incomplete sentence –
John cooked…
![Page 30: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/30.jpg)
Motivations in Cognitive Science
E.G. incremental interpretation: how do you understand an incomplete sentence –
John cooked…
CFG: “compilation” – metalevel reasoning that abstracts meaning in common to many derivations.
CCG: just use the grammar itself – John cooked parses as S/NP
![Page 31: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/31.jpg)
Motivations in Cognitive Science
Want to know how language structure supports language use.– different representations make different
processing problems.
What do we learn when we think of speaking as well as understanding?
![Page 32: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/32.jpg)
Linguistic Motivations
Generation brings an exacting standard for theoretical precision.
Grice says “be brief.” For NLG we must say precisely how. (More Tuesday.)
More generally, is a model of the meaning of forms enough to explain when people use them?
If not, we need to say more. (Or maybe this whole meaning stuff is wrongheaded!)
![Page 33: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/33.jpg)
Our Ulterior Motive
• Explain why NLG is theoretically interesting
(formally, computationally, and linguistically).
• Get more people working on NLG.
![Page 34: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/34.jpg)
Our strategy this week:
• Zoom in on Generation of Referring Expressions (henceforth: GRE)
• Suggest that rest of NLG is equally interesting
![Page 35: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/35.jpg)
What is GRE?(very briefly)
• Given: an object to refer to. (target object)
• Given: properties of all objects.• (Given: a context.)• Task: identify the target object.
![Page 36: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/36.jpg)
Our Plan
Today: Overview of generation.Tuesday: Case study in generation: GRE.Wednesday: Pragmatics in GRE.Thursday: Semantics in GRE.Friday: Syntax in GRE.(NOTE: Relatively complementary
lectures.)
Of course it’s backwards: it’s generation!
![Page 37: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/37.jpg)
Rest of Today
A discussion of how generation systems usually work
(after Reiter & Dale, 1997)
![Page 38: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/38.jpg)
In practice, NLG systems work the way we can build them.
They solve a specific, carefully-delineated task.
They can verbalize only specific knowledge.They can verbalize it only in specific, often quite stereotyped ways.
![Page 39: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/39.jpg)
In practice, NLG systems work the way we can build them.
That means start with available input and the desired output, and putting together something that maps from one to the other.
Any linguistics is a bonus.Any formal analysis of computation is a bonus.
![Page 40: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/40.jpg)
Input can come from …
• Existing database (e.g., tables) Format facilitates update, etc.
• An interface that allows a user to specify it
(e.g., by selecting from menus)
• Language interpretation
![Page 41: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/41.jpg)
For Example
Input: Rail schedule database. Current train status.
User query When is the next train to Glasgow?
Output:There are 20 trains each day from Aberdeen to Glasgow. The next train is the Caledonian express; it leaves Aberdeen at 10am. It is due to arrive in Glasgow at 1pm, but arrival may be slightly delayed.
![Page 42: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/42.jpg)
To get from input to output means selecting and organizing information
The selection and organization typically happens in a cascade of processes that use special data structures or representations
Each makes explicit a degree of selection and organization that the system is committed to.Indirectly, each indicates the degree of selection and organization the system has still to create.
![Page 43: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/43.jpg)
The NLG PipelineGoals
Text Plans
Text Planning
Sentence Plans
Sentence Planning
Linguistic Realization
Surface Text
![Page 44: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/44.jpg)
Overview of Processes and Representations, 1
Goals Messages Text Plans
Text Planning
ContentPlanning
DiscoursePlanning
![Page 45: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/45.jpg)
Message
A message represents a piece of information that the text should convey, in domain terms.
![Page 46: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/46.jpg)
Example Messages
message-id: msg01
relation: IDENTITY
arguments: arg1: NEXT-TRAIN
arg2: CALEDONIAN-EXPRESS
The next train is the Caledonian Express
![Page 47: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/47.jpg)
Example Messages
message-id: msg02
relation: DEPARTURE
arguments: entity: CALEDONIAN-EXPRESS
location: ABERDEEN
time: 1000
The Caledonian Express leaves Aberdeen at 10am.
![Page 48: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/48.jpg)
A close variant
• Q: When is the next train to New Brunswick?
• A: It’s the 7:38 Trenton express.
I know something about the domain in this case – and can highlight how nonlinguistic the domain representation will be.
![Page 49: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/49.jpg)
Variant message
message-id: msg03
relation: NEXT-SERVICE
arguments:station-stop: STATION-144
train: TRAIN-3821
The next train to New Brunwickis the Trenton Local.
![Page 50: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/50.jpg)
Closer to home
message-id: msg04
relation: DEPARTURE
arguments: origin: STATION-000
train: TRAIN-3821
time: 0738
It leaves Penn Station at 7:38.
![Page 51: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/51.jpg)
How I got domain knowledge
NY Penn Station really is NJT Station 000,New Brunswick really is Station 144
(you have to key this into ticket machines!)This really is train #3821
(it’s listed with this number on the schedule!)
![Page 52: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/52.jpg)
Text Plan
A text plan represents the argument that the text should convey; it is a hierarchical structure of interrelated messages.
![Page 53: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/53.jpg)
Example Text Plan
NextTrainInformation
ELABORATION
[DEPARTURE][IDENTITY]
![Page 54: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/54.jpg)
Overview of Processes and Representations, 2
TextPlans
SentencePlans
Sentence Planning
LexicalChoice
Aggregation
ReferringExpressionGeneration
?
![Page 55: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/55.jpg)
Sentence Plans
A sentence plan makes explicit the lexical elements and relations that have to be realized in a sentence of the output text.
![Page 56: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/56.jpg)
Example Sentence Plan
(S1/be
:subject (NEXT-SERVICE/it)
:object (TRAIN-3821/express
:modifier Trenton
:modifier 7:38
:status definite))
It’s the 7:38 Trenton express.
![Page 57: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/57.jpg)
We know what’s happened
Aggregation: we have constructed a single sentence that realizes two messages.
Once we have the first message:It’s the Trenton express.
We just add 7:38 to realize the second message:It’s the 7:38 Trenton express.
![Page 58: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/58.jpg)
We know what’s happened
Referring expression generation: we have figured out to realize the next-service as it, and figured out to identify the train by its destination and frequency of stops.
![Page 59: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/59.jpg)
We know what’s happened
Lexical (and grammatical) choice: to use the verb be with it as the subject and a reference to the train second; to say express rather than express train.to say Trenton rather than Northeast Corridor.
![Page 60: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/60.jpg)
But there’s no consensus method for how to do it.• Reiter (1994, survey of 5 NLG
systems): Most practical systems follow a pipeline, even though this makes some things difficult to do. Example: Avoidance of ambiguity
• Cahill et al. (1999, survey of 18 NLG systems): Tasks like Aggregation and GRE can happen almost anywhere in the system, e.g.,
- as early as Content Planning - as late as Sentence Realization
![Page 61: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/61.jpg)
But there’s no consensus method for how to do it.
And we’ll see that formal and computational questions raise important difficulties forwhat representations you can havewhat processes and algorithms you can usehow you bring knowledge of language into
the loop
![Page 62: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/62.jpg)
Overview of Processes and Representations, 3
SentencePlans
SurfaceText
Linguistic Realization
![Page 63: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/63.jpg)
This is easier to think about
We all know what a surface text looks like!
And we all know you have to have a grammar
(of some kind or other) to get one!
![Page 64: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/64.jpg)
Concluding remarks
• Our overview has followed `standard model’ of Reiter & Dale (1997)
• Even though the paper has an applied motivation, the formal problems we described earlier really come up.– Need linguistic representations that
correspond to domain and enable choices– Need good algorithms to put
correspondence to work
![Page 65: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/65.jpg)
Next Time
GRE in particular.• A microcosm of NLG,
requiring choices of content and form.
• A proving ground for formal questionshow to formalize knowledge of language, characterize good communication, and design effective algorithms.
![Page 66: Formal Issues in Natural Language Generation](https://reader030.vdocument.in/reader030/viewer/2022032709/568132a0550346895d994357/html5/thumbnails/66.jpg)
What to do
If you’ve read Reiter & Dale 1997 great!This lecture probably made more sense to you.
But we won’t touch general issues till FridaySo otherwise no hurry to do Reiter & Dale
1997.
Next reading is Dale & Reiter 1995.We’ll be covering the whole paper rather
closely.