f-structures, information structure, and discourse structure
DESCRIPTION
F-Structures, Information Structure, and Discourse Structure. Tracy H. King Annie Zaenen PARC. Talk Outline. Information Structure: Syntax of discourse functions Applications: Anaphora resolution Discourse structure Applications: Summarization and Sentence Condensation Conclusions. - PowerPoint PPT PresentationTRANSCRIPT
LFG Winter school July 2004
PARC
F-Structures, Information Structure, and Discourse
StructureTracy H. King Annie Zaenen
PARC
LFG Winter school July 2004
PARC
Talk Outline
• Information Structure: Syntax of discourse functions
• Applications: Anaphora resolution• Discourse structure• Applications: Summarization and
Sentence Condensation• Conclusions
LFG Winter school July 2004
PARC
Information Structure: Syntax of discourse functions
• Basic discourse functions
• Typology of encoding• LFG approaches
LFG Winter school July 2004
PARC
Basic discourse functions
• DFs encode and divide up the information structure of the sentence.
• DFs are notoriously difficult to define– Topic/Theme/Given– Focus/Rheme/New– Contrastiveness
• What to do with non-DF information, e.g. background information?
LFG Winter school July 2004
PARC
Example: Clefts
• It is the [box]Focus [that]Topic I opened.• Construction encodes focus of the clefted
constituent.• The referent of that constituent is the topic
of the subordinate clause.• The ‘relative clause’ material is
‘presupposed’.
• Question-answer pairs are often used to determine DFs. – What did you open? It was the box that I opened.
LFG Winter school July 2004
PARC
Basic discourse functions
• Here focus on:– how to encode these – what they can be used for
• Choice of relevant DFs depends on what they are needed for.
LFG Winter school July 2004
PARC
Typology of encoding
• Structural position– initial– preverbal
• Discourse markers/particles• Intonation• Combinations of these
LFG Winter school July 2004
PARC
Structural encoding
• Position indicates discourse function.• Language specific
– Topics are initial– Focus are pre/post verbal– Background information is postverbal– Constructions: clefts
• Subject as default topic• LFG: designated c-structure position
LFG Winter school July 2004
PARC
Initial topics• Object marker on the verb
– Anaphoric agreement– The OM is the object
• Chichewa (Bresnan & Mchombo 1987)Alenje zi-ná-wá-lu-ma njuchi.hunters SM-past-OM-bite-indic bees`The bees bit them, the hunters.'
LFG Winter school July 2004
PARC
Preverbal focus
• Turkish (Enc 1991)bu kitab-i Hasan ban-a ver-dirthis book-acc Hasan I-dat give`This book Hasan gave to ME.'
LFG Winter school July 2004
PARC
DF markers
• Morphemes can mark DF– Japanese wa– Hindi (Sharma 2003)
• hI exclusive contrastive focus (only)• bhI inclusive contrastive focus (also)• tO contrastive topic
LFG Winter school July 2004
PARC
Hindi exampleExclusive focus:rAdha=ne=hI baccho=kO kahAnI sunAyIRadha=erg=Foc children=ACC story hear`It was (only) Radha who told the children a story'
Contrastive topic:mOmbattI=tO milI, kEkin abh mAchis gum gayEcandle=Top found but now match lost go`The candle was found but not the matches are lost.'
LFG Winter school July 2004
PARC
Intonation• Most DFs have a specific intonation
associated with them• Intonation alone can signal a DF
Did you see Mary or John? I saw JOHN.
It was a RED hat that I wore.
LFG Winter school July 2004
PARC
Combinations• Most positionally and marker-signaled
DFs also have intonation marking.• Can combine position and marker
– ay inversion in Tagalog (Kroeger 1993) ay marker as head of I SpecIP is Topic=Subj or Focus=non-Subj
– Ni lapis ay hindi nagdala si=Rosa even pencil AY not bring
nom=Rosa `Even a pencil Rosa didn't bring.'
LFG Winter school July 2004
PARC
LFG approaches
• Syntax-DF interactions• F-structure vs. I-structure• OT-LFG
LFG Winter school July 2004
PARC
Syntax-DF interactions
• Subcategorized DFs– Predicates can subcategorize for DFs.
• C-structure annotations– C-structure nodes can be associated
with DFs, similar to GF assignment in configurational languages.
LFG Winter school July 2004
PARC
Subcategorized DFs
• Malay Topic (Alsagoff 1992)– verb affix identifies Topic and equates
it with a GF• meng- ( TOP)=( SUBJ)• di- (i) ( TOP)=( SUBJ) (ii) < ( SUBJ) ( OBL) > log obj log subj• 0- ( TOP)=( OBJ)
LFG Winter school July 2004
PARC
Malay exampleMiriam MENG-cubit doktor ituMiriam MENG-pinch doctor the`Miriam pinched the doctor.'
MENG-cubit (PRED)='pinch< ( SUBJ), ( OBJ)> ( TOP)'
PRED 'pinch< ( SUBJ), ( OBJ)> ( TOP)'SUBJ [ PRED 'Miriam' ]TOP [ ]OBJ [ PRED 'doctor' ]
LFG Winter school July 2004
PARC
Chichewa and Tagalog topic
Chichewa: Bresnan & Mchombo 1987
S
NP( TOPIC)=
VP
TOPIC [ …]SUBJ [ PRED 'pro' ]PRED 'X<SUBJ,…>'
anaphoric binding
Tagalog:Kroeger 1993
CP
NP( TOPIC)=
C'
LFG Winter school July 2004
PARC
Urdu preverbal focus
VP
XP( FOCUS)=
V'
Urdu: Butt & King 1996
LFG Winter school July 2004
PARC
C- to F-structure Mapping proposal
• Clause-Prominence of DFs: DF adjuncts (i.e., in adjoined positions) must be clause-prominent, occurring either at an edge of the clause or adjacent to the head of the clause. (Bresnan 2001:192)
XP
XP
XP
YPDF
ZPAdjunct
LFG Winter school July 2004
PARC
Mapping proposal
• Specifiers of functional categories are the grammatical discourse functions (Topic, Focus, Subj). (Bresnan 2001:102)
FP
F'SpecFPDF
LFG Winter school July 2004
PARC
Intonation• Much work is done on this association
– Steedman (2000) on Categorial Grammar
• Less in LFG– Bengali and the syntax-prosody mapping
(Butt and King 1998)– Russian clause-final focus (King 1995)– Integration of prosody into the LFG
projection architecture needs more exploration.
LFG Winter school July 2004
PARC
Discourse markers
• Constructive case/morphology approach (Sharma 2003)– hI (FOC )
X(P)
X(P) Cl-disc(FOC )hI
LFG Winter school July 2004
PARC
F-structure vs. I-structure
• DFs are often represented in the f-structure.– Malay subcategorizes for Topics– Chichewa incorporated pronouns
• Scope of DFs may conflict with that of GFs.– project DFs into an I(nformation)-
structure
LFG Winter school July 2004
PARC
DF-GF mismatches
VP focus: Mary [F ate the cake].
F-structure
PRED 'eat<SUBJ,OBJ>SUBJ [ PRED 'Mary' ]OBJ [ PRED 'cake' ]TNS past
How can the focus be represented?Form I-structure constituents.
LFG Winter school July 2004
PARC
OT-LFG approaches
• OT constraints for encoding of DFs (Choi 1999)– [New]-X: Place [+New] in a salient position
X– [Prom]-X: Place [+Prom] in a salient
position X
• Languages – rank these constraints – define possible instantiations of X
LFG Winter school July 2004
PARC
Summary: Syntax of DFs• DFs can be encoded by:
– structural position– morphological markers– intonation
• Linguistic theories need a way to capture these interactions– Much LFG work on structural position and
morphological markers– Are F and T the only elements worth
distinguishing?– Need more work on integrating generalizations
about intonation– Need more work on how syntactic distinctions
relate to semantic and pragmatic concepts
LFG Winter school July 2004
PARC
Form and function relation• A radical proposal:
– Prince: the relation between syntax and pragmatics is as arbitrary as that between sound and word meaning
• Cross language variation: • e.g. functions of Left-dislocation in Yiddish and English are
different (Prince)• Functions of clefting and topicalization are different across
Germanic languages• Functions of Left-Dislocations (or Contrastive topicalization)
and Right dislocations in Romance languages and in Germanic are different (see e.g. Lambrecht 1981 on Spoken French).
• Not a one-to-one correspondence between form and function
LFG Winter school July 2004
PARC
Talk Outline
• Information Structure: Syntax of discourse functions
• Applications: Anaphora resolution
• Discourse structure• Applications: Summarization and
Sentence Condensation• Conclusions
LFG Winter school July 2004
PARC
Applications for Discourse Functions
Anaphora resolution– DFs determine saliency– Saliency partially determines
resolution
LFG Winter school July 2004
PARC
Anaphora Resolution
• Have a sentence with pronouns or referring NPs (the president)
• Want to know what they refer to– some restrictions are purely syntactic: (most) reflexives refer to Subjects– others are heuristic: prefer closer referents prefer high saliency referents
LFG Winter school July 2004
PARC
Role of Discourse Functions
• Topic, and topic shift, are relevant for anaphora
• Centering theory and its variants– have an ordered list of salient
elements– have a referring expression– first salient element to match
features is the antecedent– update the list based on this
LFG Winter school July 2004
PARC
Anaphora resolution example
Brennan drives an AR. Brennan =Old, AR=NewShe drives too fast. She=Brennan=OldFriedman races her on weekends.
Friedman=Old, Brennan=Old, Her=Brennan=OldShe drives to Laguna Seca. She=Friedman=OldShe often beats her. She=Friedman=Old Her=Brennan=Old
Discourse functions determine correct anaphora resolution.
LFG Winter school July 2004
PARC
Pro-Drop and Anaphora Resolution
• Pro-drop is (partly) licensed by DFs– Already established topics are more
likely to be pro-dropped
• Centering theory:– Continue and Smooth-shift transition
favor null subjects– Chinese (Song 2003)– Yiddish (Prince 1998)
LFG Winter school July 2004
PARC
Summary: Anaphora resolution
• DFs are essential for determining anaphora resolution
• Pro-drop is licensed in part by IS• But a lot remains to be worked out.
LFG Winter school July 2004
PARC
Talk Outline
• Information Structure: Syntax of discourse functions
• Applications: Anaphora Resolution• Discourse structure• Applications: Summarization and
Sentence Condensation• Conclusions
LFG Winter school July 2004
PARC
Discourse Structure
•A simple model
•Its relation to syntax
LFG Winter school July 2004
PARC
A too simple idea
S S S
D
S S
LFG Winter school July 2004
PARC
Progression and elaboration
• Joan got up early. She showered. Then she made some tea. …
• Mary is a model professor. Last year she wrote ten papers. She also advised 20 doctoral students and she was a member of the Committee on Women in Science.
LFG Winter school July 2004
PARC
A still very simple ideaD
S S D S S
S S
Discourse progresses sentence by sentence
or
Subparts elaborate on previous parts
LFG Winter school July 2004
PARC
One type of discourse trees (Linguistic Discourse
Model)John fell. Bill pushed him.
a b
S
Bill pushed John. He fell.
a b
C
a and b are BDUs (Basic Discourse Unit)
A BDU basically corresponds to a segment with an event variable in its semantics.
LFG Winter school July 2004
PARC
BDU Relations
• Not all types of relations can be classified as belonging to the subordinating or the coordinating type.– We will ignore the rest here.
• Some elements in a sentence can explicitly indicate what type of relation we have, e.g. ‘because’ is a subordination relation.– They will be called “operator segments.”
LFG Winter school July 2004
PARC
How do discourse trees relate to sentence syntax
trees?• Some textual elements guide the
discourse tree construction.
• A BDU is not necessarily a complete sentence or vice versa.
LFG Winter school July 2004
PARC
Sentence does NOT equal BDU
[The man dove into the pool.]a [It was warm and
soothing]b and [he decided to remain for a little
longer than usual.]c
a b c
S
C
LFG Winter school July 2004
PARC
ADJUNCT clauses
[Joan left]a because [she was tired.]b
Three segments: Two BDUs and 1 operator
a b
S
LFG Winter school July 2004
PARC
Textual elements that guide the construction of discourse
trees
• Hypothesis 1: Subordinating conjunctions indicate discourse subordination. – Needs checking: it is often true but is
it always true?
LFG Winter school July 2004
PARC
Textual elements cont.• Hypothesis 2: tense and aspect
– John dove into the pool. The water was warm and soothing.
– John Smith was wearing a long coat. It looked brand new.
• Stative predicates do not push the discourse forward and often indicate subordination.
• English is not very rich in this type of indicator.– perfective/imperfective distinctions are more explicit in other languages (e.g. French). (e.g. Asher and Lascarides, 2003)
LFG Winter school July 2004
PARC
Textual elements cont.
• Hypothesis 3: pronominalization– John Smith was wearing a long coat. It looked
brand new.Often the ‘promotion’ of (the referent of) an OBJ or a OBL to a SUBJ in the following sentence reflects a discourse subordination. (Polanyi et al. 2004)
• But– John hit Bill. He fell.
The tense and aspect information takes precedence.
LFG Winter school July 2004
PARC
What is the role of Information Structure in the construction of
Discourse trees?• [John Smith]T1 was wearing [a long coat]F1. [It]T2 looked brand new.
Focus-1 -->Topic-2• [John]T1 likes [sweets]F1. [He]T2 eats [three dishes of ice cream]F2 and [five chocolate bars]F2 every day .
Topic-1 --> Topic-2(cf. centering theory ‘shifts’)
In Discourse Structure both are subordinations
LFG Winter school July 2004
PARC
Are Information Structure and Discourse structure
independent?• Information structure: what the
sentence/discourse is about• Discourse structure: how we talk
about what we are talking about: – narratives– explanations– …
LFG Winter school July 2004
PARC
Where to look for a link?‘The first Christian mission to New Zealand,…., was launched by Samuel Marsden on behalf of the Church of England’s Church Missionary Society (CMS) in 1814. …Marsden, a bluff Yorshireman with ‘heavy shoulders and a face of a petulant ox’, was both chaplain to the New South Wales penal settlement and a magistrate. He was severe in dealing with convicts… But he went out of his way to meet and greet Maori in Sidney, and often… He had even, in 1809, rescued the Maori sailor Ruatara, who was stranded in London, and taken him back with him to Sidney. It was this association in particular that led Marsden to set up the first CMS mission at Rangihoua in the Bay of Islands in 1814, on land that he would buy from Ruatara.’
LFG Winter school July 2004
PARC
• The cleft seems to indicate a ‘pop’ from the subordinated material to the resumption of main narrative.
• Note also that the material in the that-clause might be presupposed in the logical sense but it is not old information (see Collins for ample examples)
• No claim that this is the only discourse structure function of it-clefts.
• No one-to-one relation, multifactorial analysis necessary.
LFG Winter school July 2004
PARC
Summary: Discourse Structure
• Discourse structure looks at how clauses and sentences are related to one another.
• Textual elements provide information on how to build up the structure but they do not completely determine it.
LFG Winter school July 2004
PARC
Talk Outline
• Information Structure: Syntax of discourse functions
• Applications: Anaphora resolution• Discourse structure• Applications: Summarization and
Sentence Condensation• Conclusions
LFG Winter school July 2004
PARC
Summarization and sentence condensation
• Summarization• Condensation
LFG Winter school July 2004
PARC
Sentence Condensation and Summarization
• Have a long text• Want a short "condensed" version
– retain most salient features– maintain grammaticality
• Choose salient sentences via Discourse structure
• Condense those sentences
LFG Winter school July 2004
PARC
Example of Discourse-driven summarization
• Our group is developing new techniques for helping manage information for enhanced collaboration. We explore solutions for seamlessly connecting people to their personal and shared resources. Our solutions include services for contextual and proactive information access, personalized and collaborative office applications, collaborative annotation and symbolic, statistical and hybrid processing of natural language. Our team includes researchers with diverse backgrounds including: ubiquitous computing, computer-supported collaboration, HCI, IR, and NLP.
LFG Winter school July 2004
PARC
Discourse Tree
A coordination of two subtrees that have subordinated elements with again coordinated or subordinated elements
C
S S
SS
ba
eS
gfc d
LFG Winter school July 2004
PARC
Discourse Structure• [Our group is developing new techniques]a for
[helping manage information for enhanced collaboration.]b [We explore solutions for seamlessly connecting people to their personal and shared resources.]c [Our solutions include services for contextual and proactive information access, personalized and collaborative office applications, collaborative annotation and symbolic, statistical and hybrid processing of natural language.]d [Our team includes researchers with diverse backgrounds]e [including:]f [ubiquitous computing, computer-supported collaboration, HCI, IR, and NLP.]g
LFG Winter school July 2004
PARC
Possible condensations• [Our group is developing new
techniques]a for [helping manage information for enhanced collaboration.]b [We explore solutions for seamlessly connecting people to their personal and shared resources.]c [Our team includes researchers with diverse backgrounds]e [including:]f [ubiquitous computing, computer-supported collaboration, HCI, IR, and NLP.]g
LFG Winter school July 2004
PARC
Possible condensations• [Our group is developing new
techniques]a for [helping manage information for enhanced collaboration.]b [Our team includes researchers with diverse backgrounds]e
LFG Winter school July 2004
PARC
Possible condensations[Our group is developing new
techniques]a [Our team includes researchers with diverse backgrounds]e
LFG Winter school July 2004
PARC
Summarization and Discourse structure
• Discourse structure allows only ‘big chunks’ to be deleted.
• We need also finer-grained structure for sentence condensation.
LFG Winter school July 2004
PARC
Role of information structure in sentence condensation
• Salient information will be discourse prominent– retain Focus– retain Topic, possibly in reduced form (e.g.
pronoun)– delete background, non-prominent
• Salient information tends to correspond to the heads of arguments in main clauses but there are a lot of special cases– ADJUNCTs can be deleted but one should
keep negations
LFG Winter school July 2004
PARC
Sentence condensation• [Our group is developing new
techniques]a for [helping manage information for enhanced collaboration.]b [We explore solutions for seamlessly connecting people to their personal and shared resources.]c
• Transfer rules dictate which f-structure parts can be deleted
X (Y ADJUNCT) &(X ADJ-TYPE) neg: X 0.
LFG Winter school July 2004
PARC
Basic Sent. Cond. system
Source XLEParsing
Target Packed F-structures
XLEGeneration
Packed Condens.Transfer
n b
est
PargramEnglish
Condensationrules
Log-linearmodel
Stochastic Selection
LFG Winter school July 2004
PARC
Summary: Summarization and Condensation
• Discourse structure can guide summarization
• F-structures are easily manipulated for condensation– F-structure distinctions give broad guidance
(ADJUNCT and MOD vs. GFs and DFs)– But there are distinctions that are important for
condensation that are very minor in the f-structure, e.g. difference between negative and other ADJUNCTS.
– Is it possible to be more systematic or is this just a reflection of the way things are?
LFG Winter school July 2004
PARC
Talk Outline
• Information Structure: Syntax of discourse functions
• Applications: Anaphora resolution• Discourse structure• Applications: Summarization and
Sentence Condensation• Conclusions
LFG Winter school July 2004
PARC
Summary
• Information-structure: sentence internal partition of the information according to discourse functions
• Discourse-structure: inter-clausal relations between successive utterances
• Both are crucial in certain applications– anaphora resolution– summarization/condensation
LFG Winter school July 2004
PARC
Conclusions
• For applications it is necessary to get all the modules worked out
• This crucially involves many aspects of linguistic theory
• The projection architecture of LFG should be helpful but a lot of work remains to be done.
LFG Winter school July 2004
PARC
References• Alsagoff, L. 1992. Topic in Malay: the Other Subject.
PhD thesis, Stanford University.• Asher, N. and A. Lascarides (2003) Logics of
Conversation, Cambrdige University Press.• Bresnan, J. 2001. Lexical-Functional Syntax.
Blackwell.• Bresnan, J. and S. Mchombo. 1987. Topic, pronoun,
and agreement in Chichewa. Linguistic Inquiry.• Butt, M. and T.H. King. 1998. Interfacing
phonology with LFG. In LFG98 Proceedings. CSLI On-line Publications.
• Choi, H.-W. 1999. Optimizing Structure in Context: Scrambling and Information Structure. CSLI Publications.
LFG Winter school July 2004
PARC
References• Collins, Peter. 1991. Cleft and pseudo-cleft
constructions in English. London and New York: Routledge.
• Enc, M. 1991. The semantics of specificity. Linguistic Inquiry.
• King, T.H. 1995. Configuring Topic and Focus in Russian. CSLI Publications.
• Kroeger, P. 1993. Phrase Structure and Grammatical Relations in Tagalog. CSLI Publications.
• Lambrecht, Knud. 1981. Topic, antitopic and verb agreement in non-standard French. Benjamins
• Polanyi, L. et al. 2004. A Rule Based Approach to Discourse Parsing, ACL Workshop
• Polanyi, L. et al. 2004. Sentential Structure and Discourse Parsing. ACL Workshop
LFG Winter school July 2004
PARC
References• Prince, Ellen, 1998. On the limits of syntax, with
reference to Left-Dislocation and Topicalization. In Culicover, P. and McNally, L., eds. The limits of syntax. NY: Academic Press.
• Prince, Ellen. 1998. Subject-Prodrop in Yiddish. In Focus: Linguistics, cognitive, and computational perspectives.
• Sharma, D. 2003. Discourse clitics and constructive morphology in Hindi. In M. Butt and T.H. King (ed) Nominals: Inside and Out. CSLI Publications.
• Song, Zhiyi. 2003. A Comparative Study of Subject Pro-drop in Old Chinese and Modern Chinese. NWAVE 32.
• Steedman, M. 2000. Information Structure and the Syntax-Phonology Interface. Linguistic Inquiry.
LFG Winter school July 2004
PARC
LFG Winter school July 2004
PARC
Example: Sentential Subjects
• [That john is an idiot]top[e-subj is obvious]sentence