encoding unl expressions: some problems and proposals igor boguslavsky unl russia [email protected]

58
Encoding UNL Encoding UNL Expressions: Some Expressions: Some Problems and Proposals Problems and Proposals Igor Boguslavsky UNL Russia [email protected]

Upload: erick-dean

Post on 12-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Encoding UNL Expressions: Encoding UNL Expressions: Some Problems and ProposalsSome Problems and Proposals

Igor Boguslavsky

UNL Russia

[email protected]

Page 2: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

GeneralsGenerals

UNL: an interlingua or not?No ambiguityUNL vs. natural language

– “at least as powerful as any NL”?– semantics vs. KB

Correct UNL vs. adequate UNL

Page 3: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Adequacy conditionsAdequacy conditions

An adequate UNL expression should:preserve the meaning of the source

text;be convenient for prospective

applications, including deconversion.

Page 4: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Possibility of inverse generation?Possibility of inverse generation?

Necessary but insufficient:invitation of the presidentmod(invitation, president)?

– the president invites somebody– somebody invites the president

Russian: He received the shower (= took the shower)

Page 5: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Tentative procedure Tentative procedure

How to develop a definite and common view on what UNL expressions are adequate?

1. UNL from LCs to UNLC

2. Comments from LCs to UNLC

3. Feedback from UNLC to LCs

4. Update of UNL by LCs

Page 6: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Universal WordsUniversal Words

HeadwordsRestrictionsAttributes

Page 7: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

HeadwordsHeadwords

Multi-word UWsSupport verbs

Page 8: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Multi-word UWsMulti-word UWs

They should be avoided, if the their meaning is representable as a combination of meanings of words they are composed of:

UW to be avoided: «Ministery of foreign affairs»

UW to be preferred:mod(ministery.@entry, affair.@pl)

mod(affair.@pl,foreign)

Page 9: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Why so?Why so?

If any free word combination can be made a UW, one can never hope that other partners will have matching UWs in their dictionaries.

Page 10: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Appropriate multi-word UWsAppropriate multi-word UWs

Non-compositional phrases:– «look for(agt>thing,obj>thing)»– «look like(aoj>thing,obj>thing)»

Page 11: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

A convenient compromiseA convenient compromise

To account for the fact that a phrase is considered as denoting a single concept, the UNL expression can be enclosed in a scope:

mod:01(ministery.@entry, affair.@pl)

mod:01(affair.@pl,foreign)

Page 12: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Another possibility (Ch. Boitet)Another possibility (Ch. Boitet)

Postulate one UW having the internal structure:«mod(ministery,affair.@pl)_mod(affair.@pl,

foreign)»

Page 13: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Restrictions in UW/KB Restrictions in UW/KB

(1) Semantic function

(2) Knowledge Base function

(3) Argument frame function

Page 14: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

(1) Semantic function(1) Semantic function

Restricting the meaning - needed, in particular, to ensure

– disambiguation of the head word – selection of the translation equivalent

Page 15: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

(2) KB function(2) KB function

Locating the UW in the KB - needed, in particular, to ensure

– choice of a nearest UW in the case the direct equivalent is absent in the UW dictionary (replacement ability)

– semantic inference

Page 16: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

(3) Agrument frame function(3) Agrument frame function

Presenting the argument frame.

Page 17: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Correlation between the semantic Correlation between the semantic and the KB functionsand the KB functions

Semantically- and KB-oriented restrictions do not necessarily coincide:

– semantic restriction: book(icl>thing)

– KB restriction: titmouse{(icl>bird)}

Page 18: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

How to selectHow to select semantic semantic restrictionsrestrictionsSeptember{(icl>month>date)} answer(icl>do) (for cases like answer

questions) – answer(icl>be) (for cases like answer expectations) – answer(icl>thing) (for cases like know the answer)

Ru: zhenit’sja – marry(agt>male),

vyxodit’ zamuzh – marry(agt>female).

Page 19: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

VERY IMPORTANT!VERY IMPORTANT!

Semantic restrictions should effectively distinguish the meaning we restrict from all other relevant meanings of the same English headword.

They should NOT be equally applicable to more than one meaning.

They should be easily understandable.

Page 20: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

ExampleExample

operator - all the meanings denote a thing

WRONG (in the sense ‘inadequate’): – operator(icl>thing)

CORRECT – operator(icl>human)– operator(icl>abstract thing)

Page 21: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Relations useful for Relations useful for disambiguationdisambiguationicliofequ ant (disappeared from the specs?)

– poor(icl>bad): poor quality– ??? poor(icl>having little money)

• “having little money” is a bad UW

– poor(ant>rich): poor people

Page 22: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Needs to be emphasized againNeeds to be emphasized again

UNL News1: build global knowledge– build(agt>thing,obj>thing)

Does not differentiate between different meanings of the headword:– build a railway (a house):

build(agt>thing,obj>concrete thing)– build plans (knowledge):

build(agt>thing,obj>abstract thing)

Page 23: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

KB functionKB function

UW: SeptemberMD: September{(icl>month>date)}

Page 24: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

What remains unclear-1:What remains unclear-1:

KB semantics. Links between related concepts of different semantic categories are missing. There is no way to express the relationship between “dance” (as a verbal concept) and “dance” (as a nominal concept)), “government” and “governmental”, etc. – dance({icl>do(}agt>person{)}) – dance(icl>action{>event})

Page 25: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

What remains unclear-2:What remains unclear-2:

Status of UWs within the restrictions:propose(agt>thing,gol>thing,obj>thing)

– They proposed to the president that a special committee should be set up

«set up» does not fall under «thing». But where then?

Page 26: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

What is an argument-1?What is an argument-1?

A is an argument of L --> A is integral to the meaning of L.

Page 27: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

What is an argument-2?What is an argument-2?

A is semantically obligatory: L cannot be semantically defined without A being mentioned.

A is not always syntactically obligatory: it can remain unmentioned in a sentence.

Page 28: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Example: Example: buybuy

buy has 4 arguments: a buyer, an object, a seller, the money paid.

All of them are semantically obligatory: “buying” cannot exist without any of them.

None of them is syntactically obligatory:– I bought a book (the seller and the money are not

mentioned).– To buy is more pleasant than to sell (no

arguments are mentioned).

Page 29: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Semantic roles vs. predicate-Semantic roles vs. predicate-argument relationsargument relations UNL does not mark predicate-argument

relations in a systematic way.Assumption: arguments can be reliably

identified based on their semantic role.

Page 30: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

It does not work. Why?It does not work. Why?

Too many «difficult» cases. Only a part of semantic relations between the words can be reliably interpreted in terms of semantic roles.

Too many mismatches. Assignment of semantic roles cannot be done in a consistent way (especially in the UNL multi-lingual and multi-cultural environment).

Page 31: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

The reason:The reason:

Numerous mismatches in the representation of the same or similar phenomena are rooted in the fundamental impossibility of a consistent interpretation of ALL argument relations in terms of a fixed SMALL number of semantic roles.

Page 32: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

ExamplesExamples

Nothing (obj) prevents the members (ben) from discussing (gol) this problem– why beneficiary (ben)?– why finite state (gol)?

protect nature from pollution (?) familiarize students with India (?)

Page 33: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Difference between arguments Difference between arguments and non-arguments-1and non-arguments-1Any nominal concept can have a

purpose, e.g. – a stone for driving nails

Therefore {pur>uw} is assigned to UW «thing» and is inherited by all UWs lying below.

Page 34: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Difference between arguments Difference between arguments and non-arguments-2and non-arguments-2Purpose is NOT an argument of “stone”:

a stone has no obligatory conceptual link with the purpose.

Purpose IS an argument of “method”: a method cannot exist without a purpose.– a method for calculating taxes

Page 35: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Another example: Another example: borrowborrow

– X borrows Y from Z for W = • Z owns Y, • X makes Z to give him Y,• X promises Z to give Y back after W

expires

– borrow cannot exist without 4 participants: agent, object, source, duration

Page 36: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Difference between arguments Difference between arguments and non-arguments-3and non-arguments-3Each action has a certain duration.

Therefore {dur>time} is assigned to UW «do» and is inherited by all UWs lying below.

Besides this, borrow has a semantic argument with the role ‘duration’

Page 37: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Argument vs. non-argumentArgument vs. non-argument

(1) John borrowed money for 3 years– Argument W: term of the loan. John

promised to return money after 3 years

(2) John has been borrowing money for 3 years– Non-argument: the situation ‘John is

borrowing money’ lasted for three years (the term of each loan is not specified)

Page 38: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Why important? - For semantic Why important? - For semantic processingprocessing(1) can answer the question on the terms of

the loan - (2) cannot do so.mod(invitation, president)

– the president invites somebody (arg. 1)– somebody invites the president (arg. 2)– the invitation has an unspecified connection with

the president (non-arg.)NB: the specs do not allow to draw this

distinction!

Page 39: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Why important? - For Why important? - For deconversiondeconversionThe arguments and non-arguments are

very often encoded differently:– dur: Ru «borrow on 3 years» vs. during 3

years – rsn: afraid of darkness, tremble with fear not:

*afraid because of darkness – scn: In [scn] this country the relations

between the nations are based on [scn-arg] mutual respect

Page 40: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

How to differentiate?How to differentiate?

The distinction between the arguments and the non-arguments should be drawn both in the UWs and in UNL expressions.

Page 41: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Proposal for UWsProposal for UWs

Restrictions corresponding to arguments should be systematically and exhaustingly represented in KB.

They can either be included into the UW, or be inherited from upper concepts.

Page 42: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Proposal for UWsProposal for UWs

They should be formally opposed to non-argument restrictions.

One of the ways: capitalization. – thing{(and>thing,…,pur>uw,…)} – method(icl>abstract thing,Pur>uw) – do{and>do,…,dur>period,…)}– borrow(icl>do,…Dur>period)

Another possibility: – dur vs. dur.@A

Page 43: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Proposal for UNL expressionsProposal for UNL expressions

Mark argument relations (Ch. Boitet): – @A, @B, @C…

relation for (1):– dur.@A(borrow, year)

relation for (2):– dur(borrow, year)

Page 44: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Sample UW dictionary entrySample UW dictionary entry

Current UW: responsible(aoj>thing,obj>thing)

It is proposed to introduce a comment:

responsible (Aoj>thing,Obj>thing,Gol>*);he (aoj) is responsible to me (gol) for his

actions (obj) (example)

;IB_Ru, 29/11/02 (author and date)

Page 45: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Comments in MDComments in MD

Not only for illustrating argument frames, but also for clarifying concepts.

Specs modification is needed.

Page 46: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

AttributesAttributes

Dictionary of attributes (explanation of the attribute, examples)

Procedure for introduction new attributes should be set up.

Page 47: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Issues concerning KBIssues concerning KB

Adjectival concepts

Page 48: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Adjectival concepts: mod vs. aojAdjectival concepts: mod vs. aoj

Two major classes of adjectives: – (aoj>thing) vs. (mod<thing)

Specs: “For an adjectival concept, (aoj>thing) or (mod<thing) should be attached to the Basic UW. (aoj>thing) is for expressing a predicative concept, whereas (mod<thing) is for expressing restrictive concept”.

Page 49: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

We should distinguish between:We should distinguish between:

(a) a syntactic property: whether the adjective is used predicatively (Greeks are wise) or attributively (the wise Greeks);

(b) a semantic property: what does the adjective mean when used attributively:– restriction;– qualification.

Only (b) should interest us.

Page 50: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Restrictive vs. non-restrictive Restrictive vs. non-restrictive

Wise Greeks diluted wine with water– restrictive: Those Greeks who were wise

diluted wine with water. Silly ones didn’t.– non-restrictive (qualificative): Greeks were

wise. They diluted wine with water. Non-attributive (predicative) adjective

does not restrict the noun:– Greeks are wise.

Page 51: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

AdjectivesAdjectives

Some adjectives can only be restrictive:– Many dogs have curly hair

Some adjectives can only be non-restrictive:– Get those damned dogs out of the room! – Dear colleagues!

Most of the adjectives can be both restrictive and non-restrictive.

Page 52: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Non-adjectivesNon-adjectives

The old people in the street were very tired.– Those who were in the street were tired;

others were not tired.– The old people were very tired. They were

in the street.There is no UW to which a restriction

can be assigned!

Page 53: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

What is more…What is more…

In some languages restricive vs. non-restrictive interpretations of relative clauses are marked by punctuation (English, Spanish).

Page 54: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

RestrictiveRestrictive

No commas are allowed:

– The old people who came a long way were tired.

– Los viejos que habían venido de muy lejos estaban cansados.

Page 55: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Non-restrictiveNon-restrictive

Commas are needed:

– The old people, who came a long way, were tired.

– Los viejos, que habían venido de muy lejos, estaban cansados.

Page 56: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

ProposalProposal

Renounce from the division of adjectives into (aoj>*) and (mod<*).

In order to account for this difference, introduce two attributes (@restr, @non-restr) which can be added to any modifier (an adjective, a prepositional phrase, a clause), if the author wishes to mark the restrictive or non-restrictive interpretation.

Page 57: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

Why attributes are better than Why attributes are better than restrictions?restrictions?Attributes reflect the point of view of the

speaker in the current situation and not the permanent property of the word.

Attributes are optional and may be not assigned, if the author does not wish to specify his point of view.

Page 58: Encoding UNL Expressions: Some Problems and Proposals Igor Boguslavsky UNL Russia bogus@iitp.ru

The strongest argumentThe strongest argument

The restrictive vs. non-restrictive opposition is relevant not only for adjectives but also for other types of noun modifiers.

These modifiers cannot be assigned restrictions but can easily take an attribute.