extending the spanishwordnet clara soler universitat ramon llull

42
Extending the SpanishWordNet Clara Soler Universitat Ramon Llull

Upload: phillip-patrick

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Extending the SpanishWordNet

Clara SolerUniversitat Ramon Llull

Extending the SpanishWordNet

● Aims:– Proposal about:

A new classification and a new way of representing Adjectives and their polysemy in the lexical database

● By means of:– Using MikroKosmos Ontology (Nirenburg 1994

et al.) to modificate if necessary EuroWordNet Top Concept Ontology

Index● Classification of Adjectives in WordNet● How Polysemy of Adjectives is represented● Classification of Adjectives in EuroWordNet –

SpanishWordNet● The Ontological Criteria ● Mikrokosmos Ontology / Treatment of Adjective

Polysemy● EuroWordNet Top Ontology● Conclusions

Classification of Adjectives in WordNet

● WordNet 1.5 contains 16428 adjectivals synsets (including nouns, participles and prepositional phrases which work as adjectives)

● Kept in three files: Descriptive, Relationals and Participles.

● How are they organized? Each class expresses typical syntactic and semantic features.

● Antonymy is the lexical relation that divides adjectives: Descriptives adjectives have antonyms and Relationals do not.

Classification of Adjectives in WN

● Descriptives adjectives are organized into non-hierarchic synsets formed by one, or more, pairs of antonyms. These synsets are clusters that represent values, from one extreme to the other, of some attribute.

● Relational adjectives are represented with pointers to the noun or verb from which they derive (a pointer expresses a lexical relation)

● Participle adjectives are considered a kind of Descriptives adjectives ending in -ing and -ed, they do not suit very much in cluster structures and are kept in a separated file.

Classification of Adjectives in WN● Descriptives adjectives: attribute Warmth

torrid – hot – warm – tepid – cool – cold- frigid● Relational adjectives

2 senses of cosmologic Sense 1cosmologic, cosmological, cosmogonic, cosmogonical, cosmogenic, cosmogenical -- (pertaining to the branch of astronomy dealing with the origin and history and structure and dynamics of the universe; "cosmologic science"; "cosmological redshift"; "cosmogonic theories of the origin of the universe")Pertains to noun cosmology (Sense 2)=>cosmology, cosmogony, cosmogeny -- (the branch of astrophysics that studies the origin and evolution and structure of the universe)       

Classification of Adjectives in WN● Relational adjectives

Sense 2cosmologic, cosmological -- (pertaining to the branch of philosophy dealing with the elements and laws and especially the characteristics of the universe such as space and time and causality; "cosmologic philosophy"; "a cosmological argument is an argument that the universe demands the admission of an adequate external cause which is God")Pertains to noun cosmology (Sense 1)=>cosmology -- (the metaphysical study of the origin and nature of the universe)=> metaphysics -- (the philosophical study of being and knowing)

       

How Polysemy of Adjectives is represented

● WordNet understands polysemy as a correlation of the frequency of occurrence ( Zipf 1945). It means that on the average, the more frequently a word is used the more different meanings it will have.

● Therefore, new, which is :– Highly frequent, it is considered– Highly polysemous

How Polysemy of Adjectives is represented

● 1. new (vs. old) -- (not of long duration; having just (or relatively recently) come into being or been made or acquired or discovered; "a new law"; "new cars"; "a new comet"; "a new friend"; "a new year"; "the New World")2. new(prenominal) -- (other than the former one(s); different; "they now have a new leaders"; "my new car is four years old but has only 15,000 miles on it"; "ready to take a new direction")3. new, unexampled -- (having no previous example or precedent or parallel; "a time of unexampled prosperity")

How Polysemy of Adjectives is represented

● 4. fresh, new, novel -- (of a kind not seen before; "the computer produced a completely novel proof of a well-known theorem")5. raw, new, wet behind the ears(predicate) -- (lacking training or experience; "the new men were eager to fight"; "raw recruits"; "he was still wet behind the ears when he shipped as a hand on a merchant vessel")6. newfangled, new -- (of a new (often outrageous) kind or fashion)

7. new, new to(predicate) -- ((often followed by `to') unfamiliar; "new experiences"; "experiences new to him"; "errors of someone new to the job")

How Polysemy of Adjectives is represented

● 8. new, young -- ((of crops) harvested at an early stage of development; before complete maturity; "new potatoes"; "young corn")9. new -- (unaffected by use or exposure; "it looks like new")10. New -- (in use after Medieval times; "New Eqyptian was the language of the 18th to 21st dynasties")11. Modern, New -- (used of a living language; being the current stage in its development; "Modern English"; "New Hebrew is Israeli Hebrew")

Classification of Adjectives in EuroWordNet

● Adjectives were not included in EuroWordNet, because:– Being a modifier, the information that conveys is

less vital– It was considered a highly polysemous category-

therefore representation in an enumerative lexicon becomes difficult

Classification of Adjectives in the SpanishWordNet

● Spanish and Catalan adjectives are translations of the English adjectives included in WordNet 1.5.

● They are classified and represented in the same way

Classification of Adjectives in the SWN● 01256444a

all lock 0 new_1 0 nuevo_5 0 nou_1 not of long duration; having just come into being or been made or acquired or discovered: "a new law"; "new cars"; "a new comet"; "a new friend"

● 01290120a all lock 0 fresh_2 novel_1 new_2 0 fresco_8 nuevo_8 novel_1 0 fresc_2 novell_1 nou_2 of a kind not seen before: "the computer produced a completely novel proof of a well-known theorem"

Classification of Adjectives in the SWN

● 00104365a all lock 0 new_4 unexampled_1 lock 0 inaudito_1 nuevo_2 lock 0 inaudit_2 nou_4 "an unprecedented expansion in population and industry"

00803732a all lock 0 hot_12 new_5 0 fresco_4 nolex 0 "hot off the press"; "fresh ideas"

Classification of Adjectives in the SWN● all lock 0 new_8 newfangled_1

lock 0 novedoso_1 nolex 0 of a new (often outrageous) kind or fashion

● all lock 0 modern_5 new_9 lock 0 moderno_2 contemporáneo_6 lock 0 modern_4 contemporani_4 used of a living language; being the current stage in its development: "Modern English"; "New Hebrew is Israeli Hebrew" Perteneciente o relativo al tiempo actual Pertanyent o relatiu al temps actual

Extending the SpanishWordNet

● At issue: – Relationals are not polysemous?– It is possible to represent polysemy of Adjectives

without enumerating all the possible shades they may take?

– Is it possible to relate polysemy with other parameters differents from frequency, compatibility with greater or smaller number of nominals, noun desambiguation?

– Are Adjectives that polysemous?

Extending the SpanishWordNet

● This proposal bases the extending of the SpanishWordnet by incorporating Adjectives on the following premises:– Adjectives will be classified too in Descriptive

and Relationals notwithstanding based on ontological criteria.

– Their polysemy will also be accounted by the same ontological taxonomy.

– They can also be classified according to the Base Concepts of the Top Ontology.

The Ontological Criteria

● These criteria come from within the framework of the Mikrokosmos Ontology (Nirenburg 1994 et al.).

● In this model the lexicon mediates between a language of Text Meaning Representation and an ontology.

● Adjective meaning is explained according to this conceptual ontology.

● Lexical entries are instances of ontological types.

The MikroKosmos Ontology

● Classification of Adjectives:– Scalars:

● based on Property ontological concepts Numerical scale : Warmth

torrid – hot – warm – tepid – cool – cold - frigid

Literal scale : colour

red – yellow – black – white – blue – orange- ● Evaluative

Terrific – Great – Good – Regular – Bad – Awful - Horrible

The MikroKosmos Ontology● Non-Scalars (Relationals):

● Denominals: Object ontological concepts National – nation● Deverbals: Event ontological concepts Readable – read

● An Adjectival Class: an ontological type

The MikroKosmos Ontology● Numerical scale properties:

– Evaluation Price– Size Speed– Mass Ease– Girth– Age– Cost

● Literal scale properties:– Orientation Direction– Side Color– Shape

Treatment of Adjectival Polysemy● In this proposal:

– Polysemy of a lexical item implies a change of meaning

● A romantic style / a romantic adventure● In Spanish: negocio familiar / una atmosfera familiar● In English: a family business / a familiar atmosphere● Good son / Good deal : always positive

– Polysemy has to do with relatedness of senses arising through extension of meaning (Aarts & Calbert, 1979)

– This relatedness of senses can be explained by the ontological model

Treatment of Adjectival Polysemy● A change of Ontological Type: from an Ontology of

Entities or Events to an Ontology of Properties– Victor Hugo is a French romantic author who

lived during the XIX century– Venice is a very romantic city for lovers

● A change within the same Ontological Type: – High altitude: Scale Height – High ideals: Scale Evaluative

EuroWordNet Top Concept Ontology

● Top Concept Ontology classifies the Base Concepts

● Base Concepts are the most important prevailing meanings in the local wordnets, and constitute a shared set of 1059. Represent the shared cores of the different wordnets.

● Top Concept Ontology consists of 63 fundamental semantic distinctions, organized by subtype and opposition relations, which organize the Base Concepts

EuroWordNet Top Concept Ontology

● It is linked to the Inter-Lingual-Index (ILI) and provides some language-independent structuring of it.

● The 63 semantic distinctions or Top Concepts are classified according to three types of entities: 1st-Order-Entities, 2nd-Order-Entities and 3rd-Order-Entities.

EuroWordNet Top Concept Ontology

● 1st-Order-Entities: expressed by concrete nouns, concrete entities. One euro, a college, a hobbit

● 3rd-Order-Entities: expressed by abstract nouns: any unobservable proposition. Platonic ideas

EuroWordNet Top Concept Ontology ● 2nd-Order-Entities:

– denoted by any part of speech (nouns, verbs, adjectives and adverbs).

– represent any static or dynamic situation that cannot be grasped, seen, felt, or experienced as an independent physical thing.

– located in time, they happen, rather than exist.

EuroWordNet Top Concept Ontology● 2ndOrderEntities: (two classification schemes)

– Situation Type: the event-structure in terms of which a situation can be characterized as a conceptual unit over time

● Dynamic: transition-BoundedEvent-UnboundedEvent

● Static-Property: colour, speed, age, length, size, shape, weight. (single entity)

-Relation: relation, kinship, distance, space (two entities)

EuroWordNet Top Concept Ontology

– Usage– Time– Social– Quantity– Purpose– Possession– Physical

Situation Components (BC as conceptually coherent clusters. SC reflect the most salient semantic components)

EuroWordNet Top Concept Ontology

● Modal● Mental● Manner● Location● Experience● Existence● Condition● Communication

EuroWordNet Top Concept Ontology

● Cause● Agentive● Stimulating● Phenomenal

Proposal: to reuse Top Concepts as a new Property, Object and Event ontological concepts, and viceversa.

Extending the SpanishWordNet

● The idea is to use the ontological structure of MikroKosmos in events, objects and properties to introduce and classify the adjectives in the SpanishWordNet

● In the Top Ontology: to verify if all the Top Concepts Concepts of 2nd-Order-Entities denote all possible adjectives-situations

Extending the SpanishWordNet

● Joining both ontologies– Incorporation of the adjectival synsets into the

Inter-Lingual-Index– Incorporation of the new Top Concepts that

represent the adjectival scales into the class of 2nd-Order-Entities.

– Scalar adjectives will be defined according to the Base Concepts of 2nd-Order-Entities

– Non Scalar adjectives (Denominals and Deverbals) will be defined in relation with the Base Concepts of all Entities.

Extending the SpanishWordNet● Classification(s) of adjectives in the SWN:

– Following the MikroKosmos ontological taxonomy, adjectives can be classified according to the three types of entities:

● Scalars: subsumed under 2nd-Order-Entities● Non Scalars: subsumed under 1st, 2nd and 3rd

Order Entities (Denominals) and under 2nd-Order-Entities (Deverbals)

– 2nd -Order-Entities are denoted by adjectives. Therefore all of them can be classified according to the Base Concepts of these Entities

Extending the SpanishWordNet● Proposal of an adjectival synset which

represents polysemy: denominal and scalar● The adjective "romantic" has 3 senses in WordNet.

1. romantic, romanticist, romanticistic -- (belonging to or characteristic of romanticism or the Romantic movement in the arts; "romantic poetry")2. amatory, amorous, romantic -- (expressive of or exciting sexual love or romance; "her amatory affairs"; "amorous glances"; "a romantic adventure"; "a romantic moonlight ride")3. quixotic, romantic, wild-eyed -- (not sensible about practical matters; unrealistic; "as quixotic as a restoration of medieval knighthood"; "a romantic disregard for money"; "a wild-eyed dream of a world state")

Extending the SpanishWordNet● Proposal of an adjectival synset which represents

polysemy: denominal and scalar

The adjective romantic has also 3 senses in the SWN.1. romantic, romanticist, romanticistic

– Based on an Object ontological concept (Romanticism): denominal

2. romantic, amorous

– Based on a Property ontological concept (scale: evaluative (sentimentalism)

3. romantic, wild-eyed

– Based on a Property ontological concept (scale: evaluative (idealism)

Extending the SpanishWordNet● Proposal of an adjectival synset which represents

polysemy: scalar

The adjective new has only 4 senses in the SPW

1. new (vs. old) -- (not of long duration; having just (or relatively recently)

Based on a Property ontological concept (scale: temporal (duration)

2. new(prenominal) -- (other than the former one(s); different;)3. new, unexampled -- (having no previous example or precedent or parallel;)

4. fresh, new, novel -- (of a kind not seen before; )

Extending the SpanishWordNet● Proposal of an adjectival synset which represents

polysemy: scalar

5. raw, new, wet behind the ears(predicate) -- (lacking training or experience; )

6. newfangled, new -- (of a new (often outrageous) kind or fashion)

7. new, new to(predicate) -- ((often followed by `to') unfamiliar; )

9. new -- (unaffected by use or exposure; "it looks like new")

Based on a Property ontological concept (scale: newness)

Extending the SpanishWordNet● Proposal of an adjectival synset which represents

polysemy: scalar

8. new, young -- ((of crops) harvested at an early stage of development; before complete maturity;)Based on a Property ontological concept (scale: maturity)

11. Modern, New -- (used of a living language; being the current stage in its development; )Based on a Property ontological concept (scale: temporary nature)

Conclusions

● This paper tries to propose a new classification of Adjectives heading towards an extension of the SpanishWordNet, as well as a new way of representing polysemy.

● The following has been shown:– Relationals are also polysemous– Adjective polysemy can be reduced and

represented– Ontological criteria can reflect the relationship

between diferents senses of a lexical item

Conclusions

● Future work includes:– To focus on the syntagmatic level of the

Adjective-Noun combinations, and to study the diferent grades of co-ocurrence (Of compositive, semicompositive and noncompositive character).

– To treat if possible the called adverbials adjectives using the same ontological structure