framester and wfd

30
Word Frame Disambiguation: Evaluating Linguistic Linked Data on Frame Detection Mehwish Alam 1 , Aldo Gangemi 1,2 , Valentina Presutti 2 1 LIPN, Paris Nord University, CNRS UMR7030, France 2 Semantic Technology Lab, ISTC-CNR, Rome, Italy

Upload: aldo-gangemi

Post on 16-Jan-2017

51 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Framester and WFD

Word Frame Disambiguation: Evaluating Linguistic Linked

Data on Frame Detection Mehwish Alam1, Aldo Gangemi1,2, Valentina Presutti2

1LIPN, Paris Nord University, CNRS UMR7030, France 2Semantic Technology Lab, ISTC-CNR, Rome, Italy

Page 2: Framester and WFD

Frames as eventuality schemas

• Prepare_coffee(x,y)

• events as relations with fixed arity

• Prepare_coffee(x,y,…)

• … adding multigrade arity (coffee mix, machine, time, recipe, …)

• Prepare_coffee(e,x,y,…)

• … adding reified eventualities [a.k.a. Neo-Davidsonian events]

• Prepare_coffee(e,x,y,…) ∧ agent(e,x) ∧ theme(e,y) ∧ …

• … adding semantic roles (agent, theme, time, location, …)

• Prepare_coffee(e,x,y,…) ∧ agent(e,x) ∧ theme(e,y) ∧ … ∧ Person(x) ∧ Beverage(y) ∧ …

• … adding semantic types (Person, Beverage, Coffee mix, Machine type, …)

Page 3: Framester and WFD

How to detect frames?• From:

• Linguistic structures (Valentina prepared a barley coffee)

• Relational tables, RDF datasets, OWL classes (:BarleyCoffeePreparation :hasCook :Valentina ; :hasMaterial :myOrganicBarley .)

• XML stylesheets, templates, Web pages

• JSON microdata, infoboxes

• Requirements

• Using:

• Words (evocation: Valentina, prepare, barley, coffee)

• Word Senses, Synsets, Classes, Properties (predicates as unary or binary projections of frames: Person, Activity, Cereal, Drink, agent, theme, ingredient)

• Entities (individuals: occurrences of unary projections: Valentina)

• Facts (assertions: occurrences of binary projections: prepares(Valentina, barley coffee))

Page 4: Framester and WFD

%%% _______________________ ____________ %%% |x0 | |x1 x2 x3 | %%% |.......................| |............| %%% (|named(x0,valentina,per)|A|prepare(x3) |)%%% |_______________________| |barley(x2) | %%% |nn(x2,x1) | %%% |coffee(x1) | %%% |Agent(x3,x0)| %%% |Theme(x3,x1)| %%% |____________|

FRED+VerbNet+NER

FRED+FrameNet+NER

FRED-FrameNet+NER+UKB/WordNet

BoxerARK+Semafor

Page 5: Framester and WFD

%%% _______________________ ____________ %%% |x0 | |x1 x2 x3 | %%% |.......................| |............| %%% (|named(x0,valentina,per)|A|prepare(x3) |)%%% |_______________________| |barley(x2) | %%% |nn(x2,x1) | %%% |coffee(x1) | %%% |Agent(x3,x0)| %%% |Theme(x3,x1)| %%% |____________|

framester:Food

fschema:unaryProjectionOf

frole:agentfrole:product

fschema:subsumedunder

fschema:subsumedunderfschema:subsumedunder

Page 6: Framester and WFD

FramesterA semiotic hub for knowledge graph interoperability

?

Page 7: Framester and WFD

dataset nodes blue: role-oriented lexical resources purple: emotion-oriented lexical resources red: fact-oriented data green: wordnet-like lexical resources yellow: ontology schemas grey: topic models dotted line: existing RDF data continuous line: newly created RDF data

Originally, not many RDF datasets linked in the word-lexicon-data space

arrows orange: Framester links black dotted: previous links

Page 8: Framester and WFD

dataset nodes blue: role-oriented lexical resources purple: emotion-oriented lexical resources red: fact-oriented data green: wordnet-like lexical resources yellow: ontology schemas grey: topic models dotted line: existing RDF data continuous line: newly created RDF data

We added more RDF datasets linked in the word-lexicon-data space

arrows orange: Framester links black dotted: previous links

Page 9: Framester and WFD

dataset nodes blue: role-oriented lexical resources purple: emotion-oriented lexical resources orange: fact-oriented data green: wordnet-like lexical resources yellow: ontology schemas grey: topic models dotted line: existing RDF data continuous line: newly created RDF data

arrows orange: Framester links black dotted: previous links

We added many new links so creating a

new formal resource in the word-lexicon-

data space

Page 10: Framester and WFD

Sample triples• wn30instances:synset-anti-G_suit-noun-1

wn30schema:containsWordSense wn30instances:wordsense-anti-G_suit-noun-1 , wn30instances:wordsense-G_suit-noun-1 ; wn30schema:gloss “worn by fliers and astronauts to counteract the forces of gravity and acceleration” .

• wn30instances:synset-anti-G_suit-noun-1 own2dul:proxhyp wn30instances:synset-pressure_suit-noun-1 ; own2dul:hyp wn30instances:synset-consumer_goods-noun-1 ; own2dul:d0 dul:PhysicalObject .

• wn30instances:synset-anti-G_suit-noun-1 a fschema:SynsetFrame ; fschema:unaryProjectionOf frame:Clothing , frame:Artifact , frame:Wearing , frame:Dressing .

wn30

own

framester

Page 11: Framester and WFD

Extending WN-FN mappings

Page 12: Framester and WFD

BabelNet2Framester

• bn:s00004603n lemon:isReferenceOf bn:G_suit_EN/s00004603n .

• bn:G_suit_EN/s00004603n owl:sameAs wn30instances:wordsense-G_suit-noun-1 .

• bn:s00004603n fschema:isUnaryProjectionOf frame:Clothing , … , … .

Page 13: Framester and WFD

DeepKnowNet to Framester

Page 14: Framester and WFD

DBpedia to Framester

• dbr:John_Holmes_(actor) a wn30instances:synset-actor-noun-1 .

• dbr:John_Holmes_(actor) fschema:hasRoleIn frame:Performers .

• dbr:John_Holmes_(rugby_league) a wn30instances:synset-player-noun-1 .

• dbr:John_Holmes_(rugby_league) fschema:hasRoleIn frame:Competition , frame:Participation .

Page 15: Framester and WFD

Emo to Framester: SWN

• wn30instances:synset-anti-G_suit-noun-1 swn:negScore "0" ; swn:posScore "0" .

• wn30instances:synset-coffee_fungus-noun-1 swn:negScore "0.375" ; swn:posScore "0" .

Page 16: Framester and WFD

Framester semantics 1/3• A frame is defined as a multigrade predicate 𝜙(e,x1, ..., xn), where 𝜙 is a first-order relation, e is a (neo-Davidsonian) variable for any eventuality or state of affairs described by the frame, and xi is a variable for any argument place. Interpretation of predicates is made on a domain ∆I of

• D&S-style Punning

• 𝜙I ⊆ dands:SituationI

• 𝜙 ∈ fschema:FrameI (⊆ dands:DescriptionI)

• Actual frame occurrences

• s ∈ fschema:SituationI , 𝜙I

Page 17: Framester and WFD

Framester semantics 2/3• Projections

• A semantic role is a internal binary projection rol(e,xi) of a frame 𝜙, so that rol(e,xi) → 𝜙(e,x1, …,xn), i≥1≤n

• A co-participation relation is an external binary projection cop(xj,xk) of a frame 𝜙, so that cop(xj,xk) → 𝜙(e,x1, …,xn), j≥1≤n , k≥1≤n

• A selectional restriction or semantic type is a unary projection typ(xm) of a frame 𝜙, so that typ(xm) → 𝜙(e,x1, …,xn), m≥1≤n

Page 18: Framester and WFD

Framester semantics 3/3• Individuals and words

• A (non-situational) individual entity ent has a role in a possible occurrence of a frame 𝜙 when ent ∈ typI, i.e. when it is an instance of a type compatible (or coerced) as a unary projection of 𝜙

• An individual tuple is a possible occurrence of a frame 𝜙 when <x,y> ∈ rolI, or <x,y> ∈ copI, i.e. when it is a instance of a property compatible (or coerced) as a binary projection of 𝜙

• A word is an evocation of a frame 𝜙 when it can be disambiguated to a frame or one of its projections

Page 19: Framester and WFD

Consequences• WordNet synsets are unary projections of frames (synset-based frames)

• WordNet word senses are unary projections of lexical units (sense-based frames)

• WordNet “tropes” are binary projections of implicit synset-based frames

• VerbNet verb (sub-)classes are frames

• VerbNet verb class members are sense-based frames

• LD properties are binary projections of frames (either internal or external)

• LD classes are either (candidate) frames or unary projections of frames

• LD regular individuals are instances of unary projections of frames (role players in an external data frame)

• LD qua-individuals (e.g. DBpedia career stations) are instances of unary projections of a specific frame

• LD assertions are instances of binary projections of (?external) frames

Page 20: Framester and WFD

Achievements• more than 40 million triples including new LOD versions of many, linguistic/factual resources,

and links among them, and to Framester

• formal schema interoperability across datasets

• full revision of WordNet-FrameNet mappings

• large extension of frame coverage

• frame annotations for any kind of entity

• full mapping of local (frame-dependent), and global roles from multiple resources

• new semantic role taxonomy from localised roles way up to abstract roles and dependencies

• alignment of frames, roles and types to foundational ontologies

• new frame relations discovered based on mappings and inferences

• Word Frame Disambiguation service

Page 21: Framester and WFD

Consequent issues• Many wrong mappings e.g. in FrameBase-WordNet

• Many inaccurate subsumptions and cycles in FrameNet frame elements because of heterogeneous inheritance/causal semantics

• Other mixed errors in FrameNet, e.g. when composing formal assumptions from frame/role taxonomies

• Errors in stand-off WordNet files (specially with teleological and derivational morpho-semantics datasets)

• …

Page 22: Framester and WFD

framester:Clothing

frole:agent, frole:manner, frole:material,…

wn:synset-anti-G_suit-noun-1fschema:unaryProjectionOf

fschema:binaryProjectionOf

dul:PhysicalObject

rdfs:subClassOf

framestersyn:anti-G_suit.n.1

fe:Wearer, fe:Style, fe:Material …

fschema:subsumedUnder

fschema:subsumedUnder

dbr:G-suitowl:sameAs

bn:s00004603n

owl:sameAs

Page 23: Framester and WFD

Links• Framester GitHub page

• https://github.com/framester/Framester/wiki/Framester-Documentation

• Endpoint

• http://etna.istc.cnr.it/framester/sparql

• WFD

• http://lipn.univ-paris13.fr/framester/en/wfd/

Page 24: Framester and WFD
Page 25: Framester and WFD

R&D• Word Frame Disambiguation

• Frame vectors and frame topic models (frame2vec for deep learning)

• OKE extensions (cf. FRED)

• Frame clustering and complex frame discovery

• Sentence frame fingerprinting (valence patterns)

• Automated matching between semantic roles

• Automated matching between roles and LOD properties

• Overlap matching between frames and LOD classes

• Assisted eXtreme Design (ODP semantic search)

• …

(✔)(✔)(✔)

Page 26: Framester and WFD

Conclusions• A new large resource in LOD, linking linguistic and

factual knowledge with a frame-oriented semantics, expressible in OWL

• Evaluation wrt frame detection proves increase of recall and state-of-the-art precision

• A lot of research themes by applying links and shared semantics: valence patterns, clustering, embeddings, interoperability

Page 27: Framester and WFD

Related publications: Framester and frame semantics

• A. Gangemi, M. Alam, L. Asprino, V. Presutti, D.R. Recupero. 2016. Framester: A Wide Coverage Linguistic Linked Data Hub. EKAW

• Aldo Gangemi, 2010. What’s in a Schema?, Ontology and the Lexicon, Cambridge University Press

• Charles J Fillmore. 1976. Frame semantics and the nature of language. Annals of the New York Academy of Sciences

Page 28: Framester and WFD

Related publications: Linguistic resources

• Maddalen Lopez de Lacalle, Egoitz Laparra, and German Rigau. 2014. Predicate Matrix: extending SemLink through WordNet mappings. LREC

• Antoine Zimmermann, Christophe Gravier, Julien Subercaze, and Quentin Cruzille. 2013. Nell2RDF: Read the Web, and Turn it into RDF. KNOW@LOD, CEUR

• Montse Cuadros, Llúıs Padró, German Rigau. 2012. Highlighting relevant concepts from topic signatures. LREC

• Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multi-lingual Semantic Network. Artificial Intelligence

• Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R Hruschka Jr, and Tom M Mitchell. 2010. Toward an architecture for never-ending language learning. AAAI

• Martha Palmer. 2009. Semlink: Linking Prop-Bank, VerbNet and FrameNet. GenLex-09

• Karin Kipper Schuler. 2005. Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. Ph.D. thesis

• Christiane Fellbaum, editor. 1998. WordNet: an electronic lexical database, MIT Press

• Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The Berkeley FrameNet Project. COLING

Page 29: Framester and WFD

Related publications: Linked data resources

• Linguistic linked data resources

• Antoine Zimmermann, Christophe Gravier, Julien Subercaze, and Quentin Cruzille. 2013. Nell2RDF: Read the Web, and Turn it into RDF. KNOW@LOD

• Andrea Giovanni Nuzzolese, Aldo Gangemi, and Valentina Presutti. 2011. Gathering lexical linked data and knowledge patterns from FrameNet. KCAP

• Mark Van Assem, Aldo Gangemi, and Guus Schreiber. 2006. Conversion of WordNet to a standard RDF/OWL representation. LREC

• Aldo Gangemi, Roberto Navigli, and Paola Velardi. 2003. The OntoWordNet project: Extension and axiomatization of conceptual relations in Wordnet. ODBASE

• Factual inked data resources

• Jens Lehmann, Chris Bizer, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. 2009. DBpedia - A Crystallization Point for the Web of Data. Journal of Web Semantics

• Johannes Hoffart, Fabian M Suchanek, Klaus Berberich, and Gerhard Weikum. 2013. Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence

Page 30: Framester and WFD

Related publications: Tools

• Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero, Andrea Giovanni Nuzzolese, Francesco Draicchio, and Misael Mongiovi. 2016. Semantic Web Machine Reading with FRED. Semantic Web

• Dipanjan Das, Desai Chen, André F. T. Martins, Nathan Schneider, and Noah A. Smith. 2014. Frame-semantic parsing. Computational Linguistics