![Page 1: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/1.jpg)
FINETContext-Aware Fine-Grained Named Entity Typing
Luciano Del Corro*, Abdalghani Abujabal*,
Rainer Gemulla†, and Gerhard Weikum*
Max-Planck-Institute for Informatics*
University of Mannheim†
![Page 2: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/2.jpg)
Named Entity Typing
The task of detecting type(s) of named
entities in a given context with respect
to a type system (e.g., WordNet)
“Page plays his guitar on the stage”
guitarist
![Page 3: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/3.jpg)
FINET A system
• for detecting fine-grained types
• in short inputs (e.g., sentences or
tweets)
• in a given context
• with respect to WordNet
![Page 4: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/4.jpg)
Context-Aware Typing
“Steinmeier, the German Foreign Minister, ..”
![Page 5: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/5.jpg)
explicit
“Steinmeier, the German Foreign Minister, ..”foreign minister
Context-Aware Typing
![Page 6: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/6.jpg)
explicit
“Steinmeier, the German Foreign Minister, ..”
“Messi plays soccer”
foreign minister
Context-Aware Typing
![Page 7: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/7.jpg)
explicit
“Steinmeier, the German Foreign Minister, ..”
“Messi plays soccer”almost explicitsoccer player
foreign minister
Context-Aware Typing
![Page 8: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/8.jpg)
explicit
“Steinmeier, the German Foreign Minister, ..”
“Messi plays soccer”almost explicit
“Pavano never even made it to the mound”
soccer player
foreign minister
Context-Aware Typing
![Page 9: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/9.jpg)
explicit
“Steinmeier, the German Foreign Minister, ..”
“Messi plays soccer”almost explicit
“Pavano never even made it to the mound”
baseball player implicit
soccer player
foreign minister
Context-Aware Typing
![Page 10: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/10.jpg)
Applications• KB Construction
• find types for existing entities
![Page 11: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/11.jpg)
Applications• KB Construction
• find types for existing entities
• Named Entity Disambiguation
• “Page played amazingly on the stage”
BusinessmanMusician
![Page 12: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/12.jpg)
Applications• KB Construction
• find types for existing entities
• Named Entity Disambiguation
• “Page played amazingly on the stage”
BusinessmanMusician
![Page 13: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/13.jpg)
Applications• KB Construction
• find types for existing entities
• Named Entity Disambiguation
• “Page played amazingly on the stage”
• Semantic Search
• Give me all documents talk about musicians
![Page 14: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/14.jpg)
Supervised Approaches
• Manually labeled data is scarce
• thousands of types, need sufficient
training data for every type
![Page 15: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/15.jpg)
Distantly Supervised
Approaches
• Idea: automatically generated data
via KB (e.g., Wikipedia)
![Page 16: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/16.jpg)
Distantly Supervised
Approaches
• Idea: automatically generated data
via KB (e.g., Wikipedia)
“Klitschko is the mayor of Kiev”
“Klitschko is known for his powerful punches”
![Page 17: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/17.jpg)
Distantly Supervised
Approaches
• Idea: automatically generated data
via KB (e.g., Wikipedia)
mayorpolitician
boxer “Klitschko is the mayor of Kiev”
“Klitschko is known for his powerful punches”
![Page 18: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/18.jpg)
Distantly Supervised
Approaches
• Idea: automatically generated data
via KB (e.g., Wikipedia)
Problem: types are context-oblivious
mayorpolitician
boxer “Klitschko is the mayor of Kiev”
“Klitschko is known for his powerful punches”
![Page 19: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/19.jpg)
FINET• Unsupervised
• Most extractors are unsupervised
![Page 20: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/20.jpg)
FINET• Unsupervised
• Most extractors are unsupervised
• Context-aware
• “Klitschko is the mayor of Kiev” politicianmayor
![Page 21: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/21.jpg)
FINET• Unsupervised
• Most extractors are unsupervised
• Context-aware
• “Klitschko is the mayor of Kiev”
• Super fine-grained
• WordNet as typing system (16K types; per, loc, org)
politicianmayor
![Page 22: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/22.jpg)
FINET Overview1. Preprocessing
2. Candidate Generation
1. Pattern-based extractor [very explicit]
2. Mention-based extractor [explicit]
3. Verb-based extractor [almost explicit]
4. Corpus-based extractor [implicit]
3. Type Selection (via WSD)
![Page 23: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/23.jpg)
Extractor
Stopping condition
met?
Subsequent
Extractor
Type
SelectionYes
No
![Page 24: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/24.jpg)
Preprocessing
“Albert Einstein discovered the law of
photoelectric effect and he won the Nobel
price in 1921”
![Page 25: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/25.jpg)
Preprocessing
• Identify clauses
• Some extractors operate on clause level
(clauses capture local context)
“Albert Einstein discovered the law of
photoelectric effect and he won the Nobel
price in 1921”
![Page 26: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/26.jpg)
Preprocessing
• Identify coarse-grained types [Stanford NER]
• FINET restricts its candidates to hyponyms
• Well studied task: high prec. and recall
• “Albert Einsten”: PER
“Albert Einstein discovered the law of
photoelectric effect and he won the Nobel
price in 1921”
![Page 27: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/27.jpg)
Preprocessing
• Coreference resolution
• (“Albert Einstein”, “he”)
“Albert Einstein discovered the law of
photoelectric effect and he won the Nobel
price in 1921”
![Page 28: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/28.jpg)
FINET Overview1. Preprocessing
2. Candidate Generation
1. Pattern-based extractor [very explicit]
2. Mention-based extractor [explicit]
3. Verb-based extractor [almost explicit]
4. Corpus-based extractor [implicit]
3. Type Selection (via WSD)
![Page 29: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/29.jpg)
Pattern-based Extractor[final patterns]
targets very explicit types
• “Barack Obama, the president of […]”
• [“Barack Obama”; president-1, president-2, ..]
![Page 30: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/30.jpg)
Pattern-based Extractor[final patterns]
NAMED ENTITY , (modifier) NOUN (modifier)
appos
mod mod
targets very explicit types
• “Barack Obama, the president of […]”
• [“Barack Obama”; president-1, president-2, ..]
![Page 31: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/31.jpg)
Pattern-based Extractor[final patterns]
NAMED ENTITY , (modifier) NOUN (modifier)
appos
mod mod
Stopping Condition: produce at least one type
targets very explicit types
• “Barack Obama, the president of […]”
• [“Barack Obama”; president-1, president-2, ..]
![Page 32: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/32.jpg)
Pattern-based Extractor[non-final patterns]
• “Shakespeare’s productions”
• production produce producerDER
[“Shakespeare”; producer-1, producer-2, ..]
Poss. + transf.
DER
![Page 33: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/33.jpg)
Pattern-based Extractor[non-final patterns]
• “Shakespeare’s productions”
• production produce producerDER
Stopping Condition: KB lookup Shakespeare writer-1
Shakespeare producer-2
DER
[“Shakespeare”; producer-1, producer-2, ..]
Poss. + transf.
![Page 34: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/34.jpg)
Method Overview1. Preprocessing
2. Candidate Generation
1. Pattern-based extractor [very explicit]
2. Mention-based extractor [explicit]
3. Verb-based extractor [almost explicit]
4. Corpus-based extractor [implicit]
3. Type Selection (via WSD)
![Page 35: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/35.jpg)
Mention-based Extractor
• “Imperial College London”
• [“Imperial College London”; college-1,
college-2, ..]
![Page 36: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/36.jpg)
Mention-based Extractor
Stopping Condition: KB lookup
• “Imperial College London”
• [“Imperial College London”; college-1,
college-2, ..]
![Page 37: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/37.jpg)
Method Overview1. Preprocessing
2. Candidate Generation
1. Pattern-based extractor [very explicit]
2. Mention-based extractor [explicit]
3. Verb-based extractor [almost explicit]
4. Corpus-based extractor [implicit]
3. Type Selection (via WSD)
![Page 38: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/38.jpg)
Verb-based Extractor
• Nominalization
• “play” “player”
verb deverbal noun
Verb-argument semantic concordance
![Page 39: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/39.jpg)
• “Messi plays in Barcelona”
Example 1: Suffixes
![Page 40: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/40.jpg)
• “Messi plays in Barcelona”
play player“-er”
Example 1: Suffixes
![Page 41: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/41.jpg)
• “Messi plays in Barcelona”
play player“-er”
play-1
play-2
play-3
.
player-1 (player)
player-2 (musician)
player-3 (actor)
player-4 (participant)
DER
Example 1: Suffixes
![Page 42: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/42.jpg)
• “Messi plays in Barcelona”
play player“-er”
play-1
play-2
play-3
.
player-1 (player)
player-2 (musician)
player-3 (actor)
player-4 (participant)
[“Messi”; player, musician, actor, ..]
DER
Example 1: Suffixes
![Page 43: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/43.jpg)
• “Messi plays in Barcelona”
play player“-er”
play-1
play-2
play-3
.
player-1 (player)
player-2 (musician)
player-3 (actor)
player-4 (participant)
[“Messi”; player, musician, actor, ..]
Stopping Condition: KB lookup
DER
Example 1: Suffixes
![Page 44: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/44.jpg)
• “John committed a crime”
• commit perpetrate perpetrator
[“John”; perpetrator-1]
DERsyn
Stopping Condition: KB lookup
Example 2: Synonyms
![Page 45: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/45.jpg)
Method Overview1. Preprocessing
2. Candidate Generation
1. Pattern-based extractor [very explicit]
2. Mention-based extractor [explicit]
3. Verb-based extractor [almost explicit]
4. Corpus-based extractor [implicit]
3. Type Selection (via WSD)
![Page 46: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/46.jpg)
Corpus-based Extractor
• “Messi” & “Cristiano Ronaldo” occur in
sport (soccer)
• Key idea: Collect types of similar entities
via KB
Distributional hypothesis:
similar entities tend to occur in similar context
![Page 47: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/47.jpg)
• Word vectors represent semantic contexts for a
given phrase
• Given a set of phrases, return the k most
similar phrases with respect to context
Word2Vec
![Page 48: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/48.jpg)
“Maradona expects to win in South Africa”
query: {“Maradona”, “South Africa”}
“Parreira coached Brazil in South Africa”
“Dunga replaced Parreira after South Africa”
Mention Type
“Diego Maradona" <coach-1>, ..
“Parreira" <coach-1>, ..
“Carlos Alberto Parreira" <coach-1>, ..
“Dunga" <coach-1>, ..
…
![Page 49: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/49.jpg)
Stopping Condition: sufficient evidence for types
“Maradona expects to win in South Africa”
query: {“Maradona”, “South Africa”}
Mention Type
“Diego Maradona" <coach-1>, ..
“Parreira" <coach-1>, ..
“Carlos Alberto Parreira" <coach-1>, ..
“Dunga" <coach-1>, ..
…
“Parreira coached Brazil in South Africa”
“Dunga replaced Parreira after South Africa”
![Page 50: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/50.jpg)
Method Overview1. Preprocessing
2. Candidate Generation
1. Pattern-based extractor [very explicit]
2. Mention-based extractor [explicit]
3. Verb-based extractor [almost explicit]
4. Corpus-based extractor [implicit]
3. Type Selection (via WSD)
![Page 51: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/51.jpg)
Type Selection via
Word Sense Disambiguation
• Given an entity and a set of candidate
types
• [“Maradona”; soccer_player-1,
football_player-1, coach-1, …]
• Select the best types according to
context
![Page 52: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/52.jpg)
Entity Context for WSD
• Entity-oblivious context
• all words in an input sentence
• Entity-specific context via lexical
expansions
• entity-related words from word vectors
![Page 53: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/53.jpg)
Type Selection via WSD
Naive Bayes trained with word features on WN glosses
and labeled data (if available) [ExtendedLesk].
“Maradona expects to win in South Africa”
Entity-oblivious context:
“expects”, “win”, “South Africa”
Entity-specific context:
“coach”, “cup”, “striker”, “mid-fielder”, and “captain”
![Page 54: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/54.jpg)
Experiments
• Datasets
• 500 random sentences from NYT year 2007
• 500 random sentences from CoNLL
• 100 random tweets
![Page 55: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/55.jpg)
• CG: (artifact, event, person, location,
organization)
• FG: ~200 prominent WN types
• SFG: all remaining WN types
Type Granularity
![Page 56: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/56.jpg)
System Type System Total Types Top Categories
FINET WN 16K+ pers, org, loc
HYENA WN 505 all
![Page 57: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/57.jpg)
System CG FG SFG
PCorrect
TypesP
Correct
TypesP
Correct
Types
FINET 87.90 872 72.42 457 70.82 233
FINET (w/o l.) 87.90 872 71.13 436 67.11 204
HYENA 72.40 779 28.26 522 20.65 160
Results on NYT dataset
![Page 58: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/58.jpg)
System CG FG SFG
PCorrect
TypesP
Correct
TypesP
Correct
Types
FINET 87.90 872 72.42 457 70.82 233
FINET (w/o l.) 87.90 872 71.13 436 67.11 204
HYENA 72.40 779 28.26 522 20.65 160
Results on NYT dataset
![Page 59: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/59.jpg)
System CG FG SFG
PCorrect
TypesP
Correct
TypesP
Correct
Types
FINET 87.90 872 72.42 457 70.82 233
FINET (w/o l.) 87.90 872 71.13 436 67.11 204
HYENA 72.40 779 28.26 522 20.65 160
Results on NYT dataset
![Page 60: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/60.jpg)
Conclusion
• FINET
• A system for detecting types of named entities
• Context-aware
• Unsupervised (mostly)
• Very fine-grained typing system
![Page 61: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/61.jpg)
Mapping CG types to
WN• persons all descendants of
• person-1, imaginary, being-1, characterization-3, and
operator-2 (10584 in total);
• locations all descendants of
• location-1, way-1, and landmass-1 (3681 in total);
• organizations all descendants of
• organization-1 and social group-1 (1968 in total).
![Page 62: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/62.jpg)
Verb-based Extractor
• “Messi plays soccer”
• “Messi” is a subject
• “soccer” is direct object
• Add “soccer” as a noun modifier to
the deverbal noun
![Page 63: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/63.jpg)
Verb-based Extractor
• Utilize a corpus of frequent (verb,
type) pairs
• “Messi was treated in the hospital”
• [“Messi”; patient-1]
![Page 64: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/64.jpg)
Corpus-based Extractor• Retrieve 100 most related phrases along with
similarity scores
![Page 65: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/65.jpg)
Corpus-based Extractor• Retrieve 100 most related phrases along with
similarity scores
• Filter out non-entity phrases and entities not
compatible with CG type
![Page 66: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/66.jpg)
Corpus-based Extractor• Retrieve 100 most related phrases along with
similarity scores
• Filter out non-entity phrases and entities not
compatible with CG type
• Traverse the result list until we collect 50% of
the total score
![Page 67: FINET - uni-mannheim.de · FINET Context-Aware Fine-Grained Named Entity Typing Luciano Del Corro*, Abdalghani Abujabal*, Rainer Gemulla†, and Gerhard Weikum* Max-Planck-Institute](https://reader035.vdocument.in/reader035/viewer/2022071020/5fd4c9acbc09402f3a08894d/html5/thumbnails/67.jpg)
Corpus-based Extractor• Retrieve 100 most related phrases along with
similarity scores
• Filter out non-entity phrases and entities not
compatible with CG type
• Traverse the result list until we collect 50% of
the total score
• If no more that 10 different types were added
add types as candidates