extended named entity ontology with attribute information satoshi sekine new york university lrec...
TRANSCRIPT
![Page 1: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/1.jpg)
Extended Named Entity Ontologywith Attribute Information
Satoshi SekineNew York University
LREC 2008May 28, 2008
![Page 2: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/2.jpg)
Named Entity
• Named Entity is the most important information unit in many Information Access applications (such as IE, Q&A, Summarization, IR, MT)
• History– MUC6 First define Named Entity
• Person, Location, Organization, Date, Time, Money, Percent– IREX
• MUC6 + Artifact– ACE (20 kinds),TIMEX (Standerdized Time Expression)
• Problem: Is it enough with 7~20 categories? What is the meaning of names?
![Page 3: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/3.jpg)
Extended Named Entities
• Extended to 200 categories (LREC 02,04)– Finer categories
• Location → GPE ( Country, Province, City… ) → Geographical region (landform, water
form …) → Region ( Domestic region,
Continental region … ) → Astral body ( Star, Planet … )
– New categories• Line ( Railroad, Road, Waterway, Tunnel Bridge … )• Product (Vehicle, Food, Cloth, Weapon, Award …)• Event (Games, Conference, Natural Phenomena, War …)• Disease, Currency, God …• Era, Age, Color, Unit
![Page 4: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/4.jpg)
![Page 5: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/5.jpg)
Development of ENE
• Long time, steady development for years– Capital words in English newspaper (~2000)– Q&A, IE examples– Refer Encyclopedia, WordNet,,,– Refer Related work, Related systems– 100->140->200->210
• Used in IE and Q&A system and refine the definition
• http://nlp.cs.nyu.edu/ene
![Page 6: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/6.jpg)
What is Named Entity?
• Name is only a label• Properties and Attributes are the essential meaning
• “Hudson River” is still “Hudson River” even if people call it “Muh-he-kun-ne-tuk”
• Meaning of the entity can discerned from– “the river is in New York State”– “It is 507 km in length”– “It runs Adirondack Mountains to Upper New York Bay”
• Name is only a label which can be used to refer to the river
![Page 7: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/7.jpg)
Attributes
• “River” has attributes such as “source location”, “outflow”, “length” and so on
• “People” has attributes such as “occupation”, ”birth date”, “nationality” and so on
• Design those attributes and construct the knowledge will be very useful on the applications of NLP technologies– Q&A, IE, IR, Dialogue, co-reference…
![Page 8: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/8.jpg)
Design of the attributes
• We use encyclopedia– Encyclopedia is the knowledge archive of
named entities (dictionary for common words)– Description must contain many attributes
• We will extract attributes from description of named entities (samples) and compile general attributes for each category
![Page 9: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/9.jpg)
Procedure
1. Extract (up to 50) sample name entity instances for each categories. We use a famous Japanese Encyclopedia, “Nippon Daihyakka (Nipponica)” published by Shogakkan Inc.
2. Annotators extract possible attribute values from description of the samples, and name the attribute label
(Attribute values must be a noun phrase or equivalent)
3. Unify the attribute labels and identify the important (essential and mandatory) attributes for each category
4. Redesign the ENE categories5. Construct a set of attributes
![Page 10: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/10.jpg)
Attributes for Person(20) Example of value Freq. ENE
Vocation Professional baseball player 46(100) Vocation
Nationality American, Chinese, Japanese 29(63) Country
Career Professor at Yale University 26(57) Vocation
Masterpiece Guernica, Mona Lisa 25(54) Product, Facility
Graduate M.A. in German at Cambridge 20(44) School
Hometown Paris, Manchester, Shanghai 19(41) City
Native Providence State of Illinois, Sichuan 18(39) Province
Previous stay England, New york 12(26) Location
Mentor Andrea del Verrocchio 10(22) Person
Death date 04/23/1704, unknown 10(22) Date
Era The 11th Century 8(17) Era
Award Academy Award, MVP, Nobel Prize 8(17) Award
Real name Saint Nicholas 8(17) Person
Another name Santa, father Christmas 8(17) Person
Title Knight, an honorary degree at Yale 6(13) Title
Competition World Series, 1955 piano competition in Paris 6(13) Game
Place of death New York, Brirmingham 5(11) Location
Father John B. Kelly, Sr. 5(11) Person
Cause of death Car accident, Guillotine 5(11)
![Page 11: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/11.jpg)
Attributes for International Organization
17 Example of value Freq. ENE
Another name CARICOM, EMU, CCDN 30(75) Inter. Org.
Year founded 1/10/1920, 2004 26(65 Date
Purpose of foundation Encouragement of the African economy 23(58)
Number of signatories 170 countries, 190 20(50) N_Country
Type League of Nations, International Labor Organization 16(40)
Headquarters New York, Prague 13(33) City
Agreement, Proposal Covenant of the League of Nations 12(30) Rule
Top Organization EU (the European Union) 11(28) Inter. Org.
Member China, Senegal, Norway 10(25) Country
Predecessor African Union (OAU), Caribbean Free Trade Association 9(23) Inter. Org.
Subsidiary Organization International Amateur Athletics Federation 8(20) Organization
Rank Board of directors, Special UN Organization 7(18)
Headquarters (country) Japan, Czech, Ethiopia 7(18) Country
Year of dissolution 1974, 06/20/1977 6(15) Date
Proposer Country USA, England, Luxemburg 5(13) Country
Successor Organization United Nations Economic and Social Commission for Asia and the Pacific 5(13) Inter. Org.
Proposer (Person) Eisenhower, Colonel Qadhafi , Pierre Wellner 4(10) Person
![Page 12: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/12.jpg)
Problemswe encountered and/or we haven’t solved yet
1. Entity dependent attributes ex) Song/Poem of river, “Loreley” on “Rhine River”
2. Fineness of attributeex) Bird’s “color of head” or “color of body”
3. Span of value expressionLonger than a noun phrase, ex) definition
4. Structure in valueex) Museum’s exhibit has own attributes (author, year)
5. ENE category definitionAttributes are useful to define categories, but not always
6. Distinction of mandatory and optionalDistinction of Property and attribute
![Page 13: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/13.jpg)
Inter-annotator Agreement
• 2 annotator work on Person, Landform, International Organization and Academy
• They agree more often on attributes which have values very often
• They disagree the span of values
Percentage of having values
~60% ~40% ~10%
Agree 13 37 61
Disagree 2 3 16
![Page 14: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/14.jpg)
Summary
• Design Attributes on Extended Named Entity– Attributes are important in applications
– Created based on Encyclopedia description
– Document available (in Japanese, English in progress)
– Dictionary / Tagger in development
• http://nlp.cs.nyu.edu/ene
![Page 15: Extended Named Entity Ontology with Attribute Information Satoshi Sekine New York University LREC 2008 May 28, 2008](https://reader035.vdocument.in/reader035/viewer/2022081816/56649e455503460f94b396a6/html5/thumbnails/15.jpg)
Application
• Q&A/IR– What is the 15th highest mountain in the world
– How many mountains are there which is higher than 6000m
– Tell me the major league player from New York
– I met Satoshi Sekine from New York
• Document understanding– “Yankees came back home!!”
– “I visited the Marakech’s main sightseeing places”