notube: metadata enrichment
DESCRIPTION
TRANSCRIPT
![Page 1: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/1.jpg)
WP4: TV Data Text Enrichment
Pavel Mihaylov (OT) and partners
![Page 2: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/2.jpg)
Contents
Ontotext and its role in the project
WP4: text, audio and video
Goals and achievements
Demo
Conclusions
26-‐27 March 2012 NoTube 3rd review 2
![Page 3: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/3.jpg)
• Seman5c technology developer est. in 2000 – Staff: 65 employees and mulMple contractors
• Global leader in semanMc technologies – Seman5c Databases: high performance RDF DBMS, scalable reasoning
– Seman5c Search: text-‐mining (IE), InformaMon Retrieval (IR)
– Web Mining: focused crawling, screen scraping, data fusion
• Role in NoTube – WP4 leader
– Seman5c Enrichment
– Experience from mulMple European projects
26-‐27 March 2012 NoTube 3rd review 3
![Page 4: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/4.jpg)
WP4: Content Enrichment
Content • Text: EPGs, programme descripMons
• Audio • Video
Enrichment • Adding metadata • Content about content
26-‐27 March 2012 NoTube 3rd review 4
![Page 5: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/5.jpg)
Goal: Text enrichment
SemanMc annotaMon component
Recognising items of interest in text
Assigning links to Linked Open Data
• Analyses short or free-‐text text segments
• Extends them with further world knowledge
26-‐27 March 2012 NoTube 3rd review 5
![Page 6: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/6.jpg)
Goal: Text enrichment (2)
26-‐27 March 2012 NoTube 3rd review 6
Live at the Apollo 2/6 Not Going Out star Lee Mack presents sets from American comic Rich Hall and Scotland’s very own Danny Bhoy.
![Page 7: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/7.jpg)
Goal: MulMlingual
TV world
English
German
Italian
Dutch
Arabic Bulgarian
French
Korean
Turkish
26-‐27 March 2012 NoTube 3rd review 7
![Page 8: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/8.jpg)
Goal: Graph enrichment
• EnMMes extracted from text
Build upon basic enrichment
• Follow a chain of LOD predicates
Exploit relaMons in SemanMc Repository
• A richer set of enMMes
Enrich the basic enrichment
26-‐27 March 2012 NoTube 3rd review 8
![Page 9: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/9.jpg)
Goal: Graph enrichment (2)
26-‐27 March 2012 NoTube 3rd review 9
![Page 10: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/10.jpg)
Goal: Graph enrichment (3)
• Film • TelevisionShow • Work • Band/MusicalArMst • Actor • Place
Classes to enrich
26-‐27 March 2012 NoTube 3rd review 10
![Page 11: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/11.jpg)
Film enrichment
26-‐27 March 2012 NoTube 3rd review 11
• Film class • At least one common indirect relaMon
![Page 12: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/12.jpg)
• TelevisionShow class • At least two common indirect relaMons
TelevisionShow enrichment
26-‐27 March 2012 NoTube 3rd review 12
![Page 13: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/13.jpg)
• Work except Film and TelevisionShow • At least one common indirect rela?on
Work enrichment
26-‐27 March 2012 NoTube 3rd review 13
![Page 14: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/14.jpg)
Band/MusicalArMst enrichment
26-‐27 March 2012 NoTube 3rd review 14
• Band and MusicalAr5st • At least one direct rela?on
![Page 15: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/15.jpg)
Actor enrichment
26-‐27 March 2012 NoTube 3rd review 15
• Actor class • Starring relaMon from at least two common Works
![Page 16: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/16.jpg)
Place enrichment
26-‐27 March 2012 NoTube 3rd review 16
• Place class • At least one direct rela?on
![Page 17: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/17.jpg)
Lupedia
Text enrichment service
• Input: plain text, e.g. programme descripMons • Output: Linked Open Data enrichment • XML, json, RDFa
• Features: • MulMlingual • Graph enrichment • MulMple vocabularies • Configurable • Fast
26-‐27 March 2012 NoTube 3rd review 17
![Page 18: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/18.jpg)
Lupedia over Mme
Becer service
MulMlingualism
New matching opMons and
filters
HeurisMcs
Predicate, heurisMcs and class weights
DisambiguaMon Most specific class in output
MulMple vocabularies
Selectable vocabulary
Graph enrichment
26-‐27 March 2012 NoTube 3rd review 18
![Page 19: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/19.jpg)
EvaluaMon summary
Lupedia compared to OpenCalais and AlchemyAPI
• Only two other similar services • Much becer coverage than either of them • Comparable precision
• Custom vocabularies & filters • Tuned to TV domain
Lupedia is a unique service
26-‐27 March 2012 NoTube 3rd review 19
![Page 20: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/20.jpg)
Links to other WPs
• EnMty URIs point to WP1 models
WP1
• Lupedia in NLP based profiling and enrichment
WP3
• Lupedia in SmartLink and Watch’n’Buy
WP5
• IntegraMon, enrichment in demo apps
WP6
• 7a news enrichment
• 7c programme descripMon enrichment
WP7 26-‐27 March 2012 NoTube 3rd review 20
![Page 21: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/21.jpg)
Lupedia demo
26-‐27 March 2012 NoTube 3rd review 21
http://lupedia.ontotext.com/
![Page 22: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/22.jpg)
Emerging compeMMon
Lupedia Yahoo WikiMachine En5tyPedia
LOD output DBpedia & LinkedMDB
DBpedia ?
MulMlingual ar, bg, nl, en, fr, de, it, ko, tr
en, zh en, pt, it ?
Confidence yes yes yes ?
Graph enrichment
yes yes* no ?
Remark Tuned to TV domain, one of the pioneers
No direct access to LOD, graph enrichment too abstract
Too generic, precision seems lower
Not yet released
26-‐27 March 2012 NoTube 3rd review 22
![Page 23: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/23.jpg)
Lessons & Impact
Lessons learnt: • Emerging similar services clearly show the need for such services
• Coverage and language support are important
Lupedia recognised as one of the major players and included in NERD: • AggregaMng named enMty services and comparing their performance
• hcp://nerd.eurecom.fr
Various partners willing to use
Lupedia in other projects
26-‐27 March 2012 NoTube 3rd review 23
![Page 24: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/24.jpg)
Life aker NoTube
Will be kept alive as a
demo service
Closed source
Possibly an OpenCalais-‐like service in
future
26-‐27 March 2012 NoTube 3rd review 24
![Page 25: NoTube: Metadata Enrichment](https://reader033.vdocument.in/reader033/viewer/2022051513/5456afb0af79590b088b4e50/html5/thumbnails/25.jpg)
QuesMons?
26-‐27 March 2012 NoTube 3rd review 25