live topic generation from event streams

Post on 18-Dec-2014

653 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

"Live Topic Generation from Event Streams", talk given at the Demo session of the 22nd World Wide Web Conference (WWW), Rio de Janeiro, Brazil

TRANSCRIPT

Live Topic Generation from Event Streams

Vuk Milicic, José Luis Redondo Garcia,

Giuseppe Rizzo, Raphaël Troncy, Thomas Steiner

raphael.troncy@eurecom.fr / @rtroncy

Media Finder (www2013)

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 2

Media Finder (zooming on media items)

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 3

Media Finder (timeline view)

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 4

Media Finder (timeline view)

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 5

Media Server

Composition of media item extractors (12 SNs) Rely on search APIs + a fix 30s timeout window to provide results Fallback on screen scraping when necessary (Twitter ecosystem)

Implemented as a NodeJS server

Serialize results in a common schema (JSON)

22nd World Wide Web Conference (WWW) - Rio de Janeiro 15/05/2013 - 6

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 7

Deep link Permalink

Clean text for NLP processing

Aggregate view of ALL social interactions

12 Social Networks

Media Finder Architecture

Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media-

server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)

Image near de-duplication DCT signature on image and video frame,

Hamming distance between image pairs

Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA Density-based clustering using OPTICS

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 8

Named Entities are Pivotal

http://nerd.eurecom.fr/

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 9

REST API Ontology

Dashboard UI

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 10

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 11

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 12

Media Finder (named entities clustering)

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 13

Media Finder (zooming in a cluster)

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 14

Summary

Pick an event identified with a hashtag

Use MediaServer to get media items aggregated over multiple social networks

Use NERD to get entities aggregated over multiple extractors

Cluster and identify meaningful topics (aka entities) with a meaningful label often disambiguated with a DBpedia URI giving access

to more encyclopedic knowledge

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 15

Live Topic Generation from Event Streams

Meet us at WWW 2013 Demo Session, Booth 14 http://www.youtube.com/watch?v=8iRiwz7cDYY

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 16

http://www.slideshare.net/troncy

15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 17

top related