semantics + filtering + search = twitcident - exploring information in social web streams
DESCRIPTION
Talk by Ke Tao (from Web Information Systems, TU Delft) at 23rd ACM Conference on Hypertext and Social Media, June 28 2012, Milwaukee, WI, USATRANSCRIPT
![Page 1: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/1.jpg)
DelftUniversity ofTechnology
Semantics + Filtering + Search = TwitcidentExploring Information in Social Web StreamsHypertext 2012, Milwaukee, WI – June 28
Fabian Abel, Claudia Hauff, Geert-Jan Houben, Richard Stronkman, Ke Tao
Web Information Systems, TU Delft, the Netherlands
![Page 2: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/2.jpg)
2Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
200,000,000number of tweets published per day
![Page 3: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/3.jpg)
3
Pukkelpop 2011
People tweet about everything,
everywhere :-)
![Page 4: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/4.jpg)
4
Pukkelpop 2011
81,000 tweets in four hours
became a tragedy
Filtering
200,000,000
Search & Analytics
Useful tweets?
![Page 5: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/5.jpg)
5
Case NijmegenTrain accident
![Page 6: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/6.jpg)
6Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
First tweet…
And then your train blasts off full of the anvils. #Nijmegen #veolia
![Page 7: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/7.jpg)
7Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
First picture…
Astonishing! My train rams the platform at Nijmegen!
http://pic.twitter.com/QVVfJHyd
![Page 8: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/8.jpg)
8Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Traditional news media
A train ramed the anvils at Nijmegen.
![Page 9: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/9.jpg)
9Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
1. (Automatic) Filtering: Given an incident, how can one automatically identify those tweets that are relevant to the incident?
2. Search & Analytics: How can one improve search and analytical capabilities so that users can explore information in the streams of tweets?
Twitter streams
Research Challenges
Filtering
topic
Search & Analytics
information need
![Page 10: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/10.jpg)
10Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 11: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/11.jpg)
11Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident system
![Page 12: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/12.jpg)
12Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 13: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/13.jpg)
13Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Incident detection
• Twiticident relies on Emergency Broadcasting Services for detecting incidents.
• In the Netherlands : P2000 communication network
![Page 14: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/14.jpg)
14Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Incident Profiling• For an incident i:
• The profile of an incident is described as a set of tuples.
• Each tuple includes a facet-value pair (f, v) and its weight to the incident i.
Location, Netherlands
0.4
Incident,Train
accident0.5
Location, Nijmegen
0.8
Orgranization,Veolia
0.6
Incident,Crash
1.0
![Page 15: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/15.jpg)
15Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 16: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/16.jpg)
16Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Social Media Aggregation • Collecting Twitter messages, pictures, and videos from Social Media Platforms e.g. Twitter, PhotoBucket, Vimeo
![Page 17: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/17.jpg)
17Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 18: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/18.jpg)
18Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Semantic Enrichment
• Named Entity Recognition
• Classification : Casualties, Damages, Risks…
• Linkage : External Resources
• Metadata extraction
![Page 19: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/19.jpg)
19Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 20: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/20.jpg)
20Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Filtering
•Which tweets are relevant to the incidents?
• Preprocessing : Language detection
• Semantic Filtering : Compare tweet with P(i)
• Semantic Filtering with News Context• P’(i) : P(i) complemented with f-v pairs from
news
![Page 21: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/21.jpg)
21Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 22: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/22.jpg)
22Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Faceted Search
•Strategies (ranking)
• Frequency-based
• Time-sensitive based
• Personalized
![Page 23: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/23.jpg)
23Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Real-time analyticsWhat type of things are mentioned in the tweets?
What aspects are mentioned over time? What do people report about over time?
Impact Area
![Page 24: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/24.jpg)
24Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Evaluation - Dataset
• Twitter corpus ( TREC Microblog Track 2011 ) • 16 million tweets (Jan. 24th – Feb. 8th, 2011 )• 4,766,901 tweets classified as English• 6.2 million entity-extractions
• News (Same time period)• 62 RSS News Feeds• 13,959 News Articles• 357,559 entity-extractions
![Page 25: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/25.jpg)
25Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor tweets Filtering (1/2)
Semantic strategies outperform the keyword-based filtering regarding all metrics.
![Page 26: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/26.jpg)
26Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor tweets Filtering (2/2)
The semantic strategy is more robust and achieves higher precisions for complex topics.
![Page 27: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/27.jpg)
27Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor Faceted Search (1/2)
The semantic faceted search strategy improves the search performance by 34.8% and 22.4%.
![Page 28: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/28.jpg)
28Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor Faceted Search (2/2)
The strategies with semantic enrichment outperform the strategy without semantic enrichment in predicting the appropriate facet-values.
![Page 29: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/29.jpg)
29Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Conclusions
• What we have done:
• Twitcident, a framework for filtering, searching, and
analyzing information about incidents that people
publish in their Social Web Streams
• What we have achieved:
• Better filtering of Twitter messages for a given incident.
• Better search for relevant information about an incident
within the filtered messages.
![Page 30: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.vdocument.in/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/30.jpg)
30Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Thank you!
Ke Tao @taubau
@wisdelfthttp://twitcident.org