twitter data - university of california,...
TRANSCRIPT
Twitter DataGilad Mishne - @gilad
Twitter SearchInfo 290 - Analyzing Big Data With Twitter
UC Berkeley Information SchoolAugust 2012
About Twitter
2
• The fastest, simplest way to communicate
• More than 140M active users• Majority (also) mobile; 60% out of U.S.
• More than 400M twitter.com visitors• More than 400M tweets/day (peak: 25K/sec)
• 1,000 employees (majority in San Francisco)• 50% engineers
Twitter data: text
3
Twitter data: social graph
4Credit: @isaach
Twitter data: time series
5
Twitter data: interest graph
6Credit: @psychemedia
Combined: the pulse of the world
7
What we do with large data
8
Scale
9
Search
10
Recommendations
11
Ads
12
Anti-Spam
13
The Speakers
(Plus, some Twitter features)
14
Twitter Overview
15Embedded links
Expanded Tweets
Hadoop/Pig
16
Replies/conversations
Trends/Streaming
17
Protected accounts
Support for 30+ languages
Search
18Retweets, Favorites
Graphs, Recommendations, Relevance
19Retweets (in the timeline)
Security, anomaly detection
20
Photos
Scalding
21
Geotagging,hashtags
Goals
• Work with real data, on real problems• Learn how it is to work in a place like Twitter• Build something useful• Have a good time!
22
Questions?
23
Follow me: @gilad