social pixels acm_mm
DESCRIPTION
Using social media to understand the situations occurring in the real world.TRANSCRIPT
SOCIAL PIXELS:GENESIS & EVALUATION
Vivek Singh, Mingyan Gao, and Ramesh Jain
University of California, Irvine
Outline
Concept Approach Applications Challenges
Motivation
People are sharing massive amounts of information on the web (Twitter, Flickr, Facebook, …)
How to do effective data consumption, not just data creation Geo-spatial situation awareness Real time updates of the world state From data to actionable knowledge
Concept
Understanding evolving world situations by combining spatio-temporal-thematic data coming from social media (e.g. Twitter/Flickr).
‘Iphone’ social image for mainland USA. Jun 11, 2009
Social Pixels
Traditional Pixels Photons aggregating at locations on CCD
Social Pixels User interest aggregating at geo-locations
Create social Image, social Video… Image/Media Processing operators
Situation Detection operators (e.g. convolution, filtering, background subtraction)
Design principles
Humans as sensors Social pixel approach
Visualization Intuitive query and mental model Common spatio-temporal data
representation Data analysis using media processing
Combining media processing with declarative query algebra
Overall Approach
1. Micro-event detection2. Spatio-temporal aggregation using social pixel approach3. Media processing engine4. Query engine
Micro-event detection
Simple bag-of-words approach for detecting what event is the user talking about.e.g. ‘Sore throat’, ‘Flu’, ‘H1N1’, …
Tweet: ‘caught sore-throat today…arrrgh !’
Micro-event detected for user X.
SpatialTemporal Thematic
Spatio-temporal aggregation using social pixels
Higher level abstractions have trade-offs with lower level details
Percolate up what is necessary for the application
Can be:Count of tweets with the termAverage green channel value of imagesMean audio energyAverage monthly income, rainfall, population
etc.
Data Model
Spatio-temporal element stel = [s-t-coord, theme(s), value(s),
pointer(s)] E-mage
g = (x, {(tm, v(x))}|xϵ X = R2 , tm ϵ θ, and v(x) ϵ V = N)
Temporal E-mage Set TES= {(t1, g1), ..., (tn, gn)},
Temporal Pixel Set TPS = {(t1, p1), ..., (tn, pn)},
Operations
1. Selection Operation 2. Arithmetic and Logical Operation 3. Aggregation Operation α4. Grouping Operation 5. Characterization Operation
Spatial Temporal
6. Pattern Matching Operation Spatial Temporal
1. Selection Operation
Select part of E-mage based on predicate P
Input: Temporal E-mage Set TES = {(t1, g1), …, (tn , gn)}
Output: Temporal E-mage Set TES’ Spatial or Value predicate Pi on Emage
Pi(TES) = {(t1, Pi(g1)), …, (tn, Pi(gn))}, where Pi(g) = {(x, y) | y=g(x), if Pi(x,y) is true; y=0, otherwise}
Boolean predicate Pt on time Pt(TES) = {(t1’ g1’), …, (tm’, gm’)}, where
P(ti’) is true, e.g. date = ‘2010-03-10’
Selection Examples
Show last one week’s E-mages of California for topic ‘Obama’ R=cal t <= 1wk theme= Obama(TES)
2. Arithmetic Operation
Binary operations between two (or more) E-mage Sets
(g1, g2) = g3(x, (v1(x), v2(x))), where {+, -, *, /, max, min, convolution}, g1 and g2 are the same size.
Example: TES1=Temporal E-mage Set for ‘Unemployment
rate’ TES2=Temporal E-mage Set for ‘normalized Gas
prices’ TES3= (TES1, TES2)
3. Aggregation Operation α
Aggregates multiple E-mages in TES based on function .
(g1, g2) = g3(x, (v1(x), v2(x))), where {+, *, mean, max, min}, g1 and g2 are the same size.
Example: Show the average emage of last one
week’s emages from California for Obama. α mean (R=cal t <= 1wk theme= Obama(TES))
4. Grouping Operation
Group stels in an E-mage g based on certain function f
Input: Temporal E-mage Set TES = {(t1, g1), …, (tn , gn)}
Output: Temporal E-mage Set TES’ Function f essentially splits g, into multiple sub-e-
mages. f(TES) = f((t1, g1)) … f((tn,gn)), where f((ti, gi)) =
{(ti , gi1’), …, (ti , gik’)}, and each gij’ is a sub-E-mage of g based on f
f {segmentation, clustering, blob-detection, etc.}
Grouping Example
Identify 3 clusters for each E-mage in the TES set having last one week’s E-mages of California. clustering, n=3(R=cal t <= 1wk(TES))
5a. Characterization Op. (Spatial)
Represent each E-mage g based on a characteristic C, and store result as a stel.
Input: Temporal E-mage Set TES = {(t1, g1), …, (tn, gn)}
Output: Temporal Pixel Set TPS = {(t1, p1), …, (tn, pn)}
C(TES) = {(t1, (g1)), …, (tn, (gn))}, where (gi) is a pixel characterizing gi
C {count, max, min, sum, average, coverage, epicenter, density, shape, growth_rate, periodicity}
Characterization Examples (Spatial)
Find the epicenter of each cluster E-mage in the last one week’s E-mages of USA from TES epicenter (clustering, n=3(R=USA t <= 1wk
theme=Obama(TES))
5b. Characterization Op. (Temporal) Characterize a temporal pixel set, which is
the result of E-mage characterization Input: Temporal Pixel Set TPS = {(t1, p1),
…, (tn, pn)} Output: Temporal Pixel Set TPS’ (TPS) = {(tk , ((t1, p1), …, (tk, pk))) | k [2,
n]}, where {displacement, distance, velocity, speed, acceleration, linear extrapolation, exponential growth, exponential decay, etc.}
Temporal Characterization Examples
Find the velocity of epicenter of each cluster E-mage over the last one week’s E-mages of California from TES for theme Katrina velocity (epicenter (clustering, n=3(R=Cal t <= 1wk theme =
Katrina (TES))))
5. Pattern Matching
Pattern Matching (Spatial) Compare the similarity between each E-
mage and a given pattern P Input: Temporal E-mage Set TES = {(t1, g1),
…, (tn, gn)}, and pattern P Output: Temporal Pixel Set TPS P(TES) = {(t1, p1), …, (tn, pn)}, where each
value in pi represents the similarity between the E-mage and the given pattern
Patterns (i.e. Kernels) can be loaded from a library or be historical data samples.
Pattern Matching
Temporal Pattern matching: Compare the similarity of the temporal
value changing with a given pattern, e.g. ‘increasing’, ‘decreasing’, or ‘Enron’s stock in 1999’, …
Input: Temporal Pixel Set TPS = {(t1, p1), …, (tn, pn)}, and a pattern P
Output: Temporal Pixel Set TPS’ P(TPS) = {(tn , p)}, where v(x) in p is the
similarity value
Pattern Matching Examples
Compare the similarity between each E-mage in the last one week’s E-mages of California from TES with radial decay radial_decay(R=cal t <= 1wk theme = Obama (TES))
How close is the similarity above to pattern of “Enron’s stock price in 1999”? Enron’s stock(radial_decay(R=cal t <= 1wk(TES)))
Situation detection operators
S. No
Operator Input Output
1 Selection Temporal E-mage Set
Temporal E-mage Set
2 Arithmetic & Logical
K*Temporal E-mage Set
Temporal E-mage Set
3 Aggregation α Temporal E-mage set
Temporal E-mage Set
4 Grouping Temporal E-mage Set
Temporal E-mage Set
5 Characterization :•Spatial •Temporal
•Temporal E-mage Set•Temporal Pixel Set
•Temporal Pixel Set•Temporal Pixel Set
6 Pattern Matching •Spatial •Temporal
•Temporal E-mage Set•Temporal Pixel Set
•Temporal Pixel Set•Temporal Pixel Set
Media processing
engine
Implementation and results
Twitter feeds Geo-coding user home location Loops of location based queries for different
terms Over 100 million tweets using ‘Spritzer’
stream (since Jun 2009), and the higher rate ‘Gardenhose’ stream since Nov, 2009.
Flickr feeds API Tags, RGB values from >800K images
Correlation with real world events
Applications
Business decision making Political event analytics Seasonal characteristics analysis
Situation awareness: iPhone launch
Spatio temporal variation: Visualization
Business intelligence: Queries
AT&T retail locations
AT&T total catchment area
iPhone theme based e-mage,Jun 2
Aggregate interest
Under-served interest areas
-Difference
DecisionBest Location is
at Geocode [39, -122] , just
north of Bay Area, CA
MAXIMA <geoname><name>College City</name><lat>39.0057303</lat><lng>-122.0094129</lng><geonameId>5338600</geonameId><countryCode>US</countryCode><countryName>United States</countryName><fcl>P</fcl><fcode>PPL</fcode><fclName>city, village,...</fclName><fcodeName>populated place</fcodeName><population/><distance>1.0332</distance></geoname>
+ Aggregation
to Jun 11
Convolution.
*Store
catchment area
Convolution.
*Store catchment
area
Combination of operators
Political event analytics: Queries
Snapshothttp://socialemage.appspot.com
Flickr Social Emages
Jan – Dec 2009
Seasonal characteristics analysis
Year average Peak of green
At [35, -84], at the junction of Chattahoochee National Forest, Nantahala National Forest, Cherokee National Forest and Great Smoky Mountains National Park
Variations throughout the year
Total Energy
Fall colors of New England [R-G] channel data
Jan Dec
Jan
0
Dec
Conclusions
Combining spatio-temporal event data for visualization, and analytics.
An e-mage representation of spatio-temporal thematic data coming in real-time.
Defined operators for real-time situation analysis
Applications in multiple domains
Challenges: Future work
Defining a (visual) query language using operators
Scalability Real time data management for all possible
topics which user might be interested in Automatic tweets from sensors A reverse-911 like
control/recommendation mechanism Creating an event web by connecting all
event related data