1 clarifying sensor anomalies using social network feeds * university of illinois at urbana...

26
1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna Giridhar * , Tanvir Amin * , Lance Kaplan + , Jemin George + , Raghu Ganti ++ , Tarek Abdelzaher *

Upload: donald-daniels

Post on 17-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

1

Clarifying Sensor Anomalies using Social

Network feeds

* University of Illinois at Urbana Champaign+U.S. Army Research Lab

++IBM Research, USA

Prasanna Giridhar*, Tanvir Amin*, Lance Kaplan+, Jemin George+, Raghu Ganti++, Tarek Abdelzaher*

Page 2: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

2

INTRODUCTIONExplosive growth in deployment of physical sensors.

Many times activities recorded by these sensors deviate from the norm:

Closure of a freeway due to forest fire. Change in building occupancy due to shutdown.

Unusual behavior tend to attract human attention and get reported socially as well.

Page 3: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

3

Several research works in the past for detecting events in the physical as well as the social domain. Can we use the social media as a tool for explaining the underlying cause of anomalies?

A system for identifying the discriminative social feeds that can be correlated with sensor anomalies.

The more unusual the event, higher probability.

Evaluation performed on real time traffic data.

MOTIVATION

Page 4: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

4

System Work-flow

STEP 1: Initialization of the system

Continuous stream of tweets using parameters

Keywords Location

Continuous stream of data from physical sensors

Page 5: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

5

STEP 2: Identification of sensor anomalies

Run a black box algorithm.

Store attributes for sensors classified positively by the algorithm

Cluster the sensors which provide redundant data

Detecting events in Sensors

Page 6: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

6

STEP 2: Identification of sensor anomalies

Run a black box algorithm.

Store attributes for sensors classified positively by the algorithm

Cluster the sensors which provide redundant data

Detecting events in Sensors

t1,t2

Page 7: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

7

STEP 2: Identification of sensor anomalies

Run a black box algorithm.

Store attributes for sensors classified positively by the algorithm

Cluster the sensors which provide redundant data

Detecting events in Sensors

Page 8: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

8

STEP 3: Identification of discriminative social feeds

Social feeds often have keywords describing an event

Discriminative Social Feeds

Keywords: malaysian, airlines, 370

Page 9: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

9

Keyword Signatures

Single Keyword?

Airlines

Page 10: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

10

Keyword Signatures

Keyword pair?

Malaysian, Airlines

Page 11: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

11

Keyword Signatures

Keyword triplet?

Malaysia, Airlines, 370Malaysia, Airlines, Satellite

Page 12: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

12

Keyword Signatures

Signature Events per Signature

Signatures per Event

Single keyword 3.621 1.1579

Keyword Pair 1.1416 1.2725

Keyword Triplet 1.0628 0.4393Signature profile on the twitter data collected

Ideal 1-to-1 mapping for keyword pair

Page 13: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

13

Problem: Given a list of keyword pairs for the current and past window, how to find the most discriminating subset?

Difference in rate of occurrences: (traffic,jam) 50 times today compared to past average of 35(drunk, kills) 12 times today compared to a past average of 0.

Increase in percentage: (traffic,jam) 1 time today compared to past average of 0(drunk, kills) 12 times today compared to a past average of 2

Possible Approaches

Overcome disadvantages using Information Gain Theory

Page 14: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

14

Information Gain Theory and Entropy

Entropy measures randomness introduced by a variable

Using conditional entropy value determine information gain about an event by the keyword pair. This can be formulated as:

Information Gain = H(Y) − H(Y|X)

Y: variable associated with event; y=0 (normal) and y=1 (anomalous)X: variable associated with keyword pair; x=0 (absent) and x=1 (present)

Page 15: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

15

STEP 4: Ranking discriminative events

Identify tweets for discriminative pairs.

Score proportional to conditional entropy.

The lower the entropy value, the higher is the discriminating power.

Rank the unusual events

Page 16: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

16

STEP 5: Matching tweets with sensor anomalies

We align both the data based on spatiotemporal properties associated with the event.

For example Sensor ID40456 on I-15

Northbound with unusual activity

Unusual Tweet: “SFvSD game tonight, stuck @15N traffic!!!”

Mapping both events

Page 17: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

17

STEP 6: Output the matched explanations

Final step is to provide the explanations.

A user interface which enables to track unusual events on a per-day basis.

Output Explanations

Page 18: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

18

Twitter feeds collected for a period of 2 weeks: Aug 19 to September 01, 2013 with a radius of 30 miles

Three cities in CA:• Los Angeles• San Francisco• San Diego

Physical sensors data retrieved from PeMS (Caltrans Performance Measurement System http://pems.dot.ca.gov/ ) : 5 minutes report for flow, speed, occupancy, delay

EXPERIMENTAL RESULTS

Page 19: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

19

Table: Precision using different methods

B1 corresponds to Difference in rate of occurrences and B2 to Increase in percentage.

Table: Average position of tweets from the top

Performance measured using Precision and Mean Average rank for our Information gain theory approach against other baseline approaches

EXPERIMENTAL RESULTS

Page 20: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

20

INTERESTING EVENTS

Sensor anomaly detected

Highway I-80 Eastbound in SFLandmarks: Bay bridgeDuration: 4 days

Page 21: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

21

INTERESTING EVENTS

Page 22: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

22US101 blockage due to Bomb squad in LA

INTERESTING EVENTS

Page 23: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

23

Traffic on 15N due to game in SD

INTERESTING EVENTS

Page 24: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

24

CONCLUSION

Abnormal behavior recorded in social medium. Tool to explain the abnormalities.

Major activities explained with high precision.

Explanations ranked among top two tweets.

Page 25: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

25

Future Work

Scalability Issues Credibility of social feeds

Geo localization of tweets

Page 26: 1 Clarifying Sensor Anomalies using Social Network feeds * University of Illinois at Urbana Champaign + U.S. Army Research Lab ++ IBM Research, USA Prasanna

26

THANK YOU

Q+A