towards situational awareness systems for disaster response naveen ashish calit2@uc-irvine bell labs...

107
Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Upload: godwin-fletcher

Post on 27-Dec-2015

219 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Towards situational awareness systems for disaster response

Naveen Ashish

Calit2@UC-Irvine

Bell Labs India, Bangalore,

04/23/07

Page 2: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Organization Introduction to

SAMI

Selected research areas

Technology transition

Discussion

Page 3: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE NSF funded “large-ITR” project

Advance information technologies for disaster response 5 year project

Oct 2003 to Oct 2008 Institutions

6 universities (UCI, UCSD, UIUC, BYU, U-Colorado, U-Maryland) and 1 company (ImageCat)

Active and formal community partners City of LA, OCFA, Irvine Police, ….

People Director: Sharad Mehrotra ~ 25 researchers and staff, ~40 students

Web: http://www.itr-rescue.org

The SAMI TEAM

StudentsStella Chen, Chaitanya Desai, Vibhav Gogate, Jon Hutchinson, Ram Hariharan, Shengyue Ji, Yiming Ma, Rabia Nuray-Turan, Dawit Seid, Shankar Shivappa

StaffJay Lickfett, Chris Davison

CollaboratorsCharles Huyck, Ron Eguchi, Shubharoop Ghosh

Faculty, Scientists and Post-docsDmitri Kalashnikov, Rajesh Hedge, Sharad Mehrotra, Sangho Park

Slide Aggregator (aka Project Leader)Naveen Ashish

Page 4: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Mission

The mission of RESCUE is to enhance the ability of

emergency response organizations and the public to mitigate

crises, save lives, and prevent secondary and indirect human

and economic loss by radically transforming ways in which

these organizations gather, process, manage, use and

disseminate information during man-made and natural

catastrophes.

Page 5: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Motivation: Transform the Ability of First Responders to Mitigate Crisis

Observation: Right Information to the Right Person at the Right Time can result in dramatically better response

Response Effectiveness• lives & property saved • damage prevented• cascades avoided

Quality & Timeliness of

Information

Situational Awareness• incidences• resources• victims• needs

Quality of Decisions• first responders• consequence planners• public

Page 6: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Objectives Develop technologies to dramatically improve

situational awareness of first-responders, response organizations, and the public by providing them with timely access to accurate, reliable and actionable information about the disaster.

Page 7: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Objectives Develop technologies to dramatically improve situational awareness of first-

responders, response organizations, and the public by providing them with timely access to accurate, reliable and actionable information about the disaster.

Develop technologies that enable seamless information sharing and collective decision making across highly dynamic virtual organizations consisting of diverse entities (government, private sector, NGOs, individuals).

Page 8: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Objectives Develop technologies to dramatically improve situational awareness of first-

responders, response organizations, and the public by providing them with timely access to accurate, reliable and actionable information about the disaster.

Develop technologies that enable seamless information sharing and collective decision making across highly dynamic virtual organizations consisting of diverse entities (government, private sector, NGOs, individuals).

Develop robust communication systems that continue to operate in crisis situations despite partial/total failure of infrastructure and increased communication demands.

Page 9: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Objectives Develop technologies to dramatically improve situational awareness of first-

responders, response organizations, and the public by providing them with timely access to accurate, reliable and actionable information about the disaster.

Develop technologies that enable seamless information sharing and collective decision making across highly dynamic virtual organizations consisting of diverse entities (government, private sector, NGOs, individuals).

Develop robust communication systems that continue to operate in crisis situations despite partial/total failure of infrastructure and increased communication demands.

Develop technologies that can be used for timely and customized dissemination of crisis information that inform the public at large thus enhancing the abilities of the affected populations to take appropriate self-protective actions.

Page 10: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Objectives Develop technologies to dramatically improve situational awareness of first-

responders, response organizations, and the public by providing them with timely access to accurate, reliable and actionable information about the disaster.

Develop technologies that enable seamless information sharing and collective decision making across highly dynamic virtual organizations consisting of diverse entities (government, private sector, NGOs, individuals).

Develop robust communication systems that continue to operate in crisis situations despite partial/total failure of infrastructure and increased communication demands.

Develop technologies that can be used for timely and customized dissemination of crisis information that inform the public at large thus enhancing the abilities of the affected populations to take appropriate self-protective actions.

Explore the privacy challenges that emerge as a result of infusing technology to improve information flow in crisis response networks and the public.

Page 11: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Objectives Develop technologies to dramatically improve situational awareness of first-

responders, response organizations, and the public by providing them with timely access to accurate, reliable and actionable information about the disaster.

Develop technologies that enable seamless information sharing and collective decision making across highly dynamic virtual organizations consisting of diverse entities (government, private sector, NGOs, individuals).

Develop robust communication systems that continue to operate in crisis situations despite partial/total failure of infrastructure and increased communication demands.

Develop technologies that can be used for timely and customized dissemination of crisis information that inform the public at large thus enhancing the abilities of the affected populations to take appropriate self-protective actions.

Explore the privacy challenges that emerge as a result of infusing technology to improve information flow in crisis response networks and the public.

Promote interdisciplinary education at all levels (graduate, undergraduate, K-12) and across diverse student groups to expose the future community of citizens to issues in emergency management and homeland security – an area of global and national importance.

Page 12: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RESCUE Research Projects SAMI: Situational Awareness from Multi-Modal

Input (Project Lead: N. Ashish, UCI)

PISA: Policy-driven Information Sharing Architecture (Project Lead: M. Winslett, UIUC)

Customized Dissemination in the Large (Project Leads: K. Tierney, UC-B & N. Venkatasubramanian, UCI)

Privacy Implications of Technology Adoption (Project Lead: S. Mehrotra, UCI)

Robust Networking and Information Collection (Project Lead: BS Manoj, UCSD)

Page 13: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

A Situational Awareness Application

Reports Responders News Weather Traffic

Damage Assessment

Evacuation Planning

Situational Dashboard

Simulations Reconnaissance System

Information

Applications

Page 14: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Architecture

Situational data management

Analysis

Extraction and synthesis

Events as fundamental abstraction units

Page 15: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Areas

Situational awareness systems

Extraction and synthesisData management

Analysis

semantic extraction from text

audio-visualextraction

E event model

SAT-ware

graph analysis

geospatial

predictive modeling

damage assessmentspatial indexing

Page 16: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extraction and Synthesis

Extraction and Synthesis

Semantic extractionfrom text

Audio eventextraction

Visual eventextraction

Page 17: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Why do we need “Data Cleaning”?

An actual excerpt from a person’s CV sanitized for privacy quite common in CVs, etc this particular person

argues he is good because his work is well-cited

but, there is a problem with using CiteSeer ranking

in general, it is not valid (in CVs) let’s see why...

“... In June 2004, I was listed as the 1000th most cited author in computer science (of 100,000 authors) by CiteSeer, available at

http://citeseer.nj.nec.com/allcited.html. ...”

Page 18: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Suspicious entries Let us go to the DBLP

website which stores

bibliographic entries of many CS authors

Let us check who are “A. Gupta” “L. Zhang”

What is the problem in the example?

CiteSeer: the top-k most cited authors DBLP DBLP

Page 19: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Comparing raw and cleaned CiteSeerRank Author Location # citations

1 (100.00%) douglas schmidt cs@wustl 5608

2 (100.00%) rakesh agrawalalmaden@ib

m4209

3 (100.00%)hector 

garciamolina@ 4167

4 (100.00%) sally floyd @aciri 3902

5 (100.00%) jennifer widom @stanford 3835

6 (100.00%) david culler cs@berkeley 3619

6 (100.00%) thomas henzingereecs@berkele

y3752

7 (100.00%) rajeev motwani @stanford 3570

8 (100.00%) willy zwaenepoel cs@rice 3624

9 (100.00%) van jacobson lbl@gov 3468

10 (100.00%) rajeev alur cis@upenn 3577

11 (100.00%) john ousterhout @pacbell 3290

12 (100.00%) joseph halpern cs@cornell 3364

13 (100.00%) andrew kahng @ucsd 3288

14 (100.00%) peter stadler tbi@univie 3187

15 (100.00%) serge abiteboul @inria 3060

CiteSeer top-k

Cleaned CiteSeer top-k

Page 20: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

What is the lesson?

data should be cleaned first e.g., determine the (unique) real authors of publications solving such challenges is not always “easy” that explains a large body of work on data cleaning note

CiteSeer is aware of the problem with its ranking there are more issues with CiteSeer many not related to data cleaning

“Garbage in, garbage out” principle: Making decisions based on bad data, can lead to wrong results.

Page 21: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

High-level view of the problem

...??

"J. Smith"

Raw Dataset

...J. Smith ...

.. John Smith ...

.. Jane Smith ...

MIT

Intel Inc.

?

Normalized Dataset(now can apply data analysis techniques)

Extraction(uncertainty,

duplicates, ...)

John Smith Intel

Jane Smith MIT

... ...

John SmithJane Smith

Intel

MIT

=

Attributed Relational Graph (ARG)

??

The problem:

...

(nodes, edges can have labels)(for any objects, not only people)

Page 22: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Traditional Domain-Independent DC Methods

objectX

feature1

feature2

feature3

objectY

feature1

feature2

feature3

?

?

?

Feature-based similarity (FBS)

objectX

feature1

feature2

feature3

Context

feature4A new feature is derived from context

f1

f2

f3

?

?

?

f4

Y

f1

f2

f3

f4?

XRelDC =

Traditional FBS

+ X Y

A

B C

D

E F

Relationship Analysis(enhance the core)

ARG

Traditional techniques (FBS-based)

Page 23: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

What is “Reference Disambiguation”?

A1, ‘Dave White’, ‘Intel’A2, ‘Don White’, ‘CMU’A3, ‘Susan Grey’, ‘MIT’A4, ‘John Black’, ‘MIT’A5, ‘Joe Brown’, unknownA6, ‘Liz Pink’, unknown

P1, ‘Databases . . . ’, ‘John Black’, ‘Don White’P2, ‘Multimedia . . . ’, ‘Sue Grey’, ‘D. White’P3, ‘Title3 . . .’, ‘Dave White’P4, ‘Title5 . . .’, ‘Don White’, ‘Joe Brown’P5, ‘Title6 . . .’, ‘Joe Brown’, ‘Liz Pink’P6, ‘Title7 . . . ’, ‘Liz Pink’, ‘D. White’

Author table (clean) Publication table (to be cleaned)?

Analysis (‘D. White’ in P2, our approach):

1. ‘Don White’has a paper with ‘John Black’@MIT

2. ‘Dave White’is not connected to MIT in any way

3. ‘Sue Grey’

is coauthor of P2 too, and @ MIT

Thus: ‘D. White’ in P2 is probably Don

(since we know he collaborates with MIT ppl.)

Analysis (‘D. White’ in P6, our approach):

1. ‘Don White’has a paper (P4) with Joe Brown;Joe has a paper (P5) with Liz Pink;Liz Pink is a coauthor of P6.

2. ‘Dave White’

does not have papers with Joe or Liz

Thus: ‘D. White’ in P6 is probably Don

(since co-author networks often form clusters)

Page 24: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Attributed Relational Graph (ARG)

View dataset as a graph nodes for entities

papers, authors, organizations e.g., P2, Susan, MIT

edges for relationships “writes”, “affiliated with” e.g. Susan → P2 (“writes”)

“Choice” nodes for uncertain relationships mutual exclusion “1” and “2” in the figure

Analysis can be viewed as application of the “Context AP” to this graph defined next...

w1 = ?

P1

P2

P3

Dave White

Don White

Susan Grey

John Black

Intel

CMU

MIT

1

Joe BrownP4

Liz Pink

P5

P62

w3 = ?

Q: How come domain-independent?

Page 25: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

In designing the RelDC approach- our goal was to use CAP as an axiom - then solve problem formally, without heuristics

if reference r, made in the context of entity x,

refers to an entity yj but, the description, provided by r, matches

multiple entities: y1,…,yj,…,yN,

thenx and yj are likely to be more strongly connected

to each other via chains of relationships

than x and yk (k = 1, 2, … , N; k j).

Context Attraction Principle (CAP)“J. Smith”

publication P1

John E. SmithSSN = 123

Joe A. SmithP1

John E. Smith Jane Smith

Page 26: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Analyzing paths: linking entities and contexts

D. White is a reference in the context of P2, P6 can link P2, P6 to Don cannot link P2, P6 to

Dave more complex paths in

general

w1 = ?

P1

P2

P3

Dave White

Don White

Susan Grey

John Black

Intel

CMU

MIT

1

Joe BrownP4

Liz Pink

P5

P62

w3 = ?

Analysis (‘D. White’ in P2): path P2→Don

1. ‘Don White’has a paper with ‘John Black’@MIT

2. ‘Dave White’is not connected to MIT in any way

3. ‘Sue Grey’

is coauthor of P1 too, and @ MIT

Thus: ‘D. White’ is probably Don White

Analysis (‘D. White’ in P6): path P6→Don

1. ‘Don White’has a paper (P4) with Joe Brown;Joe has a paper (P5) with Liz Pink;Liz Pink is a coauthor of P6.

2. ‘Dave White’

does not have papers with Joe or Liz

Thus: ‘D. White’ is probably Don White

Page 27: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Questions to answer

1. Does the CAP principle hold over real datasets? That is, if we disambiguate references based on it, will the

references be correctly disambiguated?

2. Can we design a generic solution to exploiting relationships for disambiguation?

Page 28: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Problem formalization

Notation Meaning

X={x1, x2, ... , xN} the set of all entities in in the database

xi .rk the k-th reference of entity xi

a reference a description of an object, multiple attributes

d[xi .rk] the “answer” for xi .rk -- the real entity xi .rk refers to (unknown, the goal is to find it)

CS[xi .rk] the “choice set” for xi .rk -- the set of all entities matching the description provided by xi .rk

y1, y2, ... , yN the “options” for xi .rk -- elements in CS[xi .rk]

v[xi] the node in the graph for entity xi

the name of k-th author of paper xi, e.g. ‘J. Smith’

the true k-th author of paper xi

‘John A. Smith’, ‘Jane B. Smith’, ...

Page 29: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Handling References: Linking(references correspond to

relationships)if |CS[xi .rk]| = 1 then

we know the answer d[xi .rk] link xi and d[xi .rk] directly, w = 1

else the answer is uncertain for xi .rk create a “choice” node, link it “option-weights”, w1 + ... + wN = 1 option-weights are variables

Entity-Relationship Graph RelDC views dataset as a graph

undirected nodes for entities

don’t have weights edges for relationships

have weights real number in [0,1] the confidence the relationship

exists

w1 = ?

P1

P2

P3

Dave White

Don White

Susan Grey

John Black

Intel

CMU

MIT

1

Joe BrownP4

Liz Pink

P5

P62

w3 = ?

v[xi]

v[yN]

cho[xi.rk]

v[y1]

v[y2]w0=1

...

N nodesfor entities in CS[xi.rk]

e0

“J. Smith”P1

“Jane Smith”

“John Smith”

Page 30: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Definition: To resolve a reference xi .rk means

to pick one yj from CS[xi .rk] as d[xi .rk]. Graph interpretation

among w1, w2, ... , wN, assign wj = 1 to one wj

means yj is chosen as the answer d[xi .rk]

Definition: Reference xi .rk is resolved correctly, if the chosen yj = d[xi .rk].

Definition: Reference xi .rk is unresolved or uncertain, if not yet resolved...

Goal: Resolve all uncertain references as correctly as possible.

Objective of Reference Disambiguation

v[xi]

v[yN]

cho[xi.rk]

v[y1]

v[y2]

...

Page 31: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Formalizing the CAP

CAP is based on “connection strength” c(u,v) for entities u and v

measures how strongly u and v are connected to each other via relationships

e.g. c(u,v) > c(u,z) in the figure will formalize c(u,v) later

if c(xi, yj) ≥ c(xi, yk)

then wj ≥ wk (most of the time)

Context Attraction Principle (CAP)

u v

A

B C

D

E F

G H z

v[xi]

v[yN]

cho[xi.rk]

v[y1]

v[y2]

...

We use proportionality:

c(xi, yj) ∙ wk = c(xi, yk) ∙ wj

Page 32: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

RelDC approachInput: the ARG for the dataset

1. Computing connection strengths− for each unresolved reference xi .rk

− determine equations for all (i.e., N) c(xi , yj)’s− c(xi , yj) = gij(w)

− a function of other option-weights

2. Determining equations for option-weights− use CAP to relate all wj’s and connection strengths− since c(xi , yj) = gij(w), hence wij = fij(w)

3. Computing option-weights− solve the system of equations from Step 2.

4. Resolving references− use the interpretation procedure to resolve weights

v[xi]

v[yN]

cho[xi.rk]

v[y1]

v[y2]

...

2

2

Page 33: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Computing connection strength (Step 1)

Computation of c(u,v) consists of two phases Phase 1: Discover connections

all L-short simple paths between u and v bottleneck optimizations, not in SDM05

Phase 2: Measure the strength in the discovered connections many c(u,v) models exist we use random walks in graphs model

Graph

v[xi]

v[y1]

v[y2]

v[yN]u va

N-2... ... ... ... ...

b

Page 34: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Measuring connection strength

wk,

i

w1,

i

w2,

i

v1 vkw1,0

n1

... ...

Sour

ce

wk-1,0...

nk

... ...

Des

tinat

ion

edge E1,0

v2w2,0

n2

... ...

u v

A

B C

D

E F

G H z

Note:

– c(u,v) returns an equations

– because paths can go via various option-edges

– cuv = c(u,v) = guv(w)

Page 35: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Equations for option-weights (Step 2)

CAP (proportionality):

System (over-constrained):

Add slack:

Page 36: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Solving the system (Steps 3 and 4)

Step 3: Solve the system of equations1. use a math solver, or2. iterative method (approx. solution ), or3. bounding-interval-based method (tech. report).

Step 4: Interpret option-weights to determine the answer for each reference pick yj with the largest weight as the answer

Page 37: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Experimental Setup

Parameters When looking for L-short simple paths, L = 7 L is the path-length limit

RealPub dataset: CiteSeer + HPSearch

publications (255K) authors (176K) organizations (13K) departments (25K)

ground truth is not known accuracy...

SynPub datasets: many ds of two types emulation of RealPub

publications (5K) authors (1K) organizations (25K) departments (125K)

ground truth is known

RealMov: movies (12K) people (22K)

actors directors producers

studious (1K) producing distributin

g

Page 38: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Sample Publication Data

CiteSeer: publication records

HPSearch: author records

Page 39: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Efficiency and Long paths

Non-exponential cost Longer paths do help

Page 40: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Web Disambiguation

Music Composer

Football Player

UCSD Professor

Comedian

Botany Professor @ Idaho

Page 41: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Web Disambiguation

Page 42: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Web Disambiguation Extract key information such as mentions of

entities (persons, names, locations) and other information such as hyperlinks and email addresses from Web pages

Cast as a relationship analysis problem Prototype at:

http://opteron.calit2.uci.edu:1977/Diamond/people_search.jsp

Page 43: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extraction and Synthesis

Semantic extractionfrom text

Audio eventextraction

Visual eventextraction

Information extraction from text Many systems and techniques May benefit from semantics Limitations

All or nothing extraction Towards probabilistic extraction systems

Page 44: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Leads Disambiguation and data cleaning

Dmitri Kalashnikov, Stella Chen, Rabia Nuray-Turan Information extraction

Naveen Ashish, Sharad Mehrotra

Page 45: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extraction and Synthesis

Semantic extractionfrom text

Audio eventextraction

Visual eventextraction

Multi-microphone speech processing Speaker identification Noise reduction

Audio-visual speech recognition Combine visual features (venemes) with audio

Speech recognition on light-weight devices Team

Rajesh Hegde, Bhaskar Rao, Shankar Shivappa (UCSD)

Page 46: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extraction and Synthesis

Semantic extractionfrom text

Audio eventextraction

Visual eventextraction

Combine views from multiple cameras Homomorphic transformations

Multi-perspective “view-binding” Team

Sangho Park, Mohan Trivedi (UCSD)

Page 47: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Situational Data Management

Situational Data Management

Spatial Indexing Event data model SAT-Ware

Page 48: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Outline Overall Goal Use examples to illustrate:

Different approaches in modeling and querying Advantage of our approach

Extracting spatial expression Building model for spatial expression Experiments Conclusion

Page 49: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Overall Goal

Goal: Situation Awareness from Textual SourcesDatabas

e

...reports

Textual data after crisis

first responders reports Internet sources for post factum analysis

Info about events, that constitute a crisis, is often available as text.

Textual data during crisis

transcribed 911 calls first responder

communications

Information and Computer Science
Information about events, that constitute a crisis is often availablein the form of text. For example, during crisis textual datacan be available in the form of transcribed 9 1 1 calls and first responder communications.After crisis, such information is useful for post factum analysis and often available in the form of reports filed by the first responders and from various Internet Sources. This data mThe goal of our project is to build SA from textual sources.
Page 50: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Motivating Examples Two reports filed by first responders after 9/11 attack:

“…the PAPD Mobile Command Post was located on West St. north of WTC …”

“…a PAPD Command Truck parked on the west side of Broadway St. and north of Vesey St….”

Query: Retrieve Events around WTC

Goal: Both events should be retrieved with high scores attached.

Page 51: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Approach 1: Using IR approach Direct Keyword retrieval

Only one report mentioned keyword “WTC”

Query expansion based on nearby spatial

objects E.g. Nearby streets and

buildings… Ad-hoc and Objects might

not be bounded

Page 52: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Approach 2: Mapping Using Uncertain Region Query : Near WTC

Report 1: West St. north of WTC

Report 2: west side of Broadway St. and

north of Vesey St

Rank based on the ratio of intersection Problem: rank score is not accurate based on the uniform

assumptions

Page 53: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Our Approach Step 1: Converting Text to Spatial Expression

S-expression: has well-defined function form

Near WTC Near(WTC)

West St. north of WTC

On(West St.) North(WTC)

• west side of Broadway St. and north of Vesey St

West(Broadway St.) North(Vesey St.)

Page 54: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Our ApproachStep 2: Mapping S-expression to probabilistic density function

(PDF)

Near(A)

On(West St.) North(WTC)

Page 55: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Answering Range Query Given a query region

Retrieve objects based on the degree of belonging

On(West St.) North(WTC)

West(Broadway St.) North(Vesey St.)

Consider location as a random variable

Information and Computer Science
To enable this and similar types of spatial analysis on textual data, we should provide support for all standard spatial queries.Let us consider some examples of those queries.For a range query, given a range the goal is to identify all objects that are insidethis range. For a similarity query, given an event, the goal is to identifyall the events that are within certain distance epsilon from that event.For a Nearest Neighbor query, given a point inside the domain, the goal is to identify the closest object to this point.For a Spatial Join, given a parameter epsilon the goal is to return the pairs of all objects that are within epsilon distance from each other.There is a large body of work in this area, so what is the challenge here?
Page 56: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Advantages of Our Approach More explicit spatial mapping remove the needs for

keyword expansion (IR approach)

Probabilistic representation is more formal and accurate than uncertain region (UR) approach

Decouple the extraction and modeling modules Better extraction and modeling modules can be easily

plug-in

Page 57: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extracting Spatial Expression

Step1: Discovering landmarks buildings, roads, intersections

Step2: Generating s-descriptors Use spatial relations to connect the landmarks Spatial relations: near, behind, between in the format D(L1, L2, ... ,Ln)

Step3: Generating s-expressions compositions of s-descriptors near(A) near(B)

Page 58: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Step1: Discovering landmarks

Markup the text by the landmarks Using Gazetteers (Incorporate into information extractor,

GATE) Note: not only markup the “name”, features also attached

Examples of Landmark

Page 59: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Step2: Generating s-descriptors

Discover spatial relations around the landmarks Dictionary approach (convert spatial relations to potential

words) Machine learning techniques can also be used

Examples of s-descriptors

Page 60: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Modeling S-expression Goal: generating a reasonable probabilistic

representation for s-expression

Step1: Modeling S-descriptors

Step2: Combining s-descriptors

Page 61: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Modeling S-descriptors

Modeling templates e.g Uniform, Normal distribution

Using parameter learning techniques

Page 62: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Generating s-expression In a s-expression, we assume the s-descriptors are

conditional independent. If a s-expression has 2 descriptors, S1, S2

It can be generalized to n descriptors, S1…Sn

Page 63: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Generating s-expression

Near(A)

Outdoor()

Outdoor() Near(WTC)

Page 64: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Experimental Setup rdsf Domain real geographic dataset Manhattan, NY, near WTC buildings, streets, roads 4 4 km2

Data Based on 164 reports

by Police Officers participants of 9/11

s-expressions near(A), on(A), outdoor intersections, buildings,

street Construct 2359 pdfs

Queries 50 Range Queries

Page 65: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Simulate the Errors Extraction Errors:

With human supervision, error is small. Modeling Errors:

Even with supervision, model parameters can still be away from the ideal settings.

E.g., the mean and variance settings for the Gaussian model.

We simulate two types of modeling errors for the analysts: Overly confident: estimated model is too “tight”

By reducing variance of the “ideal” Gaussian model Not confident: estimated model is too “loose”

By increasing variance in the “ideal” Gaussian model

Page 66: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Results Event with large errors, probabilistic models are still

better than bounding region methods

Page 67: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Conclusions

Ongoing work database aspects of the problem

more types of queries

Future work spatio-temporal aspects better modeling (text to PDF)

Novel in this work approach for mapping text to PDF

query requirements for SA apps

query design issues

representation of PDFs

Spatial Awareness from Textual SourcesDatabas

e

...reports

Page 68: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Lead Spatial awareness

Yiming Ma

Page 69: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Situational Data Management

Spatial Indexing Event data model SAT-Ware

Page 70: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Situational Data Management

Spatial Indexing Event data model SAT-Ware

Page 71: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Analysis

Analysis and Visualization

Graph analysis GIS Predictive modeling Damage assessment

Page 72: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Graph Analysis

SEMANTIC METADATA

DESCRIBED DATA

Semantic Graphs(Attributed graphs)Entity-Relationship

Schemas

Relations Document Repositories

Taxonomies(“ReferenceData”)

Ontologies(“Semantic Models”)

DBMS

Graph Pattern-Based Querying

Ranked Graph Pattern Matching

Multi-dimensionalAnalysis[For Documents]

Relationship Summarization/Exploration[Relations]

Page 73: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Graph Data Model (Entity-Attribute-Value Model) Graph (edge sets aka triple sets):

E.g. (&dawit ns:studentAt &UCI)(&UCI ns:type &university)

(ns:university ns:subClassOf ns:oraganization) Two kinds of nodes: object-ids, literals (e.g. integer, string, etc.)

Blank nodes (e.g. (&dawit :studentAt _) Directed edges (aka predicates or properties)

there exists only one edge with a given label between a pair of nodes

Symmetric representation of Metadata + data Nodes: object classes or link classes Links: predicates on classes:

(:studentAt :domain :person)(:studentAt :range :organization)(:universty :subclassOf :organization)

Object identity + relationship identity Objects and relationships have unique ids (called URIs)

&dawitns:studentAt

&UCI

Page 74: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Graphs for actual data storage - beyond data modeling

Graphs normally used for conceptual data modeling the entity-relationship (ER) model

What is different ? Using graphs for actual (minimally structured) data

representation. Why ?

Store/represent and query data without schema Symmetrically Store/query both schema (ontology) and data Graph traversal based query + reasoning (inference) Multi-schema queries on the same graph Query unstructured data annotated with

taxonomies/ontologies using traditional (structured) query operators

Page 75: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

topic ontology

editor

publication

bookproceeding

article

researcher

author

String

String

editsProc

writesArticle

writesBook

produces

chapter inProceeding

Date String

pages

refersTo

titleyearname

book

Literal

price

Literal

list_pricerating

Literal

(a) (b) (c)

MODEL

INSTANCE

organization

affiliates

String

org_name

&o1

&r2

&r1

&r3

90

110

affiliates

affiliates

affiliates

&b1

&b2

writesBook

writesBook

price

writesBook

IBM

org_name

UCI

Johnname

Alex

name

&o2Sara name

“”title

year 2003

price

year1998

&r4

Comp.Sc

Info. Sys.Data

InterfacesDB

IR Encrypt.DataStruct.

Onlineservices

D. Lib.Systems

Languages

DistributedDB

MultimediaDB

&a1

affiliates

writesArticle

org_name

editsBook

rdf:type

subClassOf/subPropertyOf

LEGEND

produces&o organization&r researcher&b book&p proceeding&a article

book

Literal

LiteralLiteral

Info. Sys.

InterfacesDB

SystemsLanguages

DistributedDB

MultimediaDB

&b3

writesBook

affiliates

100

year1998

price

&p1inPRoceeding

Page 76: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Graph Pattern based Querying

SELECT *WHERE { ?org :affiliates ?aut .

?aut :produces ?b .?b :type :book .?b :price ?p .?b ?pred ?x . }

variable

triple pattern

queries schema (a)

super-class of writesBook

uses schema (b)

Variable on predicates - matches all applicable predicates

Page 77: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Graph Pattern based Querying

&o1 &r1 &b1 90

2003

Graph set GraphRelation

&o1 &r2 &b1 90

2003...

org aut book price year&o1 &r1 &b1 90 2003&o1 &r2 &b1 90 2003&o1 &r2 &b2 110 1998&o1 &r3 &b3 100 1998&o2 &r2 &b1 90 2003&o2 &r2 &b2 110 1998

SELECT *WHERE { ?org :affiliates ?aut .

?aut :produces ?b .?b :type :book .?b :price ?p .?b ?pred ?x . }

CONSTRUCT *WHERE { ?org :affiliates ?aut .

?aut :produces ?b .?b :type :book .?b :price ?p .?b ?pred ?x . }

EnumerativeSemantics

ExtractiveSemantics

&o1 :affiliates &r1&r1 :produces &b1&b1 :price 90&b1 :year 2003&o1 :affiliates &r2

...

Page 78: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Graph Pattern based Querying

&o1 &r1 &b1 90

2003

Graph

&o1 &r2 &b1 90

2003

&o1 &r1 &b1 90

2003

&r2 &b2

1998

&b3&r3

110

1998&o2

100...

org aut book price year&o1 &r1 &b1 90 2003&o1 &r2 &b1 90 2003&o1 &r2 &b2 110 1998&o1 &r3 &b3 100 1998&o2 &r2 &b1 90 2003&o2 &r2 &b2 110 1998

SELECT *WHERE { ?org :affiliates ?aut .

?aut :produces ?b .?b :type :book .?b :price ?p .?b ?pred ?x . }

CONSTRUCT *WHERE { ?org :affiliates ?aut .

?aut :produces ?b .?b :type :book .?b :price ?p .?b ?pred ?x . }

EnumerativeSemantics

ExtractiveSemantics

Relation

Page 79: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Enumerative Algebra

Enumerative algebra - algebra over sets of variable bindings

?aut :produces ?b?org :affiliates ?autTriple patterns …

Variablesaut b

Bindings (per triple pattern)

&r1

&r2

&r2

&b1

&b1

&b2

Joinable Bindings – same variable, same value.

autorg

&o1

&o1

&o1

&r1

&r2

&r3

Page 80: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Enumerative Algebra (ctd.)

Given two set of bindings T1 and T2, and r denoting a binding:

T1 = {r | r T1 or r T2 }T2

T1 ⋈ = {r1T2 r2 | r1 T1 and r T2 and r1 and r2 are joinable}

&r1

&r2

&r2

&b1

&b1

&b2

&r3

&01

&01

&o1

&o1

?aut ?b?org

&r1

&r2

&r2

&b1

&b1

&b2

&01

&01

&o1

?aut ?b?org

Page 81: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Enumerative Algebra (ctd.)

match[P] (G) – matches the graph pattern P to graph G Given P = {p1, p2, …, pm}

match [P](G) = match [p1] ⋈

G

match [p2] ⋈ ⋈ match [pm]…

Sets of sets (tuples) of bindings

Page 82: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Enumerative Algebra (ctd.) Other operators:

Difference:T1 \ T2 = {r T1 | for all r’ T2,

r and r’ are not joinable}

Filter, (T), evaluate the Boolean condition on T. E.g. of is: ?p > 100.

Outer Join:T1 T2 = (T1 ⋈ T2) ∪ (T1 \ T2)

Page 83: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extractive Algebra

Given two graphs G1 and G2, and t denoting a triple :

G1 = {t | t G1 or t G2 }G2

?aut :produces ?b?org :affiliates ?aut

&r1 :prod

&r2 :prod

&r2 :prod

&b1

&b1

&b2

&o1 :aff

&o1 :aff

&o1 “aff

&r1

&r2

&r3

&o1 :aff

&o1 :aff

&o1 “aff

&r1

&r2

&r3

&r1 :prod

&r2 :prod

&r2 :prod

&b1

&b1

&b2

• Matching retains Structure

• More compact Representation during implementation

Page 84: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extractive Algebra (ctd.)

1. For all t1 G1, either there exists t2 G2 such that t1 and t2 are joinable by p or t1 does not match p1 p.

2. For all t2 G2, either there exists t1 G1 such that t2 and t1 are joinable by p or t2 does not match p2 p

G1 ⋈p G2 = {G1

G2 |˄

?aut :produces ?b?org :affiliates ?aut

&r1 :prod

&r2 :prod

&r2 :prod

&b1

&b1

&b2

&o1 :aff

&o1 :aff

&o1 “aff

&r1

&r2

&r3

where p = (p1,p2), i.e. a pair of triple patterns.

⋈((?org :affiliates ?aut),(?aut :produces ?b))

&o1 :aff

&o1 :aff

&o1 “aff

&r1

&r2

&r3&r1 :prod

&r2 :prod

&r2 :prod

&b1

&b1

&b2

Page 85: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extractive Algebra (ctd.)

?b :price ?p .?b ?pred ?x

?org :affiliates ?aut .?aut :produces ?b

⋈((?aut :produces ?b),(?b :price ?p))

&o1 :aff

&o1 :aff

&r1

&r2

&r1 :prod

&r2 :prod

&r2 :prod

&b1

&b1

&b2

&b1 :price

&b3 :price

90

110

&b1 :year

&b3 :year

2003

1998

&o1 :aff

&o1 :aff

&r1

&r2

&r1 :prod

&r2 :prod

&b1

&b1

&b1 :price 90

&b1 :year

&b3 :year

2003

1998

Page 86: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extractive Algebra (ctd.)

?b :price ?p .?b ?pred ?x

?org :affiliates ?aut .?aut :produces ?b

⋈((?aut :produces ?b),(?b ?pred ?x))

&o1 :aff

&o1 :aff

&r1

&r2

&r1 :prod

&r2 :prod

&r2 :prod

&b1

&b1

&b2

&b1 :price

&b3 :price

90

110

&b1 :year

&b3 :year

2003

1998

&o1 :aff

&o1 :aff

&r1

&r2

&r1 :prod

&r2 :prod

&b1

&b1

&b1 :price 90

&b1 :year 2003

Page 87: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extractive Algebra (ctd.) extract[P] (G) – matches the graph pattern P

Given P = {p1, p2, …, pm}

extract [P](G) = match [p1]

G

match [p2]

⋈⋈ match [pm]

Graph

˄ ˄ ˄

Page 88: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Extractive Algebra (ctd.) Other operations:

Difference:G1 \ G2 = {t G1 and t G2}

Filter: (G) = G \ {t | (t) true}

Page 89: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Implementing Extract – Naïve/Join-split As a post-process of enumerative matching

Do enumerative matching Produces a joined relation

Vertically split join result into triples IO cost: for a pair of triple-sets:

2 reads of triple sets + 1 write of joined result + 2 reads of join result (one for each split/projection) + 2 writes of projected result + 2 reads of the projected triple sets 1 write of unioned result Total: 6 reads and 4 writes (4 reads and 3 write if no

union).

Page 90: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Implementing Extract – 2-way semi-joins Use 2-way semi-joins

Given two joinable triple sets A and B,

A B

A’

B’

IO Cost 2 reads of triplesets (first semi-

join) 1 write of result to union (writes

smaller table) 2 reads to perform next

semijoin (1 read is on smaller table)

1 write of result to union Total: 4 reads and 2 writes.

Page 91: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Implementing Extract – 2-stream operator

Scan each input and produce triples that have at least one match in the other

Is a high-level operator that can be implemented via: Hashing or Sort-merge A B

⋈˄

A’ B’

Page 92: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Grouping and Aggregation : Flatten-and-Aggregate Approach

SELECT ?org, sum (?p) as totalPriceWHERE { ?org :affiliates ?aut .

?aut :writesBook ?b .?b :price ?p }

GROUP BY ?org

&o1

&r2

&r1

&r3

affiliates

affiliates

&b1

&o2

&b2

&b3

affiliates

affiliates

writesBook

writesBook

writesBook

writesBook

110

90

100org aut book price year&o1 &r1 &b1 90 2003&o1 &r2 &b1 90 2003&o1 &r2 &b2 110 1998&o1 &r3 &b3 100 1998

This is how Oracle supports aggregation over graph data ! Also, [Hung, Deng, and Subrahmanian, ICDE 2005]

Group and Aggregate EnumerativeMatch Results

Result: 390. WRONG !

Page 93: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Group By

Should be based on extractive matching (graphs).

What should group by mean on graphs ? Collapse a set of

triples into a single triple.

Use Bag nodes.

&o1

&r2

&r1

&r3

affiliates

affiliates

&b1

writesBook

&o2

Bag

type

&b2

writesBook

Bag

type

&b3

writesBook

Bag

type:1

:1:2

:1

affiliates

affiliates

CONSTRUCT *WHERE { ?org :affiliates ?aut .

?aut :writesBook ?b .?b :price ?p }

GROUP BY ?aut ON :writesBook

Grouping Target

Grouping Basis

Page 94: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Aggregation

Two types (modes) of aggregations on graphs Branch-wise : aggregate a set of values adjacent to a node type Path-wise : aggregate over a path in the graph

Not discussed here. Branch-wise Example :

SELECT ?b, branch sum (:price) as totalPriceWHERE { ?org :affiliates ?aut .

?aut :writesBook ?b .?b :price ?p }

Anchor ModeAggregationbasis

label

&b1

&b2

&b3

110

90

100

2003

1998

1998

year

price

year

price

year

price

Page 95: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Aggregation – revisit example

SELECT ?org, branch sum (:price) as totalPriceWHERE { ?org :affiliates ?aut .

?aut :writesBook ?b .?b :price ?p }

GROUP BY ?org

Optional

Anchor and aggregation basisnot adjacent !

Anchor ModeAggregationbasis

label

&o1

&r2

&r1

&r3

affiliates

affiliates

&b1

&o2

&b2

&b3

affiliates

affiliates

writesBook

writesBook

writesBook

writesBook

110

90

100

price

price

price

Page 96: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Aggregation - solution

&o1&r1&r2

&r3

affiliates&b1

writesBook

&o2

affiliates

affiliates

Bag

:1 :2

&b2writesBook

Bagtype

:1

&b3writesBook

Bag

type

:1

90

110

100

RULE: All nodes between anchor and aggregation basis should be bags ! If anchor and

aggregation basis are adjacent, push aggregation into group by.

Otherwise, iteratively perform graph grouping with edge-propagation making each intermediary node an aggregation target. Result: &o1, 300.

&o2, 200

Page 97: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Lead Dawit Yimam Seid

Page 98: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Analysis and Visualization

Graph analysis GIS Predictive modeling Damage assessment

Ram Hariharan (with Sharad Mehrotra and Chen Li) Searching (open source) GIS data and datasets

Metadata Compression

Page 99: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Analysis and Visualization

Graph analysis GIS Predictive modeling Damage assessment

Vibhav Gogate and Jon Hutchinson (with Padhraic Smyth)

Activity monitoring and prediction Anomalous event detection

Page 100: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Analysis and Visualization

Graph analysis GIS Predictive modeling Damage assessment

ImageCat Inc (Ron Eguchi, Charles Huyck) INLET, MetaSIM

Page 101: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Artifacts

Page 102: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Many Communities – Many Disaster Portals Contents of sites are administered by respective city emergency mgmt. Easily customized to meet needs of different communities. Regional summarization capabilities built in (eg. county/state level

summary view).

Objectives of the Disaster Portal project are to provide: An integrated platform for RESCUE team members to develop, test, and

demonstrate their research projects in real-life scenarios. Next-generation capabilities to first responders and the public.

Key development partner: City of Ontario

The Disaster Portal is a suite of web applications for disseminating information and providing situational awareness to the general public during a disaster.

Disaster Portal

Page 103: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Community Deployment of Disaster Portal

Applications selected from Disaster Portal suite.

Portal framework providing situation summary page, custom look-and-feel

Page 104: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

http://www.disasterportal.org:8380/Ontario/

Page 105: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Applications Available in Disaster Portal Suite Research Topics

Crisis AlertsKey contacts at companies / organizations can sign up for customized information updates via web or phone.

Scalable rapid dissemination

Donation ManagementIndividuals and organizations post needs and donations; helps coordinate the matching process.

Complex publish-subscribe systems

Family ReunificationSearch for contact info of a displaced family member.

Information extraction &Data cleaning

Shelter InformationAnnouncements and status information for open emergency shelters.

Travel PlanningCurrent and predicted traffic conditions.

Activity modeling algorithms

Disaster-Oriented Web SearchFind information not already included in the site.

Multidimensional analysis algorithms

Included in Ontario Pilot Disaster Portal

Disaster Portal

Page 106: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

SAMI

Situational awareness systems

Extraction and synthesisData management

Analysis

semantic extraction from text

audio-visualextraction

E event model

SAT-ware

graph analysis

geospatial

predictive modeling

damage assessmentspatial indexing

Page 107: Towards situational awareness systems for disaster response Naveen Ashish Calit2@UC-Irvine Bell Labs India, Bangalore, 04/23/07

Conclusions Situational data

management Semantics Synergies Integrated demonstration

Thank you !

[email protected]