presented by: shahab helmi spring 2016cse.ucdenver.edu/~bdlab/seminar/2016/6.pdf · the authors...

Presented by: Shahab Helmi

Spring 2016

Authors:

Publication:

ICDE 2015

Type:

Research Paper

2

This paper presents Searchlight, a Continuous Query Processing Framework (CQPF)enabling context-aware Continuous Query Processing (CQP) of both actual andpredicted movement of objects in symbolic spaces.

The heart of Searchlight is a novel graph-based symbolic space model, the SearchlightGraph (SLG), capable of modeling object movement in both indoor and outdoorsymbolic space, and including the novel concept of object and location keywords forcapturing rich contextual information and enabling context-aware (predictive)querying.

3

A vertex-labeled directed graph represented as (V, E, K, Lids, O, Oids, S, fp,T)

The vertices V represent symbolic locations in indoor or outdoorspace. v = ( lid, Kl): location ID and a set of keywords, i.e. Parking,Canteen and …

The edges E are directed and represent possible movements betweenvertices. An edge e = (Vi, Vj, Cmap) is composed of the source vertex Vi,the target vertex Vj, and an edge weight function Cmap that maps a setof keywords and a time instance to a weight denoting the travel timein seconds.

The moving objects in the graph are O. A moving object o = (Oid, KO,H, P). Object ID, some keywords, such as “wheelchair-user”, Asequence of location history and anticipated relative futuremovement.

4

Different departments are shown with different shades of gray.

Dashed vertices are outdoor locations.

The prefix letters of the location ids show the entity type, e.g., A is anauditorium and CT is a canteen.

Edges are directed, to model, e.g., that the entrances E2 and E3 canonly be accessed through unidirectional automatic doors from CT2and P2, respectively.

5

Queries can be expressed using the declarative Searchlight Query Language (SLQL).

Range Query: An example range query is to continuously monitor which securityguards have been visiting the Math Department at Aalborg University during the pastfour hours.

Aggregate Query: An example aggregate query is to continuously monitor how manypeople are predicted to be located in the Biology Department at Aalborg Universityduring the next 60 minutes.

Position Query: continuously monitor the current location of each disabled personinside the Department of Computer Science at Aalborg University.

6

Position Query: report the location of disabled students in the CS department every 5seconds:

7

Presented by: Dardan Xhymshiti

Spring 2016:

Authors:

Publication:

ICDE 2015

Type:

Research Paper

10

Nowadays location-based social networks are becoming abundant sources of geo-related information.

Geographical location is a new way of connecting people.

Using tweets to retrieve information from them.

The authors defined an approach to use tweets for finding top-k local users who arefamiliar with relevant issues queried in a certain spatial region. People in need candirectly communicate with those recommended local users on Twitter platform.

11

Traditional IR techniques are used to retrieve information from long textual documentsrich in keywords. They are not suitable for searching short-sized social media data thatare characterized by few keywords.

Twitter offers a search service for searching the top ranked tweets. But this searchdoes not handle spatial aspects.

There existed a search engine Aardvark who retrieved tweets based on queryingkeywords. This approach returns too many results.

12

The authors defined the top-k local user search query (TkLUS).

General example: Given a location q, a distance r, and a set of words W, the TkLUSquery finds the top-k users who have posted tweets relevant to the desired keywordsin W at a pace within the distance r from q.

The authors propose two local user ranking methods that integrate text relevance andlocation proximity in TkLUS query.

They construct a hybrid index under a scalable framework, which is aware of keywordsas well as locations to organize high volume of geo-tagged tweets.

The authors devise two algorithms for processing TkLUS queries.

The authors conducted experiments on real Twitter data sets.

13

Presented by: Elban Avdylaj

Summer 2016

Authors:

Publication:

ICDE 2015

Type:

Research Paper

15

Internet users are shifting from searching on traditional media to social networkplatforms (SNPs) to retrieve up-to-date and valuable information.

SNPs have two unique characteristics: frequent content update and small worldphenomenon.

A social network exhibits the small-world phenomenon if any two individuals in thenetwork are likely to be connected through a short sequence of intermediateacquaintances.

Existing works are not able to support these two features simultaneously.

16

To address this problem, the authors develop a general framework to enable real timepersonalized top-k query.

This framework incorporates time freshness, social relevance and textual similarity

17

Search on Social Media Platform:

Facebook developed Unicorn to handle large-scale query processing. While it supportssocially related search, the feature is only available for predefined entities rather than forarbitrary documents.

Twitter’s real time query engine, Earlybird, has also been reported to offer highthroughput query evaluation for fast rate of incoming tweets. Unfortunately, it fails toconsider social relationship.

18

Search on Social Network:

Several research works have been proposed for real time search indices over SNPs.

However, none of them offers customized search for the query user.

19

20

To ensure efficient update and query processing, there are two key challenges.

1. Design an index structure that is update-friendly while supporting instant queryprocessing.

2. Efficiently compute the social relevance in a complex graph.

To address these challenges, the authors first design a novel 3D cube inverted index tosupport efficient pruning on the three dimensions simultaneously. Then they devise acube based threshold algorithm to retrieve the top-k results, and propose severalpruning techniques to optimize the social distance computation, whose costdominates the query processing. Furthermore, they optimize the 3D index via ahierarchical partition method to enhance their pruning on the social dimension.

21

Twitter - 17M users, 476M tweets

Memetracker - 9M media and 96M records.

presented by: shahab helmi spring 2016cse.ucdenver.edu/~bdlab/seminar/2016/6.pdf · the authors...

Documents