project lachesis: parsing and modeling location histories daniel keeney cs 4440

33
Project Lachesis: Parsing and Modeling Location Histories Daniel Keeney CS 4440

Upload: lynette-wilkins

Post on 13-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Project Lachesis:Parsing and Modeling Location

Histories

Daniel Keeney

CS 4440

Introduction

• Location History is a record of an entity’s location in geographical space over time

• Archaeologists and historians look at migrations and census data to reconstruct location histories

• New technologies such as GPS allow us to enhance the accuracy and resolution greatly

Resolution

• Old temporal resolutions ranged from a decade to a century

• Old spatial resolutions ranged from tens to hundreds of kilometers

• GPS accuracy opens up a completely different type of analysis

Goal

• By tracking locations in real time, new types of analysis can be performed

• Goal: condense, understand, and predict the movements of an object over a period of time

Stays and Destinations

• Stay is a single instance of an object spending some time in one place

• Destination is any place where one or more objects have experienced a stay

• Trip occurs between two adjacent stays made by the same object

• Path is a representation of the description of a set of trips between destinations

Calculating Stays

• The roaming distance, is how far an object can stray while being counted as a stay

• The stay duration, is how long an object must remain within the roaming distance to count as a stay

• Medoid is the data point nearest to the “center” of the set

roaml

durt

Calculating Stays

Calculating Stays

• Worst case: O(n2) for n data points, due to medoid and diameter working on all pairs

• In practice, clusters which require computation are far smaller than n, effectively yielding O(n)

Calculating Destinations

• Geographic scale, determines how close two stays can be and still be considered the same destination

• Destinations are represented by a location as well as the scale used:

destl

),( destjjj ld l

Calculating Destinations

Example

Creating Probabilistic Models

Assumptions:

• At the beginning of a given time interval, an object is at exactly one destination

• During any given time interval, an object makes exactly one transition between destinations

• Self-transitions are allowed

Creating Probabilistic Models

• Models are similar to Hidden Markov Models

• Critical difference from HMM is the incorporation of time-dependence, where transition probabilities are conditioned on recurring time intervals

Creating Probabilistic Models

• Model consists of three probability matrices

• Probability of the object starting time interval at destination is

• Probability of transition from to during interval is

• Observation probability: observing object at when actually at

k id )},({ kidΠ

id jd

k )},,({ kji ddaA

)},({ ji ddbB

jd id

Calculating π)},({ kidΠ

Calculating A)},,({ kji ddaA

Calculating B

)},({ ji ddbB

Calculating Probabilistic Models

• Together as these tables represent a probabilistic model

• This model can be used to solve problems such as finding the most likely destination occupied at a particular time, determining the relative likelihood of a location history sequence, or generating a location history sequence

),,( BAΠλ

Calculating Probabilistic Models

• Using λ we estimate the relative likelihood of a new location history

• This is done using a Non-Markovian Solution and a Markovian Solution

Non-Markovian Solution

Markovian Solution

Experiment Results

0

400

800

1200

1600

2000

0 50 100 150 200

Duration in Minutes

Num

ber

of S

tays

Experiment Results

Experiment Results

Experiment Results

S M T W T F S0

1

2

3

4

5

6

7

8

9

Day of Week

Num

ber

of H

ours

“I always felt more productive on Tuesdays.” - Subject A

Experiment Results

0

2

4

6

8

10

12

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Month of year

Ave

rag

e n

um

ber

of

des

tin

atio

ns

Experiment Results

S M T W T F S

50

100

150

200

250

Day of Week

Des

tinat

ion

ID

Week 16 from Training Data

S M T W T F S

50

100

150

200

250

Day of Week

Des

tinat

ion

ID

Week 31 from Training Data

A typical (left) and an atypical (right) week from Subject A.

Experimental Results

S M T W T F S

50

100

150

200

250

Day of Week

Des

tinat

ion

ID

A Stochastically Generated Week using Non-Markov Model

S M T W T F S

50

100

150

200

250

Day of Week

Des

tinat

ion

ID

A Stochastically Generated Week using Markov Model

Plots of synthesized weeks, using Non-Markov (left) and Markov (right) models

Markov vs. Non-Markov

• Markovian model showed an atypical week to have an unexpectedly high probability

• This could be mitigated by “training” on larger data sets, but generally the Non-Markovian model is sufficient

Conclusions

• Proposed rigorous definitions for location histories, stays, and destinations, as well as accompanying algorithms

• Non-Markovian is better suited for evaluating likelihoods of a location history

• Markovian is better for stochastically generating a history

• Future papers will examine trips and paths