project lachesis: parsing and modeling location histories daniel keeney cs 4440
TRANSCRIPT
Introduction
• Location History is a record of an entity’s location in geographical space over time
• Archaeologists and historians look at migrations and census data to reconstruct location histories
• New technologies such as GPS allow us to enhance the accuracy and resolution greatly
Resolution
• Old temporal resolutions ranged from a decade to a century
• Old spatial resolutions ranged from tens to hundreds of kilometers
• GPS accuracy opens up a completely different type of analysis
Goal
• By tracking locations in real time, new types of analysis can be performed
• Goal: condense, understand, and predict the movements of an object over a period of time
Stays and Destinations
• Stay is a single instance of an object spending some time in one place
• Destination is any place where one or more objects have experienced a stay
• Trip occurs between two adjacent stays made by the same object
• Path is a representation of the description of a set of trips between destinations
Calculating Stays
• The roaming distance, is how far an object can stray while being counted as a stay
• The stay duration, is how long an object must remain within the roaming distance to count as a stay
• Medoid is the data point nearest to the “center” of the set
roaml
durt
Calculating Stays
• Worst case: O(n2) for n data points, due to medoid and diameter working on all pairs
• In practice, clusters which require computation are far smaller than n, effectively yielding O(n)
Calculating Destinations
• Geographic scale, determines how close two stays can be and still be considered the same destination
• Destinations are represented by a location as well as the scale used:
destl
),( destjjj ld l
Creating Probabilistic Models
Assumptions:
• At the beginning of a given time interval, an object is at exactly one destination
• During any given time interval, an object makes exactly one transition between destinations
• Self-transitions are allowed
Creating Probabilistic Models
• Models are similar to Hidden Markov Models
• Critical difference from HMM is the incorporation of time-dependence, where transition probabilities are conditioned on recurring time intervals
Creating Probabilistic Models
• Model consists of three probability matrices
• Probability of the object starting time interval at destination is
• Probability of transition from to during interval is
• Observation probability: observing object at when actually at
k id )},({ kidΠ
id jd
k )},,({ kji ddaA
)},({ ji ddbB
jd id
Calculating Probabilistic Models
• Together as these tables represent a probabilistic model
• This model can be used to solve problems such as finding the most likely destination occupied at a particular time, determining the relative likelihood of a location history sequence, or generating a location history sequence
),,( BAΠλ
Calculating Probabilistic Models
• Using λ we estimate the relative likelihood of a new location history
• This is done using a Non-Markovian Solution and a Markovian Solution
Experiment Results
S M T W T F S0
1
2
3
4
5
6
7
8
9
Day of Week
Num
ber
of H
ours
“I always felt more productive on Tuesdays.” - Subject A
Experiment Results
0
2
4
6
8
10
12
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Month of year
Ave
rag
e n
um
ber
of
des
tin
atio
ns
Experiment Results
S M T W T F S
50
100
150
200
250
Day of Week
Des
tinat
ion
ID
Week 16 from Training Data
S M T W T F S
50
100
150
200
250
Day of Week
Des
tinat
ion
ID
Week 31 from Training Data
A typical (left) and an atypical (right) week from Subject A.
Experimental Results
S M T W T F S
50
100
150
200
250
Day of Week
Des
tinat
ion
ID
A Stochastically Generated Week using Non-Markov Model
S M T W T F S
50
100
150
200
250
Day of Week
Des
tinat
ion
ID
A Stochastically Generated Week using Markov Model
Plots of synthesized weeks, using Non-Markov (left) and Markov (right) models
Markov vs. Non-Markov
• Markovian model showed an atypical week to have an unexpectedly high probability
• This could be mitigated by “training” on larger data sets, but generally the Non-Markovian model is sufficient
Conclusions
• Proposed rigorous definitions for location histories, stays, and destinations, as well as accompanying algorithms
• Non-Markovian is better suited for evaluating likelihoods of a location history
• Markovian is better for stochastically generating a history
• Future papers will examine trips and paths