learning significant locations and predicting user ...learning significant locations and predicting...
TRANSCRIPT
GeorgiaTech
Learning Significant Locations and Predicting
User Movement with GPS
Daniel Ashbrook and Thad Starner
Contextual Computing Grouphttp://www.cc.gatech.edu/ccg
College of Computing, GVU CenterGeorgia Institute of Technology
Atlanta, GA USA
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Motivation
• Location is a very common form of context– easy to collect
– infer other pieces of context
• Most applications rely only on user’s current location
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Motivation
• How can we improve location context?• Look for patterns of movement and learn
user’s daily schedule– predict where user is going based on where
user has been
• Goal: computer can act as agent– offer suggestions at appropriate times– enable collaboration between colleagues
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Potential applications for location prediction
• Single–user applications– system only knows about one user’s
movements
• Multi–user applications– system combines predictions for several
people
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Single user: Pre–emptive Reminders– remind user at an appropriate time
– example: library book•try to determine if user will pass library today
•only then remind user to take book before leaving home
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Single user: Wireless caching– wireless networks often unavailable
•lack of infrastructure
•radio shadows (buildings, subway)
– hide lack of connectivity by caching
– predict when caching will be insufficient•warn user
•suggest alternative routes
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Single user: Wireless caching– cache even when network is available
•transmission power can increase with 4th power of distance in complex environments (i.e., city)
•cost can vary with network used, time of day
– prediction can allow savings•of battery power
•of money
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Multi–user: Enabling collaboration– “Will I see Bob today?”
•compare the user’s and Bob’s schedules
•give yes or no answer
– Scheduling many–person meetings•find when most people are free and suggest a time
•also discover most convenient place to meet
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Multi–user: Favor exchange– remotely coordinate favor trading
– example: FedEx/UPS package trading
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Related Work
• Bhattacharya — cell phone prediction
• Davis — prediction with ad–hoc networks
• Kortuem — Walid
• Marmasse — comMotion
• Liu — predictively caching network architecture
• Orwant — Doppelgänger
• Sparacino — Museum Wearable
• Wolf — travel diaries
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Hardware
• Garmin GPS model 35-LVS
• GeoStats data logger– 1 MPH recording limit
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Hardware
• Preliminary data collected in Atlanta Sep-Dec 2001
• Data currently being collected from multiple users in Zürich, Switzerland
Preliminary data—Atlanta, GA
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Preliminary implementation– finds points of possible significance
– creates probabilistic model of user’s movements•Markov model
– using model, simple queries are possible:•“The user is at home. Where will she go next?”
•“How likely is the user to visit the grocery store today?”
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
– transitions between nodes
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
– transitions between nodes
– each transition has a probability of occurring
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
– transitions between nodes
– each transition has a probability of occurring
– can also have self–
transitions
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Our Markov model– nodes are significant
locations
– transitions are trips between those locations
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Significance– how do we determine if a particular GPS
coordinate might have some meaning to the user?
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Places– logged GPS
coordinates with more than time t of “resting time”
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?– try lots of values
– graph number of places found for each value
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?– try lots of values
– graph number of places found for each value
– but relationship is nearly linear!
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?– try lots of values
– graph number of places found for each value
– but relationship is nearly linear!
– so we pick an arbitrary value: t = 10 minutes
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Locations– problem: too many places
•GPS inaccuracy
•different exit points from buildings
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Locations– problem: too many places
•GPS inaccuracy
•different exit points from buildings
– solution: cluster places to form locations•all places within a radius r of a particular place
form a single location
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
All data Only locationsOnly places,with t = 10m
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick radius r ?– too large value
• too few clusters• unrelated places
together
– too small value• too many clusters
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick radius r ?– too large value
• too few clusters• unrelated places
together
– too small value• too many clusters
• Solution:– try various values for r– find knee in graph
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
– repeat with x as the new center
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
– repeat with x as the new center
– continue until the mean stops
changing
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
– repeat with x as the new center
– continue until the mean stops
changing
– start again with another place– repeat until no more places
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations– problem: subsuming
smaller-scale paths
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations– problem: subsuming
smaller-scale paths– solution: create
sublocations within larger clusters
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to determine if sublocations exist?– use same knee &
graph algorithm on each location
– if no knee exists, not enough points to form sublocations
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
– State level
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
– State level
– City level
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
– State level
– City level
– Campus level
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Prediction– each location gets a unique ID
•user may provide a unique name for each locationsuch as “home” or “work”
– replace each place in original list with ID•result: list of locations that were visited, in the
order that they were visited
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location– count total number of
visits to other locations
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location– count total number of
visits to other locations
– divide to get probability of transition
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location– count total number of
visits to other locations
– divide to get probability of transition
– result: Markov model for each location
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• People don’t move randomly!– 23 locations total, so chance of A→? = 1/22
= 4.5%
– measured ratio CRB→Home = 16/77 = 21%
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Orders of Markov model– 1st order A → ?
•a given state’s transition probabilities only depend on that state
– 2nd order B → A → ?•a given state’s transition probabilities depend on
that state and the previous state
– and so on…
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• First order predictions
4%
6%
6%
10%
13%
13%
21%
% Chance
3/77CRB → Taco Bell
5/77CRB → 10th/14th St.
5/77CRB → GA400
8/77CRB → Grocery store
10/77CRB → Jake’s Ice Cream
10/77CRB → Hardware store
16/77CRB → Home
ProbabilityMovement
••• ••
• •••
Random chance: 1/22 = 4.5%
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
7%
7%
14%
21%
21%
70%
% Chance
1/14Home → CRB → 10th/14th St.
1/14Home → CRB → GA400
2/14Home → CRB → Jake’s Ice Cream
3/14Home → CRB → Grocery store
3/14Home → CRB → Home
14/20Home → CRB
ProbabilityMovement
0%0/14Home → CRB → Hardware store
• Second order predictions
••• ••
• •••
Random chance: 1/22 = 4.5%
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How many orders to use?– sequence of 141 locations visited
– 23 total unique locations
86
82
73
56
Observed unique paths
137
138
139
140
Approx. expected unique paths
23 * 224 = 5,387,8884
23 * 223 = 244,9043
23 * 222 = 11,1322
23 * 221 = 5061
PermutationsOrder
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Future Work
• Collect more data– Georgia Tech students in Zürich & Atlanta
• Investigate other sensors for smaller scales– RF/IR beacons
• Consider privacy policies• Add time of day to Markov model
– predict when a user will leave as well as where they’re going
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Future Work
• Schedule “sharpness”– always on time = important ?– example: work at 8AM vs. grocery store
• Speed of model update vs. accuracy– new schedule for college students every term– weight new events more heavily?
•how to avoid unduly weighting one–time trips?•use confidence intervals to determine schedule
changes
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Future Work
• Real–time update of models– currently, data is post–processed
– need full wearable computers for real–time
• User interface– visualize location model
– allow user to influence model
• Favor trading implementation