![Page 1: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/1.jpg)
From research prototypeto a data-driven product
Roman Prokofyev
linkedin.com/in/rprokofyev
6th Swiss Data Science Conference, Bern. 14.06.2019
![Page 2: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/2.jpg)
Agenda
● Intro to FAIRTIQ● Location data challenges● Data annotation● Quality assurance
![Page 3: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/3.jpg)
Intro to FAIRTIQ
![Page 4: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/4.jpg)
FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click
• You get a valid ticket with one swipe• With another click at the end of your
journey, you are automatically charged the lowest possible fare for the route traveled
4
Ticket is valid in the whole tariff
community
Journey is charged after
check-out
![Page 5: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/5.jpg)
Data collection
![Page 6: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/6.jpg)
What is collected?
6
Accuracy
![Page 7: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/7.jpg)
How it is collected
7
uses mostly GPS
Fuses multiple sensors: WiFi,
Cell, seldom GPS
It’s unknown what sensor was used
![Page 8: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/8.jpg)
Uniqueness of location data
8
![Page 9: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/9.jpg)
Uniqueness of location data
9
![Page 10: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/10.jpg)
Uniqueness of geo data
10
![Page 11: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/11.jpg)
Precision of location data: outliers
11
![Page 12: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/12.jpg)
Precision of location data: vehicles
12
Train Bus
![Page 13: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/13.jpg)
Integrity of location data
13
Time gapsLog
![Page 14: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/14.jpg)
Challenges
Uniqueness Outliers Time gaps
data is never the same on the same path
different vehicles
because not only GPS is used
because underground or OS throttling
![Page 15: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/15.jpg)
Data annotation
![Page 16: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/16.jpg)
What to annotate?
16
Modes
![Page 17: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/17.jpg)
What to annotate?
Stations
Trains
Time
![Page 18: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/18.jpg)
FAIRTIQ Annotations
18
Annotate only stations, the rest is derived automatically
![Page 19: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/19.jpg)
Semi-automatic annotations
19
Timetable data
Luzern
Bern
Bern, Bahnhof
Bern, Kursaal
Luzern 12:00
TrainBern 13:04
Bern, Bahnhof 13:08
TramBern, Kursaal 13:18
Stations Automatic annotation
![Page 20: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/20.jpg)
Quality assurance
![Page 21: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/21.jpg)
Metrics
21
Most common metrics: Precision, Recall.
A B C
P = 1.0 R = 1.0
D
A B C D
P = 1.0 R = 0.75P = 0.8 R =1.0
E
Ground truth
System output
![Page 22: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/22.jpg)
Precision/Recall drawbacks
22
A
B
C
D
E
The metrics treat elements as unordered sets
P = 1.0 R = 1.0
C
![Page 23: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/23.jpg)
Sequence alignment
23
A
B
C
D
E
Edit Distance
Pseq
= 1.0 Rseq
= 1.0Pseq
= 0.8 Rseq
= 0.8
A B C D E
A C B D E
1 insertion + 1 deletion
![Page 24: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/24.jpg)
Be aware of the location data
challenges
Know what to annotate and what
to automate
Know what to measure
Key takeaways
![Page 25: to a data-driven product From research prototype …...FAIRTIQ reduces the hurdles to public transport: A valid ticket with one click •You get a valid ticket with one swipe •With](https://reader035.vdocument.in/reader035/viewer/2022081611/5f0987337e708231d4274167/html5/thumbnails/25.jpg)
Thank you for your attention
linkedin.com/in/rprokofyev