addressing the free-rider problem in public transport systems · 2018. 3. 13. · s3,t3,w3 s2,t2,w2...

2
Addressing the Free-Rider Problem in Public Transport Systems Vaibhav Kulkarni, Bertil Chapuis, Benoît Garbinato UNIL-HEC Lausanne [email protected] Abhijit Mahalunkar School of Computing DIT-Dublin [email protected] ABSTRACT Public transport network constitutes for an indispensable part of a city by providing mobility services to the general masses. To im- prove ease of access and reduce infrastructural investments, public transport authorities often adopt proof of payment system. Such a system operates by eliminating ticket controls when boarding the vehicle and subjecting the travelers to random ticket checks by affiliated personnel (controllers). Although cost efficient, such a system promotes free-riders, who deliberately decide to evade fares for the transport service. A recent survey by the association of European transport, estimates hefty income losses due to fare evasion, highlighting that free-riding is a serious problem that needs immediate attention. To this end, we highlight the attack vectors which can be exploited by free-riders by analyzing the crowdsourced data about the control-locations. Next, we propose a framework to generate randomized control-location traces by using generative adversarial networks (GANs) in order to minimize the attack vectors. Finally, we propose metrics to evaluate such a system, quantified in terms of increased risk and higher probability of being subjected to control checks across the city. CCS CONCEPTS Information systems Geographic information systems; KEYWORDS Public Transport Networks; Fare Evasion; Free-riding; GANs ACM Reference Format: Vaibhav Kulkarni, Bertil Chapuis, Benoît Garbinato and Abhijit Mahalunkar. 2018. Addressing the Free-Rider Problem in Public Transport Systems. In Proceedings of January 2018 (Technical Report). 2 pages. https://doi.org/10. 1145/nnnnnnn.nnnnnnn 1 INTRODUCTION Public transport free-riding is raising alarming concerns due to the serious income losses (EUR 250 million in Switzerland, EUR 80 mil- lion in Paris and EUR 120 million in Berlin) to the public transport service authorities [1]. Switzerland incurs a financial damage in terms of a revenue loss between 1 to 15% (mean of 5%), estimated to be one of the highest in the world [2]. The increasing fines have given rise to insurance organizations, which promote fare-dodging by covering the fines for free riding [5]. The cost-benefit analysis Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Technical Report, DopLab, UNIL © 2018 ACM. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 https://doi.org/10.1145/nnnnnnn.nnnnnnn conducted by several public transport authorities show that the installation of a barrier separated systems or turnstiles to be un- reasonable [2]. In addition to causing financial losses, free-riding undermines the integrity of honest passengers who need to pay more for the service in the subsequent years due to fare increments. Therefore free evasion needs immediate attention by devising effi- cient combat strategies by anticipating the attack vectors available to the free-riders. To this end, we propose a framework for public transport control, with the objective of increasing the risk involved in traveling with- out a valid ticket. A project with a similar objective was undertaken in New York City called project Eagle [6]. The eagle team located hotspots, where increased fare evasion activity was taking place and heightened the frequency of controls at these places by increas- ing the number of team members. However, this led to increased expenses due to the appointment of additional personnel and ele- vated free-riding activity in the other areas of the city. Therefore, a satisfactory tradeoff needs to be maintained between the expenses incurred and the control efficacy. As a first step towards addressing this problem, we study the current state of public transport controlling by analyzing the crowd- sourced data about the locations of ticket controllers collected in Lausanne, Switzerland. We highlight several attack vectors, result- ing primarily due to the distinct predictability and lack of random- ness in the movement patterns of the controllers. To this end, we propose a framework to generate control locations and the routes connecting them to maximize the informativeness about free-riding activity. The location set and the routes can be determined by an- alyzing the population density flow in the city by gathering the ridership data collected from the public transport authorities. This data is used to train a GAN [4], which is used to formulate control- location traces satisfying certain cost and informativeness bounds. Finally, we suggest statistical metrics to evaluate the generated traces based on their distribution similarity and randomness. Problem Statement. Given a control-location p, having an asso- ciated cost (transit time), C ( p)| p L, where L is the set of all the city control-locations. The route, R between two control-locations, p, q L has an associated cost, C (R). Let T be a trace expressed as a sequence of these routes with a cost, ( C ( T )) In addition, every location and route has an associated quality, i.e. informativeness (#travelers controlled) denoted as Q ( T ). Our goal is to generate control-location traces such that max Q ( T )| C ( T ) < B, LD(µ r , σ ), where the cost is bounded by B, D is the distribution of the locations and routes based on the ridership and the randomness is quantified in terms of µ r . arXiv:1803.04389v1 [cs.CY] 12 Mar 2018

Upload: others

Post on 31-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Addressing the Free-Rider Problem in Public Transport SystemsVaibhav Kulkarni, Bertil Chapuis,

    Benoît GarbinatoUNIL-HEC Lausanne

    [email protected]

    Abhijit MahalunkarSchool of Computing

    [email protected]

    ABSTRACTPublic transport network constitutes for an indispensable part of acity by providing mobility services to the general masses. To im-prove ease of access and reduce infrastructural investments, publictransport authorities often adopt proof of payment system. Sucha system operates by eliminating ticket controls when boardingthe vehicle and subjecting the travelers to random ticket checksby affiliated personnel (controllers). Although cost efficient, sucha system promotes free-riders, who deliberately decide to evadefares for the transport service. A recent survey by the associationof European transport, estimates hefty income losses due to fareevasion, highlighting that free-riding is a serious problem thatneeds immediate attention. To this end, we highlight the attackvectors which can be exploited by free-riders by analyzing thecrowdsourced data about the control-locations. Next, we propose aframework to generate randomized control-location traces byusing generative adversarial networks (GANs) in order to minimizethe attack vectors. Finally, we propose metrics to evaluate such asystem, quantified in terms of increased risk and higher probabilityof being subjected to control checks across the city.

    CCS CONCEPTS• Information systems→ Geographic information systems;

    KEYWORDSPublic Transport Networks; Fare Evasion; Free-riding; GANs

    ACM Reference Format:Vaibhav Kulkarni, Bertil Chapuis, Benoît Garbinato and Abhijit Mahalunkar.2018. Addressing the Free-Rider Problem in Public Transport Systems. InProceedings of January 2018 (Technical Report). 2 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

    1 INTRODUCTIONPublic transport free-riding is raising alarming concerns due to theserious income losses (EUR 250 million in Switzerland, EUR 80 mil-lion in Paris and EUR 120 million in Berlin) to the public transportservice authorities [1]. Switzerland incurs a financial damage interms of a revenue loss between 1 to 15% (mean of 5%), estimatedto be one of the highest in the world [2]. The increasing fines havegiven rise to insurance organizations, which promote fare-dodgingby covering the fines for free riding [5]. The cost-benefit analysis

    Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page.Technical Report, DopLab, UNIL© 2018 ACM.ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00https://doi.org/10.1145/nnnnnnn.nnnnnnn

    conducted by several public transport authorities show that theinstallation of a barrier separated systems or turnstiles to be un-reasonable [2]. In addition to causing financial losses, free-ridingundermines the integrity of honest passengers who need to paymore for the service in the subsequent years due to fare increments.Therefore free evasion needs immediate attention by devising effi-cient combat strategies by anticipating the attack vectors availableto the free-riders.

    To this end, we propose a framework for public transport control,with the objective of increasing the risk involved in traveling with-out a valid ticket. A project with a similar objective was undertakenin New York City called project Eagle [6]. The eagle team locatedhotspots, where increased fare evasion activity was taking placeand heightened the frequency of controls at these places by increas-ing the number of team members. However, this led to increasedexpenses due to the appointment of additional personnel and ele-vated free-riding activity in the other areas of the city. Therefore, asatisfactory tradeoff needs to be maintained between the expensesincurred and the control efficacy.

    As a first step towards addressing this problem, we study thecurrent state of public transport controlling by analyzing the crowd-sourced data about the locations of ticket controllers collected inLausanne, Switzerland. We highlight several attack vectors, result-ing primarily due to the distinct predictability and lack of random-ness in the movement patterns of the controllers. To this end, wepropose a framework to generate control locations and the routesconnecting them to maximize the informativeness about free-ridingactivity. The location set and the routes can be determined by an-alyzing the population density flow in the city by gathering theridership data collected from the public transport authorities. Thisdata is used to train a GAN [4], which is used to formulate control-location traces satisfying certain cost and informativeness bounds.Finally, we suggest statistical metrics to evaluate the generatedtraces based on their distribution similarity and randomness.Problem Statement. Given a control-location p, having an asso-ciated cost (transit time), C(p) | p ∈ L, where L is the set of all thecity control-locations. The route, R between two control-locations,p,q ∈ L has an associated cost, C(R). Let T be a trace expressed asa sequence of these routes with a cost, (C(T )) In addition, everylocation and route has an associated quality, i.e. informativeness(#travelers controlled) denoted as Q(T ). Our goal is to generatecontrol-location traces such that maxQ(T ) | C(T ) < B,L D(µr ,σ ),where the cost is bounded by B,D is the distribution of the locationsand routes based on the ridership and the randomness is quantifiedin terms of µr .

    arX

    iv:1

    803.

    0438

    9v1

    [cs

    .CY

    ] 1

    2 M

    ar 2

    018

    https://doi.org/10.1145/nnnnnnn.nnnnnnnhttps://doi.org/10.1145/nnnnnnn.nnnnnnnhttps://doi.org/10.1145/nnnnnnn.nnnnnnn

  • Technical Report, DopLab, UNIL V. Kulkarni et al.Sheet 2

    Number of Records120406080106

    Hour of Date789101112131415161718192021

    Sheet 2Number of Records

    120406080106

    Hour of Date789101112131415161718192021

    Sheet 1Busted Percentage

    0.0082.0004.0006.0008.865

    0.008 8.865

    Busted Percentage

    Inspection probability Pins ⇥ Famt � Tcost

    (a) Inspection probability according to zip-codes

    Stop

    0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48Busted Percentage

    Bel-Air Bellevaux Bessieres

    Chauderon Croisettes

    Delices EPFL Foret

    Galicien Georgette

    Lausanne-Flon Lausanne-Gare

    Ours Perrelet

    Prelaz-les-Roses Renens-14 Avril

    Renens-Gare Renens-Gare n..

    Renens-Gare sud Riponne-M.Bejart

    Sallaz St-Francois

    Tunnel UNIL-Dorigny

    Sheet 6

    Probability of inspection

    Pins ⇥ Famt � Tcost

    (b) stops v/s. inspection probability

    Stop

    4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21Hour of the Day

    Bel-Air Bessieres

    Chauderon CHUV

    Croisettes Delices

    EPFL Galicien

    Georgette Lausanne-Flon Lausanne-Gare

    Ours Renens-Gare

    Renens-Gare n.. Riponne-M.Bejart

    Sallaz St-Francois

    Sheet 5Stop

    4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21Hour of the Day

    Bel-Air Bessieres

    Chauderon CHUV

    Croisettes Delices

    EPFL Galicien

    Georgette Lausanne-Flon Lausanne-Gare

    Ours Renens-Gare

    Renens-Gare n.. Riponne-M.Bejart

    Sallaz St-Francois

    Sheet 5

    Hour of the day

    Pins ⇥ Famt � Tcost

    (c) Stops v/s. hour v/s. inspection occurrences

    Figure 1: Analysis of the crowdsourced data regarding locations of ticket controllers for over four years.

    2 ATTACK VECTORSWe determine the attack vectors by analyzing the crowdsourceddata gathered via an application1 which allows the users to markthe current controller’s location.We collect a total of 14,500 locationcoordinates spanning over four years, regarding the observationsof controller movements in the city of Lausanne, Switzerland con-tributed by more than 20,000 users. We compute: (1) the probabilityof getting inspected at each station/on each line in the city, (2) cor-relations between location and time and (3) controller movementpredictability.

    As shown in Figure 1a and 1b, the control frequency is noticeablyhigher at a select few locations as compared to the rest. Furthermore,we also observe strong correlations between certain locations andtheir inspection times 1c, locations and the control position (i.e. in-side or outside the vehicle). In addition, we detect recurring patternsin the controlling schedules across weekday/weekend/seasonaltrends and high movement prediction confidence.

    This information can be be used to devise attacks such as selec-tive ticket purchasing, i.e. purchase a ticket only if the inspectionprobability of the ride (Pins ) × the fine amount (Famt ) is greaterthan the price of the ticket (Tcost ). It can also facilitate other at-tacks such as avoiding stops where controlling is performed outsidethe vehicle, path re-routing to minimize the inspection probabilityand exploiting the trends to predict and avoid the future controllocations.

    3 CONTROL-TRACE GENERATIONIn this work, we focus on minimizing the attack vectors by elim-inating the non-uniform control probability distribution and thespatiotemporal correlations. However, unconditional random sam-pling from the available location set may weaken/increase controlat key/minimal-activity locations. Our solution to generate the ran-domized control traces, therefore lies in sampling from the truecity-wide population density flows.

    As the controllers also need to traverse between the locations, wefirst model the city’s public transport network as a bipartite graph.Here, the stations and the routes are represented by nodes [7] andthe edges denote the station nodes serviced by a route nodes. Sucha graph, facilitates quantifying the path cost C and the quality Qby utilizing the nodes connection degree (neighborhood) and thepopulation density flow to result in a time varying network [3].

    Next, we train a Sequence GAN [8] by using the sequencesderived from the time varying network, ordered with respect to thequality and cost criteria as depicted in Figure 2. We rely on GANs in

    1Busted App: busted-app.com

    this context due to their ability to accurately capture a parent datadistribution which can be used to generate synthetic data. Thusthe generated control-location traces satisfy the distribution of thepopulation density flow and are significantly different to each other.This accounts for the required randomness as well as adherence tothe essential criteria’s of controlling.

    S1,T1,W1

    S2,T2,W2S3,T3,W3

    R, W

    bipartite graph weighted by population density flow and free-rider activity

    discriminator network

    real data

    (x)

    D(x)

    noise vector

    generator network

    s1,t1 s2,t2

    s3,t1 s1,t2

    generated traces

    D(x): Distribution of the real data

    Figure 2: Model overview to generate randomized control spots

    4 CONCLUSION & FUTUREWORKIn this work, we have highlighted some of the factors resultingin the failure of current public transport controlling strategies,which facilitates free-rider activity. We have proposed a frameworkto address this issue by generating control-location traces thatare spatiotemporally randomized but adhere to some of the keyrequirements of controlling. Our future work aims at training themodel with larger data volumes to incorporate the seasonal trendsat a higher granularity. We will conduct a detailed evaluation ofthe generated traces with respect to the distribution similarity interms of the Kolmogorov-Smirnov test and randomness in termsof the root mean square error. We also plan to practically validateour approach by comparing the free-rider activity before and afteradopting our approach.

    REFERENCES[1] 20min. 2014. Reasons as to why free riding is expensive(German). http://www.

    20min.ch/finance/news/story/23420047. (2014).[2] Elmar Furst. 2012. Free Riders and Ticket Fraud in Public Transport: A Delphi

    Analysis. In European Transport Conference.[3] Adriano Galati, Vladimir Vukadinovic, Maria Olivares, and Stefan Mangold. 2013.

    Analyzing temporal metrics of public transportation for designing scalable delay-tolerant networks. In PM2HW2N@MSWiM.

    [4] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative AdversarialNets. In NIPS.

    [5] Planka. 2015. Fare evasion insurance(Swedish). http://www.planka.nu. (2015).[6] MTA Press Releases. 2013. MTA NYC Transit’s Eagle Team Set to Pa-

    trol Local Buses. (2013). https://www.mta.info/press-release/nyc-transit/mta-nyc-transits-eagle-team-set-patrol-local-buses-cooperative-effort-nypd

    [7] Christian von Ferber, Taras Holovatch, and Vasyl Palchykov. 2008. Public transportnetworks: empirical analysis and modeling.

    [8] Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: SequenceGenerative Adversarial Nets with Policy Gradient. In AAAI.

    http://www.20min.ch/finance/news/story/23420047http://www.20min.ch/finance/news/story/23420047http://www.planka.nuhttps://www.mta.info/press-release/nyc-transit/mta-nyc-transits-eagle-team-set-patrol-local-buses-cooperative-effort-nypdhttps://www.mta.info/press-release/nyc-transit/mta-nyc-transits-eagle-team-set-patrol-local-buses-cooperative-effort-nypd

    Abstract1 Introduction2 Attack Vectors3 Control-Trace Generation4 Conclusion & Future WorkReferences