comparing1 the characteristics of bicycle trips … · schirck-matthews, hochmair, strelnikova,...

16
COMPARING THE CHARACTERISTICS OF BICYCLE TRIPS BETWEEN 1 ENDOMONDO, GOOGLE MAPS, AND MAPQUEST 2 3 4 Angela Schirck-Matthews 5 University of Florida, Geomatics Program 6 3205 College Avenue 7 Ft. Lauderdale, FL-33314 8 Tel: 954-577-6392 Fax: 954-475-4125; E-mail: [email protected] 9 10 Hartwig H. Hochmair, Corresponding Author 11 University of Florida, Geomatics Program 12 3205 College Avenue 13 Ft. Lauderdale, FL-33314 14 Tel: 954-577-6392 Fax: 954-475-6317; E-mail: [email protected] 15 16 Dariia Strelnikova 17 Carinthia University of Applied Sciences 18 Department of Geoinformation and Environmental Technologies 19 Europastraße 4 20 A-9524 Villach 21 Tel: +43-5-90500-2007 Fax: +43-5-90500-2210; E-mail: 22 [email protected] 23 24 Levente Juhász 25 University of Florida, Geomatics Program 26 3205 College Avenue 27 Ft. Lauderdale, FL-33314 28 Tel: 954-577-6392 Fax: 954-475-4125; E-mail: [email protected] 29 30 31 32 Word count: 6198 words text + 2 tables x 250 words (each) = 6698 words 33 34 35 36 37 38 39 Revised version submitted Nov 1, 2018 40

Upload: others

Post on 18-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

COMPARING THE CHARACTERISTICS OF BICYCLE TRIPS BETWEEN 1

ENDOMONDO, GOOGLE MAPS, AND MAPQUEST 2

3

4

Angela Schirck-Matthews 5

University of Florida, Geomatics Program 6

3205 College Avenue 7

Ft. Lauderdale, FL-33314 8

Tel: 954-577-6392 Fax: 954-475-4125; E-mail: [email protected] 9

10

Hartwig H. Hochmair, Corresponding Author 11

University of Florida, Geomatics Program 12

3205 College Avenue 13

Ft. Lauderdale, FL-33314 14

Tel: 954-577-6392 Fax: 954-475-6317; E-mail: [email protected] 15

16

Dariia Strelnikova 17

Carinthia University of Applied Sciences 18

Department of Geoinformation and Environmental Technologies 19

Europastraße 4 20

A-9524 Villach 21

Tel: +43-5-90500-2007 Fax: +43-5-90500-2210; E-mail: 22

[email protected] 23

24

Levente Juhász 25

University of Florida, Geomatics Program 26

3205 College Avenue 27

Ft. Lauderdale, FL-33314 28

Tel: 954-577-6392 Fax: 954-475-4125; E-mail: [email protected] 29

30

31

32

Word count: 6198 words text + 2 tables x 250 words (each) = 6698 words 33

34

35

36

37

38

39

Revised version submitted Nov 1, 2018 40

Page 2: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 2

ABSTRACT 1

Numerous online trip planners, such as Google Maps and MapQuest, offer trip recommendations 2

for cycling among other transportation modes. Whereas these platforms are widely used, little is 3

known about the routing criteria embedded into their routing algorithms, and, more specifically, 4

how recommended trips differ from routes travelled by cyclists. This study uses GPS tracking 5

data from the Endomondo bicycle app to extract bicycle trips in Miami-Dade County, Florida. It 6

compares Endomondo trip characteristics to the characteristics of bicycle trips obtained from 7

Google Maps and MapQuest and the corresponding shortest path. Results highlight the attributes 8

relating to road usage along a trip (e.g. percent of residential roads), trip geometry (e.g. number 9

of turns) and surrounding land cover (e.g. percent of tree canopy) which significantly differ 10

between the analyzed trips. These attributes should be considered in the refinement of routing 11

algorithms to more closely resemble observed routes and hence to improve the usability of 12

suggested trips from a cyclist’s point of view. Furthermore, the study estimates a multinomial 13

logit model on observed Endomondo trips that provides insight into the routing behavior of 14

Endomondo commute and sport cyclists, respectively. 15

16

17

18

19

Keywords: Sports app, Endomondo, route choice, cycling, decision making 20

21

Page 3: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 3

INTRODUCTION 1

Selection of the optimal route between two network locations involves the evaluation of 2

conflicting criteria, such as travel time, route complexity, safety, or scenery, which is true also for 3

cycling trips (1, 2, 3, 4). Since a shortest and general single criterion optimal path is typically not 4

the ideal trip, the computation of the best route is inherently a multi-criteria problem (5). 5

Therefore, a thorough understanding of cyclist preferences with respect to the different criteria 6

involved in the decision making process is important for the development of online trip planning 7

systems for cyclists (6, 7, 8) and for the design of efficient cycling networks (9). Crowd-sourced 8

information, such as geo-tagged shared images (e.g. Flickr), or GPS tracking data from bicycle 9

apps (e.g. Strava, Endomondo) or route sharing Web sites (e.g. GPSies) provide big data for the 10

elicitation of route selection behavior from individual cyclists (10, 11). These crowd-based data 11

sources can supplement more traditional observation methods for cycling behavior such as phone 12

interviews, intercept surveys, preference surveys or customized GPS tracking solutions. 13

Compared to aggregate level studies (12, 13), disaggregate analysis frameworks have the 14

advantage of better capturing the behavioral relationship between cyclist route preference and its 15

determinants, such as on-street parking or roadway physical characteristics (14). 16

Although several commercial platforms provide routing directions for cyclists, most of 17

these platforms do not reveal their routing algorithms and applied routing criteria. To quantify 18

the match between observed cycling behavior and trips generated by such trip planners this study 19

compares the characteristics between selected trips in Miami-Dade County logged on the 20

Endomondo bicycle tracking app, and corresponding cycling trips suggested by Google and 21

MapQuest. Strava, founded in 2009, and Endomondo, founded in 2007, are currently the most 22

widely used bike tracking apps (15), with trip recordings on all five continents. As opposed to 23

Strava, Endomondo facilitates access to GPS tracking data at the point level through a 24

combination of HTTP requests (16). It is therefore particularly suited to extract observed route 25

trips from cyclists. Despite this advantage it has so far received relatively little attention in the 26

transportation community compared to other GPS tracking platforms, and not much is known 27

about routing behavior of Endomondo users. The presented study explores therefore the 28

following two research objectives: 29

30

• Compare the characteristics of observed Endomondo bicycle sport and commute trips 31

with cycling trips recommended by Google and MapQuest and shortest paths; 32

• Analyze the route choice behavior of Endomondo sport and commuter cyclists using a 33

multinomial logit (MNL) model. 34

35

36

RELATED WORK 37

Crowd-sourced GPS tracking data from bicycle apps (MapMyRide, MapMyFitness, Garmin 38

Connect, Strava, Endomondo) are nowadays a common source for travel behavioral analysis 39

studies. For example, the Strava Metro product matches uploaded GPS tracks to the underlying 40

road network, such as OpenStreetMap (OSM) or NAVTEQ NAVSTREETS, provides segment-41

based trip and athlete counts at different temporal resolutions, activity counts by generalized 42

origins and destinations, and the geometry of individual trips that are generalized to the centroids 43

of analysis zones (census blocks) along the trips. Analysis tasks based on Strava data include the 44

estimation of bicycle travel volume and exposure as well as the identification of factors associated 45

with increased or decreased ridership (17, 18, 19, 20). Related studies found that land use diversity, 46

presence of bike paths and bike lanes as well as bicycle parks, lower speed limits, and proximity 47

Page 4: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 4

to water bodies (lakes, ocean) are associated with higher cycling volumes, whereas on street 1

parking discourages cycling. 2

GPS data from bicycle apps have also been used for route choice modeling, showing 3

preference for routes with bike facilities and a higher number of signals per mile, but aversion to 4

routes with many turns (specifically left turns) and one-way facilities, based on users of the Grid 5

Bikeshare app (11). Analysis of user data from GPSies showed that cyclists prefer to take longer 6

routes with less tumultuous traffic and that serve as social spaces (21). A review of revealed 7

preference methods for bicycle route choice is presented in (22), which includes GPS devices 8

(e.g., GPS loggers), customized smartphone apps, crowdsourcing data (recreational/sport apps, 9

Amazon Mechanical Turk), participant recalled routes, and accompanied journeys. 10

The presence of alternative routing criteria (safe, scenic, simple) in the trips suggested by 11

the popular routing platforms Google and MapQuest in addition to travel time for car routing 12

was analyzed in (23), showing that these platforms likely incorporate simplicity as an 13

optimization criterion in addition to travel time, and that no neighborhoods seem to be excluded 14

to make routes safer. Although previous work has studied the variety of routing criteria offered in 15

online bicycle routing tools (8), only little research has been conducted that evaluates the 16

outcome of routing requests for cyclists from major search engines. One such study (24) 17

proposes a bicycle routing procedure that adopts a user perspective for leisure activities. The 18

authors then compare the characteristics of computed routes to those generated by RouteYou, 19

Strava, Google Maps, and Brouter. 20

Transportation researchers acknowledge the different types of biases that come with GPS 21

tracking data from sport apps, including user self-selection bias towards younger, male cyclists 22

(17), being in-line with the fact that the cycling community itself tends to be biased towards 23

younger, male riders (20). A comprehensive comparison of user samples collected by bicycle 24

smartphone applications in North America and samples of cyclists from traditional travel surveys 25

showed furthermore that undersampled population segments include some minority groups and 26

lower-income groups (25). Other systematic effects include geographic biases, e.g. higher usage 27

around university campuses (26), and the consequence of mass events, such as bicycle 28

competitions, on the distribution of trip counts on route segments (27). The accuracy of Strava 29

count data has been assessed through comparison with community survey data (28) and manual 30

cycling counts (18, 29). Strava data has also been used to measure the effect of bicycle 31

infrastructure improvements on short-term changes in ridership (30, 31). 32

Compared to Strava, Endomondo trips are accessible as raw GPS point data. Whereas 33

Strava provides trip counts per segment at a specific temporal resolution (for example, in one-34

minute intervals), map matching and trip counting procedures needs to be conducted in pre-35

processing steps when handling Endomondo data. That is, although working with GPS raw data 36

requires additional data preparation for further analysis, the data processing method is 37

completely in the hands of the data analyst and thus highly transparent. Furthermore, 38

Endomondo data sets contain metadata about individual cyclists, such as gender or age, whereas 39

Strava, provides such information for the entire purchased sample, if at all, and only for selected 40

products. In Endomondo, metadata about individual trips include workout type, distance, 41

duration, and average and minimum speed. Only few studies are published that analyze cycling 42

patterns emerging from Endomondo data. One study identified frequently traveled network 43

portions for recreational trips in three European cities (32). 44

45

46

Page 5: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 5

STUDY SETUP 1

Network and trip data 2

To achieve the two research objectives, four types of trips between selected origin-destination 3

pairs were required, namely an observed trip in Endomondo, bicycle trips recommended by the 4

Google and MapQuest applications, and the geometrically shortest route. To be able to obtain 5

consistent trip characteristics for the four trip types and to run a route choice model, all these 6

routes needed to be matched to the same network graph in ESRI’s ArcMap, which was based on 7

OSM in this study. OSM data was downloaded from geofabrik (http://download.geofabrik.de/) in 8

shapefile format, clipped to Miami-Dade County boundaries, planarized in ArcMap to allow 9

turns at each intersection, and then used to build a network dataset that considers one-way 10

restrictions and prohibits travel on motorways (with some exceptions). Footpaths were allowed 11

for the computation of cycling routes, since many observed routes, especially in Endomondo and 12

Google, use segments that are classified as footpaths in OSM. 13

Although automated map matching algorithms between GPS tracks and a routing graphs 14

exist (33, 21), a manual mapping approach was chosen in this study to ensure that trips from all 15

sources were correctly matched to the road network in ArcMap. It was also checked that the 16

chosen Endomondo origin and destination points were available for routing in Google Maps and 17

MapQuest, and that all segments of observed bicycle trips were present in the OSM network 18

graph. Sometimes, Endomondo trips were found to traverse along sidewalks or follow off-road 19

trails that were not part of the OSM network, which therefore needed to be digitized in the OSM 20

network. This process of creating additional links was also reported necessary for other studies 21

that analyzed bicycle travel behavior from smart phone applications (34), where cyclists went 22

through parks, parking lots, and open fields and on campus sidewalks. Overall, this manual 23

approach, together with the search for suitable Endomondo route trips resulted in a relatively 24

small sample of 31 sport and 31 commute trips that was used for further analysis. 25

26

Endomondo 27

As a first step worldwide Endomondo cycling trips from January to March 2016 were 28

downloaded and extracted, following the data retrieval procedure described in (16). The 29

approach requests all workout IDs for a specified user within a chosen time interval, followed by 30

GPS point downloads for these workouts. Returned trip points with their latitude and longitude 31

values were then written into a PostgreSQL/PostGIS database. As an initial filter only trips with 32

a mean speed between 2 km/h and 40 km/h, a length of over 500 m, and a duration of over 2 33

minutes were retained for further analysis. Endomondo facilitates tracking of about 70 activity 34

categories, including the four cycling activity types Indoor, Sport, Transport (similar to 35

commute) and Mountain Biking (off-road cycling). For this analysis 31 sport and 31 transport 36

(commute) trips in Miami-Dade County were used, where circuitous trips were excluded. Where 37

necessary, trips were trimmed to a shorter portion of the original route that more closely reflected 38

a directed travel from point A to B. Waypoints along these trips were then digitized in ArcMap, 39

and routes were regenerated using the ArcGIS Network Analyst extension, based on a network 40

dataset built from the planarized and extended OSM network. The 62 Endomondo trips were 41

from 47 different users. Out of these 47 users, 38 users provided gender information in their user 42

profiles; from these, 28 (73.7%) users were male, which is a lower proportion than for Strava 43

with values above 80% (26, 20), but it is also slightly lower than the 77.9% male among Miami-44

Dade residents who commute by bicycle. Furthermore, 24 Endomondo users provided a date of 45

birth in their profiles, which resulted in a median age of 46.5 years (mean = 45.8, min = 26.2, 46

max = 66.3) at the time of travel. 47

Page 6: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 6

Google, MapQuest, and shortest path 1

Google provides a bicycle routing option both on the Web based Google Maps interface as well 2

as through an API. To obtain a Google route, the geographic coordinates of trip origin and 3

destination of the corresponding Endomondo trip were entered in the online search fields of 4

Google Maps. The application then returns one suggested route and usually some alternatives. 5

The suggested route was then replicated in ArcMap through digitizing waypoints on the OSM 6

network layer and running the Network Analyst. Google’s Directions API allows to download 7

waypoints in JSON format which can be loaded into a GIS. These routes are, however, based on 8

the road network that Google uses, and not on OSM. Therefore, this approach would require map 9

matching as well. 10

MapQuest offers cycling directions only through an API1, with the option to choose a 11

default network or OSM as base data. For the presented study, the default network was chosen, 12

so that, like with Google Maps, MapQuest’s proprietary road network could be evaluated as part 13

of the chosen route. Using a HTML site that was customized with JavaScript code the MapQuest 14

API was called with trip end point coordinates as input. The returned suggested cycling route 15

was shown on a map and subsequently digitized on the OSM network layer. In addition, the 16

geometrically shortest path between the same origin and destination point was determined in the 17

Network Analyst under consideration of one-way restrictions. 18

FIGURE 1a shows for a chosen origin-destination pair on Miami Beach the trips based 19

on the four sources. The Endomondo route to the left has most turns per km, follows primarily 20

residential roads, and has a detour of 20.2% compared to the shortest path. The Google based 21

route to the right runs primarily on a footway (with bicycles allowed) along the beach and has 22

the largest detour among all routes with 35.7%. Compared to the first two routes the MapQuest 23

route is simpler and more direct with a detour of only 10.6% and follows mostly secondary 24

roads. The shortest route (dashed) follows also mostly secondary roads and has one more turn 25

than the MapQuest route. This example illustrates the different characteristics of bicycle trips 26

generated from different sources. FIGURE 1b shows the same set of routes on top a 2-m 27

resolution land cover grid which was used to measure the percentage of different land cover 28

categories within a 15-m buffer around the identified routes. 29

30

1 https://developer.mapquest.com/documentation/directions-api/

Page 7: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 7

a) b)

FIGURE 1 Four bicycle trips between a trip origin and destination point in Miami Beach 1

on top of OSM (a) and land cover (b) background layer. 2

GIS data layers 3

Determining the characteristics of routes required several GIS data layers. The percentage of 4

road categories along a route was based on OSM highway classes, some of which were grouped 5

to simplify road categories. For example, highways tagged “cycleway”, “path”, and “track” were 6

grouped to class cycleway, and highways tagged “footway” or “pedestrian” were grouped to a 7

class footway. Traffic signals were also taken from the OSM data set. During data preparation, 8

OSM traffic signals, which are point features located at the end nodes of OSM street segments 9

were spatially aggregated to avoid double counting of traffic signals at major intersections. 10

Traffic signals were then counted within a 15-m buffer around a trip polyline and reported as the 11

number of traffic signals per km. Turns were considered as such if the turn angle between two 12

connected route segments enclosed an angle of at least 40 degrees, and if the turn took place on a 13

network junction with at least three adjacent road segments. 14

The 2-m resolution land cover map stems from an earlier tree canopy assessment study 15

for Miami-Dade County (35) and was supplemented with GIS layers from the Miami-Dade 16

County GIS Hub (e.g. location of bay water). Part of the land cover map with its land cover 17

categories is visualized in FIGURE 1b. Furthermore, scenery related Flickr images were 18

extracted using the Flickr API using keyword-based filters. Their density was computed within 19

the 15-m buffer around each route. This was motivated by earlier studies showing that the 20

density of shared images is higher along scenic routes than fastest routes (36, 10). 21

22

Analysis methodology 23

For both research objectives attributes of trips and their surroundings with a buffered area needed 24

to be extracted from the different GIS layers, which was achieved through a customized C# 25

script within ESRI’s ArcObjects framework. Computed attributes that were considered in this 26

study were classified as belonging to the attribute groups road category, trip geometry, trip 27

feature, and land cover, as follows: 28

Page 8: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 8

• Road category: % primary road, % secondary road, % residential road, % cycleway, % 1

footway 2

• Trip geometry: % detour; number of turns left, turns right, and turns per trip km 3

• Trip feature: number of traffic signals per trip km, number of scenic Flickr images per 4

trip km 5

• Land cover: % tree canopy, % inland water (lakes, rivers), % bay or ocean, % roads and 6

railroads 7

8

On-street bicycle facilities (bike lanes, sharrows) were not considered for two reasons. First, on-9

street bicycle facilities tend to be sparsely mapped in OSM in the analyzed county (37). Second, 10

due to the small sample size of analyzed trips the statistical power of the regression model is 11

limited. It would be further reduced by additional predictor variables. Therefore, the study 12

focused on readily available OSM highway categories. 13

14

Median comparison 15

For objective 1, medians of the above listed trip attributes were compared statistically between 16

the set of Endomondo trips and trips from other sources (Google, MapQuest, shortest). For this 17

purpose, a Wilcoxon-Pratt Signed Rank test was applied, which tests matched-pairs data for a 18

common median. The analysis was conducted separately for sport and commute trips, based on 19

Endomondo trip metadata. This analysis gives insight into how trip characteristics differ between 20

the different sources, and how closely trips from trip planners resemble traveled trips in 21

Endomondo. While oftentimes a paired T-test is used to compare two population means, this 22

requires data to be normally distributed. However, this was not the case since the distributions of 23

trip attributes were skewed to the right (see FIGURE 2). 24

25

Discrete route choice model 26

Discrete route choice methods empirically model and analyze a decision maker’s preferences 27

from a set of available alternatives, i.e., the choice set (38). The multinomial logit (MNL) model 28

is a commonly used framework where the decision maker is given more than two alternatives to 29

choose from. Random utility choice models, like the MNL model, assume that the utility of an 30

alternative consists of a deterministic and stochastic (random) component. The MNL model is 31

derived from the assumption that the error terms of the utility functions are independent and 32

identically Gumbel distributed (39). The probability that a given individual n chooses alternative 33

i within the choice set Cn is given by 34

35

𝑃(𝑖|𝐶𝑛) =exp(𝑉𝑖𝑛)

∑ exp(𝑉𝑗𝑛)𝑗𝜖𝐶𝑛

(1) 36

37

The Independence from Irrelevant Alternatives (IIA) assumption that comes with the MNL 38

model suggests that route alternatives should not be overlapping. There are several approaches to 39

correct for the overlap in route alternatives, such as using the path-size logit (PSL) model, which 40

adds an additional path-size factor PS to the deterministic component of the route utility if it is 41

overlapping with other routes (39). We use the PS formulation provided in (40) which is 42

calculated as 43

44

PSr= ln∑la

Lra∈r

1

Na (2) 45

46

Page 9: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 9

where 𝑃𝑆𝑟 denotes the overlap factor for route r, 𝑎 is the index of a network link, 𝑙𝑎 is the length 1

of link 𝑎 that is a part of route 𝑟 and 𝐿𝑟 is the length of route 𝑟. 𝑁𝑎 is the number of alternatives 2

in the choice set which contain link 𝑎. In this study the choice set consists of four trips 3

(Endomondo, Google, MapQuest, shortest), and the Endomondo trip is the chosen alternative. 4

Selected predictor variables were then used to estimate unrestricted multinomial logit choice 5

models. The Null log-likelihood (L0) is the same in all estimated models. It describes the value of 6

the log-likelihood function when all parameters are zero, i.e., when the alternatives are assumed 7

to have equal probability to be chosen. L0 is computed as 8

9

𝐿0=∑ 𝑙𝑛𝑁𝑗=1

1

|𝑅𝑗| (3) 10

11

where |Rj| is the number of choice alternatives available to the individual decision maker in 12

decision situation j, and N is the total number of decision situations. The final log-likelihood 13

(L*) is the log-likelihood of the estimated model. The rho-square value is an informal goodness-14

of-fit index that measures the fraction of an initial log-likelihood value explained by the model 15

and is computed as 16

17

𝜌2=1-𝐿∗

𝐿0 (4) 18

19

To check for potential multicollinearity in the tested models bivariate correlations were 20

computed between all used predictor variables, which were low to moderate with a maximum 21

absolute Pearson’s r of 0.65 between two variables. The NLOGIT 6 software package was used 22

to estimate the MNL models. 23

24

25

RESULTS 26

Comparison of trip characteristics between different sources 27

With regards to objective 1, TABLE 1 shows the differences of median attribute values between 28

Endomondo trips and the other three trip types, separated into sport trips (left half) and commute 29

trips (right half). A positive value in a cell means that the median value for the corresponding 30

attribute is higher for the Endomondo trip than for the other trip, whereas a negative value means 31

the opposite. Numbers printed in boldface show significantly different medians as determined by 32

the Wilcoxon-Pratt Signed Rank test. 33

34

Page 10: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 10

TABLE 1 Differences of median attribute values between Endomondo (E) and Google (G), 1

MapQuest (M), and shortest (S) trips. 2

Median differences (Sport trips) Median differences (Commute trips)

E-G E-M E-S E-G E-M E-S

Road category

% Primary road 0.00 0.00 0.00 -0.40 0.00 -9.27 **

% Secondary road -9.33 ** 0.00 *(-) -14.15 ** -9.51 0.10 -6.37

% Residential road 9.99 * 9.63 *** 4.75 5.50 8.00 ** 5.60

% Cycleway 9.06 9.06 *** 9.06 * 0.00 0.00 0.00 **

% Footway 0.40 * 0.40 * -0.08 0.90 0.90 0.90 *

Trip geometry

% Detour 9.57 * 12.39 ** 15.85 *** 5.04 * 3.85 * 9.65 ***

Turns per km 0.29 * 0.52 *** 0.07 0.57 ** 0.62 * 0.38

Left turns per km 0.06 * 0.19 ** -0.06 0.34 * 0.44 ** 0.22

Right turns per km 0.28 * 0.40 ** 0.24 0.23 ** 0.10 0.26

Trip features Traffic signals per

km -0.18 -0.60 ** -0.29 ** -0.65 -0.34 -0.30 *

Scenic Flickr per km -0.56 -0.78 0.22 0.77 -0.27 1.77

Land cover

% Tree canopy -0.43 2.16 2.16 0.77 1.26 1.13

% Inland water 0.23 0.24 * 0.15 * 0.11 0.20 0.16

% Bay or ocean 0.00 0.00 0.00 0.00 *(+) 0.00 *(++) 0.00

% Roads or railroads -2.65 -3.33 -5.08 0.10 -0.66 -1.57 **

Sample size = 31 3 * p ≤ .05; ** p ≤ .01; *** p ≤ .001 (not adjusted for multiple testing) 4 (+) Mdn(E) = Mdn (G), but Mean(E) > Mean(G) 5 (++) Mdn(E) = Mdn (M), but Mean(E) > Mean(M) 6 (-) Mdn(E) = Mdn (M), but Mean(E) < Mean(M) 7

8

The positive difference values across all trip types and data sources for the % detour variable 9

indicate that Endomondo trips are significantly longer than Google and MapQuest suggested 10

routes, and, as is to be expected, also shortest paths. This means that Endomondo cyclists are not 11

focused on seeking short travel routes. The differences in median detour values are consistently 12

higher for sport trips than for the commute trips, which indicates that Endomondo commuters are 13

more concerned about trip distance than recreational cyclists. Routes chosen by Endomondo 14

users have a significantly higher turn frequency than Google and MapQuest suggested routes 15

(but not shortest paths), both for sport and commute trips. The probability density functions for 16

turns per km depicted in FIGURE 2 for sport (a) and commute (b) trips provide a graphical 17

depiction of this pattern, where peaks around smaller values are more pronounced for Google 18

and MapQuest than for Endomondo and shortest paths. This supports earlier findings that Google 19

and MapQuest tend to consider simplicity in their routing algorithms (23). 20

Another commonality shared between sport and commute trips is the higher share of 21

Page 11: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 11

residential (local) roads in Endomondo trips compared to Google and MapQuest trips (compare 1

FIGURE 2c for sport activities). This means that Endomondo users prefer local roads with little 2

traffic and accept more turns along the trip as a trade-off. 3

4

a) Turns (sport) b) Turns (commute)

c) % Residential road (sport) d) % Secondary road (sport)

FIGURE 2 Probability density function of median values for turns per km (a, b), % 5

residential road (c), and % secondary road (d) across different data sources. 6

For sport activities, the share of secondary roads (typically major urban streets with traffic lights, 7

similar to minor arterials or collector roads) on Endomondo routes is significantly smaller than 8

for all other sources (compare FIGURE 2d). Similarly, a higher percentage of cycleways and 9

footpaths is observed for Endomondo sport trips than for Google and MapQuest trips. This 10

shows that the two commercial routing applications do not match the cyclist’s travel behavior 11

entirely in these aspects of observed route preferences. For commute trips, no such discrepancy 12

is observed, meaning that suggested Google and MapQuest trips match observed cycling 13

Page 12: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 12

behavior along these characteristics. Footways identified on Endomondo trips often run on 1

sidewalks, especially on major roads where travel on the traffic lanes can be considered unsafe. 2

However, use of sidewalks is not explicitly suggested in Google and MapQuest trips, although 3

bicycle riding on sidewalks is permitted in the State of Florida per Section 316.2065, Florida 4

Statues, Bicycle Regulations2, provided that a cyclist yields the right-of-way to any pedestrian. 5

Although cyclists show a preference for low-traffic residential roads, major roads can often not 6

be avoided without excessive detour when there is a bottleneck in the transportation network 7

without cycling infrastructure (e.g. a bridge). In such a case, however, sidewalk information for 8

major roads in routing directions provided by the trip planning application could be useful for a 9

cyclist to judge the safety of a recommended route. 10

Endomondo sport trips feature fewer traffic lights and more inland water land cover along 11

trips than MapQuest trips, and Endomondo commute trips run more often near the bay or ocean 12

than Google and MapQuest trips. These facts together suggest that (a) current trip planner 13

applications could be improved by better integrating these land cover attributes in route search, 14

and that (b) offering different cyclist profiles (e.g. sport vs. commuter cyclist) with modified 15

weighting functions could lead to routes that better meet the needs of different cyclist groups. 16

The density of crowd-sourced imagery along the routes did not differ significantly between 17

Endomondo and other trip sources, suggesting that this source is not necessary to replicate 18

typical Endomondo trips. However, this kind of data source could be useful to compute scenic 19

and popular routes (10, 41). 20

21

Model results 22

For objective 2, various route choice models were manually built in a step-wise approach and 23

estimated, where the pathsize variable was included in all models since it accounts for the 24

correlation between the alternatives in the choice set. Variables that were found to be significant 25

in the median comparisons (TABLE 1) were first added to build more complex models. Using 26

this stepwise approach, one final model for sport trips and one final model for commute trips was 27

identified that maximized the goodness of fit value and that at the same time retained significant 28

parameters (p < 0.05), except for the PS overlap predictor (TABLE 2). With all tested 29

intermediate models (as well as the final models shown), the arithmetic signs of coefficients, if 30

statistically significant, were as expected. TABLE 2 also reports the best fitting models for sport 31

and commute trips with a significant PS overlap predictor. However, these models contain fewer 32

significant variables and have a lower goodness of fit value than the other final models. 33

Estimated models in TABLE 2 show that Endomondo sport cyclists prefer local roads, 34

and that they avoid roads with heavier traffic and traffic lights. Local roads combine the 35

advantage of low volume traffic areas and the absence of signalized intersections, which should 36

be more prominently considered in routing applications for sport cyclists. The commute-based 37

models show preferences of expected road types (residential, cycleway) as well as footways, 38

where the preference of footways can be partially attributed to the use of sidewalks along major 39

roads that are probably deemed unsafe for on-road travel. One of the models also shows a 40

preference for routes with tree canopy. Hence urban design should consider planting trees where 41

possible. Green and recreational space has been found to be positively associated with time spent 42

on cycling in earlier studies (42). 43

44

2 https://www.flsenate.gov/Laws/Statutes/2012/316.2065

Page 13: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 13

TABLE 2 Estimation results of unrestricted multinomial logit choice models for sport and 1

commute trips. 2

Reduced sport Final sport

Reduced

commute Final commute

% Secondary road Coeff. -0.036

t-statistic -2.40 *

% Residential road Coeff. 0.041 0.034

t-statistic 2.28 * 1.97 *

% Cycleway Coeff. 0.026 0.059

t-statistic 1.99 * 2.57 *

% Footway Coeff. 0.069

t-statistic 2.18 *

Traffic signals / km Coeff. -1.126

t-statistic -2.41 *

% Tree canopy Coeff. 0.300

t-statistic 2.35 *

% Roads/railroads Coeff. 0.173

t-statistic 2.30 *

Overlap factor PS Coeff. 2.026 1.404 1.789 1.714

t-statistic 2.19 * 1.43 2.49 * 1.71

Parameters 2 3 2 6

Observations 31 31 31 31

Final LL (L*) -36.19 -32.81 -37.67 -30.36

Null LL (L0) -42.98 -42.98 -42.98 -42.98

ρ2 0.16 0.24 0.12 0.29

* p ≤ .05 3

4

5

CONCLUSIONS 6

Through comparison of characteristics of Endomondo bicycle trips with bicycle trips suggested 7

on Google and MapQuest this study revealed which aspects of cyclist route selection behavior 8

are already well incorporated in those applications, and which ones could deserve closer 9

attention, such as preferred routing through residential areas or considering the proximity to 10

water bodies. Comparison of Google and MapQuest trips with Endomondo and shortest routes 11

also revealed some apparent criteria that are already considered in those commercial 12

applications, such as route complexity, and avoidance of overly long detours. This expands 13

previous research on the characteristics of trip suggestions for motorized traffic in Google and 14

MapQuest (23). Previous work has also identified which type of routing criteria users prefer to 15

be offered in the user interface of bicycle trip planners (43). The current study expands this 16

research by evaluating routes that were generated on two major search engines. For future work 17

we plan to apply this research to other routing services, such as the MapQuest Open Directions 18

API that is based on OSM. Replicating this study for other geographic areas, as well as 19

integrating on-road bicycle facilities, and increasing trip sample size could also identify 20

additional relevant criteria in bicycle travel behavior. 21

22

23

Page 14: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 14

AUTHOR CONTRIBUTIONS 1

The authors confirm contribution to the paper as follows: Study conception and design: HH and 2

AM; data collection: DS, AM, HH, and LJ; analysis and interpretation of results: AM and HH; 3

draft manuscript preparation: HH and AM. All authors reviewed the results and approved the 4

final version of the manuscript. 5

6

7

REFERENCES 8

1. Harvey, F., Krizek, K. J., and Collins, R. Using GPS Data to Assess Bicycle Commuter 9

Route Choice. Transportation Research Board - 87th Annual Meeting, Washington, D.C., 10

2008. 11

2. Menghini, G., Carrasco, N., Schüssler, N., and Axhausen, K. W. Route choice of cyclists 12

in Zurich. Transportation Research Part A, 2010. 44: 745-765. 13

3. Broach, J., Dill, J., and Gliebe, J. Where do cyclists ride? A route choice model 14

developed with revealed preference GPS data. Transportation Research Part A: Policy 15

and Practice, 2012. 46: 1730–1740. 16

4. Krenn, P. J., Oja, P., and Titze, S. Route choices of transport bicyclists: a comparison of 17

actually used and shortest routes. International Journal of Behavioral Nutrition and 18

Physical Activity, 2014. 11. 19

5. Hrncir, J., Zilecky, P., Song, Q., and Jakob, M. Speedups for Multi-Criteria Urban 20

Bicycle Routing. In G. F. Italiano and M. Schmidt (Eds.), 15th Workshop on Algorithmic 21

Approaches for Transportation Modelling, Optimization, and Systems (ATMOS’15), 22

Dagstuhl Publishing, Dagstuhl, Germany, 2015. 23

6. Hochmair, H. H. Effective User Interface Design in Route Planners for Cyclists and 24

Public Transportation Users: An Empirical Analysis of Route Selection Criteria. 25

Transportation Research Board - 87th Annual Meeting, Washington, D.C., 2008. 26

7. Su, J. G., Winters, M., Nunes, M., and Brauer, M. Designing a route planner to facilitate 27

and promote cycling in Metro Vancouver, Canada. Transportation Research Part A: 28

Policy and Practice, 2010. 44: 495–505. 29

8. Loidl, M. and Hochmair, H. H. Do online bicycle routing portals adequately address 30

prevalent safety concerns? Safety, 2018. 4: 9. 31

9. Hochmair, H. H. Assessment of Bicycle Service Areas around Transit Stations. 32

International Journal of Sustainable Transportation 2015. 9: 15-29. 33

10. Alivand, M., Hochmair, H. H., and Srinivasan, S. Analyzing how travelers choose scenic 34

routes using route choice models. Computers, Environment and Urban Systems, 2015. 35

50: 41-52. 36

11. Khatri, R., Cherry, C. R., Nambisan, S. S., and Han, L. D. Modeling Route Choice of 37

Utilitarian Bikeshare Users with GPS Data Transportation Research Record: Journal of 38

the Transportation Research Board, 2017. 2587: 141-149. 39

12. Krizek, K. J. and Johnson, P. J. Proximity to Trails and Retail: Effects on Urban Cycling 40

and Walking. Journal of the American Planning Association, 2006. 72: 33-42. 41

13. Noland, R. B., Dekaa, D., and Waliaa, R. A Statewide Analysis of Bicycling in New 42

Jersey. International Journal of Sustainable Transportation, 2011. 5: 251-269. 43

14. Sener, I. N., Eluru, N., and Bhat, C. R. An analysis of bicycle route choice preferences in 44

Texas, US. Transportation, 2009. 36: 511-539. 45

15. Romanillos, G., Austwick, M. Z., Ettema, D., and De Kruijf, J. Big Data and Cycling. 46

Transport Reviews, 2016. 36: 114-133. 47

Page 15: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 15

16. Strelnikova, D. Comparing the Suitability of Strava and Endomondo GPS Tracking Data 1

for Bicycle Travel Pattern Analysis (B.S. Thesis, Department of Geoinformation & 2

Environmental Technologies, Carinthia University of Applied Sciences Villach, Austria), 3

2017. 4

17. Griffin, G. P. and Jiao, J. Where does bicycling for health happen? Analysing volunteered 5

geographic information through place and plexus. Journal of Transport & Health, 2015. 6

2: 238–247. 7

18. Jestico, B., Nelson, T., and Winters, M. Mapping ridership using crowdsourced cycling 8

data. Journal of Transport Geography, 2016. 52: 90–97. 9

19. Hochmair, H. H., Bardin, E., and Ahmouda, A. Estimating Bicycle Trip Volume for 10

Miami-Dade County from Strava Tracking Data. Transportation Research Board - 96th 11

Annual Meeting, Washington, D.C., 2017. 12

20. Sanders, R. L., Frackelton, A., Gardner, S., Schneider, R., and Hintze, M. Ballpark 13

Method for Estimating Pedestrian & Bicyclist Exposure in Seattle, Washington: Potential 14

Option for Resource-Constrained Cities in an Age of Big Data. Transportation Research 15

Record: Journal of the Transportation Research Board, 2017. 2605: 32-44. 16

21. Sultan, J., Ben-Haim, G., Haunert, J.-H., and Dalyot, S. Extracting spatial patterns in 17

bicycle routes from crowdsourced data. Transactions in GIS, 2017. 21: 1321–1340. 18

22. Pritchard, R. Revealed Preference Methods for Studying Bicycle Route Choice—A 19

Systematic Review. Int. J. Environ. Res. Public Health, 2018. 15: 470. 20

23. Johnson, I. L., Henderson, J., Perry, C., Schöning, J., and Hecht, B. J. Beautiful…but at 21

What Cost? An Examination of Externalities in Geographic Vehicle Routing. 22

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 23

2017. 1: Article No. 15. 24

24. Baker, K., Ooms, K., Verstockt, S., Brackman, P., De Maeyer, P., and Van de Walle, R. 25

Crowdsourcing a cyclist perspective on suggested recreational paths in real-world 26

networks. Cartography and Geographic Information Science, 2017. 44: 422-435. 27

25. Blanc, B., Figliozzi, M., and Clifton, K. How Representative of Bicycling Populations 28

Are Smartphone Application Surveys of Travel Behavior? Transportation Research 29

Record: Journal of the Transportation Research Board, 2016. 2587: 78-89. 30

26. Watkins, K., Ammanamanchi, R., LaMondia, J., and Le Dantec, C. A. Comparison of 31

Smartphone-based Cyclist GPS Data Sources. Transportation Research Board - 95th 32

Annual Meeting, Washington, D.C., 2016. 33

27. Oksanen, J., Bergman, C., Sainio, J., and Westerholm, J. Methods for deriving and 34

calibrating privacy-preserving heat maps from mobile sports tracking application data. 35

Journal of Transport Geography, 2015. 48: 135–144. 36

28. Whitfield, G. P., Ussery, E. N., Riordan, B., and Wendel, A. M. Association Between 37

User-Generated Commuting Data and Population-Representative Active Commuting 38

Surveillance Data — Four Cities, 2014–2015. Morbidity and Mortality Weekly Report, 39

2016. 65: 959–962. 40

29. Conrow, L., Wentz, E., Nelson, T., and Pettit, C. Comparing spatial patterns of 41

crowdsourced and conventional bicycling datasets. Applied Geography, 2018. 92: 21-30. 42

30. Ferenchak, N. N. and Marshall, W. E. The relative (in) effectiveness of bicycle sharrows 43

on ridership and satefy outcomes. Transportation Research Board - 95th Annual Meeting, 44

Washington, D.C., 2016. 45

31. Heesch, K. C. and Langdon, M. The usefulness of GPS bicycle tracking data for 46

evaluating the impact of infrastructure change on cycling behaviour. Health Promotion 47

Page 16: COMPARING1 THE CHARACTERISTICS OF BICYCLE TRIPS … · Schirck-Matthews, Hochmair, Strelnikova, Juhász 5 1 STUDY SETUP 2 Network and trip data 3 To achieve the two research objectives,

Schirck-Matthews, Hochmair, Strelnikova, Juhász 16

Journal of Australia, 2016. 27: 222-229. 1

32. Sileryte, R., Nourian, P., and Spek, S. v. d. Modelling Spatial Patterns of Outdoor 2

Physical Activities Using Mobile Sports Tracking Application Data. In T. Sarjakoski, M. 3

Y. Santos and T. Sarjakoski (Eds.), Proceedings of the 19th AGILE Conference on 4

Geographic Information Science, Lecture Notes in Geoinformation and Cartography, 5

Springer, Berlin, 2016, pp. 179-197. 6

33. Schuessler, N. and Axhausen, K. W. Processing Raw Data from Global Positioning 7

Systems Without Additional Information. Transportation Research Record: Journal of 8

the Transportation Research Board, 2009. 2105: 28-36. 9

34. Hudson, J. G., Duthie, J. C., Rathod, Y. K., Larsen, K. A., and Meyer, J. L. Using 10

Smartphones to Collect Bicycle Travel Data in Texas - Report UTCM 11-35-69, 11

Washington, DC, Department of Transportation. Available from 12

http://utcm.tamu.edu/publications/final_reports/Hudson_11-35-69.pdf, 2012. 13

35. Hochmair, H. H., Gann, D., Benjamin, A., and Fu, Z. J. Miami-Dade County Urban Tree 14

Canopy Assessment, Fort Lauderdale, FL, University of Florida and Florida International 15

University. Available from http://digitalcommons.fiu.edu/gis/65/, 2016. 16

36. Hochmair, H. H. Spatial Association of Geotagged Photos with Scenic Locations. In A. 17

Car, G. Griesebner and J. Strobl (Eds.), Geospatial Crossroads@GI_Forum '10: 18

Proceedings of the Geoinformatics Forum Salzburg, Wichmann, Heidelberg, 2010, pp. 19

91-100. 20

37. Hochmair, H. H., Zielstra, D., and Neis, P. Assessing the Completeness of Bicycle Trail 21

and Designated Lane Features in OpenStreetMap for the United States. Transactions in 22

GIS, 2015. 19: 63-81. 23

38. Prato, C. G. Route choice modeling: past, present and future research directions. Journal 24

of Choice Modelling, 2009. 2: 65-100. 25

39. Ben-Akiva, M. and Bierlaire, M. Discrete choice methods and their applications to short-26

term travel decisions. In R. Hall (Ed.), Handbook of Transportation Science, Kluwer, 27

Dordrecht, 1999, pp. 5-34. 28

40. Daamen, W., Bovy, P. H. L., and Hoogendoorn, S. P. Influence of Changes in Level on 29

Passenger Route Choice in Railway Stations. Transportation Research Record: Journal 30

of the Transportation Research Board, 2005. 1930: 12-20. 31

41. Sun, Y., Fan, H., Bakillah, M., and Zipf, A. Road-based travel recommendation using 32

geo-tagged images. Computers, Environment and Urban Systems, 2015. 53: 110-122. 33

42. Wendel-Vos, G. C., Schuit, A. J., de Niet, R., Boshuizen, H. C., Saris, W. H., and 34

Kromhout, D. Factors of the Physical Environment Associated with Walking and 35

Bicycling. Medicine & Science in Sports & Exercise, 2004. 36: 725–730. 36

43. Hochmair, H. H. and Rinner, C. Investigating the Need for Eliminatory Constraints in the 37

User Interface of Bicycle Route Planners. In A. G. Cohn and D. M. Mark (Eds.), 38

Conference on Spatial Information Theory (COSIT'05), LNCS 3693, Springer, Berlin, 39

2005, pp. 49-66. 40