development and transferability of advanced … · development and transferability of advanced...

61
Development and transferability of advanced econometric models of bikesharing demand in urban settings Frederic Reynaud Department of Civil Engineering and Applied Mechanics McGill University, Montreal October 2015 A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Master of Engineering © Frederic Reynaud 2015

Upload: hanhan

Post on 12-Sep-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Development and transferability of advanced econometric models of

bikesharing demand in urban settings

Frederic Reynaud

Department of Civil Engineering and Applied Mechanics

McGill University, Montreal

October 2015

A thesis submitted to McGill University in partial fulfillment of the requirements of the degree

of Master of Engineering

© Frederic Reynaud 2015

I

Contributions of Authors

Several researchers have contributed to the work presented in chapters 2 and 3 of this

thesis, which are each based on a manuscript which has been/will be sent out for publication.

First, my supervisor, Professor Naveen Eluru, has provided me with guidance and training for all

the work I carried out over the last two years. The contributions of Seyed Ahmadreza Faghih-

Imani were also critical, specifically with respect to the data preparation for Montreal and New

York, as well as the development of the original arrivals-departures model for Montreal. Finally,

the work of Lesley Bland on the preparation of New York data is also acknowledged.

II

Acknowledgments

The work presented in this thesis would not have come to fruition without the help of

several people. First, I would like to thank my supervisor, Dr. Eluru, for his help, support and

mentoring over the last two years, with respect to school and research-related issues, as well as

more general life concerns. A great supervisor is critical for one to enjoy graduate studies, and I

consider myself very fortunate to have been his student.

I have also been lucky to work with a fantastic group of people over the last two years. I

have stopped counting the number of times Ahmad, Shamsunnahar and Sabreena have explained

obscure GAUSS, SPSS or ArcGIS functions to me; and whether it be for transportation-related

debates, ping-pong matches or Uno, they always had my back.

I would also like to thank the transportation group at McGill: Prof. Hatzopoulou, Prof.

Miranda-Moreno, Maryam, Ahsan, Josh, Ting, Miguel, Junshi, and – although he’s in structural

– Nathan. I have thoroughly enjoyed my interactions with all of them.

Anna, Sun Chee, Franca, Sandy, you have been lifesavers for the past several years.

Thank you for your patience!

Other important contributors to my work have been the National Sciences and

Engineering Research Council (NSERC), the Fonds Québécois de Recherche Nature et

Technologie (FQRNT), and McGill University, who provided me with funding over the course

of my Master’s.

Finally, to my friends and family, and my girlfriend Alexandra, thank you for putting up

with the econometric-talk and nodding in approval when I complained about how tricky

modelling can be.

III

Abstract

Bikesharing systems (BSS) are becoming increasingly popular in urban areas around the

world, as demonstrated by the rapid growth of both the number and the size of these systems in

recent years. Understanding and predicting BSS usage patterns is complex, especially because

these patterns are often tied to local factors. This thesis aims to contribute to the existing

literature on BSS in two ways.

First, an econometric model featuring bicycle availability at a station level as a direct

metric of analysis is developed. This behaviorally quantitative model accounts for the influence

of temporal, meteorological, bicycle infrastructure, built environment and land-use attributes on

bicycle availability. More specifically, an ordered regression model - panel mixed generalized

ordered logit model - is estimated to accommodate for the influence of exogenous variables and

station level unobserved factors. The model estimation is undertaken using BIXI-Montreal data

from the summer of 2012. The results show BIXI is used more in the afternoon than in the

morning, dense areas tend to be associated with lower availability levels, and interactions of time

of day with land use impact availability. The estimated model is validated using a hold-out

sample of data from the summer of 2013. The results clearly highlight the satisfactory

performance of the proposed framework. The model developed can be employed by BSS

operators to arrive at hourly system state predictions and used for rebalancing operations. To

illustrate its applicability, an availability prediction exercise is also undertaken. A review of the

existing BSS literature indicates that the framework presented in this thesis is the first to model

bicycle availability in BSS using detailed temporal and spatial scales. As such, this thesis

contributes to advancing the state-of-the-art toolkit available to BSS planners worldwide, and

especially in Montreal.

IV

Second, a BSS model transferability exercise is conducted using a detailed arrivals and

departures framework developed for Montreal by Faghih-Imani et al. (2014) and applying it to

data from New York. This allows a direct comparison of the influence of temporal,

meteorological, bicycle infrastructure, built environment and land-use variables on BSS usage in

these two cities. Results show significant overlap in the influence of weather variables, bicycle

infrastructure, and several land-use attributes. However, temporal trends – especially weekend

usage patterns – are very different in both cities. Overall, our results are promising for the

development of transferable models of bicycle flows in urban areas. It should be noted that this

research effort is the first to investigate BSS model transferability between two large cities using

a detailed arrivals and departures model that takes into account temporal, meteorological, bicycle

infrastructure, built environment and land-use variables.

V

Résumé

Les Systèmes de Vélo en Libre-Service (SVLS) sont de plus en plus populaires dans les

régions urbaines partout dans le monde, comme le démontre l’expansion de ces systèmes au

cours de la dernière décennie, en termes de nombre de SVLS et de leur taille. Les motifs

d’utilisation des SVLS sont complexes et difficiles à prédire, particulièrement parce que ces

motifs sont souvent liés à des facteurs locaux. Cette thèse vise à contribuer à la littérature

académique sur les SVLS de deux manières.

Premièrement, un modèle économétrique présentant la disponibilité de vélos dans les

stations BIXI comme métrique d’analyse directe est développé. Ce modèle comportemental

quantitatif prend en compte l’influence de données temporelles, météorologiques, de

l’infrastructure pour cyclistes, de l’infrastructure générale et de la gestion du territoire sur la

disponibilité des vélos. Plus particulièrement, un modèle de régression ordonné – modèle logit

panel mixte généralisé ordonné – est estimé afin d’accommoder l’influence de données exogènes

ainsi que les facteurs non-observés au niveau des stations. Le modèle est estimé avec des

données de BIXI-Montréal de l’été 2012. Les résultats démontrent que BIXI est plus utilisé

l’après-midi que le matin, que les zones denses sont en général associées à des niveaux de

disponibilité plus faibles, et que les interactions entre la gestion du territoire et l’heure de la

journée influencent la disponibilité de vélos. Le modèle est validé avec des données de l’été

2013. Les résultats démontrent clairement la performance satisfaisante de la structure statistique

proposée. Le modèle développé peut être employé par les opérateurs de SVLS afin d’arriver à

des prédictions de disponibilité à une haute résolution temporelle – heure par heure – et peut être

utilisé pour optimiser les opérations de rééquilibrage. Afin d’illustrer ces applications, un

exercice de prédiction de disponibilité est présenté. Après avoir passé en revue les publications

VI

portant sur les SVLS, il apparait que cette thèse est la première à proposer un modèle statistique

de la disponibilité de vélos qui incorpore une échelle spatiale et une résolution temporelle

détaillées. Cette thèse contribue donc à avancer l’arsenal d’outils de pointe disponible aux

opérateurs de SVLS à travers le monde, et particulièrement à Montréal.

Deuxièmement, un exercice de transférabilité des modèles de SVLS est effectué. À cette

fin, un modèle économétrique des flux d’arrivée et de départ de vélos dans les stations développé

par Faghih-Imani et al. (2014) avec des données de Montréal est appliqué à des données de New

York, afin de comparer l’influence de facteurs temporels, météorologiques, de l’infrastructure

pour cyclistes, de l’infrastructure générale et de la gestion du territoire sur l’usage des SVLS

dans ces deux villes. Les résultats de cette étude démontrent des similarités quant à l’influence

des variables météorologiques, de l’infrastructure pour cyclistes, et de certaines variables

concernant l’usage du territoire et l’infrastructure générale. Cependant, les tendances temporelles

– particulièrement les tendances des fins de semaine – sont très différentes dans ces deux villes.

Globalement, nos résultats sont prometteurs pour le développement de modèles transférables des

flux cyclistes en milieu urbain. Il est important de noter que cet effort de recherche est le premier

à investiguer la transférabilité de modèles statistiques de SVLS entre deux grandes villes qui

utilise un modèle détaillé des flux d’arrivée et de départ de vélos dans les stations qui prenne en

compte des facteurs temporels, météorologiques, de l’infrastructure pour cyclistes, de

l’infrastructure générale et de la gestion du territoire.

VII

Table of Contents

Contributions of Authors ................................................................................................................. I

Acknowledgments........................................................................................................................... II

Abstract ......................................................................................................................................... III

Résumé ........................................................................................................................................... V

Table of Contents ......................................................................................................................... VII

List of Tables ................................................................................................................................. X

List of Figures ............................................................................................................................... XI

List of Abbreviations ................................................................................................................... XII

CHAPTER 1: INTRODUCTION ................................................................................................... 1

1.1 Background ...................................................................................................................... 1

1.2 Literature Review ............................................................................................................. 3

1.3 Objectives ......................................................................................................................... 6

1.4 Thesis Structure ................................................................................................................ 7

CHAPTER 2: MODELLING BICYCLE AVAILABILITY IN BICYCLE SHARING

SYSTEMS: A CASE STUDY FROM MONTREAL .................................................................... 9

3.1 Context .................................................................................................................................. 9

3.2 Data Preparation and Modeling Exercise ............................................................................ 10

3.2.1 Dependent Variable Definition ........................................................................................ 10

3.2.2 Visual Representation of Availability .............................................................................. 11

3.2.3 Addressing Rebalancing ................................................................................................... 14

3.2.4 Econometric Model Framework....................................................................................... 14

VIII

3.3 Estimation Results ............................................................................................................... 16

3.3.1 Constant and Preference Heterogeneity ....................................................................... 16

3.3.2 Weather and Temporal ................................................................................................. 16

3.3.3 Bicycle Infrastructure ................................................................................................... 17

3.3.4 Location, Land Use, and Built Environment ................................................................ 18

3.3.5 TAZ Level Variables .................................................................................................... 19

3.4 Validation and System-State Prediction .............................................................................. 21

3.4.1 Model Validation .......................................................................................................... 21

3.4.2 System State Prediction ................................................................................................ 24

3.5 Conclusions and Future Work ............................................................................................. 26

CHAPTER 3: TRANSFERABILITY OF ECONOMETRIC MODELS OF BICYCLE

SHARING DEMAND IN URBAN SETTINGS: A CASE STUDY OF MONTREAL AND

NEW YORK ................................................................................................................................. 27

3.1 Context ................................................................................................................................ 27

3.2 Data and Methodology ........................................................................................................ 28

3.2.1 Data Preparation and Comparison ................................................................................ 28

3.2.2 Methodology ................................................................................................................. 34

3.3 Results ................................................................................................................................. 35

3.3.1 Model Fit Measures ...................................................................................................... 35

3.3.2 Weather ......................................................................................................................... 36

3.3.3 Temporal ....................................................................................................................... 36

3.3.4 Bicycle Infrastructure ................................................................................................... 37

3.3.5 Land-use and Built Environment .................................................................................. 37

IX

3.4 Conclusions and Future Work ............................................................................................. 41

CHAPTER 4: CONCLUSION ..................................................................................................... 42

5.1 Significant Contributions .................................................................................................... 42

5.2 Future Research ................................................................................................................... 43

REFERENCES ............................................................................................................................. 44

X

List of Tables

Table 1 Estimation Results ........................................................................................................... 20

Table 2 Aggregate Measures of Fit ............................................................................................... 23

Table 3 Descriptive Summary of sample characteristics: Montreal ............................................. 32

Table 4 Descriptive Summary of sample characteristics: New York ........................................... 33

Table 5 Model Estimation Results: Montreal ............................................................................... 39

Table 6 Model Estimation Results: New York ............................................................................. 40

XI

List of Figures

Figure 1 Variation of availability during the day around BIXI stations (estimation sample) ...... 13

Figure 2 Variation of availability during the day around BIXI stations (prediction based on

validation sample) ......................................................................................................................... 25

XII

List of Abbreviations

BIXI Bicycle-Taxi

BSS Bicycle Sharing Systems

CaBi Capital Bikeshare

CBD Central Business District

FQRNT Fonds Québécois de Recherche Nature et Technologie

GHG Green House Gas

GIS Geographic Information System

IT Information Technology

LL Log-Likelihood

LLR Log-Likelihood of Restricted Model

LLUR Log-Likelihood of Unrestricted Model

MAPE Mean Absolute Percent Error

MGOL Mixed Generalized Ordered Logit

ML Maximum Likelihood

NHTS National Household Travel Survey

NSERC National Sciences and Engineering Research Council

PBSC Public Bike System Company

QMC Quasi-Monte Carlo

RMSE Root Mean Square Error

SCD Sub-City District

SVLS Système de Vélos en Libre-Service

1

CHAPTER 1: INTRODUCTION

1.1 Background

The first Bikesharing system (BSS) started operation in Europe in the 1960s, and have

since spread across the globe. According to De Maio (2009) and Shaheen et al. (2010), the

history of bikesharing can be broken down into 4 generations of systems. In the first generation

of BSS, bikes were painted vivid colors and left unlocked around the city so that anyone could

use them. The second generation of BSS required coin deposits in order to unlock a bike. These

early attempts both failed due to user anonymity and a lack of temporal constraints on rentals.

The third generation of BSS was far more successful. These information technology (IT)-based

systems featured user-interface technology or operators at docking stations, required user

identification, and started to implement membership programs and time constraints on usage.

Finally, a fourth generation of systems has emerged in recent years, known as demand-

responsive multimodal systems. Fourth-generation systems feature bicycle redistribution

systems, and attempt to integrate bikesharing with public transit or car-sharing programs. This

thesis will feature studies of two such fourth-generation systems: BIXI in Montreal, and Citi

Bike in New York.

BIXI Montreal kicked-off in May 2009, with a fleet of 3000 bicycles distributed between

300 stations. In August 2009, BIXI Montreal expanded to 411 stations and 5000 bikes. In 2010,

it recorded over 3.4 million trips over the course of the season (PBSC, 2010), which lasts from

mid-April to Mid-November, due to weather constraints. Although the system is widely regarded

as being successful, it has faced significant financial issues since it was put in place. These issues

suggest that a more thorough understanding of bikesharing systems would be beneficial and

2

would help these systems thrive and expand, which in turn would allow urban populations to

decrease their environmental footprint while enjoying health benefits.

New York is the most populous city in the US, and a prominent tourist destination, with

millions of visitors each year. In 2013, cycling mode share was about 1%, whereas it was only

0.5% in 2007 (Kaufman et al., 2015). While 71.7% of trips in the New York metropolitan area

were carried out using private vehicles, bike trips account for 0.4% of total according to NHTS

(2009). When looking a little deeper in the data, it appears that 49.7% of trips are less than two

miles, and within this category the share of private vehicles reduces to 57.1% while the share of

biking rises to 0.7%. This small increase in bike share around dense urban cores offers

substantial benefits as far as public health, well-being, and perhaps transportation-related Green

House Gas (GHG) emissions. Coupled to the fact that 74% of Citi Bike stations are within a half

mile of subway stations, these facts show the potential of BSS to become an important addition

to mobility options for populations located in dense urban areas. New York’s Citi Bike system is

one of the more recent major public bicycle-sharing systems to have been successfully

implemented, and the largest in the United States. The system was launched in May 2013 with

330 stations and over 6000 bicycles in the Northwest of Brooklyn and the lower half of

Manhattan.

Bicycle sharing systems (BSS) have been receiving increasing amounts of attention in

recent years as complementary modes of transportation in urban areas around the world.

Currently, there are over one million public bicycles worldwide, and over 1,100 cities have

installed or are planning a BSS (Meddin and DeMaio, 2015). These systems present many

advantages, including flexibility, ease of access and use, physical activity and health-related

benefits. These systems also address the issue of bicycle theft for users, a common problem for

3

regular cyclists in urban environments (Bachand-Marleau et al., 2012; Van Lierop et al., 2013).

Additionally, BSS offer a potential solution to the “last mile” problem (Cervero et al., 2013;

Shaheen et al., 2010) and are in tune with current generational trends in transportation. Younger

generations are less willing to drive, more concerned about the environment, and more prone to

use public transit and shared transportation alternatives (Dutzik and Baxandall, 2013). Recent

work by Murphy and Usher (2015) suggests BSS can improve driver awareness of cyclists,

which can result in increased safety for all cyclists. Finally, a recent study conducted by

researchers in London, UK, showed that BSS can be beneficial to the public perception of

cycling, and help broaden the demographic of bicycle users (Goodman et al., 2014).

1.2 Literature Review

In recent years, studies have examined several facets of BSS in various cities of Europe

and North America. These studies can be segmented into four broad groups. The first group of

studies employ actual flow data obtained from the system under consideration to investigate the

factors affecting BSS flows. The second group consists of surveys of user behaviours and

perceptions, while the third is concerned with identifying “problematic” stations and optimizing

rebalancing efforts. Finally, the fourth group of studies is concerned with the transferability of

models of BSS demand and bicycle flows. We will provide a brief overview of research studies

along these four dimensions.

Investigating the Factors of BSS Flows

From our review, this group of studies appears to be the most developed. Several studies

set in individual cities have been published over the course of the last decade. For instance,

Krykewycz et al. (2010) investigated a planned system in Philadelphia, Pennsylvania, using a

raster based Geographic Information System (GIS) to identify possible locations for BSS while

4

using data from European cities to forecast expected demand. Wang et al. (2012) investigated

annual station trips in Minneapolis-St.Paul using three ordinary least squares regression models

and four types of variables: presence of businesses and jobs, socio-demographics, built

environment and transportation infrastructure. A common limitation of these studies is the lack

of detailed temporal resolution. Monthly or annual flow estimations fail to capture short term

variation due to shifts in weather, as well as time of day and weekend variation. Recent work by

Hampshire et al. (2013) used aggregated hourly arrival and departure rates to study the influence

of bicycle infrastructure and land use on bicycle flows. Arrivals and departures were aggregated

at the Sub-City District (SCD) level in Barcelona and Seville, Spain. While the study considered

a detailed temporal resolution, an aggregated spatial resolution at the SCD level was a limitation.

Faghih-Imani et al. (2014) modelled station level arrival and departure rates using flow data from

Montreal’s BIXI system while allowing for fine temporal resolution (hour). The authors

developed linear mixed models to quantify the impact of meteorological, temporal and built

environment attributes on bicycle usage while accommodating for station specific unobserved

effects.

User Surveys

This second set of studies relies on survey data to elicit user experience perceptions.

Buck et al. (2013) analyzed the results of a survey conducted in 2007-2008 to establish the

profiles of short-term users and annual members of Capital Bikeshare (CaBi) in Washington,

DC. Fishman et al. (2014) used survey and trip data from Melbourne, Brisbane, Washington

D.C., London, and Minneapolis-St. Paul to investigate the extent to which BSS can help replace

some of the automobile mode share with bicycle share. The study also examined the influence of

rebalancing needs in order to determine the impact of BSS on vehicle-kilometers travelled.

5

Bachand-Marleau et al. (2011) and Bachand-Marleau et al. (2012), examined the results of a

survey conducted in 2010 with BIXI users, and sought to determine what factors affected system

usage and frequency of use. They found that proximity of home to a docking station had the

greatest impact. In the 2012 paper, the authors used the survey results to examine the relationship

between BIXI and public transit usage.

Identifying Problematic Stations

The third group of studies, and one very relevant to BSS operators, focuses on identifying

problematic stations – stations that are full or empty. Nair et al. (2013) examined system

characteristics, utilization patterns, public transit interaction, and flow imbalances between

stations over time for the Vélib’ system in Paris, France. The authors adopted a stochastic

optimization framework to generate redistribution plans for the Vélib’ system. Fricker and Gast

(2014), studied the effect of the randomness of user decisions on the number of problematic

stations. Kloimüllner et al. (2014) developed a dynamic framework to undertake rebalancing in

real time using historical data from Citybike Wien, from Vienna, Austria. These studies provide

opportunities for BSS operators to go further than rely on one of two options: set rebalancing

schedules based on historical patterns, or reactionary rebalancing when stations go beyond

certain thresholds of availability – too high or too low.

Model Transferability

Of the four groups of studies mentioned in this review, the one concerned with model

transferability is probably the most under-developed. Only a handful of publications were

concerned with determining the degree of transferability of models of BSS demand and flow

patterns. Sarkar et al. (2015), applied unsupervised clustering techniques to data from 10 cities

located all around the world, and gained some very interesting insights into how BSS in these

6

different urban areas compare. The main conclusions of this work were that the larger the

system, the greater the spread of station type and behaviour; and systems with fewer than 100

stations were relatively homogeneous. However, this paper relied solely on historical trends in

BSS data, and did not account for other types of variables. This is a rather severe limitation,

since it does not differentiate the impacts of the broad spectrum of variables. Other studies of

interest include the work of Conway (2014), who applied models developed in Washington, D.C.

to data from Minneapolis-St. Paul and the San Francisco Bay Area. The models did not perform

well when applied to these different urban settings. Once again, this model presents several

shortcomings, since it only accounts for a limited array of land-use variables, and does not

account for weather-related or temporal trends. Rixey (2013) used a regression analysis to assess

how bikesharing ridership levels were affected by demographics and built environment around

stations in Washington, Minneapolis-St. Paul, and Denver. While this study did well on the level

of spatial analysis, it was based on regressions on a small sample (n=265). Furthermore, the

dependent variable was the natural log of monthly rentals, meaning short-term considerations

were not captured, most notably weather and time-of-day trends. Finally, the intent of this study

was to develop a regression framework based on data from three cities, and not to compare these

three cities, which resulted in the authors emphasizing different aspects of their research effort.

1.3 Objectives

As is evident from the BSS literature review presented earlier, there are few studies

exploring the transferability of models of BSS flows and usage patterns, and perhaps fewer that

examine availability of bicycles at stations as a direct metric of analysis. Earlier work has

primarily focused on optimization approaches that rely on historical data of BSS usage, or on

modelling arrival and departure flows. While these studies provide useful insights based on

7

analytical approaches, most fail to consider the impact of a host of exogenous variables on

bicycle availability. Ignoring the impact of these variables would reduce the effectiveness of the

prediction platform for new stations or in locations with rapid land use changes. To elaborate, as

these approaches are mainly based on historical patterns, any change in the station structure and

usage patterns due to changes to land-use (or new developments) might be harder to replicate.

The objective of Chapter 2 is to address this research gap by developing a quantitative model of

station level bicycle availability for Montreal’s BIXI system. This behaviorally quantitative

model should accommodate for the influence of temporal, meteorological, bicycle infrastructure,

built environment and land-use attributes on bicycle availability. Furthering current

understanding of the factors affecting bicycle availability will yield insights into the supply-and-

demand mechanisms of bikesharing systems, and allow the operators to better optimize their

rebalancing procedures.

The second objective of this thesis is to contribute to the fledgling literature on BSS

model transferability by comparing two large BSS from Montreal and New York. Specifically,

earlier research into model transferability has not featured detailed econometric models

incorporating the effects of several types of variables. Understanding how weather and temporal

patterns, as well as bicycle infrastructure and general land-use around stations affect BSS

demand in these two different contexts would provide very valuable insights into how to plan

and operate more successful BSS.

1.4 Thesis Structure

The thesis objectives are with the chapters in the dissertation. Chapter 2 focuses on the

Montreal BSS by developing and estimating a quantitative model of bicycle availability. This

model presents detailed temporal and spatial resolutions, and offers significant insight into the

8

main drivers of BSS demand – temporal, meteorological, bicycle infrastructure, land-use and

built environment. Chapter 3 focusses on the potential to transfer models from different spatial

contexts in order to gain insights into existing systems, or optimize the planning of new ones.

Specifically, we apply a detailed arrivals and departures model developed for Montreal by

Faghih-Imani et al. (2014) to New York, and compare the results to determine how these two

systems respond to various categories of variables – temporal, meteorological, bicycle

infrastructure, land-use and built environment. Finally, chapter 4 provides some concluding

remarks and suggests future directions of study.

9

CHAPTER 2: MODELLING BICYCLE AVAILABILITY IN BICYCLE

SHARING SYSTEMS: A CASE STUDY FROM MONTREAL

3.1 Context

The observed bicycle flows (arrivals and departures) in a BSS are in response to

individuals’ need to travel. Hence, observed flows are significantly influenced by land use and

urban form, meteorological and temporal attributes. For example, Faghih-Imani et al. (2014)

observed clear commuting trends i.e. in the morning period bicycles were likely to be picked up

farther from the Central Business District (CBD) and dropped off at stations in the CBD. Such

systematic movements of bicycles in a single direction are likely to create empty stations away

from the CBD and full stations around the CBD. This pattern can lead to lack of access to

bicycles or empty slots for customers, which is a concern for BSS operators because bicycle

availability is at the heart of BSS user-experience. A lack of available bicycles or space to drop

off a bike after usage discourages individuals from using the system. Hence, it is important for

system operators to ensure that bicycle availability (and empty slot availability) is maintained.

For a fixed station capacity, determining the number of bicycles at the station will automatically

determine the number of empty slots. Therefore by examining bicycle availability we also

observe the availability of empty slots.

In order to address flow imbalances, in most systems BSS operators transfer bikes from

full stations to empty stations to ensure bicycle (or slot) accessibility in the system - the process

referred to as rebalancing. In addition to the commuting trend, several spatial and temporal

relationships can result in asymmetry across the system, thus increasing rebalancing needs (Nair

et al., 2013). Moreover, from an environmental perspective, since rebalancing trucks are the only

source of air pollution related to BSS systems, it is important to minimize negative

environmental externalities. Despite the growth of BSS around the world in recent years and the

10

challenges highlighted above, there are very few studies examining the availability of bicycles or

empty slots at a station. To be sure, there have been studies on optimizing rebalancing operations

using historical data from a data mining based approach (Kloimüllner et al., 2014). However,

these approaches do not consider any behavioral relationships between BSS demand and factors

affecting demand such as socio-demographics and land use.

In this chapter, we estimate an ordered regression model – panel mixed generalized

ordered logit model – using data from BIXI-Montreal for the summer of 2012 to accommodate

for exogenous variables and station level unobserved factors. The estimated model is validated

using a hold-out sample of data from the summer of 2013. Finally, to illustrate its applicability,

an availability prediction exercise is undertaken. The model developed can be employed by BSS

operators to arrive at hourly system state predictions and used for rebalancing operations.

3.2 Data Preparation and Modeling Exercise

The data used for this study was collected from BIXI Montreal’s website based on number

of bicycles available at each station on a minute-per-minute basis between April and August 2012,

for 410 BIXI stations throughout the island of Montreal. For the purposes of this study, data for

seven consecutive days for each station were extracted at random from the months of May to

August. April was excluded since all stations do not start operating the same day. The data was

aggregated to an hourly level and augmented with a host of variables, including weather, location,

bicycle infrastructure, land use and built environment, and TAZ level data. The dataset chosen

consists of 68,880 observations (7 days × 24 hours × 410 stations).

3.2.1 Dependent Variable Definition

An important part of the research exercise is to define station level availability. In our

work, we define bicycle availability as the ratio of bicycles docked at a station to station

11

capacity. Hence, availability of 0 would mean the station is completely empty, while availability

of 1 would imply a full station. Further, as BSS operate in continuous time scale, the availability

measure could also be computed in continuous time. However, this would make the analysis

substantially computationally intensive. Hence, in our approach, we average the minute-by-

minute availability across an hour to generate an hourly availability value for each station. Thus,

a single hourly measure that reflects the state of the system in that hour is computed as the

dependent variable in our analysis. The variable has a range from 0 to 1. The bounded nature of

the dependent variable precludes the consideration of linear regression models for analysis. To

facilitate a parsimonious analysis, we consider a discretization of the variable into five

categories: 0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, and 0.8-1.

In the dataset used for this study, stations were completely empty 10.5% of the time, and

completely full 6.6% of the time (0.05 and 0.95 thresholds). It should be noted that computing

availability at an hourly level makes extreme values of 0 and 1 less likely to occur, hence 0.05

and 0.95 are assumed as thresholds to determine completely empty or full stations, respectively.

Furthermore, in 26.3% of cases, stations were less than 20% full, and in 18.3% of cases, they

were over 80% full. So, in total the stations were unusable 17.1% of the time and were close to

unusable 44.6% of the time. These numbers clearly highlight the potential inefficiency in the

BSS system being studied. Finally, the spatial distribution of the inefficiency also varies

substantially across the system.

3.2.2 Visual Representation of Availability

In order to better understand bicycle availability and its intimate link with rebalancing

operations, a Geographic Information System (GIS) was used to represent bicycle availability at

all stations in Montreal’s BIXI system, at 8 AM, 12 PM, 5 PM, and 9 PM of a typical summer

12

day (see Figure 1). The availability values plotted represent the weekly mean of availability

values for each time period and station. Figure 1 highlights the typical bicycle movements

throughout the day. At 8 AM, stations located near downtown present low availability levels

(green), whereas stations located further away are more likely to be full (red). At 12 PM, the

trend is reversed. All the morning commutes downtown have filled the downtown stations and

emptied the stations located further out. The 5 PM period has less of a clear distinction, since

downtown stations are closer to empty. Finally, at 9 PM, the downtown stations are nearly

empty, while the stations on the outskirts are full. It is interesting to note that there are many

more balanced stations (yellow) in the morning than in the evening. This is likely due to the

rebalancing efforts of BIXI operators.

13

Figure 1 Variation of availability during the day around BIXI stations (estimation sample)

12 PM 8 AM

9 PM 5 PM

14

3.2.3 Addressing Rebalancing

In using station data compiled from BIXI’s website, it is not possible to differentiate

between user drop-offs and pick-ups versus rebalancing actions. Rebalancing operations

represent an outside attempt to ensure bicycle availability in the system. However, for our

analysis it is critical to account for the presence of artificial flows due to rebalancing. Faghih-

Imani et al. (2014) proposed a heuristic approach to separate rebalancing flows from true flows.

However, the same methodology could not be employed here because rebalancing operations are

likely to have a more prolonged impact on the dependent variable. To elaborate, accounting for

rebalancing on an hourly level would be inadequate, since the number of bikes docked at a

station at a specific time is dependent on how many bikes were there in the previous time frame,

and will affect several subsequent records. In other words, if a rebalancing operation occurs at

2pm, it will not only affect the 2-3pm record. It is much more likely to affect several subsequent

records. In order to address this issue, we created identifier variables to recognize rebalancing

operations and examined their impact over the next 2, 3, 4, 5, 6, and 12 hours. These identifiers

were then provided as input to the models being developed.

3.2.4 Econometric Model Framework

For this study, a Panel Mixed Generalized Ordered Logit (MGOL) model was used to

examine bicycle availability (see Eluru et al. 2008 and Eluru, 2013). Consider that propensity for

station availability is denoted by 𝑦𝑖𝑡∗ where i represents the station (i = 1, 2.. N; N=410 in our

case), t represents the hour under consideration (t = 1,2.. T; T= 168 in our case), and 𝑗 (𝑗 =

1,2, … … … , 𝐽) denotes the station availability levels.

15

The equation system for MGOL model can be expressed as (see Yasmin and Eluru, 2013):

𝑦𝑖𝑡∗ = (𝜷 + 𝜶𝑛)𝑿𝑖𝑡 + 𝜀𝑖𝑡,

(1)

and

𝜏𝑖𝑡,𝑗 = 𝜏𝑖𝑡,𝑗−1 + 𝑒𝑥𝑝 [(𝜹𝒋 + 𝜸𝑖,𝑗) 𝒁𝑖𝑡,𝑗] (2)

𝜷 and 𝜹𝑗 are vectors of unknown parameters to be estimated.

𝜏𝑖𝑡,𝑗 represents the thresholds associated with these severity levels. In order to ensure the

well-defined intervals and natural ordering of observed severity, the thresholds are assumed to be

ascending in order, such that 𝜏𝑡0 < 𝜏𝑡1 < … … … < 𝜏𝑡𝐽 where 𝜏𝑡0 = −∞ and 𝜏𝑡𝐽 = +∞.

In equations 1 and 2, we assume that 𝜶𝑖 and 𝜸𝑖𝑗 are independent realizations from normal

distribution for this study. Thus, conditional on 𝜶𝑖 and 𝜸𝑖𝑗, the probability expressions for station

𝑖, hour t and alternative 𝑗 in MGOL model take the following form:

𝜋𝑖𝑡𝑗 = 𝑃𝑟(𝑦𝑖𝑡 = 𝑗|𝜶𝑖, 𝜸𝑖𝑗)

= 𝛬[(𝜹𝑗 + 𝜸𝑖𝑡,𝑗) 𝒁𝑖𝑡,𝑗 − (𝜷 + 𝜶𝑖)𝑿𝑖𝑡] − 𝛬[(𝜹𝑗−1 + 𝜸𝑖,𝑗−1) 𝒁𝑖,𝑗 − (𝜷 + 𝜶𝑖)𝑿𝑖𝑡]

(3)

where Λ(. ) represents the standard logistic cumulative distribution function.

The likelihood function conditional on 𝜶𝑖 and 𝜸𝑖𝑗, can be written as

L|𝜶𝑖, 𝜸𝑖𝑗 = ∏ ∏ (𝜋𝑖𝑡𝑗)𝑑𝑖𝑡𝑗𝐽𝑗=1

𝑇𝑡=1 (4)

where 𝑑𝑖𝑡𝑗 takes the value of 1 if j is the observed availability at station i for hour t

The unconditional likelihood can subsequently be obtained as:

𝐿𝑛 = ∫ (L|𝜶𝑖, 𝜸𝑖𝑗) ∗ 𝒅𝑭(𝜶𝑖, 𝜸𝑖𝑗)𝒅(𝜶𝑖, 𝜸𝑖𝑗)𝜶𝑖,𝜸𝑖𝑗

(5)

16

The log-likelihood function is computed as:

ℒ = ∑ 𝐿𝑛

𝑁

𝑖=1

(6)

In this study, we use a Quasi-Monte Carlo (QMC) method proposed by Bhat (2001) to

draw realization from population multivariate distribution. Within the broad framework of QMC

sequences, we specifically use the Halton sequence (250 Halton draws) in the current analysis.

3.3 Estimation Results

The model estimation process started with the estimation of a simple generalized ordered

logit model. Subsequently, the panel mixed generalized ordered logit model was estimated by

building on the results of the simpler models and presented in Table 1. The model estimation

process was guided by statistical significance (at 90% level), parameter interpretability and

parsimony considerations. The results of the exogenous variable impacts are discussed by

variable category.

3.3.1 Constant and Preference Heterogeneity

The constant does not have any substantive interpretation in the model. However, the

presence of statistically significant standard deviation on the constant highlights the presence of

station specific unobserved effects that jointly influence the availability levels for all records for

the station. These joint effects have a standard deviation of 0.3357.

3.3.2 Weather and Temporal

The impact of temperature on latent propensity is negative indicating that with increase

temperature, BIXI availability is likely to reduce. This is expected, as in Montreal, with higher

temperatures BIXI usage is expected to increase. The coefficient for the elevation variable in the

17

propensity is negative and the coefficient in the third threshold is positive, indicating that stations

with a greater elevation are less likely to be full than their counterparts located at lower

elevations. As it is easier to bicycle downhill compared to uphill, stations at an elevation are

more likely to experience asymmetry in travel to and from such stations.

The results for temporal variables follow expected trends. For instance, the AM

coefficient in the propensity function is positive, whereas PM is negative, implying the system is

used more in the afternoon than in the morning. These results are in line with the findings of

Faghih-Imani et al. (2014). It is noteworthy that the coefficients of AM (6-10am) and PM (3-

7pm) are both positive in the second threshold, indicating that stations are more likely to have

low availability than to be balanced during those time frames. Overall, since the AM and PM

periods are when the system is used most, and the flows are most imbalanced, a concentration of

availability around the extremes is expected to occur during those periods. The weekend variable

has a positive coefficient, indicating that the system is used more during the week, probably for

commuting purposes.

3.3.3 Bicycle Infrastructure

The number of BIXI stations in a 250 meter buffer offer interesting results. The presence

of multiple stations in the 250m buffer is likely to reduce the availability at the station of interest,

possibly indicating that these locations are trip generators. On the other hand, in the downtown

region, the impact on availability of neighboring stations is compensated by the interaction term,

thus indicating that availability is marginally influenced by neighboring stations in the downtown

region.

The variable interacted with the downtown variable also affects the second threshold,

with a negative sign indicating that stations located downtown are more likely to be balanced

18

than low. As expected, a refill rebalancing operation increases availability, while a removal

rebalancing operation decreases availability.

3.3.4 Location, Land Use, and Built Environment

The model indicates that stations located in the old port or downtown areas have lower

availability levels overall, which was expected since those are mostly departure areas.

Furthermore, these areas are likely to have higher job concentrations and are conducive to PM

travel - consistent with findings of Faghih-Imani et al. (2014). Street length around the station is

associated to a positive coefficient in the propensity, which is counterintuitive since one would

expect a denser road network in downtown areas. It is important to note that the downtown and

old port dummies interact with this variable. Street length is also associated to a negative

coefficient in the fourth threshold, indicating that areas with high street length values are more

likely to be associated with very high availability.

Walkscore in the vicinity of the station has a negative coefficient indicating highly

walkable neighborhoods are bicycle friendly as well. In addition to the positive mean effect, the

Walkscore variable also has a standard deviation indicating that the impact of walkability varies

across stations. Further, the propensity function indicates that restaurants affect availability based

on time of day. In the AM period presence of restaurants reduces availability while in the PM

period their presence increases availability. This is likely because people usually go to

restaurants more in the late afternoon than in the early morning. Other commercial sites (such as

stores, and libraries) exhibit the opposite effect, with a positive impact on propensity in the AM

period and negative impact on propensity in the PM period. This suggests BIXI users shop more

in the morning than the afternoon. The parameters in the threshold also support the hypotheses

for these variables.

19

3.3.5 TAZ Level Variables

TAZ with large industrial areas are associated with lower availability levels. This result

seems intuitive since industrial parts of town are less likely to be destinations for BIXI users, and

unlikely to be refilled. Further, the variable also has a significant standard deviation indicating

the impact of the variable varies across stations. The positive sign associated to TAZ job density

variable suggests that areas with high job concentrations are mostly drop-off areas. In the second

threshold, TAZ Parks and Recreational Areas are associated to a positive sign, indicating that

stations are more likely to be empty than balanced when they are located in a TAZ with lots of

parks and recreational activities. Finally, in the fourth threshold, TAZ with large commercial

areas are more likely to have stations with high availability than stations with very high

availability. The reasons for this impact are not immediately apparent and warrant further

investigation.

20

Table 1 Estimation Results

Variables Propensity Threshold b/w Low and

Balanced

Threshold b/w

Balanced and High

Threshold b/w High and

Very High

Coef. t-stat Coef. t-stat Coef. t-stat Coef. t-stat

Latent propensity component

Constant -2.5424 -47.74 -0.0728 -11.97 -0.2175 -35.77 0.4049 25.28

Standard Deviation 0.3357 26.42 - - - - - -

Weather, Geography, Temporal

Temperature (ºC) -0.0594 -109.13 - - - - - -

Elevation (*10-1 ; m) -0.0810 -16.91 - - 0.0304 30.61 - -

AM period (6-10 am) 0.1377 5.25 0.1448 7.56 - - - -

PM period (3-7 pm) -0.1106 -4.71 0.1275 7.69 - - - -

Weekend 0.0837 16.05 - - - - - -

Bicycle Infrastructure

Number of BIXI stations in 250m buffer -0.2328 -20.45 - - - - - -

Number of BIXI stations in 250m buffer *Downtown 0.1900 8.10 -0.0450 -19.39 - - - -

Refill (6hr lag) 0.9303 18.48 - - - - - -

Removal (6hr lag) -0.2265 -4.41 - - - - - -

Location, Land use, Built environment

Old port -1.5784 -34.11 - - - - - -

Downtown -1.5639 -15.98 - - - - - -

Street length in 250m buffer (km) 0.3233 24.13 - - - - -0.1328 -29.79

Walkscore (1: low - 7: high ; *10-1) -0.0383 -7.81 - - - - - -

Standard Deviation 0.1463 63.94 - - - - - -

Restaurants in 250m buffer interacted with AM (*10-2) -0.3939 -4.74 - - 0.4687 7.03 - -

Restaurants in 250m buffer interacted with PM (*10-2) 0.6364 9.66 - - - - - -

Commercial venues in 250m interacted with AM (*10-3) 0.5370 3.09 - - -0.7349 -5.95 - -

Commercial venues in 250m interacted with PM (*10-3) -0.4940 -5.04 - - - - - -

TAZ Level

TAZ Industrial and Resources (km2) -3.7085 -19.62 - - - - - -

Standard Deviation 0.4927 2.48 - - - - - -

TAZ Job Density (Jobs per m2) 0.1179 4.47 - - - - - -

TAZ Parks and Recreational Areas (km2) - - 0.1312 4.33 - - - -

Standard Deviation - - 0.3847 6.04 - - - -

TAZ Commerces (km2) - - - - - - 8.9525 54.19

Log-likelihood at convergence -103080 - = Not applicable

Number of observations 68,880

21

3.4 Validation and System-State Prediction

3.4.1 Model Validation

To evaluate the performance of the MGOL model, we undertake a validation exercise on

a hold-out sample. The sample is obtained from 2013 (recall the estimation data is from 2012).

The same data processing approach is employed for the validation sample preparation. The

validation exercise is undertaken at disaggregate and aggregate level. At the disaggregate level,

the predictive log-likelihood of the proposed model is estimated. The predictive log-likelihood is

compared to the log-likelihood at 0 and log-likelihood at sample shares. The model with 32

parameters show substantial improvements relative to the log-likelihood at 0 and log-likelihood

at sample shares. Specifically, the predictive log-likelihood of the MGOL model is -108,117

while the corresponding numbers for log-likelihood at 0 and at sample shares are -110,858 and -

110,088 respectively. The log-likelihood ratio test statistic defined as (2 * LLUR – LLR) is

computed to evaluate the model fit improvement where LLUR corresponds to the log-likelihood

of the unrestricted model (MGOL model) and LLR corresponds to the log-likelihood of the

restricted model (Model at 0 or Model with constants). The log-likelihood ratio test statistic for

our model relative to model at 0 and model with constants are 5,482 and 3,942 respectively. This

improvement in predictive log-likelihood is clearly much larger than the corresponding test

statistic for chi-square distribution at any level of significance. Thus, we clearly see that the

model predicts the station availability levels adequately.

To undertake comparison at an aggregate level, we compare the predicted aggregate

shares with observed aggregate shares. Specifically, we compute the Mean Absolute Percent

Error (MAPE) value and Root Mean Square Error (RMSE) of the predicted shares relative to

observed shares. In addition to the full sample comparison, we also examine model performance

for two spatial categories: (1) Downtown and Old port and (2) > 5kms from Downtown. The

22

results for the comparison are presented in Table 2. Across all three categories, we observe that

the aggregate model performance is very reasonable with MAPE ranging from 12% to 18%. The

RMSE values range from 3.4 to 4.8. Overall, the results indicate high prediction accuracy around

downtown and slightly lower prediction accuracy further from downtown. Even at these further

distances, the errors are quite satisfactory. Further, we observe a slight over prediction in the

extreme alternatives based on our model results.

23

Table 2 Aggregate Measures of Fit

Availability levels/

Measures of fit

Full sample Old port and Downtown >5 km from Downtown (not old port)

Actual shares

(% records)

Predicted shares

(% records)

Actual shares

(% records)

Predicted shares

(% records)

Actual shares

(% records)

Predicted shares

(% records)

0-0.2 25.8 30.7 36.8 43.6 19.5 24.7

0.2-0.4 18.1 16.5 17.3 15.0 17.5 17.3

0.4-0.6 20.4 16.9 16.2 15.0 24.9 18.2

0.6-0.8 17.4 15.6 13.4 11.7 20.6 18.2

0.8-1 18.3 20.3 16.4 14.6 17.6 21.6

MAPE 13.2 12.4 17.8

RMSE 3.4 3.8 4.8

Records 68,880 15,120 16,632

24

3.4.2 System State Prediction

The main strength of the model framework developed is the ability to predict the future

availability levels in the bike sharing system. To illustrate this we provide snapshots of BIXI

system availability at 4 instances of the day. To be sure, the model developed is a probabilistic

model and thus only provides the probability of an availability level. To obtain the actual

availability one has to employ random numbers to arrive at predictions i.e. each random number

realization might alter the prediction for the station state. A system state prediction based on one

set of random numbers is presented in Figure 2. The figure provides evidence of the model’s

applicability for system state prediction.

25

Figure 2 Variation of availability during the day around BIXI stations (prediction based on validation sample)

12 PM 8 AM

9 PM 5 PM

26

3.5 Conclusions and Future Work

Bicycle sharing systems (BSS) have been receiving increasing amounts of attention in

recent years as complementary modes of transportation in urban areas around the world. Earlier

research exploring BSS has mainly focused on arrivals and departures from a station. The current

study addresses this research gap by examining bicycle availability at a station as a direct metric

of analysis. Specifically, we estimate an ordered regression model - panel mixed generalized

ordered logit model - to accommodate for exogenous variables and station level unobserved

factors. Data from Montreal’s BIXI system for the summer of 2012 is employed for model

estimation. The model estimation results are intuitive and along expected lines. Specifically, we

observe that BIXI is used more in the afternoon than in the morning, dense areas tend to be

associated with lower availability levels, and interactions of time of day with land use impact

availability. The estimated model is validated using a hold-out sample of data from the summer

of 2013. The model validation results clearly highlight the predictive capability of the proposed

model. Finally, to illustrate its applicability, we provide system state snapshots for the BIXI

system at 4 instances of the day. Such system state prediction serve as useful inputs for

undertaking rebalancing exercises.

Future work should investigate the level of data aggregation. The original data was

collected on a minute-per-minute basis. This is too fine a resolution for most practical purposes,

but whether the data should be aggregated at a 5 minute, 15 minute, half hour, or as we did

before at the hourly level, is open to debate and should be investigated further. Another aspect of

interest is the influence of spatial spillover effects from neighboring stations in the system.

Finally, the predictive models need to be tied to optimization routines to improve routing

decision for rebalancing trucks.

27

CHAPTER 3: TRANSFERABILITY OF ECONOMETRIC MODELS OF

BICYCLE SHARING DEMAND IN URBAN SETTINGS: A CASE STUDY

OF MONTREAL AND NEW YORK

3.1 Context

Bicycle sharing system usage patterns are influenced by complex interactions between

weather, temporal, bicycle infrastructure, land-use and built environment variables. Toward

enhancing our understanding of the impact of these factors on BSS usage, several statistical

frameworks have been developed. These frameworks are usually complex, requiring substantial

data processing and modeling proficiency at the BSS organization. Not all BSS operators

necessarily have the time, the resources, or the expertise necessary. As a result, BSS planners

often rely on historical trends to plan rebalancing operations. This presents several limitations,

ranging from ignoring the influence of short-term volatile variables such as weather, of isolated

events (festivals etc.), as well as of more long-term variables such as modifications in land-use.

Furthermore, in order to plan successful new systems, reliable prediction frameworks become

even more crucial, since no historical data is available. Given this context, developing models

and frameworks which can be transferred from one urban context to another would be a useful

contribution to the field of BSS planning.

As mentioned in the literature review presented in chapter 1, previous studies of BSS

demand model transferability have shown some promise, but no conclusive evidence of

successful model transfer. Most notably, Conway (2014) emphasized how models developed in

Washington did not perform well when applied to Minneapolis/St. Paul and the San Francisco

Bay Area. Sarkar et al. (2015) found some similarities between different urban contexts, but also

emphasized that the larger the system, the greater the range of station types and the greater the

spread of station behaviour. Quantifying the similarities and differences of BSS in Montreal and

28

New York – two large systems – is in line with these research efforts, and will allow us to shed

more light on these difficult questions.

In 2014, Faghih-Imani et al. presented detailed arrival and departure rates models for

BIXI-Montreal. These models feature detailed spatial and temporal resolutions, and account for

the possibility that flows occurring in successive time frames are more closely correlated than

flows occurring several hours apart. This chapter presents the results obtained when applying

these models to Montreal and New York, and emphasizes the similarities and differences

uncovered. The data used for Montreal is from the same dataset used in chapter 2. Data from

New York was obtained from Citi bike’s website for September 2013, and augmented with

temporal, meteorological, bicycle infrastructure, land-use and built environment variables. The

data from Montreal and New York exhibit similarities related to several attributes (such as

temperature range, length of roads in 250m buffers around stations, number of other stations in

the buffer) while presenting some notable differences (such as more metro stations in New York,

less rain in Montreal, larger average station capacity in New York, less restaurants in station

buffers in Montreal). Section 3.2.1 presents these data in more detail.

3.2 Data and Methodology

3.2.1 Data Preparation and Comparison

Montreal

The data for Montreal was obtained from the same dataset as the one described in chapter

2, so it will not be presented again here in detail. Briefly, trip data was obtained from BIXI-

Montreal’s website for the summer of 2012, and augmented with temporal, meteorological,

bicycle infrastructure, land-use and built environment variables.

29

A few notable differences from the sample formation undertaken in chapter 2 include the

way rebalancing operations were accounted for and the sample size itself. Since rebalancing

operations were not differentiated from user demand in the raw data, spikes in flow rates above

the 99th percentile at the 5-minute aggregation level were considered rebalancing operations, and

the appropriate records were adjusted by setting them equal to the average arrival rates of the two

previous 5-minute records. Data was then further aggregated to an hourly level. Flowrates during

the night time were very low – from 1 AM to 6 AM, so these hours were aggregated into one

record. Two days were sampled at random for each station, resulting in a sample size of 16,400

records (20 hours × 2 days × 410 stations). The sample is distributed evenly across all four

months (22.4 to 26 percent per month), and across all seven days of the week (12.8 to 15.6

percent). The sample might seem small given the range of data available, but there are several

reasons for choosing a moderate sample size: first, run time of linear mixed models can be quite

significant for large samples; second, very large sample size can result in data over-fit and

inflated parameter significance.

New York

Data for New York was obtained from Citi Bike’s website

(https://www.citibikenyc.com/system-data), which provides data for every month of operation

since July 2013. The dataset included the origin and destination stations for each trip, as well as

the start and end time and the user type – member and non-member. The dataset also included

the coordinates of the 330 stations in New York’s BSS. The built environment variables such as

bicycle routes and subway stations were obtained from New York City open data

(https://nycopendata.socrata.com/); the weather information was obtained for Central Park from

the National Climatic Data Center; socio-demographic information was gathered from US 2010

30

census. The sample used for the analysis spanned the month of September 2013. As in the case

of Montreal, two days were sampled at random for each of the 330 stations. This resulted in a

final dataset of 15,840 records (24 hours × 2 days × 330 stations). The sample was well

distributed between weekdays and weekends (28 percent of the records, as expected).

Data Comparison

Since we are using data from two different cities, it is important to examine the two

datasets before presenting the results from the modelling exercise. Data for Montreal can be

found in Table 3, whereas data for New York can be found in Table 4.

As far as weather is concerned, Montreal and New York present similar values for

temperature and relative humidity. However, Montreal is subject to approximately 4 times more

rainy days than New York (9.7 versus 2.6 percent). In the case of temporal variables, data from

Montreal spans April-August 2012, whereas data for New York is from September 2013. The

proportions of weekdays and weekends are similar in both datasets. As for bicycle infrastructure,

stations in New York are slightly bigger than in Montreal (34.4 versus 19.5 bicycle capacity) and

are surrounded by similar numbers of other stations. New York seems to have more bicycle

facilities – as measured by the length of facilities in the 250 meter buffer around the stations –

and similar lengths of streets in the station buffers. It should be noted that since streets in

Montreal are segmented between major and minor roads whereas New York only has one

measure of street length, it is difficult to compare these variables accurately. Finally, land-use

and built environment variables present notable differences. Stations in New York are more than

twice as likely as stations in Montreal to count a metro station in their 250 meter buffer (49.7

versus 21.7 percent). The job and population density variables are on a TAZ basis in Montreal,

which makes it difficult to compare to the open New York data. Similarly, the variables

31

accounting for the number of other commercial enterprises in the buffers also have different

units, making them difficult to compare. New York seems to have a higher density of restaurants.

32

Table 3 Descriptive Summary of sample characteristics: Montreal

Continuous Variables Min Max Mean Std. dev.

Temperature (°C) 5.9 33 20.9 5.2

Relative Humidity (%) 24 99 61.4 16.7

Elevation (m) 14.3 154.8 49.2 24.3

Station Capacity 7 65 19.5 8.0

Number of BIXI Stations in 250m Buffer 1 8 2.2 1.5

Capacity of BIXI Stations in 250m Buffer 7 223 46.9 40.5

Length of Bicycle Facility in 250m Buffer

(km) 0 2.5 0.7 0.51

Length of Minor Roads in 250m Buffer (km) 1.14 6.5 3.6 0.8

Length of Major Roads in 250m Buffer (km) 0 5.7 1.1 1.0

Length of Bus Lines in 250m Buffer (km) 0 12.3 2.8 1.9

TAZ Job density (jobs/m2 * 1000) 0.07 4078.1 141.1 529

Number of Restaurants in 250m Buffer 0 194 24.0 35.3

Number of Other Commercial Enterprises in

250m Buffer 0 1989 121.6 206.9

TAZ Population Density (people/m2 * 1000) 1.01 187.8 59.4 31.6

Area of Parks in 250m Buffer (m2) 0 194907 14551 26962

Walkscore 14 97 62.3 15.7

Categorical Variables Percentage

Rainy Weather 9.7

Weekend 26.5

Friday & Saturday Nights 8.0

Metro Station in 250m Buffer 21.7

Station in Downtown area 17.1

Station in Oldport area 4.9

University in 250m Buffer 17.1

School in 250m Buffer 40.7

33

Table 4 Descriptive Summary of sample characteristics: New York

Continuous Variables Min Max Mean Std. dev.

Hourly Arrivals 0 63 4.4 5.9

Hourly Departures 0 63 4.3 5.8

Temperature (°C) 8.3 34.4 19.4 5.0

Relative Humidity (%) 27 94.2 60.3 15.6

Station Capacity 3 67 34.4 10.8

Number of BIXI Stations in 250m Buffer 1 5 2.2 1.0

Capacity of BIXI Stations in 250m Buffer 10 203 78.3 43.0

Length of Bicycle Facility in 250m Buffer

(km) 0 3.4 1.0 0.6

Street Length in 250m Buffer (km) 1.3 8.4 4.4 1.1

Employment density (jobs/m2 * 1000) 0 432.5 55.8 53.8

Number of Restaurants in 250m Buffer 0 545 54.4 92.2

Density of Other Commercial Enterprises in

250m Buffer (establishments/m2 * 1000) 0 10.1 2.7 1.9

Population Density (people/m2 * 1000) 0.01 67.2 24.9 14.7

Categorical Variables Percentage

Rainy Weather 2.6

Weekend 28.0

Friday & Saturday Nights 7.7

Metro Station in 250m Buffer 49.7

34

3.2.2 Methodology

Since we are using panel data, simple linear regression techniques are not appropriate for

our analysis. Instead, we use a multilevel linear model that explicitly account for flows that

originate at the same station. It should be noted that in the absence of these station-specific

effects – due to repeated observations for each station – the model collapses to a simple linear

regression framework.

The dependent variable under consideration – separate models for arrival and departure

rates normalized by station capacity – is modelled using a linear regression framework which can

be expressed, in its most general form, in the following way:

𝛾𝑞𝑑𝑡 = 𝛽𝑋 + 𝜀

With:

q = 1, 2, 3… : station index

d = 1, 2, 3… : daily index

t = 1, 2, 3… : hourly index

𝛾𝑞𝑑𝑡 is the normalized arrival or departure rate, 𝑋 an L×1 column vector of attributes, 𝛽 the

coefficients (L×1), and 𝜀 the error term – assumed normally distributed across the dataset.

It should be noted the error term can be sub-divided into three unobserved factors: a

station component, a day component, and a time-of-day component. Given the sample size and

number of independent variables considered, it would be too computationally intensive to

estimate the combined influence of all three aspects simultaneously. Hence we consider station

and time-of-day effects to be related. In this structure, each Station-Day combination contains 24

records, resulting in a total of 660 observations. Estimating a full covariance matrix (24 × 24)

would be burdensome, and would be unlikely to yield useful insights. Thus we parameterize the

35

covariance matrix (Ω). In order to estimate a parsimonious specification, we assume a first-order

autoregressive moving average correlation structure with three parameters:

Ω = 𝜎2 (

1 𝜑𝜌 𝜑𝜌2 ⋯ 𝜑𝜌19

𝜑𝜌 1 ⋯ ⋯ ⋯⋮ ⋮ ⋮ ⋮ ⋮

𝜑𝜌19 ⋯ ⋯ ⋯ 1

)

With:

σ = error variance of ε

ϕ = common correlation factor across time periods

ρ = dampening parameter

If the three parameters listed above are significant, they highlight the impact of station specific

effects on the dependent variables.

Model estimation was carried out in SPSS using the Restricted Maximum likelihood

Approach (REML), which differs slightly from Maximum Likelihood (ML) approach, since the

REML estimates the parameters by computing the likelihood function on a transformed dataset.

For additional details concerning model development, the reader is referred to Faghih-Imani et

al. (2014).

3.3 Results

Full results are available in Tables 5 and 6. The following sections provide brief

commentary on how different categories of variables compare across Montreal and New York

BSS systems.

3.3.1 Model Fit Measures

When examining the Log-Likelihood (LL) values of the models, it appears that for

Montreal, the arrival and departure rate models are performing similarly, with LL values of -

16623.1 and -16102.2, respectively. On the other hand, for New York, the departure rate model

36

far outperformed the arrival rate model, since their LL values were -14826.7 and -17264.1,

respectively. This wide disparity is worth noting and should be investigated further.

3.3.2 Weather

The results for weather variables are consistent across arrival and departure rate models

for Montreal and New York. Higher temperature is associated with higher rates of arrival and

departure, whereas relative humidity and rainy weather are associated with lower utilization

rates. These results make intuitive sense insofar that people tend to prefer biking when the

weather is nice.

3.3.3 Temporal

Temporal variables highlight some interesting differences between BSS use in Montreal

and in New York. Whereas in Montreal the weekend sees a decrease in arrival and departure

rates, this trend is reversed in New York. This suggests that in Montreal the system is used

primarily for commuting purposes during the week, whereas this is not the case in New York.

This could be explained by the fact that New York is a prominent tourist destination, which

could result in increased use by tourists on weekends. This difference in weekend usage patterns

is also picked up by the Friday and Saturday night variable. In Montreal, this coefficient is

positive, signaling increased usage during those time periods. In New York, this variable is not

statistically significant, suggesting that Friday and Saturday evenings do not see a significant

variation in usage pattern when compared to the rest of the week. Finally, the AM, Midday and

PM variables show clear commuting patterns in Montreal, but not in New York.

37

3.3.4 Bicycle Infrastructure

The influence of surrounding stations is similar in both cities, with a high density of

stations associated with increased flows, whereas an increase in the capacity of neighbouring

stations is linked to lower flows. In Montreal, stations witness increased flows when surrounded

by streets with more bicycle lanes, whereas the variable capturing this behaviour is not

statistically significant when applied to New York. This is somewhat surprising since as

mentioned in section 3.2.1 our data suggest there are more bicycle facilities in New York than in

Montreal. However, this lack of significance could be explained by differences in the physical

layout and perceived safety of bicycle lanes in both cities, and deserves future investigation. The

influence of surrounding street length is difficult to assess, since the data for Montreal is

segmented in minor and major roads, whereas New York data only features one variable. For

Montreal, increased length of minor roads is associated to increased utilization of BIXI, whereas

more major roads lead to a decrease in BIXI utilization. For New York, an increase in overall

street length is associated with decreased bicycle flows.

3.3.5 Land-use and Built Environment

Results show several similarities in the influence of land-use around stations on bicycle

arrival and departure rates. Increased population density, number of metro stations and

restaurants in the buffer are all associated with higher utilization of stations. In the case of

restaurants, this is especially true in the evening. In both cities, interacted variables of job density

with AM and PM periods show clear commuting trends for arrival rates – positive in the morning

and negative in the afternoon. However, these trends vanish when looking at departure rates. In

Montreal, job density is associated with higher departure rates both in the AM and the PM. In

New York, these variables are not statistically significant when applied to departure rates. These

38

trends can be explained by the fact that regular users are more likely to use the bikes for morning

and afternoon trips, whereas occasional users are less likely to use the bikes early in the morning,

and their increased presence in the afternoon masks the usage of commuters.

One can also notice some city-specific trends in the data, specifically with respect to the

presence of commercial establishments around stations. In Montreal, surrounding shops are

linked to decreased flows both in the afternoon and the evening. In New York, shops are strongly

linked to increased flows in the afternoon and decreased flows at night.

39

Table 5 Model Estimation Results: Montreal

Arrival Rate Departure Rate

Parameter Coefficient t-statistic Coefficient t-statistic

Intercept 0.0784 3.066 0.0584 2.271

Meteorological

Temperature 0.0048 8.829 0.0047 8.576

Relative Humidity -0.0013 -8.556 -0.0012 -7.765

Rainy Weather -0.0035 -0.697 -0.0124 -2.457

Temporal

Weekend -0.0451 -7.031 -0.0506 -7.838

AM -0.0259 -5.982 0.0548 11.768

Midday -0.0186 -4.078 0.0065 1.418

PM 0.0734 15.042 0.0526 10.824

Friday & Saturday Nights 0.0608 10.218 0.0735 12.215

Bicycle Infrastructure

Number of BIXI Stations in 250m Buffer 0.0254 4.923 0.0241 4.662

Capacity of BIXI Stations in 250m Buffer -0.0011 -5.581 -0.0010 -5.206

Length of Bicycle Facility in 250m Buffer 0.0342 5.911 0.0361 6.200

Length of Minor Roads in 250m Buffer 0.0110 2.645 0.0112 2.668

Length of Major Roads in 250m Buffer -0.0173 -5.224 -0.0189 -5.659

Land Use and Built Environment

Metro Station in 250m Buffer 0.0202 2.762 0.0181 2.465

TAZ Job density * AM 0.0607 10.354 0.0142 2.036

TAZ Job density * PM -0.0230 -3.338 0.0197 2.875

Number of Restaurants in 250m Buffer 0.0004 3.691 0.0005 4.276

Number of Restaurants in 250m Buffer *

AM

-- -- -0.0007 -6.504

Number of Restaurants in 250m Buffer *

PM

0.0005 3.459 0.0006 5.844

Number of Other Commercial Enterprises

in 250m Buffer * PM -0.0001 -4.343 -- --

Number of Other Commercial Enterprises

in 250m Buffer * Night -0.0001 -5.246 -0.0001 -3.201

TAZ Population Density 0.1603 1.804 0.1613 1.805

University in 250m Buffer * AM 0.0228 2.780 -0.0352 -4.052

University in 250m Buffer * PM -0.0367 -4.253 -- --

ARMA Correlation Parameters

σ 0.0256 66.613 0.0262 67.282

ρ 0.8928 114.741 0.8942 105.994

φ 0.3546 35.216 0.3459 33.982

40

Table 6 Model Estimation Results: New York

Arrival Rate Departure Rate

Parameter Coefficient t-statistic Coefficient t-statistic

Intercept 0.0811 2.613 0.0938 3.226

Meteorological

Temperature 0.0028 3.663 0.0021 2.886

Relative Humidity -0.0010 -5.072 -0.0010 -4.991

Rainy Weather -0.0277 -3.587 -0.0378 -4.471

Temporal

Weekend 0.0198 2.065 0.0207 2.332

AM 0.0649 9.330 0.1019 13.921

Midday 0.0773 13.080 0.0817 13.540

PM 0.0986 12.732 0.0789 9.725

Friday & Saturday Nights 0.0134 1.353 0.0052 0.523

Bicycle Infrastructure

Number of BIXI Stations in 250m Buffer 0.0325 3.437 0.0238 2.726

Capacity of BIXI Stations in 250m Buffer -0.0010 -4.457 -0.0008 -3.489

Length of Bicycle Facility in 250m Buffer 0.0004 0.051 -0.0010 -0.138

Street Length in 250m Buffer -0.0117 -2.848 -0.0116 -3.033

Land Use and Built Environment

Metro Station in 250m Buffer 0.0133 3.015 0.0144 3.519

Employment density * AM 0.7390 9.230 0.0446 0.533

Employment density * PM -0.4946 -2.673 -0.0431 -0.222

Number of Restaurants in 250m Buffer 0.0003 6.951 0.0003 7.568

Number of Restaurants in 250m Buffer *

AM 0.0001 1.475

-0.0001 -1.268

Number of Restaurants in 250m Buffer *

PM 0.0003 6.950

0.0002 5.550

Number of Other Commercial Enterprises

in 250m Buffer * PM 13.9214 2.648 15.0814 2.734

Number of Other Commercial Enterprises

in 250m Buffer * Night -2.1323 -1.027 -8.6761 -4.226

Population Density 0.9195 2.999 0.8795 3.099

ARMA Correlation Parameters

σ 0.0405 47.444 0.0390 52.205

ρ 0.8375 160.713 0.8314 143.155

φ 0.7202 115.696 0.6355 85.493

41

3.4 Conclusions and Future Work

Montreal and New York are very different cities, but some common traits are apparent

when reviewing the datasets presented in this chapter. These similarities and differences are also

present in the model results, with some of our findings reinforcing the hypothesis of

heterogeneous behaviours of large BSS while others point to very similar responses to some

variable types. Specifically, the largely different weekend usage patterns uncovered by our

analysis suggest that city-specific factors are significant. However, several variables were

associated to similar outcomes in both cities, such as weather variables, bicycle infrastructure,

and several land-use attributes.

Overall, the results presented in this study are promising for the development of

transferable models of bicycle flows in urban areas. The majority of variables included in the

Montreal model remained significant when applied to New York data, which implies that overall

model structure is transferable, even though the coefficients themselves might not be. However,

practical problems associated to obtaining consistent data for different areas should not be

overlooked. It is also important to note that the scope of our analysis was limited to two cities

located in Eastern North America. How systems from different geographies or sizes compare

remains a vastly open research question.

42

CHAPTER 4: CONCLUSION

5.1 Significant Contributions

Despite the growth of fourth-generation demand-responsive multimodal BSS in recent

years, the literature on these systems is relatively sparse. Most studies of the drivers of BSS

demand and usage patterns do not feature high spatial and temporal resolutions. Furthermore,

very few studies are concerned with bicycle availability at stations, focusing instead on arrival

and departure rates. The first major contribution of this thesis is to develop a detailed panel

mixed generalized ordered logit model of bicycle availability in Montreal’s BIXI system,

featuring detailed temporal and spatial resolutions, and accommodating for exogenous variables

and station level unobserved factors. The estimation results are mostly intuitive, and account for

the influence of several types of variables: temporal, meteorological, bicycle infrastructure, land-

use and built environment. Specifically, our results show that interactions of time of day with

land use impact availability, highlighting clear commuting trends. Overall, the system is used

more in the afternoon than in the morning, and dense areas tend to be associated with lower

availability levels. The validation exercise shows the model performs well, and the system state

predictions provided illustrate how useful this model can be to BSS operators when planning

rebalancing operations.

The second major contribution of this study is to provide a detailed comparison of two

large BSS by applying arrival and departure rate models developed in Montreal to data from

New York. The data show clearly that Montreal and New York are very different cities, even

though some variables present similar ranges. The results from our comparison reinforce the

findings of Sarkar et al. (2015) insofar that they emphasize the heterogeneity of large BSS

systems in terms of temporal patterns. However, our results also show similar outcomes in both

43

cities when it comes to weather variables, bicycle infrastructure, and several land-use attributes.

This suggests that large systems might be more similar than previous research suggested.

5.2 Future Research

The work presented in this thesis opens the way to several future research efforts. First,

the level of data aggregation should be investigated further. The original data for Montreal BIXI

system was collected on a minute-per-minute basis, whereas the New York dataset contained

information about each trip made by users. This level of data resolution is too detailed for most

practical purposes, but whether the data should be aggregated to 5 minute, 15 minute, half hour,

or as we considered at the hourly level, is open to debate. Second, the influence of spatial

spillover effects from neighboring stations in the system needs to be incorporated into the models

employed. This is especially true when analyzing dense networks such as the ones in downtown

Montreal or Manhattan. Third, predictive models need to be tied to optimization routines to

improve routing decision for rebalancing trucks. This opens opportunities for partnerships

between scholars and BSS operators, which could be useful to both parties. Fourth, the degree of

transferability of statistical frameworks developed in or applied to smaller networks and various

geographies should be explored. The scope of our analysis was limited to two large cities located

in Eastern North America. Finally, developing large consistent datasets spanning several cities

and containing meteorological, bicycle infrastructure, land-use and built-environment variables

would be very useful to researchers interested in BSS model transferability.

44

REFERENCES

Bachand-Marleau, J., Larsen, J., & El-Geneidy, A. M. (2011). Much-Anticipated Marriage of

Cycling and Transit: How Will It Work? Transportation Research Record: Journal of the

Transportation Research Board, 2247(-1), 109–117. doi:10.3141/2247-13

Bachand-Marleau, J., Lee, B. H. Y., & El-Geneidy, A. M. (2012). Better Understanding of

Factors Influencing Likelihood of Using Shared Bicycle Systems and Frequency of Use.

Transportation Research Record: Journal of the Transportation Research Board, 2314(-1),

66–71. doi:10.3141/2314-09

Bhat, C.R. (2001). "Quasi-Random Maximum Simulated Likelihood Estimation of the Mixed

Multinomial Logit Model", Transportation Research Part B, Vol. 35, No. 7, pp. 677-693.

Buck, D., Buehler, R., Happ, P., Rawls, B., Chung, P., & Borecki, N. (2013). Are Bikeshare

Users Different from Regular Cyclists?: A First Look at Short-Term Users, Annual

Members, and Area Cyclists in the Washington, D.C., Region. Transportation Research

Record: Journal of the Transportation Research Board, 2387(-1), 112–119.

doi:10.3141/2387-13

Cervero, R., Caldwell, B., & Cuellar, J. (2013). Bike-and-ride: build it and they will come.

Journal of Public Transportation, 16.4, 83-105.

Citi Bike (2015) System data. Retrieved from https://www.citibikenyc.com/system-data

Conway, M. W. (2014). Predicting the Popularity of Bicycle Sharing Stations: An Accessibility-

Based Approach Using Linear Regression and Random Forests. Retrieved from

http://www.indicatrix.org/publications/2014/Conway-Bikeshare-Accessibility.pdf

DeMaio, P. (2009). Bike-sharing: History, Impacts, Models of Provision, and Future. Journal of

Public Transportation. Vol. 12, pp. 41-56.

45

Dutzik, T., and Baxandall, P. (2013). A new direction: Our changing relationship with driving

and the implications for America’s future. U.S. PIRG

Eluru N. (2013). "Evaluating Alternate Discrete Choice Frameworks for Modeling Ordinal

Discrete Variables," Accident Analysis & Prevention, 55 (1), pp. 1-11

Eluru, N., C.R. Bhat, and D.A. Hensher (2008). "A Mixed Generalized Ordered Response Model

for Examining Pedestrian and Bicyclist Injury Severity Level in Traffic Crashes",

Accident Analysis & Prevention, Vol. 40, No.3, pp. 1033-1054

Faghih-Imani A., N. Eluru, A. El-Geneidy, M. Rabbat, and U. Haq, (2014). How does land-use

and urban form impact bicycle flows: Evidence from the bicycle-sharing system (BIXI)

in Montreal, Journal of Transport Geography, Vol. 41, pp. 306-314.

Fishman, E., Washington, S., Haworth, N. (2014). Bike share’s impact on car use: Evidence

from the United States, Great Britain, and Australia, Transportation Research Part D:

Transport and Environment, Volume 31, August 2014, Pages 13-20, ISSN 1361-9209

Fricker, C., & Gast, N. (2014). Incentives and redistribution in homogeneous bike-sharing

systems with stations of finite capacity. EURO Journal on Transportation and Logistics,

1–31. doi:10.1007/s13676-014-0053-5

Goodman, A., Green, J., & Woodcock, J. (2014). The role of bicycle sharing systems in

normalising the image of cycling: An observational study of London cyclists. Journal of

Transport & Health, 1(1), 5–8. http://doi.org/10.1016/j.jth.2013.07.001

Hampshire, R., Lavanya, M., & Eluru, N. (2013). An Empirical Analysis of Bike Sharing Usage

and Rebalancing: Explaining Trip Generation and Attraction from Revealed Preference

Data. Technical Paper, Heinz College, Carnegie Mellon University

Kaufman, S. M., Gordon-Koven, L., Levenson, N. & Moss, M.L. (2015). Citi Bike: The First

46

Two Years. The Rudin Center for Transportation Policy and Management.

Kloimüllner, C., Papazek, P., Hu, B., & Raidl, G. R. (2014). Balancing Bicycle Sharing Systems:

An Approach for the Dynamic Case⋆. Retrieved July 28, 2014 from

https://128.131.166.141/publications/bib/pdf/kloimuellner-14.pdf

Krykewycz, G. R., Puchalsky, C. M., Rocks, J., Bonnette, B., & Jaskiewicz, F. (2010). Defining

a Primary Market and Estimating Demand for Major Bicycle-Sharing Program in

Philadelphia, Pennsylvania. Transportation Research Record: Journal of the

Transportation Research Board, 2143(-1), 117–124. doi:10.3141/2143-15

Meddin, R. and DeMaio, P., 2015. The Bike-Sharing World Map. Retrieved 13th July, 2015

from http://www.bikesharingworld.com

Murphy, E., & Usher, J. (2015). The Role of Bicycle-sharing in the City: Analysis of the Irish

Experience, International Journal of Sustainable Transportation, Vol. 9, pp. 116-125

Nair, R., Miller-Hooks, E., Hampshire, R. C., & Bušić, A. (2013). Large-Scale Vehicle Sharing

Systems: Analysis of Vélib’. International Journal of Sustainable Transportation, 7(1),

85–106. doi:10.1080/15568318.2012.660115

National Climatic Data Center (2015). Daily Summaries Station Details: Central Park. Retrieved

from http://www.ncdc.noaa.gov/cdoweb/datasets/GHCND/stations/

GHCND:USW00094728/detail

NHTS (2009). U.S. Department of Transportation, Federal Highway Administration, National

Household Travel Survey 2009. URL: http://nhts.ornl.gov.

New York City (2015). New York City Open Data. Retrieved from

https://nycopendata.socrata.com/

Public Bike System Company (2010). What we’ve achieved. Accessed April 17, 2014

47

http://www.publicbikesystem.com/what-we-achived/case-studies-info/?id=1

Rixey, R. A. (2013). Station-Level Forecasting of Bikesharing Ridership: Station Network

Effects in Three U.S. Systems. Transportation Research Record: Journal of the

Transportation Research Board, 2387(-1), 46–55. doi:10.3141/2387-06

Sarkar, A., Lathia, N., & Mascolo, C. (2015). Comparing cities’ cycling patterns using online

shared bicycle maps. Transportation, 42(4), 541–559. http://doi.org/10.1007/s11116-015-

9599-9

Shaheen, S. A., Guzman, S., & Zhang, H. (2010). Bikesharing in Europe, the Americas, and

Asia: Past, Present, and Future. Transportation Research Record: Journal of the

Transportation Research Board, 2143(-1), 159–167. doi:10.3141/2143-20

U.S. Census Bureau (2015). U.S. 2010 Census. Retrieved from

http://www.census.gov/2010census/

Van Lierop, D., Grimsrud, M., & El-Geneidy, A. (2013). Breaking into bicycle theft: Insights

from Montreal, Canada. Forthcoming International Journal of Sustainable Transportation

Wang, X., Lindsey, G., Schoner, J., Harrison, A. (2012). Modeling bike share station activity: the

effects of nearby businesses and jobs on trips to and from stations. Paper presented at the

92nd Transportation Research Board Annual Meeting 2012, Washington, DC

Yasmin. S., and N. Eluru (2013), "Evaluating Alternate Discrete Outcome Frameworks for

Modeling Crash Injury Severity," Accident Analysis & Prevention, 59 (1), pp. 506-52