constrained extended kalman filter: an e cient improvement
TRANSCRIPT
Constrained Extended Kalman Filter: an Efficient
Improvement of Calibration for Dynamic Traffic Assignment
Modelsby
Haizheng ZhangB. Eng. Automation, Tsinghua University, 2013
Submitted to the Department of Civil and Environmental Engineering and theDepartment of Electrical Engineering and Computer Sciencein partial fulfillment of the requirements for the degrees of
Master of Science in Transportation
and
Master of Science in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2016
c© Massachusetts Institute of Technology 2016. All rights reserved.
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Department of Civil and Environmental Engineering
Department of Electrical Engineering and Computer ScienceMay 19, 2016
Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Moshe E. Ben-Akiva
Edmund K. Turner Professor of Civil and Environmental EngineeringThesis Supervisor
Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Francisco C. Pereira
Professor, Technical University of DenmarkThesis Supervisor
Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Jacob K. White
Cecil H. Green Professor of Electrical Engineering and Computer ScienceThesis Reader
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Leslie A. Kolodziejski
Chair, Department Committee on Graduate Students
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Heidi Nepf
Chair, Graduate Program Committee
2
Constrained Extended Kalman Filter: an Efficient
Improvement of Calibration for Dynamic Traffic Assignment
Models
by
Haizheng Zhang
Submitted to the Department of Civil and Environmental Engineeringand the Department of Electrical Engineering and Computer Science
on May 19, 2016, in partial fulfillment of therequirements for the degrees of
Master of Science in Transportationand
Master of Science in Electrical Engineering and Computer Science
Abstract
The calibration (estimation of inputs and parameters) for dynamic traffic assignment(DTA) systems is a crucial process for traffic prediction accuracy, and thus critical toglobal traffic management applications to reduce traffic congestion. In support of thereal-time traffic management, the DTA calibration algorithm should also be online,in terms of: 1) estimating inputs and parameters in a time interval only based ondata up to that time; 2) performing calibration faster than real-time data generation.Generalized least squares (GLS) methods and Kalman filter-based methods are proveduseful in online calibration.
However, in literature, the road networks selected to test online calibration al-gorithms are usually simple and have small number of parameters. Thus their ef-fectiveness when applied to high dimensions and large networks is not well proved.In this thesis, we implemented the extended Kalman filter (EKF) and tested it onthe Singapore expressway network with synthetic data that replicate real world de-mand level. The EKF is diverging and the DTA system is even worse than when nocalibration is applied. The problem lies in the truncation process in DTA systems.When estimated demand values are negative, they are truncated to 0 and the overalldemand is overestimated. To overcome this problem, this thesis presents a modifiedEKF method, called constrained EKF. Constrained EKF solves the problem of over-estimating the overall demand by imposing constraints on the posterior distributionof the state estimators and obtain the maximum a posteriori (MAP) estimates withinthe feasible region. An algorithm of iteratively adding equality constraints followedby the coordinate descent method is applied to obtain the MAP estimates. In ourcase study, this constrained EKF implementation added less than 10 seconds of com-putation time and improved EKF significantly. Results show that it also outperforms
3
GLS, probably because its inherent covariance update procedure has an advantage ofadapting changes compared to fixed covariance matrix setting in GLS.
The contributions of this thesis include: 1) conducting online calibration algo-rithms on a large network with relatively high dimensional parameters, 2) identifyingdrawbacks of a widely applied solution for online DTA calibration in a large network,3) improving an existing algorithm from non-convergence to great performance, 4)proposing an efficient and simple method for the improved algorithm, 5) attainingbetter performance than an existing benchmark algorithm.
Thesis Supervisor: Moshe E. Ben-AkivaTitle: Edmund K. Turner Professor of Civil and Environmental Engineering
Thesis Supervisor: Francisco C. PereiraTitle: Professor, Technical University of Denmark
Thesis Reader: Jacob K. WhiteTitle: Cecil H. Green Professor of Electrical Engineering and Computer Science
4
Acknowledgments
First and foremost, I would like to express my sincere gratitude to my advisor, Profes-
sor Moshe Ben-Akiva. Your invaluable guidance, immense expertise and continuous
support made my MIT graduate study colorful and memorable. I learned a lot from
you. It is my honor to know you.
I express my deep gratitude to Professor Francisco Pereira, for your insightful
advice and extraordinary support through this tough but worthy journey. Thanks
to Professor Constantinos Antoniou for being a great source of knowledge and help.
Thanks also go to Dr. Ravi Seshadri, for your inspiration and guidance through my
masters research. All of you taught me how to be a better researcher. Many thanks
to Professor Jacob White, for your generous help with my dual masters degree in
EECS and invaluable suggestions about this thesis.
I would like to thank Katherine Rosa, the research administrator, and Eunice
Kim, the lab manager of our ITS Lab. Thanks for your help and patience in every
detail. I would like to thank Kiley Clapper and Janet Fischer, the administrators of
department of CEE and EECS for helping me all along.
Thanks to my roommates (and ex-roommate) Tianli Zhou, Hongyi Zhang, and
Chao Zhang. It has been enjoyable to share the journey with you. Special thanks
to Hongyi for a good source of machine learning knowledge, it is always helpful to
discuss research with you. Special thanks to Yuelong Su, Lu Lu and Runmin Xu,
whom I knew in my undergrad at Tsinghua University. All of you are great examples
that lead me to who I am now. My graduate life would not be so colorful without you.
Thanks also go to my friends I met and knew at MIT, for being friendly, considerate
and helpful, which made my life at MIT much easier. Thank you, Yin Wang, Yan
Zhao, Linsen Chong, Nathaniel Bailey, Xiao Fang, Weikun Hu, Chiwei Yan, Xiang
Song, Shi Wang, Yundi Zhang, Yinjin Lee, Monique Stinson, Mazen Danaf, Bilge
Atasoy, Nathanael Cox, Yuhan Jia, Na Wu, Manxi Wu, Li Jin, Jeffrey Liu, Rui Sun,
He Sun, Lijun Sun, Yan Leng, Zhan Zhao and Yiwen Zhu.
I am grateful for the funding from Singapore-MIT Alliance for Research and Tech-
5
nology (SMART) to support my study, research and trips to Singapore. Thanks also
go to the people I knew at SMART, including Kakali Basak, Yan Xu, Dong Wang,
Stephen Robinson, Yang Lu, Vinh-An Vu (Jenny). Every trip to Singapore is a unique
experience because of you.
I would also like to thank Zhiwei Lin, for all your love and support. Last but not
least, my deepest gratitude goes to my parents for your endless love, encouragement
and constant support. This work is dedicated to you.
6
Contents
1 Introduction and Background 13
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Introduction to Dynamic Traffic Assignment . . . . . . . . . . . . . . 14
1.3 Thesis Motivation and Outline . . . . . . . . . . . . . . . . . . . . . . 17
2 Literature Review on Calibration for DTA 19
2.1 Offline Calibration: Generalized Least Squares Model . . . . . . . . . 20
2.2 Online Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.1 State-Space Formulation . . . . . . . . . . . . . . . . . . . . . 24
2.2.2 Optimization Formulation . . . . . . . . . . . . . . . . . . . . 26
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Methodology: Constrained Extended Kalman Filter in DTA Cali-
bration 29
3.1 General Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 29
3.1.1 State-Space Formulation in Details . . . . . . . . . . . . . . . 29
3.1.2 The Idea of Deviations . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Extended Kalman Filter and Variants in DTA Calibration . . . . . . 35
3.2.1 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Finite Difference and FD-EKF . . . . . . . . . . . . . . . . . . 39
3.2.3 Simultaneous Perturbation and SP-EKF . . . . . . . . . . . . 40
3.2.4 Characteristics of EKF in Online Calibration for DTA . . . . 41
3.3 Constrained EKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7
3.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.2 Optimization Formulation for Constrained Kalman Filter Esti-
mates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3.3 An Efficient Near-Optimal Algorithm for EKF with Bound Con-
straints in DTA Calibration . . . . . . . . . . . . . . . . . . . 45
3.3.4 Coordinate Descent Algorithm with Near-Optimal Initialization 50
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 Case Study: Constrained EKF on Singapore Expressway Network 53
4.1 Model Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.1 DynaMIT Overview . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.2 MITSIMLab Overview . . . . . . . . . . . . . . . . . . . . . . 55
4.1.3 DynaMIT and MITSIMLab Integration . . . . . . . . . . . . . 56
4.2 Calibration Framework for Singapore Expressway Network . . . . . . 57
4.2.1 Simulation Scenario Overview . . . . . . . . . . . . . . . . . . 57
4.2.2 MITSIMLab and DynaMIT Integration: Data Flow . . . . . . 59
4.2.3 Calibration Settings for EKF . . . . . . . . . . . . . . . . . . 62
4.2.4 Calibration Evaluation Criteria . . . . . . . . . . . . . . . . . 64
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.1 EKF Performance . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2 Constrained EKF Performance . . . . . . . . . . . . . . . . . 70
4.3.3 Comparison with GLS . . . . . . . . . . . . . . . . . . . . . . 73
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Conclusions 77
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . 79
8
List of Figures
1-1 General DTA framework (source: (Ben-Akiva et al., 2010a)) . . . . . 15
3-1 Directed Graph Model in Kalman Filtering Scheme . . . . . . . . . . 38
3-2 2-D Posterior PDF Contour and Different Estimators . . . . . . . . . 44
4-1 DynaMIT Real-Time Framework (source: Ben-Akiva et al. (2010a)) . 54
4-2 Singapore Road Network (source: Google Maps, 2016 ) . . . . . . . . 57
4-3 Singapore Road Network in DynaMIT and MITSIMLab . . . . . . . . 58
4-4 DynaMIT and MITSIMLab Integration Workflow . . . . . . . . . . . 59
4-5 Sample W-SPSA Calibrated Demand . . . . . . . . . . . . . . . . . . 60
4-6 Sample Gaussian Kernel Smoothed Demand . . . . . . . . . . . . . . 61
4-7 Estimated versus Observed Flow Counts: EKF . . . . . . . . . . . . . 66
4-8 Flow Count RMSN versus Time: EKF . . . . . . . . . . . . . . . . . 67
4-9 Estimated versus Observed Flow Counts: Constrained EKF . . . . . 70
4-10 Flow Count RMSN versus Time: Constrained EKF . . . . . . . . . . 71
4-11 Sample Calibrated Demand by EKF (before truncation) . . . . . . . 72
4-12 Sample Calibrated Demand by Constrained EKF . . . . . . . . . . . 72
4-13 Flow Count RMSN versus Time: GLS versus Constrained EKF . . . 74
4-14 Flow Counts Comparison: Constrained EKF (left) vs GLS (right) . . 75
9
10
List of Tables
4.1 Estimated OD flows (veh/5min) for OD Pairs Related to Sensor 528 . 69
11
12
Chapter 1
Introduction and Background
1.1 Motivation
Traffic congestion in urban areas has been a hot topic due to its negative temporal,
economic and external impacts such as delays, wasted fuel, frustrated motorists and
air pollution. It is an old and well-known problem by soaring travel demand and
insufficient increase of transportation infrastructures. From 1982 to 2014, the total
length of public road in the United States has increased from 3,865,894 miles to
4,194,257 miles, which is a 8.49% increase. During the same timeframe, the vehicle
miles traveled (VMT) has increased 90.61%, from 1,595,010 miles to 3,040,220 miles
(FHWA, 2014). In contrast, the congestion cost per auto commuter severely has
increased from $400 (in 2014 dollars) in 1982 to $960 in 2014, according to 2015
Urban Mobility Scorecard (Schrank et al., 2015). To make it worse, congestion is
expected to continue increasing, according to the same source. Although the annual
congestion cost from 2008 to 2011 decreased due to the recession, recently urban areas
have generally experienced the same challenges as in the early 2000s, for instance, the
increasing population and job market that contributes to congestion (Schrank et al.,
2015).
The 2015 Urban Mobility Scorecard also recommends a balanced and diversified
approach to reduce congestion, comprised of more policies, programs, projects, flexi-
bility, options and understanding. Traffic management plays an important role among
13
the mixed solutions guided by this approach. One important application is the route
guidance for drivers in Traffic Management Centers (TMCs), for instance, Advanced
Traveler Information Systems (ATIS) in Federal Highway Administration (FHWA)
of the US Department of Transportation. The route guidance aims to achieve global
traffic control to reduce congestion and its impact on energy consumption, gas emis-
sions, delays and frustration. In order to provide correct and reliable guidance, TMCs
should have global information, insights and preferably prediction abilities. Dynamic
traffic assignment (DTA) systems are considered a most promising category of tools
to estimate traffic states and make state predictions. Built upon accurate predictions,
where drivers’ reactions to possible route guidance strategies are also considered, the
best strategy could be selected to reduce congestion to the lowest possible level. In
this thesis, we examine the basis of route guidance operations, i.e. the DTA systems.
1.2 Introduction to Dynamic Traffic Assignment
Traditionally, traffic assignment has been derived from transportation demand fore-
casting (typically the four-step model), which comprises traffic generation, traffic
distribution, mode choice and route assignment. It is a process where traffic demand
(usually represented by a static Origin-Destination matrix) is loaded onto the road
network (Barcelo, 2010). As a result, the traffic flows are computed for the links in
the road network. In contrast, Dynamic Traffic Assignment (DTA) emphasizes time-
varying properties, meaning the traffic demand and flows are time-dependent. This
allows the flexibility to accommodate variational traffic scenarios, where underlying
patterns of time and space are evolving (Mahmassani, 2001). DTA has evolved sub-
stantially since the late 1970s. It is an essential tool for estimating and predicting
dynamic traffic flows on road networks.
Various formulation and solution approaches to DTA have been introduced, both
analytical and simulation-based. Analytical models express the DTA problem in
mathematical formulations for a specific objective (e.g. user equilibrium (UE) or
system optimal (SO)). Optimization algorithms are usually applied to solve the an-
14
alytical DTA and obtain its inputs and parameters. However, its conciseness and
accuracy of replicating traffic flow dynamics is only applicable to small networks.
The analytical formulation has to be simplified for the optimization problem to be
solvable, thus some traffic relationship (e.g. driver behaviors, congestion) cannot be
fully captured (Peeta and Ziliaskopoulos, 2001; Balakrishna, 2006). Thus, simulation
is treated as the best way to model traffic due to its efficiency and accuracy. Recently,
interest has grown in simulation-based DTA methods also because they offer the ad-
vantage of accurately modeling driving behavior and response to guidance. Thus,
the utilization of simulation-based DTA is important for traffic estimation. In this
thesis, our focus in on simulation-based DTA, and DTA in the following chapters is
equivalent to simulation-based DTA.
Figure 1-1: General DTA framework (source: (Ben-Akiva et al., 2010a))
The general work flow of a simulation-based DTA system comprises the dynamic
traffic management system, demand and supply modules, as shown in Figure 1-1. The
management system dictates and provides inputs for the demand and supply modules.
Input parameters include Origin-Destination matrices, incident/event information,
weathering conditions, traffic control strategies, etc. The interaction between demand
15
and supply is simulated by the DTA system. While the simulation is running, the
traffic conditions (e.g. flows, travel times, route choice fractions, queues, vehicle
trajectories) will be measured and outputted in a timely manner.
Existing simulation-based DTA models fall in 3 categories due to different lev-
els of detail in terms of presenting traffic dynamics: microscopic, mesoscopic and
macroscopic (Lu, 2013).
Macroscopic DTA models represent the least detailed traffic dynamics, in which
the traffic is modeled as fluid flows and individual vehicles are not quantified. Thus
they are the most computationally efficient models, particularly on large networks.
Example simulation-based DTA systems are METANET (Wang et al., 2001), EMME
(INRO, 2015), and Visum (PTV, 2015b).
Microscopic DTA models have the most detailed simulation granularity, where
driving behaviors and car interactions are modeled. Examples of existing systems
include PARAMICS (Smith et al., 1995), MITSIMLab (Ben-Akiva et al., 2010b),
AIMSUN2 (Barcelo and Casas, 2005), CORSIM/TSIS (FHWA, 2015), VISSIM (PTV,
2015a), and TransModeler (FHWA, 2016).
Mesoscopic models are a combination of macroscopic and microscopic models, with
the aim of balancing between efficiency and accuracy. Mesoscopic models usually have
detailed representation for individual vehicles on the demand side, but car interactions
are not modeled on the supply side. DTA system examples include DynaMIT (Ben-
Akiva et al., 2010a), DYNASMART (Mahmassani et al., 1998) and Dynameq (Mahut
and Florian, 2010)
In this thesis, we utilized both microscopic and mesoscopic models to fully utilize
their advantages. A microscopic model is selected to represent our real world, because
of its accuracy to replicate real scenario. A mesoscopic model is selected to be our
DTA system, due to its balance between efficiency and accuracy.
16
1.3 Thesis Motivation and Outline
In Section 1.1, the motivation of DTA systems were discussed. We stated that the aim
of a DTA system is to provide traffic state estimation and prediction, which can be
used for global route guidance. For that purpose, the DTA system is usually followed
by route strategy generation. Then TMC will guide the drivers with those generated
strategies. Drivers obtain the guidance information, make decisions and change the
existing traffic flow patterns, which will be captured in the surveillance data. Those
data will again be fed into the DTA system for the latest traffic state estimation.
Among all the components in the strategy generation and evaluation loop, traffic
state estimation is crucial for the DTA system, since it is the basis of other steps.
Since traffic state estimates depend on the inputs and parameters of the DTA system,
input and parameter estimation is also a crucial step. This step is called calibration.
The calibration procedure based on the surveillance data is the focus of this thesis.
The strategy generation and modeling of drivers’ reactions are not in the scope of
this thesis.
The remainder of this thesis is structured below. Chapter 2 summarizes and
commentates on the existing algorithms of offline and online calibration, and the
scope of this research is narrowed down to extended Kalman Filter (EKF) based
algorithms in the online category. Chapter 3 gives more detailed description of the
EKF formulations and algorithms. In addition, the drawbacks of applying EKF to
DTA parameter estimation are demonstrated, particularly for parameters with natu-
ral bound constraints. Following that observation, constrained EKF, a modification
upon EKF is proposed to overcome the drawbacks discovered. Then a specific algo-
rithm of implementing the constrained EKF is presented. In Chapter 4, a case study
of Singapore expressway network is conducted and demonstrated. This experiment
is performed with synthetic data generated using an existing microscopic simulator
(MITSIMLab), which was already offline calibrated with existing real sensor data.
Then we compare the estimated traffic flow counts and their counterparts generated
by the MISTIMLab, followed by the performance comparisons of EKF, Generalize
17
Least Squares method and constrained EKF. In Chapter 5, conclusions are made.
18
Chapter 2
Literature Review on Calibration
for DTA
The DTA systems were introduced in Chapter 1. However, a prerequisite of using
DTA is its capability of producing reliable estimations and predictions of traffic states
when compared to real world. Furthermore, the prediction ability of a DTA model is
usually dependent on its estimation accuracy. Thus the estimation accuracy is of much
importance. No matter what models are applied in different DTA systems, parameter
estimation, which is also called model calibration, is a crucial step. Usually we have
the real world traffic scenario, and we measure/observe some information (these data
are called measurements/observations) about the traffic states, for instance, traffic
flow counts, average travel speeds and link travel times. Then based on these data,
we want to estimate the parameters such that the simulated measurements from DTA
models fit the data. This is the general calibration problem for DTA models. In a
more concise way, the calibration problem in DTA context is defined as follows. Given
a set of initial values for various parameters (OD flows, route choice parameters,
road capacities and speed-density relationships) and measurements (e.g. aggregate
flows, speeds and densities), estimate those parameters such that the error between
the simulated outputs and observed values is minimized (Antoniou, 2004).
The calibration problem can be classified in different ways since there are at least
two dimensions to view this problem. For instance, categorized by the input or
19
parameter type, there are supply and demand calibration, meaning the calibration of
inputs and parameters on the supply side and the demand side, respectively. There
are also offline calibration and online calibration, where data availability at each
time step and computation time are important considerations in algorithms for the
latter. In this thesis, we classify the calibration algorithms into offline and online,
and summarize them in each category.
2.1 Offline Calibration: Generalized Least Squares
Model
The objective of the offline calibration, in general, is to estimate the parameters
such that the simulated outputs fit the observed measurements for an average traffic
scenario. Here average means the measurements are expected values over a long
period of time, for instance, over a month. Please notice that it is distinct from
estimating static traffic assignment (STA) over a month, because this average is over
the same time interval for all days in a month. By averaging, we hope to include
day to day demand fluctuations, weather, etc. into the measurements, thus offline
calibration on these measurements will yield the parameters for an average day in
that month.
In terms of mathematical formulation, Lu formulated the offline calibration frame-
work as an optimization problem, with the following notation (Lu, 2013):
• h: interval index, h ∈ H = 1, 2, ..., H, where H is the set of simulation
intervals
• G: road network, G = Gh|h ∈ H
• x: time-dependent DTA parameters, e.g., OD flows, x = xh|h ∈ H
• xa: a priori time-dependent parameter values, xa = xah|h ∈ H
• β: other time-invariant DTA parameters, e.g., supply model parameters
20
• βa: a priori values of other DTA parameters
• M o: observed sensor measurements, M o = M oh|h ∈ H
• M s: simulated sensor outputs, M s = M sh|h ∈ H
Then the offline calibration problem can be formulated as:
minx,β
z(M o,M s,x,β,xa,βa) =H∑
h=1
z1 (M oh,M
sh) + z2 (xh,x
ah)+ z3 (β,βa) (2.1)
subject to:
M sh = f(x1, ...,xh;β;G1, ...,Gh) (2.2)
lxh ≤ xh ≤ uxh (2.3)
lβ ≤ β ≤ uβ (2.4)
where, f is the abstract of the DTA model, which takes x,β and G as inputs and
generate the simulated sensor outputs M sh; z is a loss function, which measures the-
goodness-of-fit between simulated sensor outputs and observed measurements. z1 is a
specific loss function that quantifies the discrepancy between simulated and observed
measurements. z2 and z3 are loss functions that penalize the parameters for moving
away from a priori values; Parameters xh and β have lower bounds lxh, lβ and upper
bounds uxh,uβ, respectively.
In general, z1, z2 and z3 work as weights to balance the objective so as to minimize
the measurement discrepancy or to constrain parameters locally.
Generalized Least Squares (GLS) is a linear regression model. It is widely known
for OD estimation in the field of DTA calibration. The GLS model is one config-
uration in the optimization framework stated above. In this model, f function is
a multiplication of the assignment matrix (Ashok, 1996) with the OD parameters,
which models the traffic assignment procedure. The model can be more complicated
when we consider the impact of OD parameters in previous intervals, and is given by
21
(Balakrishna, 2002):
f(x1, ...,xh) =h∑
p=h−p′Ap
hxp (2.5)
As for the z1 function, it is defined as:
z1(Moh,M
sh) = (M o
h −M sh)′Ωh
−1(M oh −M s
h) (2.6)
Covariance matrix Ωh is defined for the error between simulated and observed
measurements for interval h. The dimension of Ωh is n-by-n, where n is the number
of measurements. Similarly, covariance matrices for parameter errors xh−xah,β−β
a
can also be estimated and used to construct z2 and z3. The covariance matrices are
usually estimated from the residuals of an Ordinary Least Squares (OLS) model on
the same OD estimation problem. Specifically, OLS is a special case of GLS, in which
the covariance of errors is assumed an identity matrix.
The GLS formulation is advantageous for OD estimation, since the assignment
model has an analytical form and exact solutions are available(Balakrishna, 2002).
However, when we include other supply parameters, we lose the closed form advan-
tages. Thus the calibration of those parameters is harder for GLS.
Notice that the GLS formulation can handle both demand and supply parameters.
The drawback is that when we do not have an analytical form, we have to rely on
the simulation (DTA model) to reveal the relationship to us. In order to obtain
good estimation results, we need efficient algorithms that take simulation outputs
and adjust the parameters. Several optimization methods are applied to solve for the
supply parameters in this GLS framework (Balakrishna, 2006). Balakrishna applied
the Box-SNOBFIT and simultaneous perturbation stochastic approximation (SPSA)
algorithms (Spall, 1992) to solve for the supply parameters, while the demand side is
still handled by GLS. This process is called joint calibration of demand and supply
parameters, and Balakrishna performed supply and demand calibration in a sequential
manner.
22
Recently, Lu proposed an enhanced SPSA method: W-SPSA(Lu, 2013). W-SPSA
estimates the gradient only with the sensor counts related to a specific parameter and
those sensor counts are weighed according to relevance. It utilizes a weight matrix
that store the weights to estimate the enhanced gradient. Then he conducted case
studies on synthetic data and real data. Results showed that W-SPSA significantly
outperforms SPSA in fitting traffic flow counts and speed data.
2.2 Online Calibration
The objective of online calibration is to estimate the correct parameters such that
the DTA model can represent the real-time traffic scenario, not an average scenario.
It tries to solve the calibration problem in a timely manner. There are two aspects
of online calibration, in terms of: 1) estimating the parameters in a time interval
only based on data up to that time; 2) performing calibration faster than real-time
data generation. The general optimization formulation from offline calibration still
applies, but online calibration minimizes only one part of the objective function that
relates to the current simulation interval h. Despite different model formulations, one
commonality is that traffic states are estimated with current and historical data, not
future data. Furthermore, in order to deploy to an online traffic management system,
the online calibration algorithm has to be finished within each time interval, before
measurements in the next interval are available.
The online calibration problem has been addressed by some studies over the
decade, but not many algorithms are proved efficient and scalable. Similar to of-
fline calibration, inputs and parameters calibrated are generally two-fold: demand
and supply. In the following subsections, algorithms of online calibration for DTA
are reviewed and commented on.
Existing algorithms to solve such problems can be categorized according to dif-
ferent formulations. Based on (Omrani and Kattan, 2012), there are two categories
of formulations applied to online calibration for DTA systems, namely state-space
formulations and optimization formulations. They apply to both demand and supply
23
calibration.
2.2.1 State-Space Formulation
The first category is the state-space formulation. It is based on the idea of system
control; demand and supply parameters are treated as state vectors, which evolve over
time. The target is to estimate and predict the true state vector. At the same time,
we are given measurements/observations in the real world, which implies the true
state vector. The most widely applied approach to solve the state-space formulation
is Kalman filter-based method. In Kalman filtering framework, state evolvement is
modeled by a transition equation between adjacent time steps, while the connection
between states and measurements is depicted by a measurement equation. Based on
different assumptions in the transition and measurement equation, the effects and
challenges of EKF can be very different.
On the transition equation side, one assumption is that the deviations of the de-
mand and supply parameters from historical averages define a stationary time series
(Ashok and Ben-Akiva, 1993). This assumption applies to the scenarios where param-
eters follow similar patterns from history. It is an elegant framework that captures the
structural information in demand, without explicitly knowing the pattern. It requires
an offline calibrated demand to serve as a basis to construct starting point. Ashok
and Ben-Akiva(Ashok and Ben-Akiva, 2000) formulated a 4th-order autoregressive
(AR) process from deviations and developed a Kalman filter to estimate and predict
OD demand in real-time. Wang and Papageorgiou (Wang and Papageorgiou, 2005)
formulated demand and supply parameters nicely in a stochastic macroscopic model,
a random walk model is used as the transition equation to estimate traffic conditions
in freeway stretches. In general, this assumption works well with a stationary random
process with constant mean and variance (Zhou and Mahmassani, 2007). In the same
article, Zhou and Mahmassani applied a polynomial trend filter on the deviations to
account for more flexibility than AR model. The authors also proposed a procedure
to update the historical demand pattern online and applied the whole process in OD
estimation to a netword in Irvine.
24
When the evolution pattern of the parameters with time is different from the
pattern implied by historical parameters, the first assumption of stationary time
series fails. In this case, a simple random walk model can be built on the absolute
values of demand estimators with no prior information. The simple random walk
model is an AR model with an autocorrelation coefficient of 1 (i.e. Xk+1 = 1 ×
Xk). Cremer and Keller (Cremer and Keller, 1987), Chang and Wu(Chang and Wu,
1994) used a random walk model to make predictions about dynamic route choice
split proportions. The authors concluded that this algorithm performed well in both
accuracy and stability. However, this approach is limited to slow demand change
scenario, and may not reflect non-linear trends in time-dependent OD flows. Thus
when historical parameters are available, the formulation under this assumption has
inferior performance to the one discussed above.
In the above works mentioned, the measurement equations are all based on ana-
lytical formulations, since OD flows are greatly captured by the assignment matrix.
For other supply and route choice parameters, it is difficult to formulate due to their
indirect relationship with traffic flows. To solve this problem, numerical methods
are applied to obtain the linearization of the relationship, i.e. the gradient. In or-
der words, the DTA system is now treated as a “black box” and the relationship
between input parameters and simulated outputs needs to be calculated (Antoniou
et al., 2004, 2006). Notice the method is general, since no prior analytical form is
assumed. Based on this idea, Antoniou applied extended Kalman filter (EKF), un-
scented Kalman filter (UKF), limiting EKF (LimEKF) and Iterated EKF (I-EKF)
in two cases in the UK and California (Antoniou, 2004). The author calibrated the
demand and supply parameters with flow count and speed sensor data. The LimEKF
has the most computational advantages with complexity of O(1), which is practical
for online applications. EKF and UKF have similar computation complexity of O(n),
where n is the dimension of parameters. In contrast, I-EKF does multiple iterations
of EKF, thus has more complexity. Although compared with LimEKF, getting the
linearization of the system for EKF is time-consuming, it is concluded that the EKF
is still the most straightforward approach to estimate OD demand. In Antoniou’s
25
calibration results, the EKF outperforms UKF and LimEKF in terms of estimation
and prediction accuracy. The author also conclude that additional runs of I-EKF
could further reduce the estimation error.
As discussed above, one last remark about the state space model is its generality,
since it can be used to calibrate all kinds of inputs and parameters for demand, supply
and other types in the future. It can also handle all types of data, as long as they
can capture effects of the inputs and parameters.
2.2.2 Optimization Formulation
The optimization formulation follows the idea of GLS. It is performed by viewing
the DTA online calibration problem as a stochastic minimization problem. Like in
GLS, we also want to minimize a loss function that measures a mixed error between
parameters and their a priori values and estimated and real-time measurements, to-
gether with the error between the model and reality. Numerical methods to estimate
gradient as used in state-space formulation are also useful in this framework. Huang,
E.(Huang, 2010) applied a Gradient Descent Algorithm (GD) and a Conjugate Gra-
dient Algorithm (CG) as direct optimization formulation for the Brisa A5 motorway
between Lisbon and Cascais, Portugal. A heuristic method – Hooke-Jeeves Pattern
Search Algorithm (PS) – was also applied in the same scenario. The author com-
pared those three algorithms and Extended Kalman Filter (EKF) with DynaMIT-R
OD estimation, in terms of estimation accuracy and computational performance. The
author concluded that EKF outperformed PS, CG and GD in decreasing errors in
fitting both flow counts and speeds. GD and EKF have computational advantage
over PS and CG, in the sense that less runtime is needed. Partitioned Simultaneous
Perturbation EKF (PSP-EKF) and PSP-GD were also implemented. It is concluded
that PSP-based methods are more efficient than their original counterparts. However,
the estimation accuracy of EKF still outperforms all other methods including PSP-
EKF. Other stochastic optimization algorithms such as the Box complex, SNOBFIT
were also applied(Balakrishna, 2006). The algorithm is validated on a small synthetic
network and the error rate is less than 10%. However, the algorithm was not tested
26
on large road networks, thus, these algorithms may have scalability problems when
deployed to complicated road network, where traffic flows intervene.
In general, the optimization formulation is a work-around for GLS in online con-
text, where variables in one interval are optimized at a time. We make two observa-
tions here. First, based on different loss functions, the state estimates can be very
different, which essentially correspond to the covariance matrix selection in GLS. Sec-
ond, non-exact optimization methods would also need multiple iterations to converge,
which may conflict with the online requirement. Thus the optimization formulation
may have more parameters to tune than the state-space formulation.
2.3 Summary
We close with two important comments. First, there were multiple algorithms applied
to solve the online calibration problem. However, all the works mentioned above
were applied to highway corridors or freeway stretches, thus the scalability of those
algorithms has not been tested. In other words, when the road network is large and
there are plenty of parameters, the online calibration problem may not work properly.
Second, extended Kalman filter algorithm has superior accuracy performance over
other algorithms. Based on the two reasons, we conducted the following research to
examine the accuracy performance of extended Kalman filter on a large road network.
In addition, although we consider the computation performance, the main focus of
the proposed research is the calibration accuracy, because the numerical method used
in EKF shown in Section 3.2.2 is fully parallelizable.
27
28
Chapter 3
Methodology: Constrained
Extended Kalman Filter in DTA
Calibration
In this section, the detailed state-space formulation is reviewed first. Then the ex-
tended Kalman filter (EKF) algorithm and some variants applied in the field of DTA
calibration are summarized together with some comments. In addition, the drawbacks
of EKF are discussed from the point of view of maximizing a posterior probability
density. Then the optimization formulation for constrained EKF is discussed. In or-
der to solve it, an algorithm that gives a near-optimal solution is proposed, followed
by the coordinate descent algorithm that obtains the true optimum within a given
precision. Finally, a summary is made.
3.1 General Problem Formulation
3.1.1 State-Space Formulation in Details
As in Chapter 2, the state-space formulation is based on the view of DTA calibration
in control engineering. The inputs and parameters of DTA form the state vector,
and it is assumed to evolve over time. The state-space formulation has been studied
29
comprehensively in control theory, and there are algorithms that estimate state vector
in real-time efficiently. Thus, the introduction of this formulation to DTA calibration
is beneficial. The state-space formulation is explained and discussed in Estimation and
Prediction of Time-Dependent Origin-Destination Flows (Ashok, 1996). While the
original formulation is only for origin-destination flow estimation and prediction, the
state vector can contain all kinds of parameters (e.g. demand and supply parameters).
In order to express the formulation, the following notation is defined:
• h: interval index, h ∈ H = 1, 2, ..., H, where H is the set of simulation
intervals, where time is discretized into indices
• xh: state vector of time interval h
• Mh: measurements in time interval h
Then the state space model comprises the following equations:
• Transition equation
xh = fh−1(xh−1, ...,xh−p) +wh (3.1)
• Measurement equation
Mh = gh(xh, ...,xh−q+1) + vh (3.2)
where, p is the number of previous states that are believed to have relations with
xh; q is the number of states related to current measurement Mh; wh is the process
error term, which indicates the imperfection of the transition model fh; vh is the
observation error term, which absorbs the measurement error of Mh as well as the
imperfection of the model gh.
Usually the functions fh and gh are hard to model, since it depends not only on
multiple previous states, but also on time step. The transition equation is usually
30
modeled as an autoregressive process as stated in Estimation and Prediction of Time-
Dependent Origin-Destination Flows (Ashok, 1996):
xh =
h−p∑k=h−1
F khxk +wh (3.3)
where, F kh is a square matrix, representing the effect of xk on xh; If one makes the
assumption that the autocorrelation structure remains constant over time, F kh would
only rely on h − k, i.e. F kh = F h−k. In fact, the assumption is often made due to
model parsimony.
Similarly, for the measurement model, an autoregressive process can also be ap-
plied, as shown in the following display.
Mh =
h−q+1∑k=h
Akhxh + vh (3.4)
where, Akh matrix accounts for the contribution of xk to Mh. Specifically, if our
target is only the origin-destination (OD) estimation andMh is the aggregated sensor
counts, Akh is the assignment matrix, where its (i, j)-th element is the fraction of jth
OD value in xk that contributes to ith value of Mh.
Typically, the models fh and gh need to be constructed first in order to solve for
the state vectors xk. The autoregressive models are one type of candidate functions
for fh and gh, which are easier to estimate. This is because the coefficient matrices
(F kh,A
kh) work as the linearization when fh and gh are non-linear. When we have
enough data points for the same period h, we can use least squares method to estimate
the coefficient matrices. Thus, if each Akh is a full matrix, we have (nM × nx × q)
parameters to estimate in the whole model, where nM is the length of vector Mh and
nx is the length of xk. Similarly, for the transition model, we also need to estimate
each element for all F kh (nx×nx×p parameters in total). However, the amount of data
available is usually not enough to estimate such complicated models, especially for a
full matrix F kh in the transition model. Thus, a simplification for F k
h is a diagonal
matrix, where correlations between different OD pairs are not considered. One can
31
make a further assumption that the diagonals have the same magnitude, in other
words, F kh is reduced to a scalar F k
h . The model can also be simplified in the time
dimension by reducing p. As for the measurement model, it is more convenient to
estimate with numerical methods, since we already have the simulation-based DTA
model to generate enough data points for us. In this research, the formulations used
are rather simplified:
xh = fh(xh−1) +wh = xh−1 +wh (3.5)
Mh = gh(xh) + vh = Ahxh + vh (3.6)
Following the discussion about estimating Ah in the measurement equation, the DTA
model is treated as a “black box”, then numerical methods are used. In other words,
the linear relations are estimated based on data points generated by the DTA sim-
ulation. Thus, all the relations between xh and Mh can be handled and measured,
even for the types of state vectors and measurements where no analytical formulation
is available.
We make some important comments on the simplification procedure. First, the
transition equation accounts for the relations between state vectors in different time
intervals. A more accurate transition equation is undoubtedly beneficial. However,
its positive impact is more on the prediction side, especially for predicting the traffic
states of several time intervals ahead. As for its impact on calibration, it gives a
starting point (xh) for the measurement model. Thus the effect of the simplification
in Equation (3.5) on calibration is limited, if gh(·) is not a drastic changing function
that depends heavily on the starting point. Second, the measurement model simpli-
fication/approximation is based on the same conjecture that most information in a
measurement vector is already utilized to infer the OD flows (Ashok, 1996). This
conjecture is more likely to hold when measurement errors are low, and wh has low
variance. In other words, when we have enough information to infer the correct OD
flow values, the measurement simplification is acceptable. Finally, it is beneficial to
32
include higher degrees in both equations, but computational complexity can be a
major issue.
3.1.2 The Idea of Deviations
The idea of deviations comes from Dynamic Origin-Destination Matrix Estimation
and Prediction for Real-Time Traffic Management Systems (Ashok and Ben-Akiva,
1993). Since the autoregressive (AR) process is based on the assumption of temporal
interdependencies between OD flows in different time steps, it is beyond the capability
of the AR process to account for the structural information about trip patterns. For
instance, the morning peak and the evening peak are difficult to be modeled by
a simple time-invariant AR process, because they clearly do not follow the same
transition pattern. Thus, a simple way to incorporate the structure of temporal and
spatial pattern of trips is to base on historical (e.g. offline) estimated state vectors
xHh . The deviation of state vector xh and Mh are hence defined as:
∂xh = xh − xHh (3.7)
∂Mh = Mh −MHh (3.8)
where, MHh is the historical measurement values.
Then the transition equation Equation (3.5) and measurement equation Equa-
tion (3.6) now become:
∂xh = ∂xh−1 +wh (3.9)
∂Mh = Hh∂xh + vh (3.10)
After subtracting the historical values, the deviation ∂xh and ∂Mh are more likely
to be random variables of 0 mean, and they represent the day-to-day fluctuations.
Thus the wh,vh term are more likely to be 0 mean. Thus, it is more reasonable for
us to assume the error terms to be:
33
E[wh] = 0, ∀h ∈ H (3.11)
E[vh] = 0, ∀h ∈ H (3.12)
E[whvTh ] = 0, ∀h ∈ H (3.13)
E[whwTh ] = Qh, ∀h ∈ H (3.14)
E[vhvTh ] = Rh, ∀h ∈ H (3.15)
where, the Qh,Rh are covariance matrices for wh and vh in time step h, respectively.
Further, we assume that the error terms across different time steps are uncorre-
lated:
E[whwTk ] = 0, ∀h, k ∈ H, h 6= k (3.16)
E[vhvTk ] = 0, ∀h, k ∈ H, h 6= k (3.17)
Note that Hh in Equation (3.10) was specified as Ah in (Ashok and Ben-Akiva,
1993). This implies that the following two equations also hold, besides Equation (3.5)
and Equation (3.6):
xHh = xH
h−1 +wh (3.18)
MHh = Ahx
Hh + vh (3.19)
This indicates the historical states and measurements follow the same linear model
based on assignment matrix as the deviations. It is assuming the linear measurement
equation holds globally. This is a major but unnecessary constraint for the measure-
ment model. Using deviations instead of the absolute values is a major improvement
because the historical values already account for the non-linearity. Thus the devia-
tion form is a local linearization in the vicinity of the historical values, not a global
linearization. Here in Equation (3.10), the assumption is imposed on deviations only,
and the Hh matrix is not necessarily the assignment matrix. In fact, it depicts the
34
local linear relationship around the historical values. As will demonstrate later in
Section 3.2, the Hh matrix is calculated through numerical methods, not through
assignment matrix generated by DTA model.
We conclude this section by the following remarks. First, in this thesis, the state
transition model is simplified as a random walk and the focus is on estimating the
measurement model gh, where a similar simplification is made. Second, the construc-
tion of the measurement model is general in handling different data and parameter
types, because it is based on local linearization with numerical methods. Finally,
Equation (3.9) and Equation (3.10) are utilized to implement the idea of deviation,
which is an elegant framework to avoid modeling the structural pattern in state vec-
tors.
3.2 Extended Kalman Filter and Variants in DTA
Calibration
There have been several Kalman filter variants applied to solve the state-space for-
mulation in the context of online calibration. Here the extended Kalman filter (EKF)
algorithm are reviewed first and its connection to the state-space model is made ex-
plicit. Then its variants are summarized and commented upon. Last but not least,
the drawbacks of current practices of EKF are addressed and this leads to the next
section.
3.2.1 Extended Kalman Filter
The extended Kalman filter is an approach to handle non-linearity. The transition
and measurement equations are both non-linear in this case. The basic equations are:
• Transition equation
xh = f(xh−1,uh) +wh (3.20)
35
• Measurement equation
Mh = g(xh) + vh (3.21)
where, f and g depicts the transition and measurement relationship, which are as-
sumed fixed over time; uh are control vectors, which we now assume always 0 in DTA
model because the objective now is to calibrate the system instead of controlling it;
wh and vh are uncorrelated 0 mean multivariate Gaussian, with covariance matrix
Qh and Rh, respectively.
By comparing the state-space model and EKF assumptions, we conclude that
EKF has the same assumption as the simplified model as Equation (3.5) and Equa-
tion (3.6), together with Equation (3.11) to Equation (3.17), except for the two addi-
tional assumptions: the time-invariant model form and Gaussian distribution of the
error term.
When the time-independent assumption does not hold, the EKF algorithm in-
herently handles the time-dependent model, as discussed later. As for the Gaussian
assumption, the idea of deviations ensures zero mean, but the shape could be non-
Gaussian. When we are modeling day-to-day traffic fluctuations, if the historical OD
flows are calculated from enough data, the Gaussian assumption is likely to hold.
As the equations show the basic assumptions, the algorithm of extend Kalman
filter is displayed below.
For the extended Kalman filter algorithm, the input parameters are:
• x0: initial starting point (guess) of the state vector at time h = 0
• P 0: initial covariance matrix (guess) of x0
• Qh: time-variant covariance matrix of wh, h ∈ H
• Rh: time-variant covariance matrix of vh, h ∈ H
The Kalman filter is an online algorithm, which means the measurements arrive
while the algorithm is running. Assuming we have the estimates in the last time step
36
Algorithm 1 Extended Kalman Filter
Initialize
x0|0 = x0 (3.22)
P 0|0 = P 0 (3.23)
for h = 1 to H doTime UpdatePredicted state estimate
xh|h−1 = fh−1(xh−1|h−1) (3.24)
Transition equation linearization
F h−1 =∂fh−1
∂x
∣∣∣∣xh−1|h−1
(3.25)
Predicted covariance estimate
P h|h−1 = F h−1P h−1|h−1F>h−1 +Qh (3.26)
Measurement UpdateINPUT: real-time measurement Mh
Measurement equation linearization
Hh =∂gh∂x
∣∣∣∣xh|h−1
(3.27)
Near-optimal Kalman gain
Kh = P h|h−1H>h
(HhP h|h−1H
>h +Rh
)−1(3.28)
Updated state estimate
xh|h = xh|h−1 +Kh
(Mh − gh(xh|h−1)
)(3.29)
Updated covariance estimate
P h|h = P h|h−1 −KhHhP h|h−1 (3.30)
OUTPUT: posterior estimates xh|h and P h|hend for
37
h− 1: xh−1|h−1 and P h−1|h−1, the algorithm can immediately give us predicted esti-
mates xh|h−1 and P h|h−1 according to Time Update section. These are called prior
estimates, since they are based on the model. Subscript h|h−1 means measurements
at time h−1 are revealed but we are predicting for time h. When new measurements
Mh are available, they will get updated based on Measurement Update. The
updated estimates xh|h and P h|h are called posterior estimates.
A directed graph model that the extended Kalman filter based on is shown below.
The horizontal arrows are based on the state transition model, and vertical arrows are
based on measurement model. The directions of the arrows show the causal relations
in a timely manner. For each time step h, when we are given Mh, we can infer xh,
and predict xh+1, just as Algorithm 1 shows.
Figure 3-1: Directed Graph Model in Kalman Filtering Scheme
There are some observations of the extended Kalman filter algorithms.
First, it is a non-linear extension to the linear Kalman filter (Kalman, 1960). It
linearizes the functions f and g locally so that linear Kalman filter update formula-
tions could be useful.
Second, in Algorithm 1, time-dependent fh−1 and gh are used instead of time-
invariant f and g to extend the extended Kalman filter, which is necessary to be
applied in the traffic simulation field. For instance, when added to the morning peak,
the same OD flow from CBD to suburban area probably will be less of a problem
to the congestion, compared with being added to the evening peak. By making
those functions time-dependent, we consider the relations changing with time. We
38
could think the relation fh−1 and gh depend on time step h, which in reality depend
on the current traffic situation, in terms of traffic distribution over the network,
congesting level, incidents, weathering and maybe daytime or nighttime. Recall that
the focus of this thesis is gh. In our previous simplification, we model the influence
of previous state vectors xh, ...,xh−q+1 by only using gh(xh). However, this is not a
major compromise if our target is to only estimate the state for the current interval
h, since the influence of previous state vectors are already captured by gh.
Third, the Kalman filter framework is general, in the sense that it does not con-
strain the types of parameters and measurements. So the framework can handle all
the parameters and measurements in the DTA model.
Last but not least, the two linearization steps are crucial, since they represent
the underlying non-linear function. Again, in our settings, we care about the mea-
surement model, and the linearization is only estimated for gh. There are different
methods to estimate the linearization, and thus there are different EKF variants, and
they will be the focus in the following part.
3.2.2 Finite Difference and FD-EKF
The finite difference method is a way to obtain the gradient. It is the most straight-
forward way to calculate the gradient when we do not have the analytical formulation
of gh. In our setting, the gh function is the simulation-based DTA model. Assuming
gh(·) is a vector of dimension m and xh is a vector of dimension n, the gradient is a
matrix of dimension (m× n). Then the gradient matrix as shown in Equation (3.27)
39
can be calculated by
Hh =
∂gh1∂x1
. . . ∂gh1∂xn
.... . .
...
∂ghm∂x1
. . . ∂ghm∂xn
∣∣∣∣∣∣∣∣∣xh|h−1
(3.31)
where,
∂gh1∂xi
...
∂ghm∂xi
≈ gh(xh + δi)− gh(xh − δi)2δi
(3.32)
δi = [0, 0, ..., δi, ..., 0]> (3.33)
The δi vector is called perturbation vector, as it indicates that the vector xh
is perturbed at i-th element; Equation (3.32) approximates the i-th column of Hh
matrix, and it shows the change in all m measurements caused by the change in the
i-th element of δi.
The method shown in Equation (3.32) is central finite difference. We make some
remarks about this method. In our DTA setting, the simulation substitutes gh.
Thus, in order to get one column of Hh, we need 2 runs of simulation (gh). Thus this
algorithm has a complexity of O(n) for each Hh. Notice that the unit of complexity
is one run of simulation. Depending on the network size and number of simulated
vehicles, the time needed for one run can be very different.
Based on this method, the extended Kalman filter algorithm is called FD-EKF in
this thesis.
3.2.3 Simultaneous Perturbation and SP-EKF
Another method to calculate the Hh is called simultaneous perturbation. It comes
from the idea of SPSA (Spall, 1992). Instead of perturbing the vector xh in each
40
dimension, all dimensions are perturbed a small amount at the same time.
Hh =
∂gh1∂x1
. . . ∂gh1∂xn
.... . .
...
∂ghm∂x1
. . . ∂ghm∂xn
∣∣∣∣∣∣∣∣∣xh|h−1
(3.34)
where,
∂gh1∂xi
...
∂ghm∂xi
≈ gh(xh + δ)− gh(xh − δ)
2δi(3.35)
δ = [δ1, δ2, ..., δi, ..., δn]> (3.36)
The perturbation vector is perturbed randomly for each dimension. Notice in this
case, all the columns in Hh have the same denominator as in Equation (3.35), so we
only need twice the evaluation of gh. Thus, to obtain an approximate Hh, we need
O(1) calculations. The discussion about complexity unit in FD-EKF still holds here.
The extended Kalman filter with simultaneous perturbation is called SP-EKF
(Antoniou et al., 2007). Compared with the FD-EKF, it saves much computation
time, but the approximated gradient matrix will be inaccurate. Since all columns
of Hh are calculated with the same numerator vector, they are linear dependent.
In fact, the rank of Hh is 1. Due to this characteristic and our target is to obtain
accurate parameter estimations, this thesis focuses on FD-EKF, to obtain the most
accurate gradient estimation. The EKF discussed in the following sections of this
thesis is FD-EKF, unless SP is specified.
3.2.4 Characteristics of EKF in Online Calibration for DTA
As stated before, the EKF is based on a linearization of the non-linear functions.
According to Online Calibration of Dynamic Traffic Assignment (Antoniou, 2004),
EKF outperforms UKF in terms of error between simulated and observed measure-
ments. This observation holds for both estimation and prediction. This demonstrates
that EKF has good performance in practice even though it only approximates the
41
non-linear model to the first order. This case study was conducted on a freeway with
ramps, which is considered a relatively low complexity scenario with only 80 parame-
ters for each 15 minute time interval. Since the goal of online calibration is real-time
performance, its performance on complex networks with larger dimension and shorter
time intervals has yet to be investigated.
3.3 Constrained EKF
In this section, the situation when there are constraints on the state vectors is con-
sidered in the EKF framework. An efficient near-optimal method is proposed.
3.3.1 Motivation
When we have some certain constraints on the state vectors, a simple way to impose
them is to project the estimated state vector into the feasible region. When the
constraints are in the form of lower and upper bounds, we can simply project each
element of the state vector into its corresponding feasible region. This element-wise
projection is called truncation, and this term will be used often in the remaining part.
In the context of OD flow estimation, we know that for a certain OD flow variable,
it should be non-negative because the number of vehicles cannot be negative. Hence,
in this case, we have x ≥ 0 for this OD pair, as a natural non-negative constraint.
Thus the truncation is needed for negative OD estimates in order to be fed into the
DTA system. Although efficient, this fix is not necessarily correct, because estimators
of different dimensions are correlated. Truncating one variable while keeping others
intact disregards its relation with other variables.
As discussed above, there are natural non-negative constraints on the OD values.
Since EKF is based on unconstrained Gaussian assumptions, it is likely that the
estimates violate those constraints. It is especially the case when the true values of
the OD flows are zero or close to zero, and the estimated variance is large. In this case,
the Kalman filter tends to give estimates with noise around the true value. Suppose
those true values are zero, Kalman filter estimates will oscillate around 0. For all
42
the OD pairs with 0 as true values, the estimates will be either positive or negative.
Then, due to the truncation, those negative values will be set to zeros. Thus, on
average we are estimating positive values for those OD pairs that should be all zero.
Since this overestimation happens for each interval, the error accumulates. Hence the
calibrated DTA system will be further and further away from the true traffic scenario
we want to fit. A detailed demonstration is discussed in Section 4.3.2 of case study.
3.3.2 Optimization Formulation for Constrained Kalman Fil-
ter Estimates
In this subsection, the theoretical basis of the constrained Kalman filter is discussed.
First the maximum a posteriori (MAP) estimate idea is introduced to demonstrate
the objective of EKF. Then with this objective, the true MAP estimate within the
constraints are demonstrated in a two dimensional example. Finally, the general
optimization formulation is presented.
As discussed in Section 3.2.1, the Kalman filter family assumes Gaussian dis-
tributed error terms. Thus, the state estimator, as a random vector, is also Gaussian
distributed. The Kalman filter estimates xh|h at time step h are essentially the max-
imum a posteriori (MAP) estimates, which are updated given the measurements and
the prior distribution (based on the transition model). Equation (3.30) gives the
posterior covariance matrix P h|h, which depicts the shape of the posterior Gaussian
distribution. Thus, we can reconstruct the posterior distribution as:
fX(x) =1√
(2π)n|P h|h|exp
−1
2(x− xh|h)>P h|h
−1(x− xh|h)
(3.37)
where, n is the dimension of vector x.
For instance, as Figure 3-2 shows, the contours are the posterior probability den-
sity function (PDF) for a 2-dimensional case. This is an example with (x, y) ∼
43
N (µ,Σ), where
µ = [0.5,−1]>
Σ =
1 0.7
0.7 1
We can see that the “cross” is the center of the PDF, which is the MAP estimate for
unconstrained Kalman filter. When we directly impose the constraints x ≥ 0, y ≥ 0,
we get the “circle” point. But in terms of maximizing a posteriori probability density
under the constraints, the “circle” point certainly does not do a good job. The true
MAP estimate should be the “asterisk” point.
Figure 3-2: 2-D Posterior PDF Contour and Different Estimators
This problem is defined as Kalman filter with state inequality constraints. It is
discussed in (Simon and Simon, 2006; Simon, 2010). They formulated the problem as
44
a quadratic programming with linear inequality constraints, as shown in the following
display. The solution of the problem will be the estimates with inequality constraints.
maxx
fX(x)⇔ minx
(x− x)>Σ−1(x− x) (3.38)
s.t. Dx ≤ d (3.39)
where, the Σ is the covariance estimate for state estimate x; D is a known s × n
constant matrix, s is the number of constraints, n is the number of state variables,
and s ≤ n; Further, D is assumed full rank, i.e. rank of s. If the rank of D is less
than s, we can always drop the redundant constraint to make it full rank.
In Kalman Filtering with Inequality Constraints For Turbofan Engine Health Es-
timation (Simon and Simon, 2006), the same quadratic optimization formulation is
proposed and the general idea of active set method is discussed. From the same source,
the authors also proved that the variance of the constrained estimates is smaller than
unconstrained estimates. The case study was performed on turbo fan health mon-
itoring, where a quadratic programming was performed to solve the optimization
problem. By comparing the constrained Kalman filter with the unconstrained one,
the estimation error is largely reduced by 50%, on average. This implies that the
constraints will also be very helpful to the problem discussed in Section 3.3.1.
3.3.3 An Efficient Near-Optimal Algorithm for EKF with
Bound Constraints in DTA Calibration
In the specific context of DTA estimation, the constraints are usually imposed on
individual parameters. In OD estimation example, we have x ≥ 0. Another example
is for supply parameters, where the supply parameter vector could have both upper
bounds and lower bounds, i.e. slb ≤ s ≤ sub. So in our DTA calibration case, we
45
have the following optimization formulation after each measurement update:
minx
(x− x)>Σ−1(x− x) (3.40)
s.t. xlb ≤ x ≤ xub (3.41)
where, x = xh|h,Σ = P h|h, xlb and xub are lower bound vector and upper bound
vector for state vector x.
Then an intuition of solving this is based on the truncation practice. For simplicity
of the demonstration, we assume there is no upper bound for x and focus on the lower
bound. When we truncate x, we set the elements that violates the lower bounds
to corresponding elements in xlb. It is essentially adding equality constraints to the
optimization problem. Let SetA contain those indices, where truncation is performed.
In addition, Set Ac is the complement of Set A. In order to solve this problem, we
can maximize the following conditional PDF:
maxxAc
fX(xAc|xA = (xlb)A
)(3.42)
Maximizing the conditional probability (objective function) is equivalent to max-
imizing the joint probability fX(xAc ,xA = (xlb)A
), since:
fX(xAc|xA = (xlb)A
)=fX(xAc ,xA = (xlb)A
)fX (xA = (xlb)A)
and the denominator is a fixed probability density for given x and Σ. Thus, we want
to:
maxxAc
fX(xAc ,xA = (xlb)A
)⇔ min
xAc(x− x)>Σ−1(x− x)
∣∣xA = (xlb)A (3.43)
Now we prove that solution of Equation (3.42) is:
xAc = xAc + ΣAc,A(ΣA,A)−1(xA − xA
)(3.44)
xA = (xlb)A (3.45)
46
Proof. Assume that the indices in Set Ac are less than each indice in Set A. If not,
we can always do row and column exchanges to Σ, and element exchanges to x such
that this assumption holds. Thus Σ and x can be split into blocks. In addition, we
use the following notations:
x =
xAc
xA
=
x1
x2
(3.46)
Σ−1 = J =
JAc,Ac JAc,A
JA,Ac JA,A
=
J11 J12
J21 J22
(3.47)
Σ−1x =
J11x1 + J12x2
J21x1 + J22x2
= h =
hAc
hA
=
h1
h2
(3.48)
Thus,
(x− x)>Σ−1(x− x) =x>Jx− 2x>Jx+ x
>Jx (3.49)
=
x1
x2
> J11 J12
J21 J22
x1
x2
− 2
h1
h2
> x1
x2
+ C (3.50)
=x>1 J11x1 +(2(J12x2)
> − 2h>1)x1 + x>2 J22x2 − 2h>1 x2 + C
(3.51)
where, C is a constant irrelevant to x. Please note x2 is fixed at (xlb)A and only x1
is the variable. Thus, according to the first order condition:
2J11x1 + 2J12x2 − 2h1 = 0 (3.52)
⇒ x1 = (J11)−1h1 − (J11)
−1J12x2 (3.53)
We now claim that J11 is invertible. This is because Σ is invertible to be a covariance
matrix for a multivariate normal. Thus Σ and J are both positive definite. Since the
leading principal minors of J are all positive, the leading principal minors of J11 are
also positive. Hence, J11 is also positive definite, thus invertible.
47
Then we substitute h1 back,
x1 = x1 − (J11)−1J12
(x2 − x2
)(3.54)
where, x2 = xA = (xlb)A.
In the following part, we will prove that (J11)−1J12 = −Σ12(Σ22)
−1, where, Σ is
also divided to blocks like J .
Σ =
Σ11 Σ12
Σ21 Σ22
(3.55)
So, we can do row operations:
[I −Σ12(Σ22)
−1]Σ11 Σ12
Σ21 Σ22
=[Σ′ 0
](3.56)
with Σ′ , Σ11 −Σ12(Σ22)−1Σ21.
Since ΣJ = I, right-multiplying both sides of Equation (3.56) by J gives us
[I −Σ12(Σ22)
−1]
=[Σ′ 0
]J11 J12
J21 J22
=[Σ′J11 Σ′J12
](3.57)
from which we conclude the following by matching entries on both sides:
Σ′ = (J11)−1 (3.58)
−Σ12(Σ22)−1 = (J11)
−1J12 (3.59)
Thus, we go back to the general case in Equation (3.40) and Equation (3.41), when
the MAP estimates of unconstrained case violates the bounds (whose indices are in
Set A), we can set them to the bounds that are nearest to them, and then obtain the
conditional MAP with xA fixed to the bounds, according to Equation (3.44). Note
48
Algorithm 2 EKF with Iteratively Added Equality Constraints
Run EKF and obtain state estimate x and variance estimate Σ, n is the dimensionof xInitialize
I ← ∅A ← ∅x← x
doif I 6= ∅ then
Adjust invalid state elements
xIlb ← xlbIlb (3.60)
xIub ← xubIub (3.61)
Find conditional MAP estimates
A ← A⋃I (3.62)
Ac ← 1, 2, ..., n \ A (3.63)
xAc = xAc + ΣAc,A(ΣA,A)−1(xA − xA
)(3.64)
end if
Ilb ← ∅Iub ← ∅
Identify invalid state indicesfor j = 1 to n do
if xj < xlbj then Ilb ← Ilb
⋃j
else if xj > xubj then Iub ← Iub
⋃j
end ifend for
I ← Ilb⋃Iub (3.65)
while I 6= ∅
49
that this conditional MAP is not guaranteed to satisfy the bounds for xAc . Thus we
can do this iteratively, adding the indices where bounds are violated from Set Ac to
Set A, and then re-estimate the conditional MAP, until all elements whose indices
are in Set Ac are in the feasible region. The near-optimal algorithm is specified as
Algorithm 2.
xA = [xA(1), ...,xA(|A|)]>, A(j) is the j-th element in Set A, |A| is the cardinality
of Set A; Similarly, ΣAc,A = [Σi,j]|i,j∈Ac×A.
Based on our experiments, this algorithm gives the objective function (Equa-
tion (3.40)) around 2% worse than the optimal, but it is much more efficient than
solving the same quadratic programming problem.
3.3.4 Coordinate Descent Algorithm with Near-Optimal Ini-
tialization
When we are interested in the true optimum, this algorithm can also serve as an initial
estimation, in other words, a starting point for the quadratic programming. Since
we are handling with decoupled constraints for each element, a coordinate descent
method can be applied to solve the quadratic programming problem. It is also faster
than the quadratic programming toolbox in MATLAB. The specific algorithm of
coordinate descent we used is the following.
There are several remarks of the gradient descent. First, the step size in each
update is fixed to 1Qj,j
. Since the objective function is quadratic, the update of
this step size will give the optimal solution for xj, when other dimensions are fixed.
Second, this algorithm is computationally inexpensive, because there is no matrix
multiplications in Equation (3.66). Last but not least, in the specific context of OD
estimation, other objective functions could be used as the stopping rule. For instance,
a distance measurement (like L1 norm) between current and the last estimated state
vector could be used as the objective function. When the improvement of the objective
function is less than ε, the algorithm stops. When L1 norm is used, ε = 0.001 is
good enough, because additional hundreds of updates will not improve the current
50
Algorithm 3 Coordinate Descent
Initialize
x← x0
ε← 0.001
Q← Σ−1
b← −Σ−1x
Objthis ← (x− x)>Σ−1(x− x)
dofor j = 1 to n do
xj = xj −1
Qj,j
(Qj,1:nx+ bj
)(3.66)
xj ← max (xj,xlbj ) (3.67)
xj ← min (xj,xubj ) (3.68)
end for
Objlast ← Objthis (3.69)
Objthis ← (x− x)>Σ−1(x− x) (3.70)
while Objlast −Objthis > ε
51
estimated state vector more than 1 unit away. It is especially the case for OD flow
estimation, since OD flow values only need a precision of 1. Values of 1.8 and 2.4
do not make much difference for DTA system because they will both be rounded to
2 vehicles. So this near-optimal solution is likely to work well in DTA estimation
context.
One last comment about the computational performance: in our case study, the
two algorithms need less than 10 seconds to run. It takes much less time than run-
ning quadprog function in MATLAB. However, this is an empirical comparison. The
efficiency of the proposed algorithms could be concluded after more tests and com-
parisions.
3.4 Summary
In this chapter, the extended Kalman filter methodology was first reviewed. Then the
application of DTA models was demonstrated with a simplification of EKF models. In
addition, two methods of calculating the gradient matrix were reviewed and discussed.
Finally, an extension of the Kalman filter family to constrained case was discussed.
Further, when the constraints are decoupled and could be imposed on each element,
a near optimal solution was given based on the maximizing conditional posterior
distribution. Finally, an efficient algorithm based on coordinate descent to solve for
the Gaussian optimum was discussed.
52
Chapter 4
Case Study: Constrained EKF on
Singapore Expressway Network
In this chapter, a case study is conducted on the Singapore Expressway Network to
demonstrate the performance of the methods discussed in Chapter 3. This chapter is
structured as follows: First the DTA system and data source are introduced. Then,
the Singapore expressway network is briefly summarized. Finally, calibration meth-
ods, including EKF, constrained EKF and GLS are tested on the network, along with
results and discussions.
4.1 Model Overview
4.1.1 DynaMIT Overview
The DTA system we want to calibrate is DynaMIT (DYnamic Network Assignment
for the Management of Information to Travelers). It is a state-of-the-art DTA system
that aims at providing Traffic Management Centers (TMC) with real-time traffic
estimation and prediction. The ultimate goal is to provide traveler information and
route guidance, based on the estimation and prediction. Since the traffic estimation
is the basis for the rest functionalities, it is crucial to gain accurate traffic estimation,
thus forms the topic of this subsection.
53
Figure 4-1: DynaMIT Real-Time Framework (source: Ben-Akiva et al. (2010a))
DynaMIT is a simulation-based DTA system. The DynaMIT real-time work flow
is shown in Figure 4-1, which is built on the general framework, as can be seen in
Figure 1-1. DynaMIT achieves traffic state estimation and prediction based on his-
torical data, a priori parameter values, and real-time observations. Specifically, the
state estimation and prediction are accomplished by the online calibration algorithm,
operated on demand and supply models. The online calibration algorithm was dis-
cussed in Chapter 3. In the following paragraphs, concepts and core models of demand
simulation and supply simulation will be introduced.
The demand simulation intends to mimic the process that vehicles are added to
54
the road network. It utilizes time-dependent origin-destination (OD) matrices to
represent the demand pattern. The OD matrices serve as an aggregated demand de-
cisions by individuals including origin, destination, departure time and route choice.
Since the demand simulator models individual travel demand, it is microscopic. In
contrast, the supply simulation handles the process that traffic proceeds in the road
network. It is mesoscopic, meaning the traffic dynamics are modeled to the level to
road segments and lanes, not to individual vehicles. It utilizes speed-density relation-
ships and queuing theory to quantify the effect of queues, spillbacks, incidents and
bottlenecks.
To demonstrate how the DTA model handles trips in an example, consider a
family who want to go to a movie by car. They decided to depart at 6:30pm. The
driver of their sedan made a decision to take a certain route to avoid the evening
peak. So far, the trip generation, departure time and route choice of this vehicle in
the road network is handled by demand simulation. Then the vehicle movement is
decided by the congestion situation along the route selected, weathering condition,
road condition (work zone, lane closure), etc. The vehicle movement part is handled
by supply simulation.
Generally, DynaMIT is capable of efficiently modeling large networks since it is
a mesoscopic DTA model. Thus DynaMIT is a DTA candidate that is convenient
to calibrate and fine tune, although generally analytical form is not available for
simulation-based DTA. The simulation efficiency also makes DynaMIT useful in real-
time decision support. Detailed information about the DynaMIT framework and its
applications are available in (Ben-Akiva et al., 2010a).
4.1.2 MITSIMLab Overview
In this thesis, MITSIMLab was configured to mimic the traffic in the real world.
MITSIMLab or MITSIM is short for MIcroscopic Traffic SIMulator Laboratory. It
is designed to model traffic flows in road networks equipped with advanced traffic
control, route guidance and surveillance systems. It originates from 1996 as stated
in (Yang and Koutsopoulos, 1996), and released as an open-source project in 2004
55
(more information: http://its.mit.edu/software/mitsimlab). It has been used
widely in practice, with applications in the USA, the UK, Sweden, Italy, Switzerland,
Malaysia, Japan, Korea, etc. It was also used to test and verify the driving behavior
models that are in development within NGSIM (next generation simulation) project.
As a microscopic simulator, MITSIMLab captures the interactions among vehicles
and roads, including car-following, lane-changing and traffic signal response behav-
iors. Thus it can model the vehicles on the road network at a lane level precision.
As for the drivers’ route choice decisions, they are reflected by a probabilistic model
in the presence of real-time traffic information fed by route guidance systems. More
details of MITSIMLab are available in (Ben-Akiva et al., 2010b). In summary, MIT-
SIMLab supports a comprehensive representation of the entire transportation system,
considering vehicles, traffic control devices, algorithms and other elements of TMC
(e.g. surveillance system). Thus is has the ability to provide simulated data that
highly resemble the real-time surveillance data from the real world, for instance, the
traffic flow count data.
4.1.3 DynaMIT and MITSIMLab Integration
As stated in the above subsections, DynaMIT is an efficient mesoscopic DTA system
while MITSIMLab is a comprehensive representation of the real world. Thus we can
take advantage of the realism of MITSIMLab and the efficiency of DynaMIT. we can
simulate the real world with MITSIMLab, where demand is generated and surveillance
data are recorded. Then we use the surveillance data to calibrate DynaMIT, in other
words, we try to tune the parameters such that the DynaMIT simulated sensor data
fit the surveillance data provided by MITSIM.
The advantage of using MITSIMLab as the data feed instead of the real field data
is that we have total control over the “real world”, which is beneficial when we are
testing our algorithms. Nevertheless, for the calibration algorithm to be deployed to
real world, we have to deal with real data in the end.
In the following sections, the data flow between in MITSIMLab and DynaMIT is
specifically explained.
56
4.2 Calibration Framework for Singapore Express-
way Network
4.2.1 Simulation Scenario Overview
The case study was conducted on the Singapore Expressway network. The road
network of Singapore is shown with links (curves) in Figure 4-2. The links with
deeper color represent expressways in Singapore, which are extracted and modeled in
a network file. The network file contains detailed and precise representations of the
road links and segments, while the location, length, curvature, default speed limits,
different lanes and lane-connections are also specified. The same network file is shared
by DynaMIT and MITSIMLab.
Figure 4-2: Singapore Road Network (source: Google Maps, 2016 )
The Singapore Expressway network file has 939 nodes and 1157 links. The defi-
nition of nodes and links agree on the basic definitions in network studies: each link
has two end points and nodes connect with one or more end points. Nodes handle the
connection between links, and some nodes serve as loaders where vehicles are added
to the network. One single link comprise one ore more segments, where coordinates,
lanes, lane-connections, curvature, speed limits and free flow speed are specified. The
57
network visualization in DynaMIT and MITSIMLab is shown in Figure 4-3.
Figure 4-3: Singapore Road Network in DynaMIT and MITSIMLab
There are 650 sensors over the network, each placed on a single segment, according
to the real sensor location provided by Land Transport Authority of Singapore (LTA).
Those sensors provide the surveillance data, which are traffic flow counts per 5 minutes
in our scenario.
On the demand side, the parameters are 1623 Origin-Destination (OD) pairs in
each 5 minute interval from 17:00 to 21:30 (night peak). Since each loader can be the
origin or destination of a trip, there are tens of thousands of possible OD pairs in total.
Based on heuristic rules, unreasonably long trips caused by detour are eliminated (Lu,
2013), thus we decrease the dimension to 4121 OD pairs. The dimension is further
decreased to 1623 OD pairs, based on elimination of OD pairs where constant 0 values
are given through out the night peak. In brief, the state vector has dimension of 1623.
In summary, the case study is performed on Singapore Expressway network. The
dimension of the OD parameters during the night peak we defined is 1623 × 54 =
87642. We are calibrating them in a timely manner, where 1623 parameters are
calibrated at each time step.
58
4.2.2 MITSIMLab and DynaMIT Integration: Data Flow
As stated in the above sections, this research relies on the integration of MITSIMLab
and DynaMIT. However, we need a set of realistic input parameters to approximate
the real world and decrease the gap between simulation laboratory and real world.
In order to do this, the data generation procedure and the calibration framework is
discussed in this subsection.
Figure 4-4: DynaMIT and MITSIMLab Integration Workflow
Figure 4-4 shows the overall work flow of the integration. First of all, MITSIMLab
is calibrated with traffic flow data provided by LTA, using the W-SPSA algorithm
(Lu, 2013). This was an existing work. W-SPSA is an offline calibration algorithm
that uses the collected data in a past time period and tries to tune the parameters
knowing the evolving patterns of the surveillance data. In other words, for each time
interval we are calibrating, we already foresee the surveillance data in the following
intervals. This may give the offline algorithms an advantage in obtaining accurate
parameters. Back to the specific procedure of MITSIMLab calibration, W-SPSA is
run to obtain the offline calibrated demand for the whole day, then the demand in
the night peak period (17:00-21:30) is extracted for further use.
The second step to construct the integration scenario is data smoothing. Same
59
as other offline calibration algorithms, the objective of W-SPSA is Equation (2.1).
Although there is penalty for parameters to deviate greatly from a-priori values, it
is not guaranteed that the parameters we obtain will be smooth. In our case, the
offline calibrated demand for some OD pairs can be seen in Figure 4-5. The drastic
demand change for OD with index of 1007 in Figure 4-5 is not the usual pattern we
observe in the real world. Thus, in order to make the demand pattern more realistic,
we performed a smoothing to this. A Gaussian kernel smoothing with bandwidth
h of 0.16 hour (10 minutes) was applied. And the resulting smoothed demand are
shown in Figure 4-6. Then this smoothed demand file with 1623 OD pairs are used as
the true demand for MITSIMLab, which is deemed as our “real world”. With these
input parameters, MITSIMLab is run to obtain the surveillance data, i.e. sensor flow
counts.
Figure 4-5: Sample W-SPSA Calibrated Demand
As for the third step, we need to construct the historical demand for DynaMIT,
which works as the starting point of our calibration algorithm. The dotted arrow
60
Figure 4-6: Sample Gaussian Kernel Smoothed Demand
in Figure 4-4 indicates that in this particular experiment, our historical demand
approximates the baselines of the true demand. But it does not mean the algorithm
is based on a shared demand in order to work. Thus, we also want to add some
challenge to the calibration algorithm in the next step, so we construct the historical
demand with the following formulation.
xHh,i = (0.75 + 0.15× rand())× xtrueh,i (4.1)
where h is the time interval, i is the index of the OD pair and xtrueh,i indicate the
smoothed demand input for the MITSIMLab system for i-th OD pair at time h.
rand() is a function that generates random number in interval [−1, 1]. It could be
Gaussian random number generator, uniform random number generator, etc. So the
magnitude of historical demand compared to true demand lies in [0.6, 0.9].
Thus, we constructed the historical demand by adding noise and underestimating
the true demand. The objective is to generate a historical demand that has a different
61
pattern from the true demand, to test if the calibration algorithm can recover the
true demand with this setting. The point of choosing an underestimated demand is
to avoid the historical scenario being oversaturated, in which case the DTA system
being calibrated behaves very differently from the true traffic system, and the whole
calibration framework may break down.
4.2.3 Calibration Settings for EKF
Last but not least, it is crucial to discuss the calibration settings. As seen in Equa-
tions (3.7) and (3.8), our EKF implementation works with deviations. It is worthwhile
to rewrite with deviations the basic Kalman filtering equations in DTA calibration
context as shown in Equations (3.3) and (3.4).
∂xh =
h−p∑k=h−1
F kh∂xk +wh (4.2)
∂Mh =
h−q+1∑k=h
Akh∂xh + vh (4.3)
Note that the deviations are calculated based on the historical values. In the third
step, we have the historical state vector (demand), but we also need the historical
measurements to obtain the deviations. Since the measurement equation depicts
the DTA model, what the historical measurements capture should also be the DTA
model. Thus the historical measurements should be obtained by running DynaMIT
with the historical demand input. In order to reduce the stochasticity in the historical
measurements to obtain better foundation for further calibration, 5 runs of DynaMIT
are conducted. Each run is with a different random seed and the average measurement
values are selected as the historical measurements.
As for other calibration settings for Kalman filters (as shown in Algorithm 1),
there are several parameters to initialize, including ∂x0,Qh,Rh and P 0. For the
state vector ∂x0, since deviations are used, it is set to a zero vector. For Qh,Rh and
P 0, which are the covariance matrices for random vectorswh,vh and ∂x0 respectively,
62
the initialization rule is to disregard the covariance across different elements in each
random vector, i.e. to set covariance matrices as diagonal matrices. In addition, the
diagonal elements for Qh are configured such that the standard deviation of wh is
α times the magnitude of ∂xh. In order to handle the situation where ∂xh has near
zero values, the standard deviation of random variable wh,i is set to maxq0, α|∂xh,i|
Thus, in mathematical formulation, covariance Qh has the following relation with
random vector ∂xh:
Qh =
maxq0, α|∂xh,1|2 0 . . . 0
0 maxq0, α|∂xh,2|2 . . . 0...
.... . .
...
0 0 . . . maxq0, α|∂xh,n|2
(4.4)
where, ∂xh,j is the j-th element of random vector ∂xh; α is the fraction to configure.
In our case, α for is set to 0.3, indicating ∂xh has some probability to go to 0 as well
as the probability to go as twice as large. q0 is set to 1, allowing elements in ∂xh with
0 values to change.
Similarly, for covariance Rh, we have:
Rh =
maxr0, β|Mh,1|2 0 . . . 0
0 maxr0, β|Mh,2|2 . . . 0...
.... . .
...
0 0 . . . maxr0, β|Mh,n|2
(4.5)
here, we lose the ∂ because the covariance account for the measurement error, which
depends on the magnitude of real sensor flow counts, but not the deviation between
historical and real sensor flow counts. The fraction β is chosen to be 0.1, meaning
standard deviation is 10% of the sensor readings. r0 is set to 10, considering the
magnitude of sensor readings. Note the imperfection of local linearization in Equa-
tion (4.3) is also captured in Rh.
63
As for P 0 initialization, it can be configured as Q0. Especially with our initial-
ization of ∂x0 = 0, P 0 is a diagonal matrix with q0.
After all the preparation (historical input parameters, measurements and surveil-
lance data (sensor counts)) discussed above, we can test and compare different algo-
rithms discussed in Chapter 3. The candidate algorithms are GLS (implemented in
DynaMIT), EKF, and EKF with bound constraints.
4.2.4 Calibration Evaluation Criteria
The objective of the DTA calibration is to obtain the parameters so that the simulated
measurement outputs are close to the observed measurements in the surveillance data.
In order to quantify the discrepancy, the root mean square normalized error (RMSN)
was selected as the loss function. It is normalized to represent percentage of difference.
Specifically, it is calculated according to:
RMSN =
√N∑N
j=1(Mj −Mj)2∑Nj=1(Mj)
(4.6)
where, Mj is the j-th observed (true) measurement value and Mj is the j-th simulated
(estimated) measurement value. N can be the total number of sensors in each time
interval, then j is the index of sensors in a particular time interval. Thus RMSN
indicates how the calibration algorithm works in a specific time interval. N can also
be the total number of intervals, in which case, j is the index of time intervals for a
specific sensor. Then we have a RMSN value showing how well this sensor is fitted
in the whole simulation period. Finally, if j is a unique index for each sensor in each
time interval, the summation is over all sensors and all intervals for each calibration
run. Thus it indicates the overall performance of the algorithm over all the sensors
in the whole calibration period.
Here we make a distinction of the three usages. The first sense is useful when we
are comparing calibration algorithms, and the performance of different algorithms in
64
each time interval can be compared. The second is useful when we want to compare
the performance of fitting different sensors. The last is an overall metric. The first
and the last are selected in our comparison in Section 4.3.
4.3 Results
In this section, the calibration results of EKF, constrained EKF are reported and
compared with the Generalized Least Squares (GLS) implementation (Wen, 2009).
An analysis was conducted to explain the performance of EKF, together with an anal-
ysis for a particular sensor on the Singapore Expressway network. The computational
performance is also discussed.
4.3.1 EKF Performance
As stated in Section 4.2.3, the EKF is configured with:
q0 = 1, r0 = 10, α = 0.3, β = 0.1
In addition, the simulation time is from 17:00 to 21:30. Then the EKF is run to
calibrate the DynaMIT system against the surveillance data generated by MITSIM-
Lab system.
Estimated versus Observed Sensor Counts
Figure 4-7 are the scatter plots for estimated vs observed sensor counts for six inter-
vals: 17:00-17:05, 17:30-17:35, 18:05-18:10, 19:10-19:15, 20:15-20:20 and 21:20-21:25,
respectively. It can be concluded that in the first several intervals, the results of EKF
seem OK. The subplot in interval 19:10-19:15 shows that the algorithm is clearly over-
estimating the observed sensor counts, since the estimated sensor counts lie above the
diagonal line. However, this pattern continues to present until the whole simulation is
finished. More severely, in the last subplot in Figure 4-7, the estimated and observed
sensor counts are even less correlated than previous intervals.
65
Figure 4-7: Estimated versus Observed Flow Counts: EKF
66
Sensor Flow Count RMSN versus Time
Figure 4-8: Flow Count RMSN versus Time: EKF
The same pattern can also be seen in Figure 4-8. The figure demonstrates how
the RMSN varies with time. The green line shows the RMSN for all 650 sensor counts
in each time interval, where the historical flow counts substitute for estimated flow
counts. This line indicates the RMSN of the expected estimated flow counts when no
calibration is applied and historical demand is directly loaded to the network. The
blue line is the sensor count RMSN produced by EKF. So the lower the blue line is,
the better the algorithm performs.
Based on Figure 4-7 and Figure 4-8, we have the following observations. First, the
RMSN seems to be fine for the first one hour, then it deteriorates and the algorithm
diverges. Second, although at the end it has the tendency to reduce RMSN, but the
scatter plot at time 21:25 suggests that it will probably not converge back. Third, the
67
scatter plots at 19:15, 20:20 and 21:25 in Figure 4-7 show that the overall simulated
sensor flow is much greater than the observed sensor flow.
Following the third observation, we checked the simulation output and found the
number of vehicles in the network keeps growing during calibration. At the same
time, the simulation speed was greatly reduced and there were severe gridlocks in
the DynaMIT system. Thus it is extremely hard for the algorithm to correct the
gridlocks.
Analysis of EKF Performance
Results show that the EKF performed badly in terms of RMSN. The reason, as also
revealed in Section 3.3, lies on the correlated estimators in different dimensions. In
the EKF, the estimators are real numbers. In our case of DTA calibration, some
estimates of demand are negative with high variances. Since the demand for an OD
pair cannot be negative, we just set the negative estimates to zero, and leave the
positive values unchanged. By doing this, we add more vehicles into the network
than what EKF suggests to add. From the point of view of probability theory, the
expectation of the sum of the estimates is changed due to truncating negative values.
More fundamentally, we disregard the correlation among estimators. By setting some
demand values to zero, others are no longer the maximum a posteriori estimates,
but we keep them unchanged. To demonstrate the idea, suppose we have only two
OD pairs each with only one possible route that passes through one common sensor.
Imagine the sensor reports 20 vehicles passed in a 5 minute interval, and EKF gives
us -10 and 30 as the estimated demand values. Then according to the way we use the
estimates, we set -10 to 0, but 30 remains. So we are overestimating the overall OD
flow in each time interval. In our case study, the true demand values in each interval
are sparse, meaning a large proportion of the true OD flows are close to zero. In this
situation, the overestimation problem is more severe. In previous research works, this
sparsity in OD flows is not addressed. This may be the reason that EKF has good
performance in literature.
Noticeably, when applying the EKF followed by a truncation on negative values,
68
we add more vehicles to the network in each interval, and they remain until their trips
are finished. More importantly, in the next time interval, the overestimated sensor
caused by the overestimated OD flows will make Kalman filter decide to reduce the
overall level of demand, by setting some of the OD values even more negative, while
keeping the sum of all estimated demand at a low level (equivalent to the expectation
of the sum of the true demand). This implies some positive ODs will be greater
and total number of vehicles after truncation is much greater than the sum of the
true demand. So by running the EKF for 3 hours, the congestion accumulates to a
significant level. That’s the reason why we have an oversaturated scenario and thus
simulated sensor counts are much more than observed counts.
An Example Estimation Step in EKF
Here we look at an example of the demand values related with sensor ID 528. The
particular sensor captures the demand for 11 OD pairs (OD pair ID renamed to 1-
11). In our demand file for MITSIMLab system, all of those 11 OD pairs have 0
value for a certain time interval. Thus Sensor 528 will have 0 reading for that time
interval. However, the estimated demand values are shown in Table 4.1. Note that
the decimals directly come from the EKF estimates, and will be rounded by the DTA
system. They are kept here just to show the original estimators. It can be concluded
that according to EKF, the sum of those estimators are negative and close to 0.
However, the sum of truncated estimates are almost 100. While the true OD values
are approximately estimated well by EKF before truncation, the truncated estimates
are far from the truth.
Table 4.1: Estimated OD flows (veh/5min) for OD Pairs Related to Sensor 528
OD pair id 1 2 3 4 5 6 7 8 9 10 11 sumunconstrainedEKF estimates
-1.07 14.85 -19.13 -31.50 -0.78 0.91 -7.93 8.78 34.61 37.44 -53.12 -16.93
truncatedEKF estimates
0 0 0 0 0 0.91 0 8.78 34.61 37.44 0 96.59
constrainedEKF estimates
0 0 0 0 0 0 0 0.06 0 0 0 0.06
This problem is solved after we apply constrained EKF. The estimates for the 11
OD pairs after applying constrained EKF are shown in the last row of Table 4.1. These
69
estimates are much better than the truncated EKF estimates in terms of obtaining
the true values and reducing total estimated demand.
4.3.2 Constrained EKF Performance
As stated in Section 3.3, Algorithms 2 and 3 are applied to implement constrained
EKF. Since we utilize an additional optimization step to ensure the estimates are
feasible (satisfy non-negativity constraints), there is no need to to truncation. The
scatter plot of estimated counts versus observed coutns for constrained EKF is shown
in Figure 4-9. It shows the same last four intervals in Figure 4-7 where the perfor-
mance of EKF is poor. indicates clear improvement in fitting sensor.
Figure 4-9: Estimated versus Observed Flow Counts: Constrained EKF
70
Figure 4-10 demonstrates the constrained EKF performance over time. It is no-
ticeable that constrained EKF manage to keep the overall RMSN at around 18%,
and keep the low RMSN until the calibration ends. Another comment is that the
calibration algorithm still bears some pattern of the historical values, since we observe
similar patterns between the two RMSN lines in the first several intervals.
Figure 4-10: Flow Count RMSN versus Time: Constrained EKF
As for the estimated OD flows, the same OD pairs (OD id from 1000 to 1008)
shown before are investigated. Figures 4-11 and 4-12 demonstrate the OD flow esti-
mations for EKF and constrained EKF, respectively. The horizontal axis is the hour
of day, from 17:00 to 21:30. The vertical axis indicates the OD flow value. The blue
curve shows the true demand in MITSIMLab system (treated as real world), while
the orange curve represents the estimated demand. The calibration objective is to
make the estimated curve fit the true curve, thus the algorithm recovers the true de-
mand. Here we make several observations. First, EKF estimates tend to oscillate and
71
Figure 4-11: Sample Calibrated Demand by EKF (before truncation)
Figure 4-12: Sample Calibrated Demand by Constrained EKF
72
violate the 0 lower bound. All estimated OD flows in Figure 4-11 violate the bound
at some time point. Noticeably, this situation is completely avoided by constrained
EKF, as shown in Figure 4-12. Second, the overall overestimation by EKF is signif-
icantly reduced by constrained EKF. For instance, OD pairs with id 1003 and 1005
in Figure 4-11 have abnormally great estimated values from 20:00 to 21:30. It means
that the algorithm constantly adds a great number of vehicles to the DTA system,
which is likely to make the system oversaturated. In comparison, while there are
still some spike in constrained EKF case in Figure 4-12, the magnitude of the spike is
only around 1. In general, the demand values are estimated well by constrained EKF.
Third, EKF estimated OD id 1007 well at first, but a negative spike that lasts around
25 minutes made the estimated demand curve discontinuous. In contrast, constrained
EKF continuously provided good estimates over the whole calibration period. This
implies that EKF still has some ability to estimate states when their true values are
far from the bounds. However, due to the impact from other states whose true values
are close to the bounds, they may still be estimated incorrectly. Finally, for the EKF,
there are some OD flows not fully recovered (e.g. OD with id 1004). This implies
that constrained EKF can be further improved.
4.3.3 Comparison with GLS
Based on the implementation in (Wen, 2009), the GLS algorithm is also tested and
compared. In order to get the covariance matrix, which is a necessary input for GLS,
the covariance matrix is first set to an identity matrix, thus we are estimating the
Ordinary Least Squares (OLS) model. After we get the OD and sensor estimates
with OLS, the variances for each OD pair and sensor count is estimated, and then
the covariance matrix is substituted by the diagonal matrix with those variances. The
following figures show the RMSN of EKF and GLS.
From Figure 4-13, it can be concluded that constrained EKF has worse perfor-
mance than GLS in the first several intervals, due to the imperfect covariance setting.
However, after several intervals, the performance of constrained EKF is comparable
with GLS. Furthermore, it has superior performance to GLS in the last several inter-
73
Figure 4-13: Flow Count RMSN versus Time: GLS versus Constrained EKF
74
vals. This may due to a more accurate covariance matrix, which benefits from the
inherent covariance update mechanism in Kalman filter family. From Figure 4-14, we
can conclude constrained EKF tend to estimate correctly (20:20) or underestimate
(21:25) sensor counts, while GLS tend to overestimate the sensor counts for both
intervals presented. For DTA systems, it is generally better to have underestimated
scenarios than overestimated ones for calibration starting points, because of the risk
of having unnecessary saturated conditions. This may explain why constrained EKF
performs better at the end of the simulation period.
Figure 4-14: Flow Counts Comparison: Constrained EKF (left) vs GLS (right)
75
4.4 Summary
In this chapter, a case study based on Singapore Expressway network was presented.
The data flow and calibration settings were also demonstrated. It was followed by
the calibration algorithm comparison among EKF, constrained EKF and GLS. It
was concluded that EKF will not converge in this setting due to poorly calibrated
values caused by truncation on negative values, which is a necessary step to feed the
estimates to the DTA system. The reasons for the performance of EKF were discussed
and an specific example was given to demonstrate this. In addition, constrained EKF
and GLS were performed. The results indicate that both GLS and constrained EKF
converge, and constrained EKF works better overall. It is probably due to the inherent
covariance update procedure, while GLS uses fixed covariance matrix.
76
Chapter 5
Conclusions
5.1 Summary
Traffic congestion is a hot research topic, because of its negative impacts of wasted
energy, exhaust gas, delays and frustration. Traffic policies, projects, management,
etc. work together as a balanced and diversified solution to congestion. In order
to achieve global traffic management, dynamic traffic assignment (DTA) systems are
widely used to support the operations of Traffic Management Centers (TMCs). One
valuable and important application of DTA is to provide route guidance to drivers in
order to alleviate the overall traffic congestion.
However, the effectiveness of route guidance depends significantly on the fidelity
of the DTA system. To be specific, DTA systems model the process of loading time-
dependent traffic demand onto road networks and compute the traffic conditions at
link level. Without accurate demand and supply parameters, it is unlikely that DTA
systems will produce correct predictions and route guidance. Thus it is critical to
perform parameter estimation, i.e. DTA calibration, which is the topic of this thesis.
In addition to the accuracy requirement, the algorithm should also be online,
meaning it can update parameter estimates with only the data collected before. Ex-
isting online calibration algorithms include generalized least squares (GLS) estima-
tion, extended Kalman filter (EKF) and its variants. However, these algorithms are
usually conducted on freeway corridors. Their effectiveness was not proved in a large
77
network like the entire Singapore expressways, where there are thousands of possible
Origin-Destination (OD) pairs every time interval. The structure of Singapore ex-
pressway network is also more complex than freeway corridors, where many different
traffic flows are captured by a particular sensor. So the test of those online calibra-
tion algorithms on Singapore expressways are important. In this thesis, the demand
values for OD pairs are the parameters to calibrate and the measurements are traffic
flow counts on links.
In general, covariance matrix needs to be estimated to perform GLS, while EKF
updates the covariance matrix in every time step. Since GLS is utilizing a fixed co-
variance matrix throughout the whole simulation period, EKF should perform better
than GLS. However, the flow counts estimated by EKF are abnormally worse, and
the algorithm is diverging. The reason is that the some of the EKF estimated de-
mand values are negative, and DTA system will truncate negative estimates to 0,
since negative demand does not make sense. This essentially adds more vehicles to
the DTA system, because we change the total number of vehicles advised by EKF.
This happens in every time interval and additional vehicles are cumulating, then we
get an oversaturated scenario and the algorithm diverges.
Inspired by similar research in control theory, an EKF formulation with con-
straints (constrained EKF) is introduced to address this problem. An algorithm of
adding equality constraints iteratively followed by a coordinated descent algorithm is
proposed to obtain better maximum a posteriori (MAP) estimates. Noticeably, the
algorithm handles a general situation where both lower bounds and upper bounds
are present for the state vector. This algorithm provides estimates that satisfy the
constraints so that no truncation is necessary for DTA systems.
Finally, a case study with synthetic data for Singapore expressway network is
conducted. The MITSIMLab and DynaMIT integration framework is applied, where
MITSIMLab generates the flow counts, and calibration algorithms are tested on the
DynaMIT system. In order for the synthetic demand used by MITSIMLab to be
realistic, it comes from offline calibration against real flow counts in Singapore ex-
pressways. There are 1623 OD pairs, each of them needs to be estimated every 5
78
minutes. The simulation period is 17:00-21:30, thus 54 time intervals in total. The
settings for EKF and constrained EKF are the same. Results show that EKF algo-
rithm diverges and flow counts are fitted poorly. Constrained EKF performs much
better and the counts are fitted quite well. When compared to GLS, constrained
EKF also performs better in general, especially for later intervals. This coincide with
the intuition that EKF-based algorithms will update the covariance automatically,
which gives it an advantage over GLS, where fixed covariance matrix is used. The
results indicate that constrained EKF is a very competent candidate for DTA online
calibration. It has the potential to calibrate DTA systems with real world traffic data.
5.2 Future Research Directions
More scenario testing
It is convincing that constrained EKF worked much better than unconstrained EKF.
However, the fact that it will always outperform GLS can not be concluded. It is bene-
ficial to perform more case studies accounting for more scenarios, preferably with more
types of parameters and measurements, to demonstrate its superior performance.
Applying directly to real world case studies
This research is conducted on a smoothed modification of real world data, which
were based on existing offline calibration results. Although the demand with which
flow counts are generated by MITSIMLab is the same level as real data on Singapore
expressways, performing the algorithms to calibrate DynaMIT directly with real world
data will demonstrate their capability of fitting real scenarios.
Comparing algorithm 2 and 3 with quadratic programming
In Chapter 3, we proposed Algorithms 2 and 3. Due to empirical results, this al-
gorithm is much more efficient than the quadratic programming in MATLAB. It is
interesting to see if our algorithm is more efficient when it gives the same calibration
79
accuracy when compared with using quadratic programming in MATLAB. More tests
and comparisons will be beneficial.
Computational Performance
In the current implementation, the bottleneck of EKF and constrained EKF is the
gradient matrix estimation. It is currently calculated by central finite difference in
parallel. Noticeably, this calculation can be fully parallelized, as long as we have
enough cores to distribute jobs. In our case, we are using a 20 core machine, and
the calibration of each time interval takes about 12 minutes. In contrast with target
real-time computation time, which is within 5 minutes, it is totally feasible if we have
a more powerful distributed system.
As the dimension grows even larger, adding more CPUs may not be an economic
way to solve the problem. Thus a future research direction is using machine learning
algorithms to perform regression on gradient matrices based on measurements and
estimated state vectors. In this case, with enough offline calculated gradient matrices
and their corresponding measurement and state vectors, we have constructed a train-
ing set. Then all regression methods can be performed and tested to see if we can
predict future gradient matrices. This will be a generalization of the LimEKF idea,
where gradient matrices are precomputed to greatly reduce computation complexity.
If the prediction is quite well and it works well with EKF-based methods, we have
found a way to reduce the computation load significantly.
Autoregressive Degree and Prediction
Currently the implementation is based on the transition model of autoregressive (AR)
degree 1, where the deviation of this interval is assumed to be the same as the last
interval. This is a rather simplified assumption and it affects how the model predicts
the state vector. It may not affect much on the estimation procedure, since in each
measurement update step, the predicted state vector based on transition equation
serves as a starting value. So long as the starting point is reasonable, the gradient
estimation or linearizion is similar, thus measurement updated estimates will be sim-
80
ilar. However, prediction accuracy really depends on the transition model, especially
for predicting traffic states several intervals later. In this case, the AR(1) model has
limited prediction power. In future research, prediction accuracy definitely needs to
be checked, and more complicated AR model should be applied.
81
82
Bibliography
Antoniou, C. (2004). On-line calibration for dynamic traffic assignment, PhD thesis,Massachusetts Institute of Technology.
Antoniou, C., Ben-Akiva, M. and Koutsopoulos, H. (2004). Incorporating automatedvehicle identification data into origin-destination estimation, Transportation Re-search Record: Journal of the Transportation Research Board (1882): 37–44.
Antoniou, C., Ben-Akiva, M. and Koutsopoulos, H. N. (2006). Dynamic traffic de-mand prediction using conventional and emerging data sources, IEE Proceedings-Intelligent Transport Systems, Vol. 153, IET, pp. 97–104.
Antoniou, C., Koutsopoulos, H. N. and Yannis, G. (2007). An efficient non-linearkalman filtering algorithm using simultaneous perturbation and applications intraffic estimation and prediction, Intelligent Transportation Systems Conference,2007. ITSC 2007. IEEE, IEEE, pp. 217–222.
Ashok, K. (1996). Estimation and prediction of time-dependent origin-destinationflows, PhD thesis, Massachusetts Institute of Technology.
Ashok, K. and Ben-Akiva, M. E. (1993). Dynamic origin-destination matrix estima-tion and prediction for real-time traffic management systems, International Sym-posium on the Theory of Traffic Flow and Transportation (12th: 1993: Berkeley,Calif.). Transportation and traffic theory.
Ashok, K. and Ben-Akiva, M. E. (2000). Alternative approaches for real-time esti-mation and prediction of time-dependent origin–destination flows, TransportationScience 34(1): 21–36.
Balakrishna, R. (2002). Calibration of the demand simulator in a dynamic trafficassignment system, Master’s thesis, Massachusetts Institute of Technology.
Balakrishna, R. (2006). Off-line calibration for dynamic traffic assignment models,PhD thesis, Massachusetts Institute of Technology.
Barcelo, J. (2010). Models, traffic models, simulation, and traffic simulation, Funda-mentals of Traffic Simulation, Springer, pp. 1–62.
Barcelo, J. and Casas, J. (2005). Dynamic network simulation with AIMSUN, Sim-ulation Approaches in Transportation Analysis, Springer, pp. 57–98.
83
Ben-Akiva, M., Koutsopoulos, H. N., Antoniou, C. and Balakrishna, R. (2010a). Traf-fic simulation with dynamit, Fundamentals of traffic simulation, Springer, pp. 363–398.
Ben-Akiva, M., Koutsopoulos, H. N., Toledo, T., Yang, Q., Choudhury, C. F., Anto-niou, C. and Balakrishna, R. (2010b). Traffic simulation with mitsimlab, Funda-mentals of Traffic Simulation, Springer, pp. 233–268.
Chang, G.-L. and Wu, J. (1994). Recursive estimation of time-varying origin-destination flows from traffic counts in freeway corridors, Transportation ResearchPart B: Methodological 28(2): 141–160.
Cremer, M. and Keller, H. (1987). A new class of dynamic methods for the identifi-cation of origin-destination flows, Transportation Research Part B: Methodological21(2): 117–132.
FHWA (2015). CORSIM software.URL: http://ops.fhwa.dot.gov/trafficanalysistools/corsim.htm
FHWA (2016). Transmodeler software.URL: http://www.caliper.com/transmodeler/default.htm
FHWA, U. D. o. T. (2014). Series, highway statistics.
Huang, E. (2010). Algorithmic and implementation aspects of on-line calibration ofdynamic traffic assignment, Master’s thesis, Massachusetts Institute of Technology.
INRO (2015). Emme software.URL: http://www.inro.ca/en/products/emme2/index.php
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems,Journal of basic Engineering 82(1): 35–45.
Lu, L. (2013). W-spsa: an efficient stochastic approximation algorithm for the off-linecalibration of dynamic traffic assignment models, Master’s thesis, MassachusettsInstitute of Technology.
Mahmassani, H., Hawas, Y., Abdelghany, K., Abdelfatah, A., Chiu, Y., Kang, Y.,Chang, G., Peeta, S., Taylor, R. and Ziliaskopoulos, A. (1998). Dynasmart-x;volume ii: analytical and algorithmic aspects, Technical Rep. ST067 85.
Mahmassani, H. S. (2001). Dynamic network traffic assignment and simulationmethodology for advanced system management applications, Networks and spatialeconomics 1(3-4): 267–292.
Mahut, M. and Florian, M. (2010). Traffic simulation with dynameq, Fundamentalsof Traffic Simulation, Springer, pp. 323–361.
84
Omrani, R. and Kattan, L. (2012). Demand and supply calibration of dynamic trafficassignment models: Past efforts and future challenges, Transportation ResearchRecord: Journal of the Transportation Research Board (2283): 100–112.
Peeta, S. and Ziliaskopoulos, A. K. (2001). Foundations of dynamic traffic assignment:The past, the present and the future, Networks and Spatial Economics 1(3-4): 233–265.
PTV (2015a). VISSIM software.URL: http://vision-traffic.ptvgroup.com/en-uk/products/ptv-vissim/
PTV (2015b). Visum software.URL: http://vision-traffic.ptvgroup.com/en-uk/products/ptv-visum/
Schrank, D., Eisele, B., Lomax, T. and Bak, J. (2015). Urban mobility scorecard,Technical report, Technical Report August, Texas A&M Transportation Instituteand INRIX, Inc.
Simon, D. (2010). Kalman filtering with state constraints: a survey of linear andnonlinear algorithms, Control Theory & Applications, IET 4(8): 1303–1318.
Simon, D. and Simon, D. L. (2006). Kalman filtering with inequality constraintsfor turbofan engine health estimation, Control Theory and Applications, IEEProceedings-, Vol. 153, IET, pp. 371–378.
Smith, M., Duncan, G. and Druitt, S. (1995). Paramics: microscopic traffic simula-tion for congestion management, Dynamic Control of Strategic Inter-Urban RoadNetworks, IEE Colloquium on, IET, pp. 8–1.
Spall, J. C. (1992). Multivariate stochastic approximation using a simultaneousperturbation gradient approximation, Automatic Control, IEEE Transactions on37(3): 332–341.
Wang, Y., Messmer, A. and Papageorgiou, M. (2001). Freeway network simula-tion and dynamic traffic assignment with metanet tools, Transportation ResearchRecord: Journal of the Transportation Research Board 1776(1): 178–188.
Wang, Y. and Papageorgiou, M. (2005). Real-time freeway traffic state estimationbased on extended kalman filter: a general approach, Transportation Research PartB: Methodological 39(2): 141–167.
Wen, Y. (2009). Scalability of dynamic traffic assignment, PhD thesis, MassachusettsInstitute of Technology.
Yang, Q. and Koutsopoulos, H. N. (1996). A microscopic traffic simulator for eval-uation of dynamic traffic management systems, Transportation Research Part C:Emerging Technologies 4(3): 113–129.
85
Zhou, X. and Mahmassani, H. S. (2007). A structural state space model for real-time traffic origin–destination demand estimation and prediction in a day-to-daylearning framework, Transportation Research Part B: Methodological 41(8): 823–840.
86