constrained extended kalman filter: an e cient improvement

Constrained Extended Kalman Filter: an Efficient

Improvement of Calibration for Dynamic Traffic Assignment

Modelsby

Haizheng ZhangB. Eng. Automation, Tsinghua University, 2013

Submitted to the Department of Civil and Environmental Engineering and theDepartment of Electrical Engineering and Computer Sciencein partial fulfillment of the requirements for the degrees of

Master of Science in Transportation

and

Master of Science in Electrical Engineering and Computer Science

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

June 2016

c© Massachusetts Institute of Technology 2016. All rights reserved.

Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Department of Civil and Environmental Engineering

Department of Electrical Engineering and Computer ScienceMay 19, 2016

Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Moshe E. Ben-Akiva

Edmund K. Turner Professor of Civil and Environmental EngineeringThesis Supervisor

Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Francisco C. Pereira

Professor, Technical University of DenmarkThesis Supervisor

Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Jacob K. White

Cecil H. Green Professor of Electrical Engineering and Computer ScienceThesis Reader

Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Leslie A. Kolodziejski

Chair, Department Committee on Graduate Students

Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Heidi Nepf

Chair, Graduate Program Committee

Constrained Extended Kalman Filter: an Efficient

Improvement of Calibration for Dynamic Traffic Assignment

Models

by

Haizheng Zhang

Submitted to the Department of Civil and Environmental Engineeringand the Department of Electrical Engineering and Computer Science

on May 19, 2016, in partial fulfillment of therequirements for the degrees of

Master of Science in Transportationand

Master of Science in Electrical Engineering and Computer Science

Abstract

The calibration (estimation of inputs and parameters) for dynamic traffic assignment(DTA) systems is a crucial process for traffic prediction accuracy, and thus critical toglobal traffic management applications to reduce traffic congestion. In support of thereal-time traffic management, the DTA calibration algorithm should also be online,in terms of: 1) estimating inputs and parameters in a time interval only based ondata up to that time; 2) performing calibration faster than real-time data generation.Generalized least squares (GLS) methods and Kalman filter-based methods are proveduseful in online calibration.

However, in literature, the road networks selected to test online calibration al-gorithms are usually simple and have small number of parameters. Thus their ef-fectiveness when applied to high dimensions and large networks is not well proved.In this thesis, we implemented the extended Kalman filter (EKF) and tested it onthe Singapore expressway network with synthetic data that replicate real world de-mand level. The EKF is diverging and the DTA system is even worse than when nocalibration is applied. The problem lies in the truncation process in DTA systems.When estimated demand values are negative, they are truncated to 0 and the overalldemand is overestimated. To overcome this problem, this thesis presents a modifiedEKF method, called constrained EKF. Constrained EKF solves the problem of over-estimating the overall demand by imposing constraints on the posterior distributionof the state estimators and obtain the maximum a posteriori (MAP) estimates withinthe feasible region. An algorithm of iteratively adding equality constraints followedby the coordinate descent method is applied to obtain the MAP estimates. In ourcase study, this constrained EKF implementation added less than 10 seconds of com-putation time and improved EKF significantly. Results show that it also outperforms

3

GLS, probably because its inherent covariance update procedure has an advantage ofadapting changes compared to fixed covariance matrix setting in GLS.

The contributions of this thesis include: 1) conducting online calibration algo-rithms on a large network with relatively high dimensional parameters, 2) identifyingdrawbacks of a widely applied solution for online DTA calibration in a large network,3) improving an existing algorithm from non-convergence to great performance, 4)proposing an efficient and simple method for the improved algorithm, 5) attainingbetter performance than an existing benchmark algorithm.

Thesis Supervisor: Moshe E. Ben-AkivaTitle: Edmund K. Turner Professor of Civil and Environmental Engineering

Thesis Supervisor: Francisco C. PereiraTitle: Professor, Technical University of Denmark

Thesis Reader: Jacob K. WhiteTitle: Cecil H. Green Professor of Electrical Engineering and Computer Science

4

Acknowledgments

First and foremost, I would like to express my sincere gratitude to my advisor, Profes-

sor Moshe Ben-Akiva. Your invaluable guidance, immense expertise and continuous

support made my MIT graduate study colorful and memorable. I learned a lot from

you. It is my honor to know you.

I express my deep gratitude to Professor Francisco Pereira, for your insightful

advice and extraordinary support through this tough but worthy journey. Thanks

to Professor Constantinos Antoniou for being a great source of knowledge and help.

Thanks also go to Dr. Ravi Seshadri, for your inspiration and guidance through my

masters research. All of you taught me how to be a better researcher. Many thanks

to Professor Jacob White, for your generous help with my dual masters degree in

EECS and invaluable suggestions about this thesis.

I would like to thank Katherine Rosa, the research administrator, and Eunice

Kim, the lab manager of our ITS Lab. Thanks for your help and patience in every

detail. I would like to thank Kiley Clapper and Janet Fischer, the administrators of

department of CEE and EECS for helping me all along.

Thanks to my roommates (and ex-roommate) Tianli Zhou, Hongyi Zhang, and

Chao Zhang. It has been enjoyable to share the journey with you. Special thanks

to Hongyi for a good source of machine learning knowledge, it is always helpful to

discuss research with you. Special thanks to Yuelong Su, Lu Lu and Runmin Xu,

whom I knew in my undergrad at Tsinghua University. All of you are great examples

that lead me to who I am now. My graduate life would not be so colorful without you.

Thanks also go to my friends I met and knew at MIT, for being friendly, considerate

and helpful, which made my life at MIT much easier. Thank you, Yin Wang, Yan

Zhao, Linsen Chong, Nathaniel Bailey, Xiao Fang, Weikun Hu, Chiwei Yan, Xiang

Song, Shi Wang, Yundi Zhang, Yinjin Lee, Monique Stinson, Mazen Danaf, Bilge

Atasoy, Nathanael Cox, Yuhan Jia, Na Wu, Manxi Wu, Li Jin, Jeffrey Liu, Rui Sun,

He Sun, Lijun Sun, Yan Leng, Zhan Zhao and Yiwen Zhu.

I am grateful for the funding from Singapore-MIT Alliance for Research and Tech-

5

nology (SMART) to support my study, research and trips to Singapore. Thanks also

go to the people I knew at SMART, including Kakali Basak, Yan Xu, Dong Wang,

Stephen Robinson, Yang Lu, Vinh-An Vu (Jenny). Every trip to Singapore is a unique

experience because of you.

I would also like to thank Zhiwei Lin, for all your love and support. Last but not

least, my deepest gratitude goes to my parents for your endless love, encouragement

and constant support. This work is dedicated to you.

6

Contents

1 Introduction and Background 13

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.2 Introduction to Dynamic Traffic Assignment . . . . . . . . . . . . . . 14

1.3 Thesis Motivation and Outline . . . . . . . . . . . . . . . . . . . . . . 17

2 Literature Review on Calibration for DTA 19

2.1 Offline Calibration: Generalized Least Squares Model . . . . . . . . . 20

2.2 Online Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.1 State-Space Formulation . . . . . . . . . . . . . . . . . . . . . 24

2.2.2 Optimization Formulation . . . . . . . . . . . . . . . . . . . . 26

2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Methodology: Constrained Extended Kalman Filter in DTA Cali-

bration 29

3.1 General Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 29

3.1.1 State-Space Formulation in Details . . . . . . . . . . . . . . . 29

3.1.2 The Idea of Deviations . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Extended Kalman Filter and Variants in DTA Calibration . . . . . . 35

3.2.1 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . 35

3.2.2 Finite Difference and FD-EKF . . . . . . . . . . . . . . . . . . 39

3.2.3 Simultaneous Perturbation and SP-EKF . . . . . . . . . . . . 40

3.2.4 Characteristics of EKF in Online Calibration for DTA . . . . 41

3.3 Constrained EKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

7

3.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.2 Optimization Formulation for Constrained Kalman Filter Esti-

mates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3.3 An Efficient Near-Optimal Algorithm for EKF with Bound Con-

straints in DTA Calibration . . . . . . . . . . . . . . . . . . . 45

3.3.4 Coordinate Descent Algorithm with Near-Optimal Initialization 50

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Case Study: Constrained EKF on Singapore Expressway Network 53

4.1 Model Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.1.1 DynaMIT Overview . . . . . . . . . . . . . . . . . . . . . . . . 53

4.1.2 MITSIMLab Overview . . . . . . . . . . . . . . . . . . . . . . 55

4.1.3 DynaMIT and MITSIMLab Integration . . . . . . . . . . . . . 56

4.2 Calibration Framework for Singapore Expressway Network . . . . . . 57

4.2.1 Simulation Scenario Overview . . . . . . . . . . . . . . . . . . 57

4.2.2 MITSIMLab and DynaMIT Integration: Data Flow . . . . . . 59

4.2.3 Calibration Settings for EKF . . . . . . . . . . . . . . . . . . 62

4.2.4 Calibration Evaluation Criteria . . . . . . . . . . . . . . . . . 64

4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.3.1 EKF Performance . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.3.2 Constrained EKF Performance . . . . . . . . . . . . . . . . . 70

4.3.3 Comparison with GLS . . . . . . . . . . . . . . . . . . . . . . 73

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5 Conclusions 77

5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . 79

8

List of Figures

1-1 General DTA framework (source: (Ben-Akiva et al., 2010a)) . . . . . 15

3-1 Directed Graph Model in Kalman Filtering Scheme . . . . . . . . . . 38

3-2 2-D Posterior PDF Contour and Different Estimators . . . . . . . . . 44

4-1 DynaMIT Real-Time Framework (source: Ben-Akiva et al. (2010a)) . 54

4-2 Singapore Road Network (source: Google Maps, 2016 ) . . . . . . . . 57

4-3 Singapore Road Network in DynaMIT and MITSIMLab . . . . . . . . 58

4-4 DynaMIT and MITSIMLab Integration Workflow . . . . . . . . . . . 59

4-5 Sample W-SPSA Calibrated Demand . . . . . . . . . . . . . . . . . . 60

4-6 Sample Gaussian Kernel Smoothed Demand . . . . . . . . . . . . . . 61

4-7 Estimated versus Observed Flow Counts: EKF . . . . . . . . . . . . . 66

4-8 Flow Count RMSN versus Time: EKF . . . . . . . . . . . . . . . . . 67

4-9 Estimated versus Observed Flow Counts: Constrained EKF . . . . . 70

4-10 Flow Count RMSN versus Time: Constrained EKF . . . . . . . . . . 71

4-11 Sample Calibrated Demand by EKF (before truncation) . . . . . . . 72

4-12 Sample Calibrated Demand by Constrained EKF . . . . . . . . . . . 72

4-13 Flow Count RMSN versus Time: GLS versus Constrained EKF . . . 74

4-14 Flow Counts Comparison: Constrained EKF (left) vs GLS (right) . . 75

9

List of Tables

4.1 Estimated OD flows (veh/5min) for OD Pairs Related to Sensor 528 . 69

11

Chapter 1

Introduction and Background

1.1 Motivation

Traffic congestion in urban areas has been a hot topic due to its negative temporal,

economic and external impacts such as delays, wasted fuel, frustrated motorists and

air pollution. It is an old and well-known problem by soaring travel demand and

insufficient increase of transportation infrastructures. From 1982 to 2014, the total

length of public road in the United States has increased from 3,865,894 miles to

4,194,257 miles, which is a 8.49% increase. During the same timeframe, the vehicle

miles traveled (VMT) has increased 90.61%, from 1,595,010 miles to 3,040,220 miles

(FHWA, 2014). In contrast, the congestion cost per auto commuter severely has

increased from $400 (in 2014 dollars) in 1982 to $960 in 2014, according to 2015

Urban Mobility Scorecard (Schrank et al., 2015). To make it worse, congestion is

expected to continue increasing, according to the same source. Although the annual

congestion cost from 2008 to 2011 decreased due to the recession, recently urban areas

have generally experienced the same challenges as in the early 2000s, for instance, the

increasing population and job market that contributes to congestion (Schrank et al.,

2015).

The 2015 Urban Mobility Scorecard also recommends a balanced and diversified

approach to reduce congestion, comprised of more policies, programs, projects, flexi-

bility, options and understanding. Traffic management plays an important role among

13

the mixed solutions guided by this approach. One important application is the route

guidance for drivers in Traffic Management Centers (TMCs), for instance, Advanced

Traveler Information Systems (ATIS) in Federal Highway Administration (FHWA)

of the US Department of Transportation. The route guidance aims to achieve global

traffic control to reduce congestion and its impact on energy consumption, gas emis-

sions, delays and frustration. In order to provide correct and reliable guidance, TMCs

should have global information, insights and preferably prediction abilities. Dynamic

traffic assignment (DTA) systems are considered a most promising category of tools

to estimate traffic states and make state predictions. Built upon accurate predictions,

where drivers’ reactions to possible route guidance strategies are also considered, the

best strategy could be selected to reduce congestion to the lowest possible level. In

this thesis, we examine the basis of route guidance operations, i.e. the DTA systems.

1.2 Introduction to Dynamic Traffic Assignment

Traditionally, traffic assignment has been derived from transportation demand fore-

casting (typically the four-step model), which comprises traffic generation, traffic

distribution, mode choice and route assignment. It is a process where traffic demand

(usually represented by a static Origin-Destination matrix) is loaded onto the road

network (Barcelo, 2010). As a result, the traffic flows are computed for the links in

the road network. In contrast, Dynamic Traffic Assignment (DTA) emphasizes time-

varying properties, meaning the traffic demand and flows are time-dependent. This

allows the flexibility to accommodate variational traffic scenarios, where underlying

patterns of time and space are evolving (Mahmassani, 2001). DTA has evolved sub-

stantially since the late 1970s. It is an essential tool for estimating and predicting

dynamic traffic flows on road networks.

Various formulation and solution approaches to DTA have been introduced, both

analytical and simulation-based. Analytical models express the DTA problem in

mathematical formulations for a specific objective (e.g. user equilibrium (UE) or

system optimal (SO)). Optimization algorithms are usually applied to solve the an-

14

alytical DTA and obtain its inputs and parameters. However, its conciseness and

accuracy of replicating traffic flow dynamics is only applicable to small networks.

The analytical formulation has to be simplified for the optimization problem to be

solvable, thus some traffic relationship (e.g. driver behaviors, congestion) cannot be

fully captured (Peeta and Ziliaskopoulos, 2001; Balakrishna, 2006). Thus, simulation

is treated as the best way to model traffic due to its efficiency and accuracy. Recently,

interest has grown in simulation-based DTA methods also because they offer the ad-

vantage of accurately modeling driving behavior and response to guidance. Thus,

the utilization of simulation-based DTA is important for traffic estimation. In this

thesis, our focus in on simulation-based DTA, and DTA in the following chapters is

equivalent to simulation-based DTA.

Figure 1-1: General DTA framework (source: (Ben-Akiva et al., 2010a))

The general work flow of a simulation-based DTA system comprises the dynamic

traffic management system, demand and supply modules, as shown in Figure 1-1. The

management system dictates and provides inputs for the demand and supply modules.

Input parameters include Origin-Destination matrices, incident/event information,

weathering conditions, traffic control strategies, etc. The interaction between demand

15

and supply is simulated by the DTA system. While the simulation is running, the

traffic conditions (e.g. flows, travel times, route choice fractions, queues, vehicle

trajectories) will be measured and outputted in a timely manner.

Existing simulation-based DTA models fall in 3 categories due to different lev-

els of detail in terms of presenting traffic dynamics: microscopic, mesoscopic and

macroscopic (Lu, 2013).

Macroscopic DTA models represent the least detailed traffic dynamics, in which

the traffic is modeled as fluid flows and individual vehicles are not quantified. Thus

they are the most computationally efficient models, particularly on large networks.

Example simulation-based DTA systems are METANET (Wang et al., 2001), EMME

(INRO, 2015), and Visum (PTV, 2015b).

Microscopic DTA models have the most detailed simulation granularity, where

driving behaviors and car interactions are modeled. Examples of existing systems

include PARAMICS (Smith et al., 1995), MITSIMLab (Ben-Akiva et al., 2010b),

AIMSUN2 (Barcelo and Casas, 2005), CORSIM/TSIS (FHWA, 2015), VISSIM (PTV,

2015a), and TransModeler (FHWA, 2016).

Mesoscopic models are a combination of macroscopic and microscopic models, with

the aim of balancing between efficiency and accuracy. Mesoscopic models usually have

detailed representation for individual vehicles on the demand side, but car interactions

are not modeled on the supply side. DTA system examples include DynaMIT (Ben-

Akiva et al., 2010a), DYNASMART (Mahmassani et al., 1998) and Dynameq (Mahut

and Florian, 2010)

In this thesis, we utilized both microscopic and mesoscopic models to fully utilize

their advantages. A microscopic model is selected to represent our real world, because

of its accuracy to replicate real scenario. A mesoscopic model is selected to be our

DTA system, due to its balance between efficiency and accuracy.

16

1.3 Thesis Motivation and Outline

In Section 1.1, the motivation of DTA systems were discussed. We stated that the aim

of a DTA system is to provide traffic state estimation and prediction, which can be

used for global route guidance. For that purpose, the DTA system is usually followed

by route strategy generation. Then TMC will guide the drivers with those generated

strategies. Drivers obtain the guidance information, make decisions and change the

existing traffic flow patterns, which will be captured in the surveillance data. Those

data will again be fed into the DTA system for the latest traffic state estimation.

Among all the components in the strategy generation and evaluation loop, traffic

state estimation is crucial for the DTA system, since it is the basis of other steps.

Since traffic state estimates depend on the inputs and parameters of the DTA system,

input and parameter estimation is also a crucial step. This step is called calibration.

The calibration procedure based on the surveillance data is the focus of this thesis.

The strategy generation and modeling of drivers’ reactions are not in the scope of

this thesis.

The remainder of this thesis is structured below. Chapter 2 summarizes and

commentates on the existing algorithms of offline and online calibration, and the

scope of this research is narrowed down to extended Kalman Filter (EKF) based

algorithms in the online category. Chapter 3 gives more detailed description of the

EKF formulations and algorithms. In addition, the drawbacks of applying EKF to

DTA parameter estimation are demonstrated, particularly for parameters with natu-

ral bound constraints. Following that observation, constrained EKF, a modification

upon EKF is proposed to overcome the drawbacks discovered. Then a specific algo-

rithm of implementing the constrained EKF is presented. In Chapter 4, a case study

of Singapore expressway network is conducted and demonstrated. This experiment

is performed with synthetic data generated using an existing microscopic simulator

(MITSIMLab), which was already offline calibrated with existing real sensor data.

Then we compare the estimated traffic flow counts and their counterparts generated

by the MISTIMLab, followed by the performance comparisons of EKF, Generalize

17

Least Squares method and constrained EKF. In Chapter 5, conclusions are made.

18

Chapter 2

Literature Review on Calibration

for DTA

The DTA systems were introduced in Chapter 1. However, a prerequisite of using

DTA is its capability of producing reliable estimations and predictions of traffic states

when compared to real world. Furthermore, the prediction ability of a DTA model is

usually dependent on its estimation accuracy. Thus the estimation accuracy is of much

importance. No matter what models are applied in different DTA systems, parameter

estimation, which is also called model calibration, is a crucial step. Usually we have

the real world traffic scenario, and we measure/observe some information (these data

are called measurements/observations) about the traffic states, for instance, traffic

flow counts, average travel speeds and link travel times. Then based on these data,

we want to estimate the parameters such that the simulated measurements from DTA

models fit the data. This is the general calibration problem for DTA models. In a

more concise way, the calibration problem in DTA context is defined as follows. Given

a set of initial values for various parameters (OD flows, route choice parameters,

road capacities and speed-density relationships) and measurements (e.g. aggregate

flows, speeds and densities), estimate those parameters such that the error between

the simulated outputs and observed values is minimized (Antoniou, 2004).

The calibration problem can be classified in different ways since there are at least

two dimensions to view this problem. For instance, categorized by the input or

19

parameter type, there are supply and demand calibration, meaning the calibration of

inputs and parameters on the supply side and the demand side, respectively. There

are also offline calibration and online calibration, where data availability at each

time step and computation time are important considerations in algorithms for the

latter. In this thesis, we classify the calibration algorithms into offline and online,

and summarize them in each category.

2.1 Offline Calibration: Generalized Least Squares

Model

The objective of the offline calibration, in general, is to estimate the parameters

such that the simulated outputs fit the observed measurements for an average traffic

scenario. Here average means the measurements are expected values over a long

period of time, for instance, over a month. Please notice that it is distinct from

estimating static traffic assignment (STA) over a month, because this average is over

the same time interval for all days in a month. By averaging, we hope to include

day to day demand fluctuations, weather, etc. into the measurements, thus offline

calibration on these measurements will yield the parameters for an average day in

that month.

In terms of mathematical formulation, Lu formulated the offline calibration frame-

work as an optimization problem, with the following notation (Lu, 2013):

• h: interval index, h ∈ H = 1, 2, ..., H, where H is the set of simulation

intervals

• G: road network, G = Gh|h ∈ H

• x: time-dependent DTA parameters, e.g., OD flows, x = xh|h ∈ H

• xa: a priori time-dependent parameter values, xa = xah|h ∈ H

• β: other time-invariant DTA parameters, e.g., supply model parameters

20

• βa: a priori values of other DTA parameters

• M o: observed sensor measurements, M o = M oh|h ∈ H

• M s: simulated sensor outputs, M s = M sh|h ∈ H

Then the offline calibration problem can be formulated as:

minx,β

z(M o,M s,x,β,xa,βa) =H∑

h=1

z1 (M oh,M

sh) + z2 (xh,x

ah)+ z3 (β,βa) (2.1)

subject to:

M sh = f(x1, ...,xh;β;G1, ...,Gh) (2.2)

lxh ≤ xh ≤ uxh (2.3)

lβ ≤ β ≤ uβ (2.4)

where, f is the abstract of the DTA model, which takes x,β and G as inputs and

generate the simulated sensor outputs M sh; z is a loss function, which measures the-

goodness-of-fit between simulated sensor outputs and observed measurements. z1 is a

specific loss function that quantifies the discrepancy between simulated and observed

measurements. z2 and z3 are loss functions that penalize the parameters for moving

away from a priori values; Parameters xh and β have lower bounds lxh, lβ and upper

bounds uxh,uβ, respectively.

In general, z1, z2 and z3 work as weights to balance the objective so as to minimize

the measurement discrepancy or to constrain parameters locally.

Generalized Least Squares (GLS) is a linear regression model. It is widely known

for OD estimation in the field of DTA calibration. The GLS model is one config-

uration in the optimization framework stated above. In this model, f function is

a multiplication of the assignment matrix (Ashok, 1996) with the OD parameters,

which models the traffic assignment procedure. The model can be more complicated

when we consider the impact of OD parameters in previous intervals, and is given by

21

(Balakrishna, 2002):

f(x1, ...,xh) =h∑

p=h−p′Ap

hxp (2.5)

As for the z1 function, it is defined as:

z1(Moh,M

sh) = (M o

h −M sh)′Ωh

−1(M oh −M s

h) (2.6)

Covariance matrix Ωh is defined for the error between simulated and observed

measurements for interval h. The dimension of Ωh is n-by-n, where n is the number

of measurements. Similarly, covariance matrices for parameter errors xh−xah,β−β

a

can also be estimated and used to construct z2 and z3. The covariance matrices are

usually estimated from the residuals of an Ordinary Least Squares (OLS) model on

the same OD estimation problem. Specifically, OLS is a special case of GLS, in which

the covariance of errors is assumed an identity matrix.

The GLS formulation is advantageous for OD estimation, since the assignment

model has an analytical form and exact solutions are available(Balakrishna, 2002).

However, when we include other supply parameters, we lose the closed form advan-

tages. Thus the calibration of those parameters is harder for GLS.

Notice that the GLS formulation can handle both demand and supply parameters.

The drawback is that when we do not have an analytical form, we have to rely on

the simulation (DTA model) to reveal the relationship to us. In order to obtain

good estimation results, we need efficient algorithms that take simulation outputs

and adjust the parameters. Several optimization methods are applied to solve for the

supply parameters in this GLS framework (Balakrishna, 2006). Balakrishna applied

the Box-SNOBFIT and simultaneous perturbation stochastic approximation (SPSA)

algorithms (Spall, 1992) to solve for the supply parameters, while the demand side is

still handled by GLS. This process is called joint calibration of demand and supply

parameters, and Balakrishna performed supply and demand calibration in a sequential

manner.

22

Recently, Lu proposed an enhanced SPSA method: W-SPSA(Lu, 2013). W-SPSA

estimates the gradient only with the sensor counts related to a specific parameter and

those sensor counts are weighed according to relevance. It utilizes a weight matrix

that store the weights to estimate the enhanced gradient. Then he conducted case

studies on synthetic data and real data. Results showed that W-SPSA significantly

outperforms SPSA in fitting traffic flow counts and speed data.

2.2 Online Calibration

The objective of online calibration is to estimate the correct parameters such that

the DTA model can represent the real-time traffic scenario, not an average scenario.

It tries to solve the calibration problem in a timely manner. There are two aspects

of online calibration, in terms of: 1) estimating the parameters in a time interval

only based on data up to that time; 2) performing calibration faster than real-time

data generation. The general optimization formulation from offline calibration still

applies, but online calibration minimizes only one part of the objective function that

relates to the current simulation interval h. Despite different model formulations, one

commonality is that traffic states are estimated with current and historical data, not

future data. Furthermore, in order to deploy to an online traffic management system,

the online calibration algorithm has to be finished within each time interval, before

measurements in the next interval are available.

The online calibration problem has been addressed by some studies over the

decade, but not many algorithms are proved efficient and scalable. Similar to of-

fline calibration, inputs and parameters calibrated are generally two-fold: demand

and supply. In the following subsections, algorithms of online calibration for DTA

are reviewed and commented on.

Existing algorithms to solve such problems can be categorized according to dif-

ferent formulations. Based on (Omrani and Kattan, 2012), there are two categories

of formulations applied to online calibration for DTA systems, namely state-space

formulations and optimization formulations. They apply to both demand and supply

23

calibration.

2.2.1 State-Space Formulation

The first category is the state-space formulation. It is based on the idea of system

control; demand and supply parameters are treated as state vectors, which evolve over

time. The target is to estimate and predict the true state vector. At the same time,

we are given measurements/observations in the real world, which implies the true

state vector. The most widely applied approach to solve the state-space formulation

is Kalman filter-based method. In Kalman filtering framework, state evolvement is

modeled by a transition equation between adjacent time steps, while the connection

between states and measurements is depicted by a measurement equation. Based on

different assumptions in the transition and measurement equation, the effects and

challenges of EKF can be very different.

On the transition equation side, one assumption is that the deviations of the de-

mand and supply parameters from historical averages define a stationary time series

(Ashok and Ben-Akiva, 1993). This assumption applies to the scenarios where param-

eters follow similar patterns from history. It is an elegant framework that captures the

structural information in demand, without explicitly knowing the pattern. It requires

an offline calibrated demand to serve as a basis to construct starting point. Ashok

and Ben-Akiva(Ashok and Ben-Akiva, 2000) formulated a 4th-order autoregressive

(AR) process from deviations and developed a Kalman filter to estimate and predict

OD demand in real-time. Wang and Papageorgiou (Wang and Papageorgiou, 2005)

formulated demand and supply parameters nicely in a stochastic macroscopic model,

a random walk model is used as the transition equation to estimate traffic conditions

in freeway stretches. In general, this assumption works well with a stationary random

process with constant mean and variance (Zhou and Mahmassani, 2007). In the same

article, Zhou and Mahmassani applied a polynomial trend filter on the deviations to

account for more flexibility than AR model. The authors also proposed a procedure

to update the historical demand pattern online and applied the whole process in OD

estimation to a netword in Irvine.

24

When the evolution pattern of the parameters with time is different from the

pattern implied by historical parameters, the first assumption of stationary time

series fails. In this case, a simple random walk model can be built on the absolute

values of demand estimators with no prior information. The simple random walk

model is an AR model with an autocorrelation coefficient of 1 (i.e. Xk+1 = 1 ×

Xk). Cremer and Keller (Cremer and Keller, 1987), Chang and Wu(Chang and Wu,

1994) used a random walk model to make predictions about dynamic route choice

split proportions. The authors concluded that this algorithm performed well in both

accuracy and stability. However, this approach is limited to slow demand change

scenario, and may not reflect non-linear trends in time-dependent OD flows. Thus

when historical parameters are available, the formulation under this assumption has

inferior performance to the one discussed above.

In the above works mentioned, the measurement equations are all based on ana-

lytical formulations, since OD flows are greatly captured by the assignment matrix.

For other supply and route choice parameters, it is difficult to formulate due to their

indirect relationship with traffic flows. To solve this problem, numerical methods

are applied to obtain the linearization of the relationship, i.e. the gradient. In or-

der words, the DTA system is now treated as a “black box” and the relationship

between input parameters and simulated outputs needs to be calculated (Antoniou

et al., 2004, 2006). Notice the method is general, since no prior analytical form is

assumed. Based on this idea, Antoniou applied extended Kalman filter (EKF), un-

scented Kalman filter (UKF), limiting EKF (LimEKF) and Iterated EKF (I-EKF)

in two cases in the UK and California (Antoniou, 2004). The author calibrated the

demand and supply parameters with flow count and speed sensor data. The LimEKF

has the most computational advantages with complexity of O(1), which is practical

for online applications. EKF and UKF have similar computation complexity of O(n),

where n is the dimension of parameters. In contrast, I-EKF does multiple iterations

of EKF, thus has more complexity. Although compared with LimEKF, getting the

linearization of the system for EKF is time-consuming, it is concluded that the EKF

is still the most straightforward approach to estimate OD demand. In Antoniou’s

25

calibration results, the EKF outperforms UKF and LimEKF in terms of estimation

and prediction accuracy. The author also conclude that additional runs of I-EKF

could further reduce the estimation error.

As discussed above, one last remark about the state space model is its generality,

since it can be used to calibrate all kinds of inputs and parameters for demand, supply

and other types in the future. It can also handle all types of data, as long as they

can capture effects of the inputs and parameters.

2.2.2 Optimization Formulation

The optimization formulation follows the idea of GLS. It is performed by viewing

the DTA online calibration problem as a stochastic minimization problem. Like in

GLS, we also want to minimize a loss function that measures a mixed error between

parameters and their a priori values and estimated and real-time measurements, to-

gether with the error between the model and reality. Numerical methods to estimate

gradient as used in state-space formulation are also useful in this framework. Huang,

E.(Huang, 2010) applied a Gradient Descent Algorithm (GD) and a Conjugate Gra-

dient Algorithm (CG) as direct optimization formulation for the Brisa A5 motorway

between Lisbon and Cascais, Portugal. A heuristic method – Hooke-Jeeves Pattern

Search Algorithm (PS) – was also applied in the same scenario. The author com-

pared those three algorithms and Extended Kalman Filter (EKF) with DynaMIT-R

OD estimation, in terms of estimation accuracy and computational performance. The

author concluded that EKF outperformed PS, CG and GD in decreasing errors in

fitting both flow counts and speeds. GD and EKF have computational advantage

over PS and CG, in the sense that less runtime is needed. Partitioned Simultaneous

Perturbation EKF (PSP-EKF) and PSP-GD were also implemented. It is concluded

that PSP-based methods are more efficient than their original counterparts. However,

the estimation accuracy of EKF still outperforms all other methods including PSP-

EKF. Other stochastic optimization algorithms such as the Box complex, SNOBFIT

were also applied(Balakrishna, 2006). The algorithm is validated on a small synthetic

network and the error rate is less than 10%. However, the algorithm was not tested

26

on large road networks, thus, these algorithms may have scalability problems when

deployed to complicated road network, where traffic flows intervene.

In general, the optimization formulation is a work-around for GLS in online con-

text, where variables in one interval are optimized at a time. We make two observa-

tions here. First, based on different loss functions, the state estimates can be very

different, which essentially correspond to the covariance matrix selection in GLS. Sec-

ond, non-exact optimization methods would also need multiple iterations to converge,

which may conflict with the online requirement. Thus the optimization formulation

may have more parameters to tune than the state-space formulation.

2.3 Summary

We close with two important comments. First, there were multiple algorithms applied

to solve the online calibration problem. However, all the works mentioned above

were applied to highway corridors or freeway stretches, thus the scalability of those

algorithms has not been tested. In other words, when the road network is large and

there are plenty of parameters, the online calibration problem may not work properly.

Second, extended Kalman filter algorithm has superior accuracy performance over

other algorithms. Based on the two reasons, we conducted the following research to

examine the accuracy performance of extended Kalman filter on a large road network.

In addition, although we consider the computation performance, the main focus of

the proposed research is the calibration accuracy, because the numerical method used

in EKF shown in Section 3.2.2 is fully parallelizable.

27

Chapter 3

Methodology: Constrained

Extended Kalman Filter in DTA

Calibration

In this section, the detailed state-space formulation is reviewed first. Then the ex-

tended Kalman filter (EKF) algorithm and some variants applied in the field of DTA

calibration are summarized together with some comments. In addition, the drawbacks

of EKF are discussed from the point of view of maximizing a posterior probability

density. Then the optimization formulation for constrained EKF is discussed. In or-

der to solve it, an algorithm that gives a near-optimal solution is proposed, followed

by the coordinate descent algorithm that obtains the true optimum within a given

precision. Finally, a summary is made.

3.1 General Problem Formulation

3.1.1 State-Space Formulation in Details

As in Chapter 2, the state-space formulation is based on the view of DTA calibration

in control engineering. The inputs and parameters of DTA form the state vector,

and it is assumed to evolve over time. The state-space formulation has been studied

29

comprehensively in control theory, and there are algorithms that estimate state vector

in real-time efficiently. Thus, the introduction of this formulation to DTA calibration

is beneficial. The state-space formulation is explained and discussed in Estimation and

Prediction of Time-Dependent Origin-Destination Flows (Ashok, 1996). While the

original formulation is only for origin-destination flow estimation and prediction, the

state vector can contain all kinds of parameters (e.g. demand and supply parameters).

In order to express the formulation, the following notation is defined:

• h: interval index, h ∈ H = 1, 2, ..., H, where H is the set of simulation

intervals, where time is discretized into indices

• xh: state vector of time interval h

• Mh: measurements in time interval h

Then the state space model comprises the following equations:

• Transition equation

xh = fh−1(xh−1, ...,xh−p) +wh (3.1)

• Measurement equation

Mh = gh(xh, ...,xh−q+1) + vh (3.2)

where, p is the number of previous states that are believed to have relations with

xh; q is the number of states related to current measurement Mh; wh is the process

error term, which indicates the imperfection of the transition model fh; vh is the

observation error term, which absorbs the measurement error of Mh as well as the

imperfection of the model gh.

Usually the functions fh and gh are hard to model, since it depends not only on

multiple previous states, but also on time step. The transition equation is usually

30

modeled as an autoregressive process as stated in Estimation and Prediction of Time-

Dependent Origin-Destination Flows (Ashok, 1996):

xh =

h−p∑k=h−1

F khxk +wh (3.3)

where, F kh is a square matrix, representing the effect of xk on xh; If one makes the

assumption that the autocorrelation structure remains constant over time, F kh would

only rely on h − k, i.e. F kh = F h−k. In fact, the assumption is often made due to

model parsimony.

Similarly, for the measurement model, an autoregressive process can also be ap-

plied, as shown in the following display.

Mh =

h−q+1∑k=h

Akhxh + vh (3.4)

where, Akh matrix accounts for the contribution of xk to Mh. Specifically, if our

target is only the origin-destination (OD) estimation andMh is the aggregated sensor

counts, Akh is the assignment matrix, where its (i, j)-th element is the fraction of jth

OD value in xk that contributes to ith value of Mh.

Typically, the models fh and gh need to be constructed first in order to solve for

the state vectors xk. The autoregressive models are one type of candidate functions

for fh and gh, which are easier to estimate. This is because the coefficient matrices

(F kh,A

kh) work as the linearization when fh and gh are non-linear. When we have

enough data points for the same period h, we can use least squares method to estimate

the coefficient matrices. Thus, if each Akh is a full matrix, we have (nM × nx × q)

parameters to estimate in the whole model, where nM is the length of vector Mh and

nx is the length of xk. Similarly, for the transition model, we also need to estimate

each element for all F kh (nx×nx×p parameters in total). However, the amount of data

available is usually not enough to estimate such complicated models, especially for a

full matrix F kh in the transition model. Thus, a simplification for F k

h is a diagonal

matrix, where correlations between different OD pairs are not considered. One can

31

make a further assumption that the diagonals have the same magnitude, in other

words, F kh is reduced to a scalar F k

h . The model can also be simplified in the time

dimension by reducing p. As for the measurement model, it is more convenient to

estimate with numerical methods, since we already have the simulation-based DTA

model to generate enough data points for us. In this research, the formulations used

are rather simplified:

xh = fh(xh−1) +wh = xh−1 +wh (3.5)

Mh = gh(xh) + vh = Ahxh + vh (3.6)

Following the discussion about estimating Ah in the measurement equation, the DTA

model is treated as a “black box”, then numerical methods are used. In other words,

the linear relations are estimated based on data points generated by the DTA sim-

ulation. Thus, all the relations between xh and Mh can be handled and measured,

even for the types of state vectors and measurements where no analytical formulation

is available.

We make some important comments on the simplification procedure. First, the

transition equation accounts for the relations between state vectors in different time

intervals. A more accurate transition equation is undoubtedly beneficial. However,

its positive impact is more on the prediction side, especially for predicting the traffic

states of several time intervals ahead. As for its impact on calibration, it gives a

starting point (xh) for the measurement model. Thus the effect of the simplification

in Equation (3.5) on calibration is limited, if gh(·) is not a drastic changing function

that depends heavily on the starting point. Second, the measurement model simpli-

fication/approximation is based on the same conjecture that most information in a

measurement vector is already utilized to infer the OD flows (Ashok, 1996). This

conjecture is more likely to hold when measurement errors are low, and wh has low

variance. In other words, when we have enough information to infer the correct OD

flow values, the measurement simplification is acceptable. Finally, it is beneficial to

32

include higher degrees in both equations, but computational complexity can be a

major issue.

3.1.2 The Idea of Deviations

The idea of deviations comes from Dynamic Origin-Destination Matrix Estimation

and Prediction for Real-Time Traffic Management Systems (Ashok and Ben-Akiva,

1993). Since the autoregressive (AR) process is based on the assumption of temporal

interdependencies between OD flows in different time steps, it is beyond the capability

of the AR process to account for the structural information about trip patterns. For

instance, the morning peak and the evening peak are difficult to be modeled by

a simple time-invariant AR process, because they clearly do not follow the same

transition pattern. Thus, a simple way to incorporate the structure of temporal and

spatial pattern of trips is to base on historical (e.g. offline) estimated state vectors

xHh . The deviation of state vector xh and Mh are hence defined as:

∂xh = xh − xHh (3.7)

∂Mh = Mh −MHh (3.8)

where, MHh is the historical measurement values.

Then the transition equation Equation (3.5) and measurement equation Equa-

tion (3.6) now become:

∂xh = ∂xh−1 +wh (3.9)

∂Mh = Hh∂xh + vh (3.10)

After subtracting the historical values, the deviation ∂xh and ∂Mh are more likely

to be random variables of 0 mean, and they represent the day-to-day fluctuations.

Thus the wh,vh term are more likely to be 0 mean. Thus, it is more reasonable for

us to assume the error terms to be:

33

E[wh] = 0, ∀h ∈ H (3.11)

E[vh] = 0, ∀h ∈ H (3.12)

E[whvTh ] = 0, ∀h ∈ H (3.13)

E[whwTh ] = Qh, ∀h ∈ H (3.14)

E[vhvTh ] = Rh, ∀h ∈ H (3.15)

where, the Qh,Rh are covariance matrices for wh and vh in time step h, respectively.

Further, we assume that the error terms across different time steps are uncorre-

lated:

E[whwTk ] = 0, ∀h, k ∈ H, h 6= k (3.16)

E[vhvTk ] = 0, ∀h, k ∈ H, h 6= k (3.17)

Note that Hh in Equation (3.10) was specified as Ah in (Ashok and Ben-Akiva,

1993). This implies that the following two equations also hold, besides Equation (3.5)

and Equation (3.6):

xHh = xH

h−1 +wh (3.18)

MHh = Ahx

Hh + vh (3.19)

This indicates the historical states and measurements follow the same linear model

based on assignment matrix as the deviations. It is assuming the linear measurement

equation holds globally. This is a major but unnecessary constraint for the measure-

ment model. Using deviations instead of the absolute values is a major improvement

because the historical values already account for the non-linearity. Thus the devia-

tion form is a local linearization in the vicinity of the historical values, not a global

linearization. Here in Equation (3.10), the assumption is imposed on deviations only,

and the Hh matrix is not necessarily the assignment matrix. In fact, it depicts the

34

local linear relationship around the historical values. As will demonstrate later in

Section 3.2, the Hh matrix is calculated through numerical methods, not through

assignment matrix generated by DTA model.

We conclude this section by the following remarks. First, in this thesis, the state

transition model is simplified as a random walk and the focus is on estimating the

measurement model gh, where a similar simplification is made. Second, the construc-

tion of the measurement model is general in handling different data and parameter

types, because it is based on local linearization with numerical methods. Finally,

Equation (3.9) and Equation (3.10) are utilized to implement the idea of deviation,

which is an elegant framework to avoid modeling the structural pattern in state vec-

tors.

3.2 Extended Kalman Filter and Variants in DTA

Calibration

There have been several Kalman filter variants applied to solve the state-space for-

mulation in the context of online calibration. Here the extended Kalman filter (EKF)

algorithm are reviewed first and its connection to the state-space model is made ex-

plicit. Then its variants are summarized and commented upon. Last but not least,

the drawbacks of current practices of EKF are addressed and this leads to the next

section.

3.2.1 Extended Kalman Filter

The extended Kalman filter is an approach to handle non-linearity. The transition

and measurement equations are both non-linear in this case. The basic equations are:

• Transition equation

xh = f(xh−1,uh) +wh (3.20)

35

• Measurement equation

Mh = g(xh) + vh (3.21)

where, f and g depicts the transition and measurement relationship, which are as-

sumed fixed over time; uh are control vectors, which we now assume always 0 in DTA

model because the objective now is to calibrate the system instead of controlling it;

wh and vh are uncorrelated 0 mean multivariate Gaussian, with covariance matrix

Qh and Rh, respectively.

By comparing the state-space model and EKF assumptions, we conclude that

EKF has the same assumption as the simplified model as Equation (3.5) and Equa-

tion (3.6), together with Equation (3.11) to Equation (3.17), except for the two addi-

tional assumptions: the time-invariant model form and Gaussian distribution of the

error term.

When the time-independent assumption does not hold, the EKF algorithm in-

herently handles the time-dependent model, as discussed later. As for the Gaussian

assumption, the idea of deviations ensures zero mean, but the shape could be non-

Gaussian. When we are modeling day-to-day traffic fluctuations, if the historical OD

flows are calculated from enough data, the Gaussian assumption is likely to hold.

As the equations show the basic assumptions, the algorithm of extend Kalman

filter is displayed below.

For the extended Kalman filter algorithm, the input parameters are:

• x0: initial starting point (guess) of the state vector at time h = 0

• P 0: initial covariance matrix (guess) of x0

• Qh: time-variant covariance matrix of wh, h ∈ H

• Rh: time-variant covariance matrix of vh, h ∈ H

The Kalman filter is an online algorithm, which means the measurements arrive

while the algorithm is running. Assuming we have the estimates in the last time step

36

h− 1: xh−1|h−1 and P h−1|h−1, the algorithm can immediately give us predicted esti-

mates xh|h−1 and P h|h−1 according to Time Update section. These are called prior

estimates, since they are based on the model. Subscript h|h−1 means measurements

at time h−1 are revealed but we are predicting for time h. When new measurements

Mh are available, they will get updated based on Measurement Update. The

updated estimates xh|h and P h|h are called posterior estimates.

A directed graph model that the extended Kalman filter based on is shown below.

The horizontal arrows are based on the state transition model, and vertical arrows are

based on measurement model. The directions of the arrows show the causal relations

in a timely manner. For each time step h, when we are given Mh, we can infer xh,

and predict xh+1, just as Algorithm 1 shows.

Figure 3-1: Directed Graph Model in Kalman Filtering Scheme

There are some observations of the extended Kalman filter algorithms.

First, it is a non-linear extension to the linear Kalman filter (Kalman, 1960). It

linearizes the functions f and g locally so that linear Kalman filter update formula-

tions could be useful.

Second, in Algorithm 1, time-dependent fh−1 and gh are used instead of time-

invariant f and g to extend the extended Kalman filter, which is necessary to be

applied in the traffic simulation field. For instance, when added to the morning peak,

the same OD flow from CBD to suburban area probably will be less of a problem

to the congestion, compared with being added to the evening peak. By making

those functions time-dependent, we consider the relations changing with time. We

38

could think the relation fh−1 and gh depend on time step h, which in reality depend

on the current traffic situation, in terms of traffic distribution over the network,

congesting level, incidents, weathering and maybe daytime or nighttime. Recall that

the focus of this thesis is gh. In our previous simplification, we model the influence

of previous state vectors xh, ...,xh−q+1 by only using gh(xh). However, this is not a

major compromise if our target is to only estimate the state for the current interval

h, since the influence of previous state vectors are already captured by gh.

Third, the Kalman filter framework is general, in the sense that it does not con-

strain the types of parameters and measurements. So the framework can handle all

the parameters and measurements in the DTA model.

Last but not least, the two linearization steps are crucial, since they represent

the underlying non-linear function. Again, in our settings, we care about the mea-

surement model, and the linearization is only estimated for gh. There are different

methods to estimate the linearization, and thus there are different EKF variants, and

they will be the focus in the following part.

3.2.2 Finite Difference and FD-EKF

The finite difference method is a way to obtain the gradient. It is the most straight-

forward way to calculate the gradient when we do not have the analytical formulation

of gh. In our setting, the gh function is the simulation-based DTA model. Assuming

gh(·) is a vector of dimension m and xh is a vector of dimension n, the gradient is a

matrix of dimension (m× n). Then the gradient matrix as shown in Equation (3.27)

39

can be calculated by

Hh =

∂gh1∂x1

. . . ∂gh1∂xn

.... . .

...

∂ghm∂x1

. . . ∂ghm∂xn

∣∣∣∣∣∣∣∣∣xh|h−1

(3.31)

where,

∂gh1∂xi

...

∂ghm∂xi

≈ gh(xh + δi)− gh(xh − δi)2δi

(3.32)

δi = [0, 0, ..., δi, ..., 0]> (3.33)

The δi vector is called perturbation vector, as it indicates that the vector xh

is perturbed at i-th element; Equation (3.32) approximates the i-th column of Hh

matrix, and it shows the change in all m measurements caused by the change in the

i-th element of δi.

The method shown in Equation (3.32) is central finite difference. We make some

remarks about this method. In our DTA setting, the simulation substitutes gh.

Thus, in order to get one column of Hh, we need 2 runs of simulation (gh). Thus this

algorithm has a complexity of O(n) for each Hh. Notice that the unit of complexity

is one run of simulation. Depending on the network size and number of simulated

vehicles, the time needed for one run can be very different.

Based on this method, the extended Kalman filter algorithm is called FD-EKF in

this thesis.

3.2.3 Simultaneous Perturbation and SP-EKF

Another method to calculate the Hh is called simultaneous perturbation. It comes

from the idea of SPSA (Spall, 1992). Instead of perturbing the vector xh in each

40

dimension, all dimensions are perturbed a small amount at the same time.

Hh =

∂gh1∂x1

. . . ∂gh1∂xn

.... . .

...

∂ghm∂x1

. . . ∂ghm∂xn

∣∣∣∣∣∣∣∣∣xh|h−1

(3.34)

where,

∂gh1∂xi

...

∂ghm∂xi

≈ gh(xh + δ)− gh(xh − δ)

2δi(3.35)

δ = [δ1, δ2, ..., δi, ..., δn]> (3.36)

The perturbation vector is perturbed randomly for each dimension. Notice in this

case, all the columns in Hh have the same denominator as in Equation (3.35), so we

only need twice the evaluation of gh. Thus, to obtain an approximate Hh, we need

O(1) calculations. The discussion about complexity unit in FD-EKF still holds here.

The extended Kalman filter with simultaneous perturbation is called SP-EKF

(Antoniou et al., 2007). Compared with the FD-EKF, it saves much computation

time, but the approximated gradient matrix will be inaccurate. Since all columns

of Hh are calculated with the same numerator vector, they are linear dependent.

In fact, the rank of Hh is 1. Due to this characteristic and our target is to obtain

accurate parameter estimations, this thesis focuses on FD-EKF, to obtain the most

accurate gradient estimation. The EKF discussed in the following sections of this

thesis is FD-EKF, unless SP is specified.

3.2.4 Characteristics of EKF in Online Calibration for DTA

As stated before, the EKF is based on a linearization of the non-linear functions.

According to Online Calibration of Dynamic Traffic Assignment (Antoniou, 2004),

EKF outperforms UKF in terms of error between simulated and observed measure-

ments. This observation holds for both estimation and prediction. This demonstrates

that EKF has good performance in practice even though it only approximates the

41

non-linear model to the first order. This case study was conducted on a freeway with

ramps, which is considered a relatively low complexity scenario with only 80 parame-

ters for each 15 minute time interval. Since the goal of online calibration is real-time

performance, its performance on complex networks with larger dimension and shorter

time intervals has yet to be investigated.

3.3 Constrained EKF

In this section, the situation when there are constraints on the state vectors is con-

sidered in the EKF framework. An efficient near-optimal method is proposed.

3.3.1 Motivation

When we have some certain constraints on the state vectors, a simple way to impose

them is to project the estimated state vector into the feasible region. When the

constraints are in the form of lower and upper bounds, we can simply project each

element of the state vector into its corresponding feasible region. This element-wise

projection is called truncation, and this term will be used often in the remaining part.

In the context of OD flow estimation, we know that for a certain OD flow variable,

it should be non-negative because the number of vehicles cannot be negative. Hence,

in this case, we have x ≥ 0 for this OD pair, as a natural non-negative constraint.

Thus the truncation is needed for negative OD estimates in order to be fed into the

DTA system. Although efficient, this fix is not necessarily correct, because estimators

of different dimensions are correlated. Truncating one variable while keeping others

intact disregards its relation with other variables.

As discussed above, there are natural non-negative constraints on the OD values.

Since EKF is based on unconstrained Gaussian assumptions, it is likely that the

estimates violate those constraints. It is especially the case when the true values of

the OD flows are zero or close to zero, and the estimated variance is large. In this case,

the Kalman filter tends to give estimates with noise around the true value. Suppose

those true values are zero, Kalman filter estimates will oscillate around 0. For all

42

the OD pairs with 0 as true values, the estimates will be either positive or negative.

Then, due to the truncation, those negative values will be set to zeros. Thus, on

average we are estimating positive values for those OD pairs that should be all zero.

Since this overestimation happens for each interval, the error accumulates. Hence the

calibrated DTA system will be further and further away from the true traffic scenario

we want to fit. A detailed demonstration is discussed in Section 4.3.2 of case study.

3.3.2 Optimization Formulation for Constrained Kalman Fil-

ter Estimates

In this subsection, the theoretical basis of the constrained Kalman filter is discussed.

First the maximum a posteriori (MAP) estimate idea is introduced to demonstrate

the objective of EKF. Then with this objective, the true MAP estimate within the

constraints are demonstrated in a two dimensional example. Finally, the general

optimization formulation is presented.

As discussed in Section 3.2.1, the Kalman filter family assumes Gaussian dis-

tributed error terms. Thus, the state estimator, as a random vector, is also Gaussian

distributed. The Kalman filter estimates xh|h at time step h are essentially the max-

imum a posteriori (MAP) estimates, which are updated given the measurements and

the prior distribution (based on the transition model). Equation (3.30) gives the

posterior covariance matrix P h|h, which depicts the shape of the posterior Gaussian

distribution. Thus, we can reconstruct the posterior distribution as:

fX(x) =1√

(2π)n|P h|h|exp

−1

2(x− xh|h)>P h|h

−1(x− xh|h)

(3.37)

where, n is the dimension of vector x.

For instance, as Figure 3-2 shows, the contours are the posterior probability den-

sity function (PDF) for a 2-dimensional case. This is an example with (x, y) ∼

43

N (µ,Σ), where

µ = [0.5,−1]>

Σ =

1 0.7

0.7 1

We can see that the “cross” is the center of the PDF, which is the MAP estimate for

unconstrained Kalman filter. When we directly impose the constraints x ≥ 0, y ≥ 0,

we get the “circle” point. But in terms of maximizing a posteriori probability density

under the constraints, the “circle” point certainly does not do a good job. The true

MAP estimate should be the “asterisk” point.

Figure 3-2: 2-D Posterior PDF Contour and Different Estimators

This problem is defined as Kalman filter with state inequality constraints. It is

discussed in (Simon and Simon, 2006; Simon, 2010). They formulated the problem as

44

a quadratic programming with linear inequality constraints, as shown in the following

display. The solution of the problem will be the estimates with inequality constraints.

maxx

fX(x)⇔ minx

(x− x)>Σ−1(x− x) (3.38)

s.t. Dx ≤ d (3.39)

where, the Σ is the covariance estimate for state estimate x; D is a known s × n

constant matrix, s is the number of constraints, n is the number of state variables,

and s ≤ n; Further, D is assumed full rank, i.e. rank of s. If the rank of D is less

than s, we can always drop the redundant constraint to make it full rank.

In Kalman Filtering with Inequality Constraints For Turbofan Engine Health Es-

timation (Simon and Simon, 2006), the same quadratic optimization formulation is

proposed and the general idea of active set method is discussed. From the same source,

the authors also proved that the variance of the constrained estimates is smaller than

unconstrained estimates. The case study was performed on turbo fan health mon-

itoring, where a quadratic programming was performed to solve the optimization

problem. By comparing the constrained Kalman filter with the unconstrained one,

the estimation error is largely reduced by 50%, on average. This implies that the

constraints will also be very helpful to the problem discussed in Section 3.3.1.

3.3.3 An Efficient Near-Optimal Algorithm for EKF with

Bound Constraints in DTA Calibration

In the specific context of DTA estimation, the constraints are usually imposed on

individual parameters. In OD estimation example, we have x ≥ 0. Another example

is for supply parameters, where the supply parameter vector could have both upper

bounds and lower bounds, i.e. slb ≤ s ≤ sub. So in our DTA calibration case, we

45

have the following optimization formulation after each measurement update:

minx

(x− x)>Σ−1(x− x) (3.40)

s.t. xlb ≤ x ≤ xub (3.41)

where, x = xh|h,Σ = P h|h, xlb and xub are lower bound vector and upper bound

vector for state vector x.

Then an intuition of solving this is based on the truncation practice. For simplicity

of the demonstration, we assume there is no upper bound for x and focus on the lower

bound. When we truncate x, we set the elements that violates the lower bounds

to corresponding elements in xlb. It is essentially adding equality constraints to the

optimization problem. Let SetA contain those indices, where truncation is performed.

In addition, Set Ac is the complement of Set A. In order to solve this problem, we

can maximize the following conditional PDF:

maxxAc

fX(xAc|xA = (xlb)A

)(3.42)

Maximizing the conditional probability (objective function) is equivalent to max-

imizing the joint probability fX(xAc ,xA = (xlb)A

), since:

fX(xAc|xA = (xlb)A

)=fX(xAc ,xA = (xlb)A

)fX (xA = (xlb)A)

and the denominator is a fixed probability density for given x and Σ. Thus, we want

to:

maxxAc

fX(xAc ,xA = (xlb)A

)⇔ min

xAc(x− x)>Σ−1(x− x)

∣∣xA = (xlb)A (3.43)

Now we prove that solution of Equation (3.42) is:

xAc = xAc + ΣAc,A(ΣA,A)−1(xA − xA

)(3.44)

xA = (xlb)A (3.45)

46

Proof. Assume that the indices in Set Ac are less than each indice in Set A. If not,

we can always do row and column exchanges to Σ, and element exchanges to x such

that this assumption holds. Thus Σ and x can be split into blocks. In addition, we

use the following notations:

x =

xAc

xA

=

x1

x2

(3.46)

Σ−1 = J =

JAc,Ac JAc,A

JA,Ac JA,A

=

J11 J12

J21 J22

(3.47)

Σ−1x =

J11x1 + J12x2

J21x1 + J22x2

= h =

hAc

hA

=

h1

h2

(3.48)

Thus,

(x− x)>Σ−1(x− x) =x>Jx− 2x>Jx+ x

>Jx (3.49)

=

x1

x2

> J11 J12

J21 J22

x1

x2

− 2

h1

h2

> x1

x2

+ C (3.50)

=x>1 J11x1 +(2(J12x2)

> − 2h>1)x1 + x>2 J22x2 − 2h>1 x2 + C

(3.51)

where, C is a constant irrelevant to x. Please note x2 is fixed at (xlb)A and only x1

is the variable. Thus, according to the first order condition:

2J11x1 + 2J12x2 − 2h1 = 0 (3.52)

⇒ x1 = (J11)−1h1 − (J11)

−1J12x2 (3.53)

We now claim that J11 is invertible. This is because Σ is invertible to be a covariance

matrix for a multivariate normal. Thus Σ and J are both positive definite. Since the

leading principal minors of J are all positive, the leading principal minors of J11 are

also positive. Hence, J11 is also positive definite, thus invertible.

47

Then we substitute h1 back,

x1 = x1 − (J11)−1J12

(x2 − x2

)(3.54)

where, x2 = xA = (xlb)A.

In the following part, we will prove that (J11)−1J12 = −Σ12(Σ22)

−1, where, Σ is

also divided to blocks like J .

Σ =

Σ11 Σ12

Σ21 Σ22

(3.55)

So, we can do row operations:

[I −Σ12(Σ22)

−1]Σ11 Σ12

Σ21 Σ22

=[Σ′ 0

](3.56)

with Σ′ , Σ11 −Σ12(Σ22)−1Σ21.

Since ΣJ = I, right-multiplying both sides of Equation (3.56) by J gives us

[I −Σ12(Σ22)

−1]

=[Σ′ 0

]J11 J12

J21 J22

=[Σ′J11 Σ′J12

](3.57)

from which we conclude the following by matching entries on both sides:

Σ′ = (J11)−1 (3.58)

−Σ12(Σ22)−1 = (J11)

−1J12 (3.59)

Thus, we go back to the general case in Equation (3.40) and Equation (3.41), when

the MAP estimates of unconstrained case violates the bounds (whose indices are in

Set A), we can set them to the bounds that are nearest to them, and then obtain the

conditional MAP with xA fixed to the bounds, according to Equation (3.44). Note

48

Algorithm 2 EKF with Iteratively Added Equality Constraints

Run EKF and obtain state estimate x and variance estimate Σ, n is the dimensionof xInitialize

I ← ∅A ← ∅x← x

doif I 6= ∅ then

Adjust invalid state elements

xIlb ← xlbIlb (3.60)

xIub ← xubIub (3.61)

Find conditional MAP estimates

A ← A⋃I (3.62)

Ac ← 1, 2, ..., n \ A (3.63)

xAc = xAc + ΣAc,A(ΣA,A)−1(xA − xA

)(3.64)

end if

Ilb ← ∅Iub ← ∅

Identify invalid state indicesfor j = 1 to n do

if xj < xlbj then Ilb ← Ilb

⋃j

else if xj > xubj then Iub ← Iub

⋃j

end ifend for

I ← Ilb⋃Iub (3.65)

while I 6= ∅

49

that this conditional MAP is not guaranteed to satisfy the bounds for xAc . Thus we

can do this iteratively, adding the indices where bounds are violated from Set Ac to

Set A, and then re-estimate the conditional MAP, until all elements whose indices

are in Set Ac are in the feasible region. The near-optimal algorithm is specified as

Algorithm 2.

xA = [xA(1), ...,xA(|A|)]>, A(j) is the j-th element in Set A, |A| is the cardinality

of Set A; Similarly, ΣAc,A = [Σi,j]|i,j∈Ac×A.

Based on our experiments, this algorithm gives the objective function (Equa-

tion (3.40)) around 2% worse than the optimal, but it is much more efficient than

solving the same quadratic programming problem.

3.3.4 Coordinate Descent Algorithm with Near-Optimal Ini-

tialization

When we are interested in the true optimum, this algorithm can also serve as an initial

estimation, in other words, a starting point for the quadratic programming. Since

we are handling with decoupled constraints for each element, a coordinate descent

method can be applied to solve the quadratic programming problem. It is also faster

than the quadratic programming toolbox in MATLAB. The specific algorithm of

coordinate descent we used is the following.

There are several remarks of the gradient descent. First, the step size in each

update is fixed to 1Qj,j

. Since the objective function is quadratic, the update of

this step size will give the optimal solution for xj, when other dimensions are fixed.

Second, this algorithm is computationally inexpensive, because there is no matrix

multiplications in Equation (3.66). Last but not least, in the specific context of OD

estimation, other objective functions could be used as the stopping rule. For instance,

a distance measurement (like L1 norm) between current and the last estimated state

vector could be used as the objective function. When the improvement of the objective

function is less than ε, the algorithm stops. When L1 norm is used, ε = 0.001 is

good enough, because additional hundreds of updates will not improve the current

50

Algorithm 3 Coordinate Descent

Initialize

x← x0

ε← 0.001

Q← Σ−1

b← −Σ−1x

Objthis ← (x− x)>Σ−1(x− x)

dofor j = 1 to n do

xj = xj −1

Qj,j

(Qj,1:nx+ bj

)(3.66)

xj ← max (xj,xlbj ) (3.67)

xj ← min (xj,xubj ) (3.68)

end for

Objlast ← Objthis (3.69)

Objthis ← (x− x)>Σ−1(x− x) (3.70)

while Objlast −Objthis > ε

51

estimated state vector more than 1 unit away. It is especially the case for OD flow

estimation, since OD flow values only need a precision of 1. Values of 1.8 and 2.4

do not make much difference for DTA system because they will both be rounded to

2 vehicles. So this near-optimal solution is likely to work well in DTA estimation

context.

One last comment about the computational performance: in our case study, the

two algorithms need less than 10 seconds to run. It takes much less time than run-

ning quadprog function in MATLAB. However, this is an empirical comparison. The

efficiency of the proposed algorithms could be concluded after more tests and com-

parisions.

3.4 Summary

In this chapter, the extended Kalman filter methodology was first reviewed. Then the

application of DTA models was demonstrated with a simplification of EKF models. In

addition, two methods of calculating the gradient matrix were reviewed and discussed.

Finally, an extension of the Kalman filter family to constrained case was discussed.

Further, when the constraints are decoupled and could be imposed on each element,

a near optimal solution was given based on the maximizing conditional posterior

distribution. Finally, an efficient algorithm based on coordinate descent to solve for

the Gaussian optimum was discussed.

52

Chapter 4

Case Study: Constrained EKF on

Singapore Expressway Network

In this chapter, a case study is conducted on the Singapore Expressway Network to

demonstrate the performance of the methods discussed in Chapter 3. This chapter is

structured as follows: First the DTA system and data source are introduced. Then,

the Singapore expressway network is briefly summarized. Finally, calibration meth-

ods, including EKF, constrained EKF and GLS are tested on the network, along with

results and discussions.

4.1 Model Overview

4.1.1 DynaMIT Overview

The DTA system we want to calibrate is DynaMIT (DYnamic Network Assignment

for the Management of Information to Travelers). It is a state-of-the-art DTA system

that aims at providing Traffic Management Centers (TMC) with real-time traffic

estimation and prediction. The ultimate goal is to provide traveler information and

route guidance, based on the estimation and prediction. Since the traffic estimation

is the basis for the rest functionalities, it is crucial to gain accurate traffic estimation,

thus forms the topic of this subsection.

53

Figure 4-1: DynaMIT Real-Time Framework (source: Ben-Akiva et al. (2010a))

DynaMIT is a simulation-based DTA system. The DynaMIT real-time work flow

is shown in Figure 4-1, which is built on the general framework, as can be seen in

Figure 1-1. DynaMIT achieves traffic state estimation and prediction based on his-

torical data, a priori parameter values, and real-time observations. Specifically, the

state estimation and prediction are accomplished by the online calibration algorithm,

operated on demand and supply models. The online calibration algorithm was dis-

cussed in Chapter 3. In the following paragraphs, concepts and core models of demand

simulation and supply simulation will be introduced.

The demand simulation intends to mimic the process that vehicles are added to

54

the road network. It utilizes time-dependent origin-destination (OD) matrices to

represent the demand pattern. The OD matrices serve as an aggregated demand de-

cisions by individuals including origin, destination, departure time and route choice.

Since the demand simulator models individual travel demand, it is microscopic. In

contrast, the supply simulation handles the process that traffic proceeds in the road

network. It is mesoscopic, meaning the traffic dynamics are modeled to the level to

road segments and lanes, not to individual vehicles. It utilizes speed-density relation-

ships and queuing theory to quantify the effect of queues, spillbacks, incidents and

bottlenecks.

To demonstrate how the DTA model handles trips in an example, consider a

family who want to go to a movie by car. They decided to depart at 6:30pm. The

driver of their sedan made a decision to take a certain route to avoid the evening

peak. So far, the trip generation, departure time and route choice of this vehicle in

the road network is handled by demand simulation. Then the vehicle movement is

decided by the congestion situation along the route selected, weathering condition,

road condition (work zone, lane closure), etc. The vehicle movement part is handled

by supply simulation.

Generally, DynaMIT is capable of efficiently modeling large networks since it is

a mesoscopic DTA model. Thus DynaMIT is a DTA candidate that is convenient

to calibrate and fine tune, although generally analytical form is not available for

simulation-based DTA. The simulation efficiency also makes DynaMIT useful in real-

time decision support. Detailed information about the DynaMIT framework and its

applications are available in (Ben-Akiva et al., 2010a).

4.1.2 MITSIMLab Overview

In this thesis, MITSIMLab was configured to mimic the traffic in the real world.

MITSIMLab or MITSIM is short for MIcroscopic Traffic SIMulator Laboratory. It

is designed to model traffic flows in road networks equipped with advanced traffic

control, route guidance and surveillance systems. It originates from 1996 as stated

in (Yang and Koutsopoulos, 1996), and released as an open-source project in 2004

55

(more information: http://its.mit.edu/software/mitsimlab). It has been used

widely in practice, with applications in the USA, the UK, Sweden, Italy, Switzerland,

Malaysia, Japan, Korea, etc. It was also used to test and verify the driving behavior

models that are in development within NGSIM (next generation simulation) project.

As a microscopic simulator, MITSIMLab captures the interactions among vehicles

and roads, including car-following, lane-changing and traffic signal response behav-

iors. Thus it can model the vehicles on the road network at a lane level precision.

As for the drivers’ route choice decisions, they are reflected by a probabilistic model

in the presence of real-time traffic information fed by route guidance systems. More

details of MITSIMLab are available in (Ben-Akiva et al., 2010b). In summary, MIT-

SIMLab supports a comprehensive representation of the entire transportation system,

considering vehicles, traffic control devices, algorithms and other elements of TMC

(e.g. surveillance system). Thus is has the ability to provide simulated data that

highly resemble the real-time surveillance data from the real world, for instance, the

traffic flow count data.

4.1.3 DynaMIT and MITSIMLab Integration

As stated in the above subsections, DynaMIT is an efficient mesoscopic DTA system

while MITSIMLab is a comprehensive representation of the real world. Thus we can

take advantage of the realism of MITSIMLab and the efficiency of DynaMIT. we can

simulate the real world with MITSIMLab, where demand is generated and surveillance

data are recorded. Then we use the surveillance data to calibrate DynaMIT, in other

words, we try to tune the parameters such that the DynaMIT simulated sensor data

fit the surveillance data provided by MITSIM.

The advantage of using MITSIMLab as the data feed instead of the real field data

is that we have total control over the “real world”, which is beneficial when we are

testing our algorithms. Nevertheless, for the calibration algorithm to be deployed to

real world, we have to deal with real data in the end.

In the following sections, the data flow between in MITSIMLab and DynaMIT is

specifically explained.

56

http://its.mit.edu/software/mitsimlab

4.2 Calibration Framework for Singapore Express-

way Network

4.2.1 Simulation Scenario Overview

The case study was conducted on the Singapore Expressway network. The road

network of Singapore is shown with links (curves) in Figure 4-2. The links with

deeper color represent expressways in Singapore, which are extracted and modeled in

a network file. The network file contains detailed and precise representations of the

road links and segments, while the location, length, curvature, default speed limits,

different lanes and lane-connections are also specified. The same network file is shared

by DynaMIT and MITSIMLab.

Figure 4-2: Singapore Road Network (source: Google Maps, 2016 )

The Singapore Expressway network file has 939 nodes and 1157 links. The defi-

nition of nodes and links agree on the basic definitions in network studies: each link

has two end points and nodes connect with one or more end points. Nodes handle the

connection between links, and some nodes serve as loaders where vehicles are added

to the network. One single link comprise one ore more segments, where coordinates,

lanes, lane-connections, curvature, speed limits and free flow speed are specified. The

57

network visualization in DynaMIT and MITSIMLab is shown in Figure 4-3.

Figure 4-3: Singapore Road Network in DynaMIT and MITSIMLab

There are 650 sensors over the network, each placed on a single segment, according

to the real sensor location provided by Land Transport Authority of Singapore (LTA).

Those sensors provide the surveillance data, which are traffic flow counts per 5 minutes

in our scenario.

On the demand side, the parameters are 1623 Origin-Destination (OD) pairs in

each 5 minute interval from 17:00 to 21:30 (night peak). Since each loader can be the

origin or destination of a trip, there are tens of thousands of possible OD pairs in total.

Based on heuristic rules, unreasonably long trips caused by detour are eliminated (Lu,

2013), thus we decrease the dimension to 4121 OD pairs. The dimension is further

decreased to 1623 OD pairs, based on elimination of OD pairs where constant 0 values

are given through out the night peak. In brief, the state vector has dimension of 1623.

In summary, the case study is performed on Singapore Expressway network. The

dimension of the OD parameters during the night peak we defined is 1623 × 54 =

87642. We are calibrating them in a timely manner, where 1623 parameters are

calibrated at each time step.

58

4.2.2 MITSIMLab and DynaMIT Integration: Data Flow

As stated in the above sections, this research relies on the integration of MITSIMLab

and DynaMIT. However, we need a set of realistic input parameters to approximate

the real world and decrease the gap between simulation laboratory and real world.

In order to do this, the data generation procedure and the calibration framework is

discussed in this subsection.

Figure 4-4: DynaMIT and MITSIMLab Integration Workflow

Figure 4-4 shows the overall work flow of the integration. First of all, MITSIMLab

is calibrated with traffic flow data provided by LTA, using the W-SPSA algorithm

(Lu, 2013). This was an existing work. W-SPSA is an offline calibration algorithm

that uses the collected data in a past time period and tries to tune the parameters

knowing the evolving patterns of the surveillance data. In other words, for each time

interval we are calibrating, we already foresee the surveillance data in the following

intervals. This may give the offline algorithms an advantage in obtaining accurate

parameters. Back to the specific procedure of MITSIMLab calibration, W-SPSA is

run to obtain the offline calibrated demand for the whole day, then the demand in

the night peak period (17:00-21:30) is extracted for further use.

The second step to construct the integration scenario is data smoothing. Same

59

as other offline calibration algorithms, the objective of W-SPSA is Equation (2.1).

Although there is penalty for parameters to deviate greatly from a-priori values, it

is not guaranteed that the parameters we obtain will be smooth. In our case, the

offline calibrated demand for some OD pairs can be seen in Figure 4-5. The drastic

demand change for OD with index of 1007 in Figure 4-5 is not the usual pattern we

observe in the real world. Thus, in order to make the demand pattern more realistic,

we performed a smoothing to this. A Gaussian kernel smoothing with bandwidth

h of 0.16 hour (10 minutes) was applied. And the resulting smoothed demand are

shown in Figure 4-6. Then this smoothed demand file with 1623 OD pairs are used as

the true demand for MITSIMLab, which is deemed as our “real world”. With these

input parameters, MITSIMLab is run to obtain the surveillance data, i.e. sensor flow

counts.

Figure 4-5: Sample W-SPSA Calibrated Demand

As for the third step, we need to construct the historical demand for DynaMIT,

which works as the starting point of our calibration algorithm. The dotted arrow

60

Figure 4-6: Sample Gaussian Kernel Smoothed Demand

in Figure 4-4 indicates that in this particular experiment, our historical demand

approximates the baselines of the true demand. But it does not mean the algorithm

is based on a shared demand in order to work. Thus, we also want to add some

challenge to the calibration algorithm in the next step, so we construct the historical

demand with the following formulation.

xHh,i = (0.75 + 0.15× rand())× xtrueh,i (4.1)

where h is the time interval, i is the index of the OD pair and xtrueh,i indicate the

smoothed demand input for the MITSIMLab system for i-th OD pair at time h.

rand() is a function that generates random number in interval [−1, 1]. It could be

Gaussian random number generator, uniform random number generator, etc. So the

magnitude of historical demand compared to true demand lies in [0.6, 0.9].

Thus, we constructed the historical demand by adding noise and underestimating

the true demand. The objective is to generate a historical demand that has a different

61

pattern from the true demand, to test if the calibration algorithm can recover the

true demand with this setting. The point of choosing an underestimated demand is

to avoid the historical scenario being oversaturated, in which case the DTA system

being calibrated behaves very differently from the true traffic system, and the whole

calibration framework may break down.

4.2.3 Calibration Settings for EKF

Last but not least, it is crucial to discuss the calibration settings. As seen in Equa-

tions (3.7) and (3.8), our EKF implementation works with deviations. It is worthwhile

to rewrite with deviations the basic Kalman filtering equations in DTA calibration

context as shown in Equations (3.3) and (3.4).

∂xh =

h−p∑k=h−1

F kh∂xk +wh (4.2)

∂Mh =

h−q+1∑k=h

Akh∂xh + vh (4.3)

Note that the deviations are calculated based on the historical values. In the third

step, we have the historical state vector (demand), but we also need the historical

measurements to obtain the deviations. Since the measurement equation depicts

the DTA model, what the historical measurements capture should also be the DTA

model. Thus the historical measurements should be obtained by running DynaMIT

with the historical demand input. In order to reduce the stochasticity in the historical

measurements to obtain better foundation for further calibration, 5 runs of DynaMIT

are conducted. Each run is with a different random seed and the average measurement

values are selected as the historical measurements.

As for other calibration settings for Kalman filters (as shown in Algorithm 1),

there are several parameters to initialize, including ∂x0,Qh,Rh and P 0. For the

state vector ∂x0, since deviations are used, it is set to a zero vector. For Qh,Rh and

P 0, which are the covariance matrices for random vectorswh,vh and ∂x0 respectively,

62

the initialization rule is to disregard the covariance across different elements in each

random vector, i.e. to set covariance matrices as diagonal matrices. In addition, the

diagonal elements for Qh are configured such that the standard deviation of wh is

α times the magnitude of ∂xh. In order to handle the situation where ∂xh has near

zero values, the standard deviation of random variable wh,i is set to maxq0, α|∂xh,i|

Thus, in mathematical formulation, covariance Qh has the following relation with

random vector ∂xh:

Qh =

maxq0, α|∂xh,1|2 0 . . . 0

0 maxq0, α|∂xh,2|2 . . . 0...

.... . .

...

0 0 . . . maxq0, α|∂xh,n|2

(4.4)

where, ∂xh,j is the j-th element of random vector ∂xh; α is the fraction to configure.

In our case, α for is set to 0.3, indicating ∂xh has some probability to go to 0 as well

as the probability to go as twice as large. q0 is set to 1, allowing elements in ∂xh with

0 values to change.

Similarly, for covariance Rh, we have:

Rh =

maxr0, β|Mh,1|2 0 . . . 0

0 maxr0, β|Mh,2|2 . . . 0...

.... . .

...

0 0 . . . maxr0, β|Mh,n|2

(4.5)

here, we lose the ∂ because the covariance account for the measurement error, which

depends on the magnitude of real sensor flow counts, but not the deviation between

historical and real sensor flow counts. The fraction β is chosen to be 0.1, meaning

standard deviation is 10% of the sensor readings. r0 is set to 10, considering the

magnitude of sensor readings. Note the imperfection of local linearization in Equa-

tion (4.3) is also captured in Rh.

63

As for P 0 initialization, it can be configured as Q0. Especially with our initial-

ization of ∂x0 = 0, P 0 is a diagonal matrix with q0.

After all the preparation (historical input parameters, measurements and surveil-

lance data (sensor counts)) discussed above, we can test and compare different algo-

rithms discussed in Chapter 3. The candidate algorithms are GLS (implemented in

DynaMIT), EKF, and EKF with bound constraints.

4.2.4 Calibration Evaluation Criteria

The objective of the DTA calibration is to obtain the parameters so that the simulated

measurement outputs are close to the observed measurements in the surveillance data.

In order to quantify the discrepancy, the root mean square normalized error (RMSN)

was selected as the loss function. It is normalized to represent percentage of difference.

Specifically, it is calculated according to:

RMSN =

√N∑N

j=1(Mj −Mj)2∑Nj=1(Mj)

(4.6)

where, Mj is the j-th observed (true) measurement value and Mj is the j-th simulated

(estimated) measurement value. N can be the total number of sensors in each time

interval, then j is the index of sensors in a particular time interval. Thus RMSN

indicates how the calibration algorithm works in a specific time interval. N can also

be the total number of intervals, in which case, j is the index of time intervals for a

specific sensor. Then we have a RMSN value showing how well this sensor is fitted

in the whole simulation period. Finally, if j is a unique index for each sensor in each

time interval, the summation is over all sensors and all intervals for each calibration

run. Thus it indicates the overall performance of the algorithm over all the sensors

in the whole calibration period.

Here we make a distinction of the three usages. The first sense is useful when we

are comparing calibration algorithms, and the performance of different algorithms in

64

each time interval can be compared. The second is useful when we want to compare

the performance of fitting different sensors. The last is an overall metric. The first

and the last are selected in our comparison in Section 4.3.

4.3 Results

In this section, the calibration results of EKF, constrained EKF are reported and

compared with the Generalized Least Squares (GLS) implementation (Wen, 2009).

An analysis was conducted to explain the performance of EKF, together with an anal-

ysis for a particular sensor on the Singapore Expressway network. The computational

performance is also discussed.

4.3.1 EKF Performance

As stated in Section 4.2.3, the EKF is configured with:

q0 = 1, r0 = 10, α = 0.3, β = 0.1

In addition, the simulation time is from 17:00 to 21:30. Then the EKF is run to

calibrate the DynaMIT system against the surveillance data generated by MITSIM-

Lab system.

Estimated versus Observed Sensor Counts

Figure 4-7 are the scatter plots for estimated vs observed sensor counts for six inter-

vals: 17:00-17:05, 17:30-17:35, 18:05-18:10, 19:10-19:15, 20:15-20:20 and 21:20-21:25,

respectively. It can be concluded that in the first several intervals, the results of EKF

seem OK. The subplot in interval 19:10-19:15 shows that the algorithm is clearly over-

estimating the observed sensor counts, since the estimated sensor counts lie above the

diagonal line. However, this pattern continues to present until the whole simulation is

finished. More severely, in the last subplot in Figure 4-7, the estimated and observed

sensor counts are even less correlated than previous intervals.

65

Figure 4-7: Estimated versus Observed Flow Counts: EKF

66

Sensor Flow Count RMSN versus Time

Figure 4-8: Flow Count RMSN versus Time: EKF

The same pattern can also be seen in Figure 4-8. The figure demonstrates how

the RMSN varies with time. The green line shows the RMSN for all 650 sensor counts

in each time interval, where the historical flow counts substitute for estimated flow

counts. This line indicates the RMSN of the expected estimated flow counts when no

calibration is applied and historical demand is directly loaded to the network. The

blue line is the sensor count RMSN produced by EKF. So the lower the blue line is,

the better the algorithm performs.

Based on Figure 4-7 and Figure 4-8, we have the following observations. First, the

RMSN seems to be fine for the first one hour, then it deteriorates and the algorithm

diverges. Second, although at the end it has the tendency to reduce RMSN, but the

scatter plot at time 21:25 suggests that it will probably not converge back. Third, the

67

scatter plots at 19:15, 20:20 and 21:25 in Figure 4-7 show that the overall simulated

sensor flow is much greater than the observed sensor flow.

Following the third observation, we checked the simulation output and found the

number of vehicles in the network keeps growing during calibration. At the same

time, the simulation speed was greatly reduced and there were severe gridlocks in

the DynaMIT system. Thus it is extremely hard for the algorithm to correct the

gridlocks.

Analysis of EKF Performance

Results show that the EKF performed badly in terms of RMSN. The reason, as also

revealed in Section 3.3, lies on the correlated estimators in different dimensions. In

the EKF, the estimators are real numbers. In our case of DTA calibration, some

estimates of demand are negative with high variances. Since the demand for an OD

pair cannot be negative, we just set the negative estimates to zero, and leave the

positive values unchanged. By doing this, we add more vehicles into the network

than what EKF suggests to add. From the point of view of probability theory, the

expectation of the sum of the estimates is changed due to truncating negative values.

More fundamentally, we disregard the correlation among estimators. By setting some

demand values to zero, others are no longer the maximum a posteriori estimates,

but we keep them unchanged. To demonstrate the idea, suppose we have only two

OD pairs each with only one possible route that passes through one common sensor.

Imagine the sensor reports 20 vehicles passed in a 5 minute interval, and EKF gives

us -10 and 30 as the estimated demand values. Then according to the way we use the

estimates, we set -10 to 0, but 30 remains. So we are overestimating the overall OD

flow in each time interval. In our case study, the true demand values in each interval

are sparse, meaning a large proportion of the true OD flows are close to zero. In this

situation, the overestimation problem is more severe. In previous research works, this

sparsity in OD flows is not addressed. This may be the reason that EKF has good

performance in literature.

Noticeably, when applying the EKF followed by a truncation on negative values,

68

we add more vehicles to the network in each interval, and they remain until their trips

are finished. More importantly, in the next time interval, the overestimated sensor

caused by the overestimated OD flows will make Kalman filter decide to reduce the

overall level of demand, by setting some of the OD values even more negative, while

keeping the sum of all estimated demand at a low level (equivalent to the expectation

of the sum of the true demand). This implies some positive ODs will be greater

and total number of vehicles after truncation is much greater than the sum of the

true demand. So by running the EKF for 3 hours, the congestion accumulates to a

significant level. That’s the reason why we have an oversaturated scenario and thus

simulated sensor counts are much more than observed counts.

An Example Estimation Step in EKF

Here we look at an example of the demand values related with sensor ID 528. The

particular sensor captures the demand for 11 OD pairs (OD pair ID renamed to 1-

11). In our demand file for MITSIMLab system, all of those 11 OD pairs have 0

value for a certain time interval. Thus Sensor 528 will have 0 reading for that time

interval. However, the estimated demand values are shown in Table 4.1. Note that

the decimals directly come from the EKF estimates, and will be rounded by the DTA

system. They are kept here just to show the original estimators. It can be concluded

that according to EKF, the sum of those estimators are negative and close to 0.

However, the sum of truncated estimates are almost 100. While the true OD values

are approximately estimated well by EKF before truncation, the truncated estimates

are far from the truth.

Table 4.1: Estimated OD flows (veh/5min) for OD Pairs Related to Sensor 528

OD pair id 1 2 3 4 5 6 7 8 9 10 11 sumunconstrainedEKF estimates

-1.07 14.85 -19.13 -31.50 -0.78 0.91 -7.93 8.78 34.61 37.44 -53.12 -16.93

truncatedEKF estimates

0 0 0 0 0 0.91 0 8.78 34.61 37.44 0 96.59

constrainedEKF estimates

0 0 0 0 0 0 0 0.06 0 0 0 0.06

This problem is solved after we apply constrained EKF. The estimates for the 11

OD pairs after applying constrained EKF are shown in the last row of Table 4.1. These

69

estimates are much better than the truncated EKF estimates in terms of obtaining

the true values and reducing total estimated demand.

4.3.2 Constrained EKF Performance

As stated in Section 3.3, Algorithms 2 and 3 are applied to implement constrained

EKF. Since we utilize an additional optimization step to ensure the estimates are

feasible (satisfy non-negativity constraints), there is no need to to truncation. The

scatter plot of estimated counts versus observed coutns for constrained EKF is shown

in Figure 4-9. It shows the same last four intervals in Figure 4-7 where the perfor-

mance of EKF is poor. indicates clear improvement in fitting sensor.

Figure 4-9: Estimated versus Observed Flow Counts: Constrained EKF

70

Figure 4-10 demonstrates the constrained EKF performance over time. It is no-

ticeable that constrained EKF manage to keep the overall RMSN at around 18%,

and keep the low RMSN until the calibration ends. Another comment is that the

calibration algorithm still bears some pattern of the historical values, since we observe

similar patterns between the two RMSN lines in the first several intervals.

Figure 4-10: Flow Count RMSN versus Time: Constrained EKF

As for the estimated OD flows, the same OD pairs (OD id from 1000 to 1008)

shown before are investigated. Figures 4-11 and 4-12 demonstrate the OD flow esti-

mations for EKF and constrained EKF, respectively. The horizontal axis is the hour

of day, from 17:00 to 21:30. The vertical axis indicates the OD flow value. The blue

curve shows the true demand in MITSIMLab system (treated as real world), while

the orange curve represents the estimated demand. The calibration objective is to

make the estimated curve fit the true curve, thus the algorithm recovers the true de-

mand. Here we make several observations. First, EKF estimates tend to oscillate and

71

Figure 4-11: Sample Calibrated Demand by EKF (before truncation)

Figure 4-12: Sample Calibrated Demand by Constrained EKF

72

violate the 0 lower bound. All estimated OD flows in Figure 4-11 violate the bound

at some time point. Noticeably, this situation is completely avoided by constrained

EKF, as shown in Figure 4-12. Second, the overall overestimation by EKF is signif-

icantly reduced by constrained EKF. For instance, OD pairs with id 1003 and 1005

in Figure 4-11 have abnormally great estimated values from 20:00 to 21:30. It means

that the algorithm constantly adds a great number of vehicles to the DTA system,

which is likely to make the system oversaturated. In comparison, while there are

still some spike in constrained EKF case in Figure 4-12, the magnitude of the spike is

only around 1. In general, the demand values are estimated well by constrained EKF.

Third, EKF estimated OD id 1007 well at first, but a negative spike that lasts around

25 minutes made the estimated demand curve discontinuous. In contrast, constrained

EKF continuously provided good estimates over the whole calibration period. This

implies that EKF still has some ability to estimate states when their true values are

far from the bounds. However, due to the impact from other states whose true values

are close to the bounds, they may still be estimated incorrectly. Finally, for the EKF,

there are some OD flows not fully recovered (e.g. OD with id 1004). This implies

that constrained EKF can be further improved.

4.3.3 Comparison with GLS

Based on the implementation in (Wen, 2009), the GLS algorithm is also tested and

compared. In order to get the covariance matrix, which is a necessary input for GLS,

the covariance matrix is first set to an identity matrix, thus we are estimating the

Ordinary Least Squares (OLS) model. After we get the OD and sensor estimates

with OLS, the variances for each OD pair and sensor count is estimated, and then

the covariance matrix is substituted by the diagonal matrix with those variances. The

following figures show the RMSN of EKF and GLS.

From Figure 4-13, it can be concluded that constrained EKF has worse perfor-

mance than GLS in the first several intervals, due to the imperfect covariance setting.

However, after several intervals, the performance of constrained EKF is comparable

with GLS. Furthermore, it has superior performance to GLS in the last several inter-

73

Figure 4-13: Flow Count RMSN versus Time: GLS versus Constrained EKF

74

vals. This may due to a more accurate covariance matrix, which benefits from the

inherent covariance update mechanism in Kalman filter family. From Figure 4-14, we

can conclude constrained EKF tend to estimate correctly (20:20) or underestimate

(21:25) sensor counts, while GLS tend to overestimate the sensor counts for both

intervals presented. For DTA systems, it is generally better to have underestimated

scenarios than overestimated ones for calibration starting points, because of the risk

of having unnecessary saturated conditions. This may explain why constrained EKF

performs better at the end of the simulation period.

Figure 4-14: Flow Counts Comparison: Constrained EKF (left) vs GLS (right)

75

4.4 Summary

In this chapter, a case study based on Singapore Expressway network was presented.

The data flow and calibration settings were also demonstrated. It was followed by

the calibration algorithm comparison among EKF, constrained EKF and GLS. It

was concluded that EKF will not converge in this setting due to poorly calibrated

values caused by truncation on negative values, which is a necessary step to feed the

estimates to the DTA system. The reasons for the performance of EKF were discussed

and an specific example was given to demonstrate this. In addition, constrained EKF

and GLS were performed. The results indicate that both GLS and constrained EKF

converge, and constrained EKF works better overall. It is probably due to the inherent

covariance update procedure, while GLS uses fixed covariance matrix.

76

Chapter 5

Conclusions

5.1 Summary

Traffic congestion is a hot research topic, because of its negative impacts of wasted

energy, exhaust gas, delays and frustration. Traffic policies, projects, management,

etc. work together as a balanced and diversified solution to congestion. In order

to achieve global traffic management, dynamic traffic assignment (DTA) systems are

widely used to support the operations of Traffic Management Centers (TMCs). One

valuable and important application of DTA is to provide route guidance to drivers in

order to alleviate the overall traffic congestion.

However, the effectiveness of route guidance depends significantly on the fidelity

of the DTA system. To be specific, DTA systems model the process of loading time-

dependent traffic demand onto road networks and compute the traffic conditions at

link level. Without accurate demand and supply parameters, it is unlikely that DTA

systems will produce correct predictions and route guidance. Thus it is critical to

perform parameter estimation, i.e. DTA calibration, which is the topic of this thesis.

In addition to the accuracy requirement, the algorithm should also be online,

meaning it can update parameter estimates with only the data collected before. Ex-

isting online calibration algorithms include generalized least squares (GLS) estima-

tion, extended Kalman filter (EKF) and its variants. However, these algorithms are

usually conducted on freeway corridors. Their effectiveness was not proved in a large

77

network like the entire Singapore expressways, where there are thousands of possible

Origin-Destination (OD) pairs every time interval. The structure of Singapore ex-

pressway network is also more complex than freeway corridors, where many different

traffic flows are captured by a particular sensor. So the test of those online calibra-

tion algorithms on Singapore expressways are important. In this thesis, the demand

values for OD pairs are the parameters to calibrate and the measurements are traffic

flow counts on links.

In general, covariance matrix needs to be estimated to perform GLS, while EKF

updates the covariance matrix in every time step. Since GLS is utilizing a fixed co-

variance matrix throughout the whole simulation period, EKF should perform better

than GLS. However, the flow counts estimated by EKF are abnormally worse, and

the algorithm is diverging. The reason is that the some of the EKF estimated de-

mand values are negative, and DTA system will truncate negative estimates to 0,

since negative demand does not make sense. This essentially adds more vehicles to

the DTA system, because we change the total number of vehicles advised by EKF.

This happens in every time interval and additional vehicles are cumulating, then we

get an oversaturated scenario and the algorithm diverges.

Inspired by similar research in control theory, an EKF formulation with con-

straints (constrained EKF) is introduced to address this problem. An algorithm of

adding equality constraints iteratively followed by a coordinated descent algorithm is

proposed to obtain better maximum a posteriori (MAP) estimates. Noticeably, the

algorithm handles a general situation where both lower bounds and upper bounds

are present for the state vector. This algorithm provides estimates that satisfy the

constraints so that no truncation is necessary for DTA systems.

Finally, a case study with synthetic data for Singapore expressway network is

conducted. The MITSIMLab and DynaMIT integration framework is applied, where

MITSIMLab generates the flow counts, and calibration algorithms are tested on the

DynaMIT system. In order for the synthetic demand used by MITSIMLab to be

realistic, it comes from offline calibration against real flow counts in Singapore ex-

pressways. There are 1623 OD pairs, each of them needs to be estimated every 5

78

minutes. The simulation period is 17:00-21:30, thus 54 time intervals in total. The

settings for EKF and constrained EKF are the same. Results show that EKF algo-

rithm diverges and flow counts are fitted poorly. Constrained EKF performs much

better and the counts are fitted quite well. When compared to GLS, constrained

EKF also performs better in general, especially for later intervals. This coincide with

the intuition that EKF-based algorithms will update the covariance automatically,

which gives it an advantage over GLS, where fixed covariance matrix is used. The

results indicate that constrained EKF is a very competent candidate for DTA online

calibration. It has the potential to calibrate DTA systems with real world traffic data.

5.2 Future Research Directions

More scenario testing

It is convincing that constrained EKF worked much better than unconstrained EKF.

However, the fact that it will always outperform GLS can not be concluded. It is bene-

ficial to perform more case studies accounting for more scenarios, preferably with more

types of parameters and measurements, to demonstrate its superior performance.

Applying directly to real world case studies

This research is conducted on a smoothed modification of real world data, which

were based on existing offline calibration results. Although the demand with which

flow counts are generated by MITSIMLab is the same level as real data on Singapore

expressways, performing the algorithms to calibrate DynaMIT directly with real world

data will demonstrate their capability of fitting real scenarios.

Comparing algorithm 2 and 3 with quadratic programming

In Chapter 3, we proposed Algorithms 2 and 3. Due to empirical results, this al-

gorithm is much more efficient than the quadratic programming in MATLAB. It is

interesting to see if our algorithm is more efficient when it gives the same calibration

79

accuracy when compared with using quadratic programming in MATLAB. More tests

and comparisons will be beneficial.

Computational Performance

In the current implementation, the bottleneck of EKF and constrained EKF is the

gradient matrix estimation. It is currently calculated by central finite difference in

parallel. Noticeably, this calculation can be fully parallelized, as long as we have

enough cores to distribute jobs. In our case, we are using a 20 core machine, and

the calibration of each time interval takes about 12 minutes. In contrast with target

real-time computation time, which is within 5 minutes, it is totally feasible if we have

a more powerful distributed system.

As the dimension grows even larger, adding more CPUs may not be an economic

way to solve the problem. Thus a future research direction is using machine learning

algorithms to perform regression on gradient matrices based on measurements and

estimated state vectors. In this case, with enough offline calculated gradient matrices

and their corresponding measurement and state vectors, we have constructed a train-

ing set. Then all regression methods can be performed and tested to see if we can

predict future gradient matrices. This will be a generalization of the LimEKF idea,

where gradient matrices are precomputed to greatly reduce computation complexity.

If the prediction is quite well and it works well with EKF-based methods, we have

found a way to reduce the computation load significantly.

Autoregressive Degree and Prediction

Currently the implementation is based on the transition model of autoregressive (AR)

degree 1, where the deviation of this interval is assumed to be the same as the last

interval. This is a rather simplified assumption and it affects how the model predicts

the state vector. It may not affect much on the estimation procedure, since in each

measurement update step, the predicted state vector based on transition equation

serves as a starting value. So long as the starting point is reasonable, the gradient

estimation or linearizion is similar, thus measurement updated estimates will be sim-

80

ilar. However, prediction accuracy really depends on the transition model, especially

for predicting traffic states several intervals later. In this case, the AR(1) model has

limited prediction power. In future research, prediction accuracy definitely needs to

be checked, and more complicated AR model should be applied.

81

Bibliography

Antoniou, C. (2004). On-line calibration for dynamic traffic assignment, PhD thesis,Massachusetts Institute of Technology.

Antoniou, C., Ben-Akiva, M. and Koutsopoulos, H. (2004). Incorporating automatedvehicle identification data into origin-destination estimation, Transportation Re-search Record: Journal of the Transportation Research Board (1882): 37–44.

Antoniou, C., Ben-Akiva, M. and Koutsopoulos, H. N. (2006). Dynamic traffic de-mand prediction using conventional and emerging data sources, IEE Proceedings-Intelligent Transport Systems, Vol. 153, IET, pp. 97–104.

Antoniou, C., Koutsopoulos, H. N. and Yannis, G. (2007). An efficient non-linearkalman filtering algorithm using simultaneous perturbation and applications intraffic estimation and prediction, Intelligent Transportation Systems Conference,2007. ITSC 2007. IEEE, IEEE, pp. 217–222.

Ashok, K. (1996). Estimation and prediction of time-dependent origin-destinationflows, PhD thesis, Massachusetts Institute of Technology.

Ashok, K. and Ben-Akiva, M. E. (1993). Dynamic origin-destination matrix estima-tion and prediction for real-time traffic management systems, International Sym-posium on the Theory of Traffic Flow and Transportation (12th: 1993: Berkeley,Calif.). Transportation and traffic theory.

Ashok, K. and Ben-Akiva, M. E. (2000). Alternative approaches for real-time esti-mation and prediction of time-dependent origin–destination flows, TransportationScience 34(1): 21–36.

Balakrishna, R. (2002). Calibration of the demand simulator in a dynamic trafficassignment system, Master’s thesis, Massachusetts Institute of Technology.

Balakrishna, R. (2006). Off-line calibration for dynamic traffic assignment models,PhD thesis, Massachusetts Institute of Technology.

Barcelo, J. (2010). Models, traffic models, simulation, and traffic simulation, Funda-mentals of Traffic Simulation, Springer, pp. 1–62.

Barcelo, J. and Casas, J. (2005). Dynamic network simulation with AIMSUN, Sim-ulation Approaches in Transportation Analysis, Springer, pp. 57–98.

83

Ben-Akiva, M., Koutsopoulos, H. N., Antoniou, C. and Balakrishna, R. (2010a). Traf-fic simulation with dynamit, Fundamentals of traffic simulation, Springer, pp. 363–398.

Ben-Akiva, M., Koutsopoulos, H. N., Toledo, T., Yang, Q., Choudhury, C. F., Anto-niou, C. and Balakrishna, R. (2010b). Traffic simulation with mitsimlab, Funda-mentals of Traffic Simulation, Springer, pp. 233–268.

Chang, G.-L. and Wu, J. (1994). Recursive estimation of time-varying origin-destination flows from traffic counts in freeway corridors, Transportation ResearchPart B: Methodological 28(2): 141–160.

Cremer, M. and Keller, H. (1987). A new class of dynamic methods for the identifi-cation of origin-destination flows, Transportation Research Part B: Methodological21(2): 117–132.

FHWA (2015). CORSIM software.URL: http://ops.fhwa.dot.gov/trafficanalysistools/corsim.htm

FHWA (2016). Transmodeler software.URL: http://www.caliper.com/transmodeler/default.htm

FHWA, U. D. o. T. (2014). Series, highway statistics.

Huang, E. (2010). Algorithmic and implementation aspects of on-line calibration ofdynamic traffic assignment, Master’s thesis, Massachusetts Institute of Technology.

INRO (2015). Emme software.URL: http://www.inro.ca/en/products/emme2/index.php

Kalman, R. E. (1960). A new approach to linear filtering and prediction problems,Journal of basic Engineering 82(1): 35–45.

Lu, L. (2013). W-spsa: an efficient stochastic approximation algorithm for the off-linecalibration of dynamic traffic assignment models, Master’s thesis, MassachusettsInstitute of Technology.

Mahmassani, H., Hawas, Y., Abdelghany, K., Abdelfatah, A., Chiu, Y., Kang, Y.,Chang, G., Peeta, S., Taylor, R. and Ziliaskopoulos, A. (1998). Dynasmart-x;volume ii: analytical and algorithmic aspects, Technical Rep. ST067 85.

Mahmassani, H. S. (2001). Dynamic network traffic assignment and simulationmethodology for advanced system management applications, Networks and spatialeconomics 1(3-4): 267–292.

Mahut, M. and Florian, M. (2010). Traffic simulation with dynameq, Fundamentalsof Traffic Simulation, Springer, pp. 323–361.

84

Omrani, R. and Kattan, L. (2012). Demand and supply calibration of dynamic trafficassignment models: Past efforts and future challenges, Transportation ResearchRecord: Journal of the Transportation Research Board (2283): 100–112.

Peeta, S. and Ziliaskopoulos, A. K. (2001). Foundations of dynamic traffic assignment:The past, the present and the future, Networks and Spatial Economics 1(3-4): 233–265.

PTV (2015a). VISSIM software.URL: http://vision-traffic.ptvgroup.com/en-uk/products/ptv-vissim/

PTV (2015b). Visum software.URL: http://vision-traffic.ptvgroup.com/en-uk/products/ptv-visum/

Schrank, D., Eisele, B., Lomax, T. and Bak, J. (2015). Urban mobility scorecard,Technical report, Technical Report August, Texas A&M Transportation Instituteand INRIX, Inc.

Simon, D. (2010). Kalman filtering with state constraints: a survey of linear andnonlinear algorithms, Control Theory & Applications, IET 4(8): 1303–1318.

Simon, D. and Simon, D. L. (2006). Kalman filtering with inequality constraintsfor turbofan engine health estimation, Control Theory and Applications, IEEProceedings-, Vol. 153, IET, pp. 371–378.

Smith, M., Duncan, G. and Druitt, S. (1995). Paramics: microscopic traffic simula-tion for congestion management, Dynamic Control of Strategic Inter-Urban RoadNetworks, IEE Colloquium on, IET, pp. 8–1.

Spall, J. C. (1992). Multivariate stochastic approximation using a simultaneousperturbation gradient approximation, Automatic Control, IEEE Transactions on37(3): 332–341.

Wang, Y., Messmer, A. and Papageorgiou, M. (2001). Freeway network simula-tion and dynamic traffic assignment with metanet tools, Transportation ResearchRecord: Journal of the Transportation Research Board 1776(1): 178–188.

Wang, Y. and Papageorgiou, M. (2005). Real-time freeway traffic state estimationbased on extended kalman filter: a general approach, Transportation Research PartB: Methodological 39(2): 141–167.

Wen, Y. (2009). Scalability of dynamic traffic assignment, PhD thesis, MassachusettsInstitute of Technology.

Yang, Q. and Koutsopoulos, H. N. (1996). A microscopic traffic simulator for eval-uation of dynamic traffic management systems, Transportation Research Part C:Emerging Technologies 4(3): 113–129.

85

Zhou, X. and Mahmassani, H. S. (2007). A structural state space model for real-time traffic origin–destination demand estimation and prediction in a day-to-daylearning framework, Transportation Research Part B: Methodological 41(8): 823–840.

86

constrained extended kalman filter: an e cient improvement

Documents