real-time traffic pattern analysis and inference with ... · real-time traffic pattern analysis...

7
Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang 1, 2 , Yiwei Xiao 2 , Xike Xie 1 , Ruoyu Chen 2 , and Hengchang Liu 1 1 School of Computer Science and Technology, University of Science and Technology of China 2 School of Software Engineering, University of Science and Technology of China {angyan, xkxie, hcliu}@ustc.edu.cn, {xiaoyw, chenry}@mail.ustc.edu.cn Abstract Recent advances in video surveillance systems en- able a new paradigm for intelligent urban traffic management systems. Since surveillance cameras are usually sparsely located to cover key regions of the road under surveillance, it is a big challenge to perform a complete real-time traffic pattern anal- ysis based on incomplete sparse surveillance in- formation. As a result, existing works mostly fo- cus on predicting traffic volumes with historical records available at a particular location and may not provide a complete picture of real-time traffic patterns. To this end, in this paper, we go beyond existing works and tackle the challenges of traf- fic flow analysis from three perspectives. First, we train the transition probabilities to capture vehicles’ movement patterns. The transition probabilities are trained from third-party vehicle GPS data, and thus can work in the area even if there is no camera. Second, we exploit the Multivariate Normal Dis- tribution model together with the transferred prob- abilities to estimate the unobserved traffic patterns. Third, we propose an algorithm for real-time traffic inference with surveillance as a complement source of information. Finally, experiments on real-world data show the effectiveness of our approach. 1 Introduction The proliferation of urban video surveillance systems gives prominence to advanced traffic services, from personalized route optimization to macrolevel traffic administration. Such surveillance data applications include urban vehicle driving optimization [Gonzalez et al., 2007; Schmitt and Jula, 2006; Leduc, 2008], road network traffic flow analysis [Bas et al., 2007; Liu et al., 2013; Suzuki and Nakamura, 2006], and intelligent transportation realization [Zhang et al., 2011; Wang, 2010; Lu et al., 2009]. Most traffic analysis with such surveillance systems as- sumes a dense camera distributions and so as a full coverage of road network traffic surveillance. However, the sparsity of camera distribution can hardly be avoided in real scenarios, due to the high deployment overheads and the dynamic na- ture of urban road networks. On the other hand, there is an increasing need for utilizing such camera surveillance data for traffic inference, especially for most small and medium- sized cities of China, where floating vehicles are not well em- ployed and therefore the solutions with historical trajectory collections are not applicable. For instance, in Suzhou, a lead- ing city in deploying urban traffic monitoring systems, there are 107 intersections in the industrial park while only 41.1% intersections are equipped with traffic surveillance systems. There also exist extensive studies on forecasting traffic conditions with time series analysis. However, such seeming- ly related techniques fall short in making predictions with the incomplete information caused by the partial coverage of traffic surveillance. We can summarize existing works of this field into two categories, neural network based methods [Yasdi, 1999; Dia, 2001] and spatial topology based method- s [De Fabritiis et al., 2008; Min and Wynter, 2011]. Neural network based methods predict a road segment’s traffic flow by matching the real time information with his- torical patterns. In particular, Yasdi et al. [Yasdi, 1999] utilize Jordan architecture based neural networks and Dia et al. [Dia, 2001] develop time-lag recurrent neural networks. Neverthe- less, such predictions are done based on the long-term data accumulation of the target road segment. Therefore, it is not clear how to extend them to support road networks that are only partially surveilled. Spatial topology based methods mostly consider the cor- relation between interconnected road segments. Particularly, Fabritiis et al. [De Fabritiis et al., 2008] predict short-term (15 to 30 minutes) road travel speeds of the Italian motorway net- work with historical and real-time floating-car-collected traf- fic speed information, and Min et al. [Min and Wynter, 2011] define a time lag function to describe the spatiotemporal cor- relations of a road segment and its counterparts in the neigh- borhood. Without exception, these spatial topology methods cannot abstract the correlations between road segments which are not surveilled either. In summary, previous works on traffic volume prediction take the completeness of the road network surveillance as an essential ingredient. It is not clear how prediction can be made if the some parts of the surveillance is missing. In this work, we tackle the challenge in three folds. First, we mod- el the traffic volume of the entire road network with trans- ferred transition probabilities from a third-party GPS dataset. Second, we use a model that takes transition probabilities as Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) 3571

Upload: others

Post on 06-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

Real-time Traffic Pattern Analysis and Inferencewith Sparse Video Surveillance Information

Yang Wang1,2, Yiwei Xiao2, Xike Xie1, Ruoyu Chen2, and Hengchang Liu1

1 School of Computer Science and Technology, University of Science and Technology of China2 School of Software Engineering, University of Science and Technology of China{angyan, xkxie, hcliu}@ustc.edu.cn, {xiaoyw, chenry}@mail.ustc.edu.cn

AbstractRecent advances in video surveillance systems en-able a new paradigm for intelligent urban trafficmanagement systems. Since surveillance camerasare usually sparsely located to cover key regions ofthe road under surveillance, it is a big challenge toperform a complete real-time traffic pattern anal-ysis based on incomplete sparse surveillance in-formation. As a result, existing works mostly fo-cus on predicting traffic volumes with historicalrecords available at a particular location and maynot provide a complete picture of real-time trafficpatterns. To this end, in this paper, we go beyondexisting works and tackle the challenges of traf-fic flow analysis from three perspectives. First, wetrain the transition probabilities to capture vehicles’movement patterns. The transition probabilities aretrained from third-party vehicle GPS data, and thuscan work in the area even if there is no camera.Second, we exploit the Multivariate Normal Dis-tribution model together with the transferred prob-abilities to estimate the unobserved traffic patterns.Third, we propose an algorithm for real-time trafficinference with surveillance as a complement sourceof information. Finally, experiments on real-worlddata show the effectiveness of our approach.

1 IntroductionThe proliferation of urban video surveillance systems givesprominence to advanced traffic services, from personalizedroute optimization to macrolevel traffic administration. Suchsurveillance data applications include urban vehicle drivingoptimization [Gonzalez et al., 2007; Schmitt and Jula, 2006;Leduc, 2008], road network traffic flow analysis [Bas etal., 2007; Liu et al., 2013; Suzuki and Nakamura, 2006],and intelligent transportation realization [Zhang et al., 2011;Wang, 2010; Lu et al., 2009].

Most traffic analysis with such surveillance systems as-sumes a dense camera distributions and so as a full coverageof road network traffic surveillance. However, the sparsity ofcamera distribution can hardly be avoided in real scenarios,due to the high deployment overheads and the dynamic na-ture of urban road networks. On the other hand, there is an

increasing need for utilizing such camera surveillance datafor traffic inference, especially for most small and medium-sized cities of China, where floating vehicles are not well em-ployed and therefore the solutions with historical trajectorycollections are not applicable. For instance, in Suzhou, a lead-ing city in deploying urban traffic monitoring systems, thereare 107 intersections in the industrial park while only 41.1%intersections are equipped with traffic surveillance systems.

There also exist extensive studies on forecasting trafficconditions with time series analysis. However, such seeming-ly related techniques fall short in making predictions withthe incomplete information caused by the partial coverageof traffic surveillance. We can summarize existing works ofthis field into two categories, neural network based methods[Yasdi, 1999; Dia, 2001] and spatial topology based method-s [De Fabritiis et al., 2008; Min and Wynter, 2011].

Neural network based methods predict a road segment’straffic flow by matching the real time information with his-torical patterns. In particular, Yasdi et al. [Yasdi, 1999] utilizeJordan architecture based neural networks and Dia et al. [Dia,2001] develop time-lag recurrent neural networks. Neverthe-less, such predictions are done based on the long-term dataaccumulation of the target road segment. Therefore, it is notclear how to extend them to support road networks that areonly partially surveilled.

Spatial topology based methods mostly consider the cor-relation between interconnected road segments. Particularly,Fabritiis et al. [De Fabritiis et al., 2008] predict short-term (15to 30 minutes) road travel speeds of the Italian motorway net-work with historical and real-time floating-car-collected traf-fic speed information, and Min et al. [Min and Wynter, 2011]define a time lag function to describe the spatiotemporal cor-relations of a road segment and its counterparts in the neigh-borhood. Without exception, these spatial topology methodscannot abstract the correlations between road segments whichare not surveilled either.

In summary, previous works on traffic volume predictiontake the completeness of the road network surveillance asan essential ingredient. It is not clear how prediction can bemade if the some parts of the surveillance is missing. In thiswork, we tackle the challenge in three folds. First, we mod-el the traffic volume of the entire road network with trans-ferred transition probabilities from a third-party GPS dataset.Second, we use a model that takes transition probabilities as

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3571

Page 2: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

Figure 1: Examples (I. Surveillance Camera Distribution; II. Before-/After Camera Data Processing of Vehicles)

inputs to make the incomplete surveillance space approxi-mately complemented. The model follows multivariate nor-mal distributions, which are considered as a general stochas-tic method for capturing travel behaviours on road network-s [Lam and Xu, 1999]. Also, with extensive data analysis,we observe that the transferred probabilities rest with timevarying traffic contexts, e.g., workdays and weekends, rushand non-rush hours. It means that the utilization of the prob-abilities has to contend with specific spatiotemporal contexts.Third, we propose a novel approach to accurately infer thereal-time traffic volume with only partially camera-equippedroad networks. The results are cross-validated with real data.

Our work is based on a real project, SkyEye, in collab-oration with Suzhou Traffic Police Department. To our bestknowledge, this is the first work on inferring real-time traf-fic volumes with sparse road video cameras. The distributionof traffic surveillance cameras and exemplified surveillanceinformation is shown in Figure 1. The GPS data is collectedfrom 4,303 taxicabs of a year, i.e., from May, 2015 to June,2016. We also collected all the video surveillance informationwith 44 camera-equipped road intersections of two-month,including the April and May of 2016. Experiments show thatour proposal can improve the estimation accuracy by up to30%.

The rest of this paper is organized as follows. Section II in-troduces preliminaries and formalizes the problem. Section I-II investigates our technical proposals including the inferencealgorithm. Section IV presents empirical studies and SectionV concludes the paper.

2 Problem DefinitionIn this section, we formally define basic concepts as well asthe problem studied in the work.

Definition 1 (Road Network.) A road network can be mod-eled as a directed graph G(V, E), where the vertex set V de-notes all the road intersections and the edge set E refers toall road segments. Given two road intersections vi, vj ∈ V ,eij ∈ E is the road segment between vi and vj .

In SkyEye, traffic surveillance is implemented by camerasthat equipped on the road intersections. The captured imageand videos are then analyzed and parsed, as shown in Figure1. Accordingly, the traffic volume statistics can be obtained.The intersections can thus be classified into two sets, camera-equipped intersections Vc and camera-free intersections Vc,satisfying Vc ∪ Vc = V and Vc ∩ Vc = ∅.

Definition 2 (Intersection Traffic Volume.) Regarding in-tersection vj , its traffic volume information in a specific timeinterval ∆t can be formulated by Vj(∆t), meaning the trafficvolume of the intersection vj during ∆t is Vj vehicles pertime unit.

Definition 3 (Trajectories.) Considering a moving objec-t o, e.g., a taxicab, the trajectory of o within a giv-en time interval ∆t can be formulated by To(∆t) ={v1(t1), v2(t2), · · · , vk(tk)}ti∈∆t, which means object o s-tarts its journey at intersection v1 of time t1 and ends at in-tersection vk of time tk. Here, vi(ti) refers to the snapshot oftime ti that trajectory To moving across intersection vi.

Notice that trajectories and traffic flows vary with differentsettings of time intervals ∆t. We define trajectories and trafficvolume in accordance with the time-varying urban traffic fea-tures, because it is commonly accepted that the traffic flowschange regularly over time in practice [Wang et al., 2014;2016; Zheng et al., 2010; Yuan et al., 2013; 2012] 1.

Definition 4 (Inference with Incomplete Surveillance.)Given road network G(V, E) and intersections under surveil-lance Vc, for any given time interval ∆t, our purpose isto design a method such that the traffic volume Vi(∆t) ofintersection (vi ∈ Vc) without surveillance can be estimated.

The quality of the estimation or inference is measured byEquation (1). Here, V̂i denotes the estimated value of trafficvolume by our method. For a camera-equipped intersection,the accuracy is equal to 100%. For a camera-free case, theaccuracy measure the ratio of the real value to the estimat-ed value and is normalized to 1 by setting the dominator assummation of real value and the estimation error.

Inference Accuracy =Vi(∆t)

Vi(∆t) + |V̂i(∆t)− Vi(∆t)|(1)

3 Real-time Traffic Volume InferringAlgorithm

In this section, we propose the solution framework for thetraffic analysis and inference problem. Figure 2 illustratesthe architecture of our solution which consists of three majorcomponents: 1) training transition probabilities from third-party datasets; 2) unobserved traffic pattern completion withmodel-based approximations; 3) read-time traffic inferencealgorithms. We detail each step in the following subsections.

3.1 Time-varying Traffic Influence ModelsBetween Neighboring Intersections

Now we model the transition probability between road inter-sections. Suppose intersection vi is a m-way road intersec-tions. To find the probability distribution of an incoming ve-hicle of vi to other m− 1 directions within a specific period,

1The setting of ∆t should balance the tradeoff between the accu-racies and temporal granularity. In our implementation, we slice thetemporal information into slots of 30 minutes following the commonsettings. Notice that such a setting may be related to the results ofinference accuracy but is orthogonal to the generalities of our pro-posals

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3572

Page 3: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

Figure 2: Algorithm overview

e.g., ∆t, helps to infer the traffic patterns of vi’s neighboringintersections.

Let vi and vj be two adjacent intersections. The transitionprobability from vi to vj , denoted as p∆t(vi, vj), representsthe probability of a vehicle moving from intersection vi tovj within time interval ∆t, and thus enables quantifying thetraffic volume of vj , given that all vj’s neighboring intersec-tions are known. In our implementation, we adopt an order-1Markov transition probability matrix, B∆t, for summarizingtransition probabilities of all pairs of road intersections in theurban road network, as shown in Equation (2).

B∆t =

p∆t(v1, v1) · · · p∆t(v1, vk) · · · p∆t(v1, v|V|)

.

.

.. . .

.

.

. . . . ...

p∆t(vj , v1) · · · p∆t(vj , vk) · · · p∆t(vj , v|V|)

.

.

. . . . ...

. . ....

p∆t(v|V|, v1) · · · p∆t(v|V|, vk) · · · p∆t(v|V|, v|V|)

(2)

We assume an acyclic road network such that the transi-tion probability from an intersection to itself equals 0, i.e.,p∆t(vj , vj) = 0. Also, for two intersection vj and vk, if theyare not adjacent, the corresponding transition probabilities e-qual to 0, i.e., p∆t(vj , vk) = p∆t(vk, vj) = 0. According toour analysis, the traffic patterns in weekdays and weekendsare significantly different. So, the traffic influence is modeledfor them separately.

The transition probabilities are trained with third-party G-PS datasets in our implementation. With transition probabil-ities, the mutual influence between road intersections are es-tablished. In the sequel, we study how the trained probabili-ties are further used for estimating unobserved intersections.

3.2 Traffic Volume Analysis of Entire RoadNetwork

As concluded in [Lam and Xu, 1999], the traffic volume ofroad networks follows a MND (Multivariate Normal Distribu-tion). Recall that earlier, we introduce the time-varying prop-erties of traffic volumes. So, we assume, for ∆t and vj , the

traffic volume is V̂j(∆t), j = 1 · · · |V|, where Vj can be cal-culated by:

V̂j(∆t) =

µvj (∆t) + εvj (∆t)+∑vk∈N (vj)

p∆t(vk, vj) ∗ Vk(∆t)

(3)

where εvj(∆t) ∼ N(0, (σ∆t

vj)2), and N (vj) is the neighbor

intersection set of intersection vj .For the entire road network and ∆t, we use a vector TV

to denote the traffic volumes of all intersections. After substi-tuting elements with Equation (3), we have:

TV(∆t) = M(∆t) + B∆t ∗TV(∆t) + E(∆t) (4)where M(∆t) = [µv1

(∆t), µv2(∆t), · · · , µv|V|(∆t)]

T , andE(∆t) = [εv1(∆t), εv2(∆t), · · · , εv|V|(∆t)]

T . Based on thedefinition of MND, we can have:

TV(∆t) ∼ N

(1−B∆t)

−1M(∆t),(

(1−B∆t)−1∗

Υ(∆t)((1−B∆t)T

)−1

) (5)

where Υ(∆t) is the diagonal matrix with elements of(σ∆t

vj)2’s:

Υ(∆t) =

(σ∆t

v1)2

0 · · · 0

0 (σ∆tv2

)2 · · · 0

......

. . ....

0 0 · · · (σ∆tv|V|

)2

(6)

Note that we have already calculated the matrix B∆t inthe last subsection. So far, we have derived the equation ofTV (∆t) whose input parameters can be calculated by Equa-tions (2) and (6).

As we have analyzed, the road intersections can be with orwithout camera surveillance. Let the number of intersectionswith camera be K. We use a K × |V| indicator matrix P toindicate the existence of camera surveillance. The equation ofP is as follows.

P =

a11 a12 · · · a1|V|a21 a22 · · · a2|V|

......

......

aK1 aK2 · · · aK|V|

(7)

If an intersection vj is camera-equipped, the element ajj isset to 1. In total, K columns are set to 1 because there are Kintersections are camera-equipped. Defining as such, we candenote the observed traffic volume as below.

TVOBS (∆t) = P×TV(∆t)

∼ N

P(1−B∆t)

−1M(∆t),(P(1−B∆t)

−1∗Υ(∆t)((1−B∆t)

T )−1

PT

)(8)

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3573

Page 4: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

The purpose here is to estimate the unknown parametersµvi(∆t) and (σ∆t

vi)2. To do that, we utilize maximum likeli-

hood estimation:

L(TV OBS (∆t)) ∝

∣∣∣∣ P(1−B∆t)−1∗

Υ(∆t)((1−B∆t)T )−1

PT

∣∣∣∣−12

exp

− 1

2(TVOBS (∆t)−P(1−BTP )−1M(∆t))

T ∗(P(1−B∆t)

−1∗Υ(∆t)((1−B∆t)

T )−1

PT

)−1

∗(TVOBS (∆t)−P(1−B∆t)−1M(∆t))

(9)

Very likely, we cannot estimate all unknown µvi(∆t) and

(σ∆tvi

)2 since the number of unknown parameters is more thanthe number of observations. Therefore, we group the intersec-tions in nearby regions and make those camera-free intersec-tions have the same µvi(∆t) and (σ∆t

vi)2 value to the nearest

camera-equipped intersections to them. By solving the max-imum likelihood in Equation (9), we can have the estimationresults M̂(∆t) and Υ̂(∆t).

3.3 Inferring Real-time Traffic Volumes ofCamera-free Intersections

We have shown how unknown parameters can be estimatedand how the MND for the road network traffic volume can bemodeled. Now, we describe the mechanism which can inferreal-time traffic volumes of camera-free intersections with theapproximated MND. We first show how to use the calculatedTV (∆t) to infer the real-time traffic volumes of camera-freeintersections. For simplicity, we formally define r and Γ:

r(∆t) = (1−B∆t)−1

M̂(∆t)

Γ(∆t) = (1−B∆t)−1

Υ̂(∆t)((1−B∆t)T

)−1

)

(10)

then we have TV(∆t) ∼ N(r(∆t),Γ(∆t)). After generat-ing the MND, we can proceed to calculate the real-time trafficvolumes of intersections without fixed video cameras by com-puting conditional expectation of MND. We first re-arrangethe matrix TV(∆t) as:

TVRE (∆t) =

[TVMISSING(∆t)

TVOBS (∆t)

](11)

where there are |V| −K and K elements inTVMISSING(∆t) and TVOBS (∆t) respectively, andthe goal is to compute E[TVMISSING(∆t) |TVOBS (∆t) ].According to the re-arranged matrix TVRE (∆t), wealso re-arrange matrix M(∆t) and Υ(∆t) as MRE (∆t)and ΥRE (∆t) respectively, then partition rRE (∆t) andΓRE (∆t) as:

rRE(∆t) =

[r1(∆t)

r2(∆t)

](12)

ΓRE(∆t) =

[Γ11(∆t) Γ12(∆t)

Γ21(∆t) Γ22(∆t)

](13)

where there are |V| −K and K elements in r1(∆t) andr2(∆t), and Γ11(∆t), Γ12(∆t), Γ21(∆t) and Γ22(∆t) are(|V| −K) ∗ (|V| −K), (|V| −K) ∗K, K ∗ (|V| −K) andK ∗ K sub-matrices respectively. Based on the definition ofconditional expectation of MND, we have:

E[TVMISSING(∆t) |TVOBS (∆t) ]

= r1(∆t) +

Γ12(∆t) ∗ (Γ22(∆t))−1∗

(TVOBS (∆t)− r2(∆t))

(14)

3.4 TISV AlgorithmNow, we present the solution that consists of two parts.

The first part is to extract the time-varying traffic influencemodels between neighboring intersections and to extract theholistic time-varying traffic patterns of the entire road net-work. The travel patterns are obtained by mining the histor-ical trajectory dataset of taxicabs and by mining the datasetof fixed road surveillance cameras. This part only needs to beexecuted once and the results can be used to infer the trafficvolumes of urban camera-free intersections in real-time in thesecond part. Note that the first part should be executed for alltime slots of a day.

Algorithm 1 The TISV Algorithm1: procedure TRAFFICVOLUMEINFERRING(∆t)

Initialization: Real-time observed traffic volume dataset Θ ofcamera-equipped intersection set Ψ

2: TVOBS (∆t)← Θ3: rRE (∆t)← rearrange r(∆t) with Ψ4: ΓRE (∆t)← rearrange Γ(∆t) with Ψ5: K ← |Ψ|

6:(

r1(∆t)r2(∆t)

)← Divide rRE (∆t) with K

7:(

Γ11(∆t) Γ12(∆t)Γ21(∆t) Γ22(∆t)

)← Divide ΓRE (∆t)

with K

8: a← (Γ22(∆t))−1

9: b← TVOBS (∆t)− r2(∆t)10: TVMISSING(∆t)← r1(∆t) + Γ12(∆t) ∗ a ∗ b11: for i = 1 to |V| −K do12: Return V̂j(∆t)13: end for14: end procedure

The second part is the Traffic Volume Inferring with S-parse Video Surveillance Cameras algorithm (TISV in short),which is triggered if the real-time traffic volume informationis to be inferred. In particular, to infer the traffic volumesof camera-free intersections, we first obtain the traffic vol-ume information of all camera-equipped intersections, thenre-arrange and re-calculate the corresponding sub-matrices ofthe MND on-the-fly. Finally, the expected traffic volume val-ues for all camera-free intersections can be calculated. The

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3574

Page 5: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

pseudo code of the TISV algorithm is given in Algorithm 1.The parameters in Algorithm 1 are as defined previously.

4 ExperimentsWe present the experimental setup in Section 4.1 and showthe results in Section 4.2.

4.1 Experimental SetupRoad Networks. Our experiment is carried out on the main

urban area of Suzhou Industrial Park covering 45.5 square k-ilometers with a road network of 107 road intersections and298 road segments. Among the 107 intersections, 44 are e-quipped with fixed video surveillance systems.

Taxicab Systems. Historical records of 4,303 taxicabs arecollected for training transition probabilities. All taxicabs areequipped with a GPS-based navigation system and 3G/4Gnetwork and upload their ID, location, speed, direction, etc.in every two minutes.

Verification. The inference performance is measured bycross-validation. We randomly choose 35% camera-equippedintersections as verifying intersections. Assuming the surveil-lance results of the chosen intersections are missing, we caninfer them by the information of other camera-equipped inter-sections and then compare with the ground truth. An exampleregion is shown in Figure 3, where camera-equipped intersec-tions are marked by red dots, and the verifying intersectionsare illustrated by blue dots. We use the dataset of taxicab-s from May 1, 2015 to April 30, 2016 to train the transitionprobabilities, then use the surveillance data of the same peri-od to build the holistic traffic estimation models. We then ran-domly select another 10 days, including both workdays andweekends, to test the accuracy of inference of the real-timetraffic volumes for the verifying intersections. In all testingcases, we consider time slots from 7 am to 8 pm, for eachday.

Competitors. We evaluate the performance of our TISV al-gorithm by comparing it with the OKA (One Kilometer Av-erage), LR (Linear Regression), K-means and LSTM (LongShort-Term Memory) strategies. The OKA strategy assumesthe real-time traffic volume of a camera-free intersection e-quals to the average traffic volume of all camera-equipped in-tersections within one kilometer. The LR strategy establishes

Figure 3: An example region with verifying intersections

an linear regression model for each camera-equipped inter-section, and assumes its nearest camera-free intersection fol-lows the same linear regression model. The K-means strategypartitions all intersections into clusters and assume the traf-fic volume of a camera-free intersection equals to the one ofits cluster center. The LSTM strategy approximates the traf-fic flow of a camera-free intersection by its nearest camera-equipped intersection.

4.2 ResultsWe first evaluate the effectiveness of our proposed TISV al-gorithm by comparing its accuracy with several competitors.Then, we made detailed testing to show how the proposal per-forms with varying factors. In the sequel, we use AR to denoteinference accuracy for brevity.

EffectivenessImpact of day type. We show the effectiveness of our pro-

posal in Figure 4. It can be observed that the accuracy of TISVmethod is steadily above 70%. Compared with the base-line methods, i.e., OKA, LR, K-means and LSTM, our so-lution can increase the accuracy by 31.19%, 29.36%, 31.62%and 8.68%, respectively. The LSTM method performs bet-ter than other baselines. But the performance is unstable, i.e.,for weekends. By contrast, our method is stable towards daytypes. The reason behind is that our method addresses the in-fluences of traffic volumes in a global view, whereas othermethods only consider the local effect of the traffic volume,e.g., on the neighborhood camera-equipped intersections.

Impact of time period. We examine the performance withrespect to the effect of time slots in Figures 5 and 6, respec-tively. In particular, we show the results on weekdays in Fig-ure 5, and the results on weekends in Figure 6.

From the two figures, we can observe that our solution out-performs all competitors and achieves an accuracy as high as75%. For traffic volume estimation, different predicability isachieved due to the different travel patterns, which is con-sistent with the observations of Figure 4. More, we find thatthe inference accuracy works better in rush hours. For exam-ple, in Figure 5, the accuracy can be as high as about 80% at17pm. It is because that during rush hours the travel patternsare with more regularities. Thus, it enables a better estima-tion on traffic volume inference. Finally, we discover the per-formances of five solutions in morning rush hours are muchbetter than those of afternoon. It is because of the centralizeddistribution of traffic flows during rush hours. Also, this isconsistent with the skewed distribution of business districtsand factories where a large population work in a relatively s-mall region. However, the traffic flow during afternoon rushhours are more decentralized due to the more evenly distribut-ed residential regions.

AnalysisWe test the accuracy of our proposed TISV algorithm withvarying factors for the real-time intersection traffic volumeinference. In particular, we investigate: i) the impact of thenumber of camera-equipped intersections; ii) the impact ofthe traffic volume; and iii) the impact of the density ofcamera-equipped intersections.

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3575

Page 6: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

1 2 3 4 5 6 7 8 9 10

Day

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00A

verag

e A

R v

alu

e

weekdays weekends

TISV OKA K-means LR LSTM

Figure 4: Impacts of weekdays andweekends

7:008:00

9:0010:00

11:0012:00

13:0014:00

15:0016:00

17:0018:00

19:000.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

Averag

e A

R v

alu

e

weekdays

rushhours

rushhours

Time

TISV OKA K-means LR LSTM

Figure 5: Impacts of time periods inweekdays

7:008:00

9:0010:00

11:0012:00

13:0014:00

15:0016:00

17:0018:00

19:000.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

Averag

e A

R v

alu

e

weekends

rushhours

rushhours

rushhours

Time

TISV OKA K-means LR LSTM

Figure 6: Impacts of time periods inweekends

Figure 7: Impacts of percentages ofcamera-equipped intersections

Figure 8: Impacts of traffic vol-umes

2 3 4 5 9

Number of camera-equipped intersections within a square

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Averag

e A

R v

alu

e

TISV OKA K-means LR LSTM

Figure 9: Impacts of densities ofcamera-equipped intersections

Impact of percentage of camera-equipped intersection-s. We investigate the effect of the percentages of camera-equipped intersections by varying the percentage of camera-equipped intersections from 15% to 30%. The results are re-ported in Figure 7. It can be observed that the accuracy ra-tio of our solution increases roughly with the increase of thepercentage of camera-equipped intersections. Because the M-ND can be better modelled if with larger number of camera-equipped intersections. Also, it reflects that the performancesof our solution for weekdays and weekends are close, whichmeans our solution is general to the day types.

Impact of traffic volume. We study the effect of real timevolume of camera-free intersections on the prediction accura-cy in Figure 8. It can be observed that the performance of oursolution increases in accordance with the increase of actualtraffic volumes of camera-free intersections and that our so-lution outperforms the other four in all settings. In particular,the accuracy ratio of our algorithm has an increase of morethan 25% while the actual traffic volumes of camera-free in-tersections increase from 0-20 vehicles per minute to 180-200vehicles per minute. Also, we can observe that the perfor-mances of the OKA, K-means, LR and LSTM strategies arerelatively better while the actual traffic volumes of camera-free intersections are between 20 and 60 vehicles per minute.For most urban intersections, the traffic volume is within sucha range during the day time, and the traffic flows follow stablepatterns as well. Compare with the competitors, our solutionconsiders the mutual influence between intersections whichresults in a better estimation.

Impact of density of camera-equipped intersections. Final-ly, we evaluate the performances of our method as well ascompetitors while changing the density of camera-equippedintersections. To do that, we partition the road network intoequal-sized squares, as exemplified in Figure 3. The densityof camera-equipped intersections is classified into five levels,i.e., 2, 3, 4, 5, 9 per square, respectively. The results are re-ported in Figure 9. We can observe that with the increase ofthe number of camera-equipped intersections, the inferenceaccuracy increases accordingly. By contrast, the competitors’performance does not change. This is because our solutionapproximates the traffic volume model from a global view,whereas all others only consider intersections in the neigh-borhood.

5 ConclusionRecent technology advances provide us new opportunities toimprove traditional real-time traffic surveillance system. Inthis paper, we model the time-varying traffic influence modelbetween neighboring intersections with transition probabili-ties, then use MNDs to approximate the traffic volumes ofthe entire road network. We further propose a novel methodto infer the possible real-time traffic volume of a camera-free intersection with the real-time traffic volume informationof sparsely distributed camera-equipped intersections. Perfor-mance evaluations from real datasets demonstrate the effec-tiveness of our proposal.

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3576

Page 7: Real-time Traffic Pattern Analysis and Inference with ... · Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information Yang Wang1;2, Yiwei Xiao2,

AcknowledgmentsWe gratefully acknowledge anonymous reviewers for readingthis paper and giving valuable comments. This paper is par-tially supported by the NSFC (No.61672487, No.61472384,No.61772492, No.61772492, No.U1301256), Jiangsu Natu-ral Science Foundation (No.BK20151239, No.BK20171240,No.BK20171240), and Fundamental Research Funds for theCentral Universities. Xike Xie is supported by the CAS Pio-neer Hundred Talents Program.

References[Bas et al., 2007] Erhan Bas, A Murat Tekalp, and F Sibel

Salman. Automatic vehicle counting from video for trafficflow analysis. In 2007 IEEE Intelligent Vehicles Sympo-sium, pages 392–397. Ieee, 2007.

[De Fabritiis et al., 2008] Corrado De Fabritiis, RobertoRagona, and Gaetano Valenti. Traffic estimation and pre-diction based on real time floating car data. In 2008 11thInternational IEEE Conference on Intelligent Transporta-tion Systems, pages 197–203. IEEE, 2008.

[Dia, 2001] Hussein Dia. An object-oriented neural networkapproach to short-term traffic forecasting. European Jour-nal of Operational Research, 131(2):253–261, 2001.

[Gonzalez et al., 2007] Hector Gonzalez, Jiawei Han, Xi-aolei Li, Margaret Myslinska, and John Paul Sondag.Adaptive fastest path computation on a road network: atraffic mining approach. In Proceedings of the 33rd inter-national conference on Very large data bases, pages 794–805. VLDB Endowment, 2007.

[Lam and Xu, 1999] William HK Lam and G Xu. A trafficflow simulator for network reliability assessment. Journalof advanced transportation, 33(2):159–182, 1999.

[Leduc, 2008] Guillaume Leduc. Road traffic data: Collec-tion methods and applications. Working Papers on Energy,Transport and Climate Change, 1(55), 2008.

[Liu et al., 2013] Honghai Liu, Shengyong Chen, andNaoyuki Kubota. Intelligent video systems and analytics:A survey. IEEE Transactions on Industrial Informatics,9(3):1222–1233, 2013.

[Lu et al., 2009] Rongxing Lu, Xiaodong Lin, Haojin Zhu,and Xuemin Shen. Spark: a new vanet-based smart parkingscheme for large parking lots. In INFOCOM 2009, IEEE,pages 1413–1421. IEEE, 2009.

[Min and Wynter, 2011] Wanli Min and Laura Wynter. Real-time road traffic prediction with spatio-temporal correla-tions. Transportation Research Part C: Emerging Tech-nologies, 19(4):606–616, 2011.

[Schmitt and Jula, 2006] E Schmitt and Hossein Jula. Vehi-cle route guidance systems: classification and comparison.In 2006 IEEE Intelligent Transportation Systems Confer-ence, pages 242–247. IEEE, 2006.

[Suzuki and Nakamura, 2006] Kazufumi Suzuki and Hidek-i Nakamura. Trafficanalyzer-the integrated video imageprocessing system for traffic flow analysis. In Proceed-ings of the 13th ITS World Congress, London, 8-12 Octo-ber 2006, 2006.

[Wang et al., 2014] Yang Wang, Liusheng Huang, TianboGu, Hao Wei, Kai Xing, and Junshan Zhang. Data-driventraffic flow analysis for vehicular communications. InINFOCOM, 2014 Proceedings IEEE, pages 1977–1985.IEEE, 2014.

[Wang et al., 2016] Yang Wang, Erkun Yang, Wei Zheng,Liusheng Huang, Hengchang Liu, and Binxin Liang. A re-alistic and optimized v2v communication system for taxi-cabs. In 2016 IEEE 36th International Conference onDistributed Computing Systems (ICDCS), pages 139–148.IEEE, 2016.

[Wang, 2010] Fei-Yue Wang. Parallel control and manage-ment for intelligent transportation systems: Concepts, ar-chitectures, and applications. IEEE Transactions on Intel-ligent Transportation Systems, 11(3):630–638, 2010.

[Yasdi, 1999] Ramin Yasdi. Prediction of road traffic usinga neural network approach. Neural computing & applica-tions, 8(2):135–142, 1999.

[Yuan et al., 2012] Nicholas Jing Yuan, Yu Zheng, and XingXie. Segmentation of urban areas using road networks.Technical Report, 2012.

[Yuan et al., 2013] Jing Yuan, Yu Zheng, Xing Xie, andGuangzhong Sun. T-drive: Enhancing driving directionswith taxi drivers’ intelligence. Knowledge and Data Engi-neering, IEEE Transactions on, 25(1):220–232, 2013.

[Zhang et al., 2011] Junping Zhang, Fei-Yue Wang, Kun-feng Wang, Wei-Hua Lin, Xin Xu, and Cheng Chen.Data-driven intelligent transportation systems: A survey.IEEE Transactions on Intelligent Transportation Systems,12(4):1624–1639, 2011.

[Zheng et al., 2010] Yu Zheng, Jing Yuan, Wenlei Xie, XingXie, and Guangzhong Sun. Drive smartly as a taxi driv-er. In Ubiquitous Intelligence & Computing and 7th Inter-national Conference on Autonomic & Trusted Computing(UIC/ATC), 2010 7th International Conference on, pages484–486. IEEE, 2010.

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3577