congestion attack detection in intelligent traffic signal

17
Research Article Congestion Attack Detection in Intelligent Traffic Signal System: Combining Empirical and Analytical Methods Yingxiao Xiang , 1 Wenjia Niu , 1 Endong Tong , 1 Yike Li , 1 Bowei Jia , 1 Yalun Wu , 1 Jiqiang Liu , 1 Liang Chang , 2 and Gang Li 3 1 Beijing Key Laboratory of Security and Privacy in Intelligent Transportation, Beijing Jiaotong University, Beijing, China 2 Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin, China 3 Australia Centre for Cyber Security Research and Innovation, Deakin University, Geelong, Australia Correspondence should be addressed to Wenjia Niu; [email protected] and Endong Tong; [email protected] Received 11 June 2021; Revised 31 July 2021; Accepted 11 October 2021; Published 31 October 2021 Academic Editor: Jian Weng Copyright©2021YingxiaoXiangetal.isisanopenaccessarticledistributedundertheCreativeCommonsAttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e intelligent traffic signal (I-SIG) system aims to perform automatic and optimal signal control based on traffic situation awareness by leveraging connected vehicle (CV) technology. However, the current signal control algorithm is highly vulnerable to CV data spoofing attacks. ese vulnerabilities can be exploited to create congestion in an intersection and even trigger a cascade failure in the traffic network. To avoid this issue, timely and accurate congestion attack detection and identification are essential. is work proposes a congestion attack detection approach by combining empirical prediction and analytical verification. First, we collect a range of traffic images that correspond to specific traffic snapshots which are vulnerable to potential data spoofing attacks. Based on these traffic images, an improved generative adversarial network is trained to predict whether a forthcoming attack will cause congestion with a high probability. Meanwhile, we define a group of traffic flow features. After exploring features and conducting a thorough analysis, a TGRU (tree-regularized gated recurrent unit)-based approach is proposed to verify whether congestion occurs. When we find a possible attack that can cause congestion with high probability and subsequent traffic flows also prove congestion, we can say there is a congestion attack. us, we can realize timely and accurate congestion attack detection by integrating empirical prediction and analytical veri- fication. Extensive experiments demonstrate that our approach performs well in congestion attack detection accuracy and timeliness. 1. Introduction Connected vehicle (CV) technology [1, 2] empowers vehicles to communicate with the surrounding environment (roadside units and traffic signal control infrastructure) and is now transforming today’s transportation systems. As one key component, the intelligent traffic signal (I-SIG) system [3] is responsible for performing dynamic and optimal signal control. It is based on automatic traffic situation awareness by leveraging the emerging communication infrastructure of the space-air- ground integrated network (SAGIN) [4, 5] with the advantages of coverage, flexibility, and so on. For instance, since September 2016, a series of I-SIG systems have been deployed in California, Florida, and New York by the U.S. Department of Trans- portation (USDOT) as a CV Pilot Program [1]. ese systems are currently under testing and not yet widespread. Unfortunately, such dramatically increased connectivity also opens a new door for cyberattacks. Recently, such I-SIG has exposed a vulnerability of the controlled optimization of phases (COP) algorithm [6, 7]. Attackers can compromise the on-board units on their vehicles and send malicious messages (such as those containing speed and location) to influence the traffic control decisions at specific times, thus causing unexpected heavy traffic congestion. Some data show that a single attack vehicle can cause a total delay 11 times greater than the total delay before the attack [8], posing a significant barrier to the development and de- ployment of I-SIG systems on a wide scale in the future. Previous research [8] reveals such congestion attacks on the COP algorithm, analyzes how congestion attacks affect the COP algorithm decisions, and explains how to launch an attack using data spoofing in SAGIN. However, developers Hindawi Security and Communication Networks Volume 2021, Article ID 1632825, 17 pages https://doi.org/10.1155/2021/1632825

Upload: others

Post on 28-Dec-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Congestion Attack Detection in Intelligent Traffic Signal

Research ArticleCongestion Attack Detection in Intelligent Traffic SignalSystem Combining Empirical and Analytical Methods

YingxiaoXiang 1WenjiaNiu 1EndongTong 1YikeLi 1BoweiJia 1YalunWu 1

Jiqiang Liu 1 Liang Chang 2 and Gang Li3

1Beijing Key Laboratory of Security and Privacy in Intelligent Transportation Beijing Jiaotong University Beijing China2Guangxi Key Laboratory of Trusted Software Guilin University of Electronic Technology Guilin China3Australia Centre for Cyber Security Research and Innovation Deakin University Geelong Australia

Correspondence should be addressed to Wenjia Niu niuwjbjtueducn and Endong Tong edtongbjtueducn

Received 11 June 2021 Revised 31 July 2021 Accepted 11 October 2021 Published 31 October 2021

Academic Editor Jian Weng

Copyright copy 2021YingxiaoXiang et al+is is an open access article distributed under theCreative CommonsAttribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

+e intelligent traffic signal (I-SIG) system aims to perform automatic and optimal signal control based on traffic situation awareness byleveraging connected vehicle (CV) technology However the current signal control algorithm is highly vulnerable to CV data spoofingattacks +ese vulnerabilities can be exploited to create congestion in an intersection and even trigger a cascade failure in the trafficnetwork To avoid this issue timely and accurate congestion attack detection and identification are essential +is work proposes acongestion attack detection approach by combining empirical prediction and analytical verification First we collect a range of trafficimages that correspond to specific traffic snapshots which are vulnerable to potential data spoofing attacks Based on these traffic imagesan improved generative adversarial network is trained to predict whether a forthcoming attack will cause congestion with a highprobability Meanwhile we define a group of traffic flow features After exploring features and conducting a thorough analysis a TGRU(tree-regularized gated recurrent unit)-based approach is proposed to verify whether congestion occurs When we find a possible attackthat can cause congestion with high probability and subsequent traffic flows also prove congestion we can say there is a congestionattack +us we can realize timely and accurate congestion attack detection by integrating empirical prediction and analytical veri-fication Extensive experiments demonstrate that our approach performs well in congestion attack detection accuracy and timeliness

1 Introduction

Connected vehicle (CV) technology [1 2] empowers vehicles tocommunicate with the surrounding environment (roadsideunits and traffic signal control infrastructure) and is nowtransforming todayrsquos transportation systems As one keycomponent the intelligent traffic signal (I-SIG) system [3] isresponsible for performing dynamic and optimal signal controlIt is based on automatic traffic situation awareness by leveragingthe emerging communication infrastructure of the space-air-ground integrated network (SAGIN) [4 5] with the advantagesof coverage flexibility and so on For instance since September2016 a series of I-SIG systems have been deployed in CaliforniaFlorida and New York by the US Department of Trans-portation (USDOT) as a CV Pilot Program [1] +ese systemsare currently under testing and not yet widespread

Unfortunately such dramatically increased connectivityalso opens a new door for cyberattacks Recently such I-SIGhas exposed a vulnerability of the controlled optimization ofphases (COP) algorithm [6 7] Attackers can compromisethe on-board units on their vehicles and send maliciousmessages (such as those containing speed and location) toinfluence the traffic control decisions at specific times thuscausing unexpected heavy traffic congestion Some datashow that a single attack vehicle can cause a total delay 11times greater than the total delay before the attack [8]posing a significant barrier to the development and de-ployment of I-SIG systems on a wide scale in the future

Previous research [8] reveals such congestion attacks onthe COP algorithm analyzes how congestion attacks affectthe COP algorithm decisions and explains how to launch anattack using data spoofing in SAGIN However developers

HindawiSecurity and Communication NetworksVolume 2021 Article ID 1632825 17 pageshttpsdoiorg10115520211632825

may still lack a deep understanding of such I-SIG attacks anddefenses raising some pressing concerns (1) What is theeffect of different phases where the attack vehicle is located+e different phases of the attack vehicle can cause differentcongestion effects (2) What is the quantified correlationbetween the attack and congestion degree +e quantifiedcorrelation refers to the potential relationship between theattack and congestion degree once identified we can inferwhether the attack occurred according to the congestiondegree (3) Are there any potential features to be utilized forrevealing the above correlation It is necessary to analyze thecongestion attack mechanism firstly to solve these issues+e challenges of solving these issues include how to au-tomatically explore multiple and multidimensional featuresto quantify the traffic flow characteristics under no attackand congestion attack and analyze the correlation betweenattack features and attack effects +us demystifying thecongestion attack based on the COP mechanism throughquantified features and exploring new analysis methods willbenefit all stakeholders for I-SIG including transportationSAGIN and security specialists

We demystify the attack and corresponding congestionfrom a machine learning perspective by exploring andutilizing quantified features We deeply analyze dataspoofing in SAGIN and the COP algorithm vulnerabilityunder two different attack strategies To explore the effect ofdifferent phases of the attack vehicle we consider utilizinghigh-level image features and design a novel analysis modelbased on the cycle generative adversarial network(CycleGAN) [9] to reflect the relation between the attackand the congestion caused by the attack +us we canpredict whether a forthcoming attack will cause congestionand the congestion effect according to the traffic image at aspecific moment To explore the quantified correlationbetween the attack and congestion degree we utilize trafficflow features and the TGRU classification model [10] (anexplainable gated recurrent unit-based model [11] with treeregularization) to verify whether a congestion attack occursbased on all vehiclesrsquo trajectory data in an intersectionFollowing analysis we also give some promising sugges-tions for defending I-SIG systems against a congestionattack

We implement the I-SIG and experiment through vi-sualized simulation in VISSIM [12] +e experiment showsthe effectiveness of our approach We find that feature-basedmachine learning can reflect the correlation between theattack and congestion degree well +rough the deeplearning-based training the CycleGAN-based approachoutput visualized results with satisfied prediction comparedwith real values the MAE and RMSE of the congestiondegree are near 002 and 003 respectively and theMAE andRMSE of the congestion degree are near 094 and 114respectively TGRU has a 084 precision and 079 recall onpredicting the spoofing attack based on 30 features Gen-erally for defenses we suggest improving the estimation ofvehicle location and speed (EVLS) [7] algorithm of I-SIG ifwe would like to keep a limited cost which requires fewerauthentication mechanisms and SAGIN reinforcementefforts

We summarize our contributions as follows

(1) We perform the study to demystify the attack toI-SIG and the corresponding congestion from amachine learning perspective by exploring differentkinds of features through supervised learning andunsupervised learning

(2) For predicting the spoofing congestion attack weautomatically explore the image feature to quantifythe traffic flow characteristics under no attack andcongestion attack And we propose a CycleGAN-based approach to analyze the potential relationshipbetween the congestion attack and correspondingresults two stages later based on the image feature

(3) For verifying the spoofing congestion attack wepropose a TGRU-based approach to explore theunderlying relationship between the congestion at-tack and traffic flow feature at the current momentbased on the traffic flow features which are firstlydefined in this work

(4) We evaluate our approach empirically from the realCOP algorithm through VISSIM We collect 4476high-quality image samples and 3600 traffic flow datafor the experiment which enables us to demonstratethe effectiveness of our approach compared withground truth

2 Preliminaries

21 SAGIN Infrastructure of I-SIG Figure 1 presents thebasic architecture for the space-air-ground integrated net-work of I-SIG in which two main segments are included aspace segment and a ground segment +e I-SIG of the CVenvironment is located in the network-based ground seg-ment +ere are three main components within the groundsegment on-board units (OBUs) roadside units (RSUs) andsignal planning units +ese refer to the devices installed invehicles roadside servers and traffic lights respectivelyBoth vehicle-to-vehicle (V2V) [13] communication andvehicle-to-infrastructure (V2I eg roadside servers) [14]communication adopt the dedicated short-range commu-nications (DSRC) [15] transmission protocol as 80211p-based wireless communication this provides a channel andenables high-speed direct communication Every vehiclebroadcasts anonymously and surrounding vehicles receivemessages Messages containing critical information arecalled basic safety messages (BSMs) +ese contain core dataelements including vehicle size position speed headingacceleration and brake system status Compared withDSRC the communication from the RSU to the signalplanning unit adopts the US National TransportationCommunications for Intelligent Transportation SystemProtocol (NTCIP) [16] By providing two-way communi-cation between vehicles and traffic signals NTCIP is spe-cially designed to achieve interpretability andinterchangeability between computers and electronic trafficcontrol equipment from different manufacturers thus in-creasing use in smart city initiatives

2 Security and Communication Networks

22 I-SIG Data Flow +e data flow of the I-SIG system isrevealed in Figure 2 Each OBU of a vehicle sends BSMs tothe RSU for real-time trajectory collection+en the data arepreprocessed to form an arrival table (Table 1) to be used asinput for signal planning which contains COP and EVLSalgorithms If the penetration rate (PR) of OBU for a vehicleis less than 95 the arrival table will be sent to EVLS for anupdate Otherwise it will be directly sent to the COP al-gorithm for planning According to the results of the COPalgorithm a downward signaling command will be trans-ferred to the phase signal controller After each stage ofsignal control the status of the signal will be returned asfeedback for continuous COP planning

+ere are 8 traffic signals in I-SIG as shown in Figure 3called phases odd numbers are for left-turn lanes evennumbers are for through lanes Table 1 is the arrival tablewhich is sent to the signal planning model In Table 1 Ti

i (0le ileM) denotes the time to arrive at the stop bar fromthe current location I-SIG sets M 130 seconds covering aBSM statistic of over two minutes Nij(i isin [0 M] j isin [1 8])

means that in phase j there will beNij vehicles that are goingto reach the stop bar within Ti seconds Here the stop bar isset in front of the traffic light as it is marked in real roadintersections

+e EVLS is based onWiedemannrsquos car-following modeland is used to fill the blank monitoring area of the moni-toring segment and insert vehicle data between OBU-equipped vehicles

+e key is to estimate the number of queued vehiclesBecause it is assumed that a queue always begins at the stopbar the last vehicle in the queue needs to be found to de-termine the queue length

First the historical distances to the stop bar and stoptime of the last stopped connected vehicle and the second-to-the-last stopped connected vehicle in the queue arecalculated these are denoted as Lq1 Tq1 Lq2 and Tq2 re-spectively +e current time is Tc and the estimated queuelength is Les Assuming that the queue propagation speed vq

is constant we have

vq Lq1 minus Lq2

Tq1 minus Tq2

Les minus Lq1

TC minus Tq1 (1)

+en

Les Lq1 + vq TC minus Tq11113872 1113873 (2)

If the average vehicle length is C the number N0i ofvehicles in queue is then calculated as follows

N0i Les

C i isin [1 8] (3)

Although such estimation provides effective support fora low PR it also introduces a new threat of data spoofingattack to the COP algorithm

3 Demystifying Attack on COP

31Data Spoofing0reat +ere are two data spoofing attackstrategies proposed in I-SIG (Figure 4) +e first one is adirect attack on the arrival table without considering PR thesecond one is an indirect attack on EVLS when the PR is lessthan 95

+e first strategy is for arrival time and phase spoofingfor both the full deployment period (PR ge 95) and

Space network

Ground network

CV environment

V2V

OBUs

RSUs

LegendDSRC Transmission (V2V V2I)

OBUsBSM

BSM

Groundsegment

V2I

Signal planning

BSM

NTCIP Transmission

GPSsatellites

Figure 1 +e architecture for space-air-ground integrated network of I-SIG

Security and Communication Networks 3

transition period (PR lt 95) +e attacker changes thelocation and speed information in vehicle BSMs to alter thevehiclersquos arrival time and requested phase thus the corre-sponding arrival table elements in Table 1 are changed +isattack strategy can directly attack input data flow no matterwhat the PR is As shown in Figure 4(a) the attacker adds aspoofed vehicle into the original vehicle queue at any lo-cation +e insertion of a spoofed vehicle makes the queuelonger Moreover there is an increase in the duration of thegreen light allocated by the COP algorithm for the currentphase which delays the next start time of the green light ofall phases thus increasing the delay for vehicles to passthrough the intersection

+e second strategy is for queue-length spoofing forthe transition period only +is strategy aims to extend the

queue length estimated by the EVLS algorithm bychanging the location and speed values in BSMsFigure 4(b) shows that the attacker adds a stopped vehiclewith the farthest distance to the stop bar Owing to theEVLS algorithm estimating the queue length based on thelocation of the last stopped connected vehicle this attackcauses the estimated queue length Les calculated byequation (2) to increase +erefore the number of vehiclesin the queue N0i calculated by equation (3) increases aswell

32 Planning-Level CongestionAnalysis +eCOP algorithmis responsible for traffic signal planning thus it is essentialfor planning-level congestion analysis of I-SIG

Trajectory collection

Signal planning

COP EVLSPhase signal

controller

Signalling

Signal status

Arrival table

BSMPreprocessingOBU ę

Trajectorydata

Figure 2 Data flow of the I-SIG system

Table 1 Arrival table Numbers 1 to 8 are phases and T0 to TM are the remaining arrival time of vehicles

Phase 1 2 middot middot middot 8T0 N01 N02 middot middot middot N08T1 N11 N12 middot middot middot N18T2 N21 N22 middot middot middot N28middot middot middot middot middot middot middot middot middot middot middot middot middot middot middot

TM NM1 NM2 middot middot middot NM8

1 6

2 5

7

4

83I-SIG

Figure 3 I-SIG signal control scenario including 8 phases

4 Security and Communication Networks

+rough reading the published COP-related papers[6 7] and analyzing the implementation code we reveal amore complete and detailed COP algorithm for the first time(Algorithms 1 and 2) +e authors in [6] first proposed aCOP algorithm that allows optimization of various per-formance indices including delay stops and queue lengthsfor the optimal control of a single intersection However itdid not support flexible or dual ring and phase sequencesand it is difficult to understand for most readers due to thelack of the algorithm flow Based on the COP algorithm theauthors in [7] presented a real-time adaptive traffic controlalgorithm by utilizing data from connected vehicles tooptimize the phase sequence However they did not providethe details of the algorithm Compared with [6 7] Algo-rithm 1 is the first algorithm that provides a complete anddetailed flow of signal planning

In Table 2 we list the meanings of the mathematicalsymbols that appear in the two algorithms

+e design of the COP algorithm uses the collaborationof two-stage planning and operation +e COP algorithmplans signals for the next-stage based on the vehiclersquos esti-mation and such planned signal duration will be operated atthe next-stage signal control time +us this is a continuousalternate process in a fixed phase sequence which meansthat the I-SIG system cannot change the order and durationof phases in the current stage since this is set in the previousstage When bringing foresight of planning such a designalso opens the door to attack signal planning in order toaffect next-stage operation continuously

+e spoofing of the arrival table affects the variables Atk

and planPrp and the later calculation of Delayr in line 19 ofAlgorithms 1 +e change in Delayr causes the variables

optVr in line 21 optGr0 in line 22 and optGr1 in line 23 ofAlgorithms 1 to change as well Finally the outputs planPrpoptGrp xlowastj and vj are changed

33 High-Level Image Feature-Based Congestion AttackPrediction In this subsection we employ an image feature-based CycleGAN to explain the relationship between thephase where the spoofed vehicle is located and the con-gestion image features two stages later

As mentioned in the Data Spoofing+reat section thereare two data spoofing attack strategies but either attack willcause congestion in a period Different phases of spoofedvehicles lead to different congestion effects +erefore theimage features of intersection congestion are also different+e CycleGAN model can mine the potential relationshipbetween two different types (X and Y) of images +roughtraining CycleGAN can generate the corresponding imagesY according to X and generate the related images Xaccording to Y +erefore we utilize the CycleGANmodel topredict the congestion effects according to the phases ofspoofed vehicles which were considered the attack featurein order to reveal the relationship between the phase of thespoofed vehicle and the caused congestion image feature

+e CycleGAN architecture is illustrated in Figure 5One training sample is a pair of image xi and image yi toform (xi yi) xi isin X and yi isin Y xi refers to the processedtraffic image at the spoofing time and yi is the processedtraffic congestion image two stages later +e image pro-cessing consists of three steps (1) filter out environmentbackground (2) extract four images in which each has 2phases at one intersection and (3) join these four images

1 6

2 574

83

Add a spoofed vehicle at any location as a late

arrived one

The equipped vehicles

The unequipped vehicles

(a)

e equipped vehicles

e unequipped vehicles

1 6

2 574

83

Add a spoofed vehicle at the location with maximum increased queue length

Queuingregion

Increased queue length

(b)

Figure 4 Two strategies of congestion data spoofing attack PR is short for penetration rate (a) Direct attack on arrival table withoutconsidering PR (b) Indirect attack on EVLS when PR is less than 95

Security and Communication Networks 5

+e plan of the optimal green duration and phase sequenceRequire Atk G

rp

min Grpmax R T

(1) Set j 0 vj 0(2) Xmin

j max G00min + R + G01

min + R G10min + R + G11

min + R1113966 1113967

(3) Xmaxj min G00

max + R + G01max + R G10

max + R + G11max + R1113864 1113865

(4) Tprime T minus sjminus 1 minus middot middot middot minus s1(5) for r 0 1 do(6) plan Pr0 1 + jlowast 2 + rlowast 4(7) plan Pr1 2 + jlowast 2 + rlowast 4(8) end for(9) for sj 1 T do(10) if sj geXmin

j and sj lemin Xmaxj Tprime1113966 1113967 then

(11) xj sj

(12) effect G0 effect G1 sj minus 2R

(13) optV0 optV1 999990(14) for r 0 1 do(15) for i Gr0

min Gr0max do

(16) tGr0 i

(17) tGr1 effectGr minus tGr0(18) if tGr1 geGr1

min and tGr1 leGr1max then

(19) Delayr f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Calculated by Algorithm 2(20) if Delayr lt optVr then(21) optVr Delayr

(22) optGr0 tGr0(23) optGr1 tGr1(24) end if(25) end if(26) end for(27) end for(28) else(29) vj 99999(30) xj 0(31) end if(32) end for(33) xlowastj optGr0 + R + optGr1 + R

(34) fj(xlowastj ) optV0 + optV1(35) vj fj(xlowastj ) + vjminus 1

Ensure planPrp optGrp xlowastj vj

(36) if jlt 2 then(37) j j + 1 go to step 2(38) end if

ALGORITHM 1 +e COP algorithm

+e delay calculation of ring r at stage j f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Require r p1 p2 g1 g2 xj Atk

(1) l0p1 A0p1 l0p2 A0p2(2) for i 1 xj do(3) lip1 liminus 1p1 minus Dip1 + Aip1(4) lip2 liminus 1p2 minus Dip2 + Aip2(5) di lip1 + lip2(6) end for(7) st

Dip1 1 if ileg1 and (i + 1)2 0

0 if g1lt ilexj1113896

Dip2 1 if (g1 + R)lt ile (g1 + R + g2) and (i + 1)2 0

0 i (g1 + R + g2)lt ilexj and ile (g1 + R)1113896

(8) Delayr 1113936xj

i1 di

Ensure Delayr

ALGORITHM 2 +e delay calculation algorithm

6 Security and Communication Networks

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 2: Congestion Attack Detection in Intelligent Traffic Signal

may still lack a deep understanding of such I-SIG attacks anddefenses raising some pressing concerns (1) What is theeffect of different phases where the attack vehicle is located+e different phases of the attack vehicle can cause differentcongestion effects (2) What is the quantified correlationbetween the attack and congestion degree +e quantifiedcorrelation refers to the potential relationship between theattack and congestion degree once identified we can inferwhether the attack occurred according to the congestiondegree (3) Are there any potential features to be utilized forrevealing the above correlation It is necessary to analyze thecongestion attack mechanism firstly to solve these issues+e challenges of solving these issues include how to au-tomatically explore multiple and multidimensional featuresto quantify the traffic flow characteristics under no attackand congestion attack and analyze the correlation betweenattack features and attack effects +us demystifying thecongestion attack based on the COP mechanism throughquantified features and exploring new analysis methods willbenefit all stakeholders for I-SIG including transportationSAGIN and security specialists

We demystify the attack and corresponding congestionfrom a machine learning perspective by exploring andutilizing quantified features We deeply analyze dataspoofing in SAGIN and the COP algorithm vulnerabilityunder two different attack strategies To explore the effect ofdifferent phases of the attack vehicle we consider utilizinghigh-level image features and design a novel analysis modelbased on the cycle generative adversarial network(CycleGAN) [9] to reflect the relation between the attackand the congestion caused by the attack +us we canpredict whether a forthcoming attack will cause congestionand the congestion effect according to the traffic image at aspecific moment To explore the quantified correlationbetween the attack and congestion degree we utilize trafficflow features and the TGRU classification model [10] (anexplainable gated recurrent unit-based model [11] with treeregularization) to verify whether a congestion attack occursbased on all vehiclesrsquo trajectory data in an intersectionFollowing analysis we also give some promising sugges-tions for defending I-SIG systems against a congestionattack

We implement the I-SIG and experiment through vi-sualized simulation in VISSIM [12] +e experiment showsthe effectiveness of our approach We find that feature-basedmachine learning can reflect the correlation between theattack and congestion degree well +rough the deeplearning-based training the CycleGAN-based approachoutput visualized results with satisfied prediction comparedwith real values the MAE and RMSE of the congestiondegree are near 002 and 003 respectively and theMAE andRMSE of the congestion degree are near 094 and 114respectively TGRU has a 084 precision and 079 recall onpredicting the spoofing attack based on 30 features Gen-erally for defenses we suggest improving the estimation ofvehicle location and speed (EVLS) [7] algorithm of I-SIG ifwe would like to keep a limited cost which requires fewerauthentication mechanisms and SAGIN reinforcementefforts

We summarize our contributions as follows

(1) We perform the study to demystify the attack toI-SIG and the corresponding congestion from amachine learning perspective by exploring differentkinds of features through supervised learning andunsupervised learning

(2) For predicting the spoofing congestion attack weautomatically explore the image feature to quantifythe traffic flow characteristics under no attack andcongestion attack And we propose a CycleGAN-based approach to analyze the potential relationshipbetween the congestion attack and correspondingresults two stages later based on the image feature

(3) For verifying the spoofing congestion attack wepropose a TGRU-based approach to explore theunderlying relationship between the congestion at-tack and traffic flow feature at the current momentbased on the traffic flow features which are firstlydefined in this work

(4) We evaluate our approach empirically from the realCOP algorithm through VISSIM We collect 4476high-quality image samples and 3600 traffic flow datafor the experiment which enables us to demonstratethe effectiveness of our approach compared withground truth

2 Preliminaries

21 SAGIN Infrastructure of I-SIG Figure 1 presents thebasic architecture for the space-air-ground integrated net-work of I-SIG in which two main segments are included aspace segment and a ground segment +e I-SIG of the CVenvironment is located in the network-based ground seg-ment +ere are three main components within the groundsegment on-board units (OBUs) roadside units (RSUs) andsignal planning units +ese refer to the devices installed invehicles roadside servers and traffic lights respectivelyBoth vehicle-to-vehicle (V2V) [13] communication andvehicle-to-infrastructure (V2I eg roadside servers) [14]communication adopt the dedicated short-range commu-nications (DSRC) [15] transmission protocol as 80211p-based wireless communication this provides a channel andenables high-speed direct communication Every vehiclebroadcasts anonymously and surrounding vehicles receivemessages Messages containing critical information arecalled basic safety messages (BSMs) +ese contain core dataelements including vehicle size position speed headingacceleration and brake system status Compared withDSRC the communication from the RSU to the signalplanning unit adopts the US National TransportationCommunications for Intelligent Transportation SystemProtocol (NTCIP) [16] By providing two-way communi-cation between vehicles and traffic signals NTCIP is spe-cially designed to achieve interpretability andinterchangeability between computers and electronic trafficcontrol equipment from different manufacturers thus in-creasing use in smart city initiatives

2 Security and Communication Networks

22 I-SIG Data Flow +e data flow of the I-SIG system isrevealed in Figure 2 Each OBU of a vehicle sends BSMs tothe RSU for real-time trajectory collection+en the data arepreprocessed to form an arrival table (Table 1) to be used asinput for signal planning which contains COP and EVLSalgorithms If the penetration rate (PR) of OBU for a vehicleis less than 95 the arrival table will be sent to EVLS for anupdate Otherwise it will be directly sent to the COP al-gorithm for planning According to the results of the COPalgorithm a downward signaling command will be trans-ferred to the phase signal controller After each stage ofsignal control the status of the signal will be returned asfeedback for continuous COP planning

+ere are 8 traffic signals in I-SIG as shown in Figure 3called phases odd numbers are for left-turn lanes evennumbers are for through lanes Table 1 is the arrival tablewhich is sent to the signal planning model In Table 1 Ti

i (0le ileM) denotes the time to arrive at the stop bar fromthe current location I-SIG sets M 130 seconds covering aBSM statistic of over two minutes Nij(i isin [0 M] j isin [1 8])

means that in phase j there will beNij vehicles that are goingto reach the stop bar within Ti seconds Here the stop bar isset in front of the traffic light as it is marked in real roadintersections

+e EVLS is based onWiedemannrsquos car-following modeland is used to fill the blank monitoring area of the moni-toring segment and insert vehicle data between OBU-equipped vehicles

+e key is to estimate the number of queued vehiclesBecause it is assumed that a queue always begins at the stopbar the last vehicle in the queue needs to be found to de-termine the queue length

First the historical distances to the stop bar and stoptime of the last stopped connected vehicle and the second-to-the-last stopped connected vehicle in the queue arecalculated these are denoted as Lq1 Tq1 Lq2 and Tq2 re-spectively +e current time is Tc and the estimated queuelength is Les Assuming that the queue propagation speed vq

is constant we have

vq Lq1 minus Lq2

Tq1 minus Tq2

Les minus Lq1

TC minus Tq1 (1)

+en

Les Lq1 + vq TC minus Tq11113872 1113873 (2)

If the average vehicle length is C the number N0i ofvehicles in queue is then calculated as follows

N0i Les

C i isin [1 8] (3)

Although such estimation provides effective support fora low PR it also introduces a new threat of data spoofingattack to the COP algorithm

3 Demystifying Attack on COP

31Data Spoofing0reat +ere are two data spoofing attackstrategies proposed in I-SIG (Figure 4) +e first one is adirect attack on the arrival table without considering PR thesecond one is an indirect attack on EVLS when the PR is lessthan 95

+e first strategy is for arrival time and phase spoofingfor both the full deployment period (PR ge 95) and

Space network

Ground network

CV environment

V2V

OBUs

RSUs

LegendDSRC Transmission (V2V V2I)

OBUsBSM

BSM

Groundsegment

V2I

Signal planning

BSM

NTCIP Transmission

GPSsatellites

Figure 1 +e architecture for space-air-ground integrated network of I-SIG

Security and Communication Networks 3

transition period (PR lt 95) +e attacker changes thelocation and speed information in vehicle BSMs to alter thevehiclersquos arrival time and requested phase thus the corre-sponding arrival table elements in Table 1 are changed +isattack strategy can directly attack input data flow no matterwhat the PR is As shown in Figure 4(a) the attacker adds aspoofed vehicle into the original vehicle queue at any lo-cation +e insertion of a spoofed vehicle makes the queuelonger Moreover there is an increase in the duration of thegreen light allocated by the COP algorithm for the currentphase which delays the next start time of the green light ofall phases thus increasing the delay for vehicles to passthrough the intersection

+e second strategy is for queue-length spoofing forthe transition period only +is strategy aims to extend the

queue length estimated by the EVLS algorithm bychanging the location and speed values in BSMsFigure 4(b) shows that the attacker adds a stopped vehiclewith the farthest distance to the stop bar Owing to theEVLS algorithm estimating the queue length based on thelocation of the last stopped connected vehicle this attackcauses the estimated queue length Les calculated byequation (2) to increase +erefore the number of vehiclesin the queue N0i calculated by equation (3) increases aswell

32 Planning-Level CongestionAnalysis +eCOP algorithmis responsible for traffic signal planning thus it is essentialfor planning-level congestion analysis of I-SIG

Trajectory collection

Signal planning

COP EVLSPhase signal

controller

Signalling

Signal status

Arrival table

BSMPreprocessingOBU ę

Trajectorydata

Figure 2 Data flow of the I-SIG system

Table 1 Arrival table Numbers 1 to 8 are phases and T0 to TM are the remaining arrival time of vehicles

Phase 1 2 middot middot middot 8T0 N01 N02 middot middot middot N08T1 N11 N12 middot middot middot N18T2 N21 N22 middot middot middot N28middot middot middot middot middot middot middot middot middot middot middot middot middot middot middot

TM NM1 NM2 middot middot middot NM8

1 6

2 5

7

4

83I-SIG

Figure 3 I-SIG signal control scenario including 8 phases

4 Security and Communication Networks

+rough reading the published COP-related papers[6 7] and analyzing the implementation code we reveal amore complete and detailed COP algorithm for the first time(Algorithms 1 and 2) +e authors in [6] first proposed aCOP algorithm that allows optimization of various per-formance indices including delay stops and queue lengthsfor the optimal control of a single intersection However itdid not support flexible or dual ring and phase sequencesand it is difficult to understand for most readers due to thelack of the algorithm flow Based on the COP algorithm theauthors in [7] presented a real-time adaptive traffic controlalgorithm by utilizing data from connected vehicles tooptimize the phase sequence However they did not providethe details of the algorithm Compared with [6 7] Algo-rithm 1 is the first algorithm that provides a complete anddetailed flow of signal planning

In Table 2 we list the meanings of the mathematicalsymbols that appear in the two algorithms

+e design of the COP algorithm uses the collaborationof two-stage planning and operation +e COP algorithmplans signals for the next-stage based on the vehiclersquos esti-mation and such planned signal duration will be operated atthe next-stage signal control time +us this is a continuousalternate process in a fixed phase sequence which meansthat the I-SIG system cannot change the order and durationof phases in the current stage since this is set in the previousstage When bringing foresight of planning such a designalso opens the door to attack signal planning in order toaffect next-stage operation continuously

+e spoofing of the arrival table affects the variables Atk

and planPrp and the later calculation of Delayr in line 19 ofAlgorithms 1 +e change in Delayr causes the variables

optVr in line 21 optGr0 in line 22 and optGr1 in line 23 ofAlgorithms 1 to change as well Finally the outputs planPrpoptGrp xlowastj and vj are changed

33 High-Level Image Feature-Based Congestion AttackPrediction In this subsection we employ an image feature-based CycleGAN to explain the relationship between thephase where the spoofed vehicle is located and the con-gestion image features two stages later

As mentioned in the Data Spoofing+reat section thereare two data spoofing attack strategies but either attack willcause congestion in a period Different phases of spoofedvehicles lead to different congestion effects +erefore theimage features of intersection congestion are also different+e CycleGAN model can mine the potential relationshipbetween two different types (X and Y) of images +roughtraining CycleGAN can generate the corresponding imagesY according to X and generate the related images Xaccording to Y +erefore we utilize the CycleGANmodel topredict the congestion effects according to the phases ofspoofed vehicles which were considered the attack featurein order to reveal the relationship between the phase of thespoofed vehicle and the caused congestion image feature

+e CycleGAN architecture is illustrated in Figure 5One training sample is a pair of image xi and image yi toform (xi yi) xi isin X and yi isin Y xi refers to the processedtraffic image at the spoofing time and yi is the processedtraffic congestion image two stages later +e image pro-cessing consists of three steps (1) filter out environmentbackground (2) extract four images in which each has 2phases at one intersection and (3) join these four images

1 6

2 574

83

Add a spoofed vehicle at any location as a late

arrived one

The equipped vehicles

The unequipped vehicles

(a)

e equipped vehicles

e unequipped vehicles

1 6

2 574

83

Add a spoofed vehicle at the location with maximum increased queue length

Queuingregion

Increased queue length

(b)

Figure 4 Two strategies of congestion data spoofing attack PR is short for penetration rate (a) Direct attack on arrival table withoutconsidering PR (b) Indirect attack on EVLS when PR is less than 95

Security and Communication Networks 5

+e plan of the optimal green duration and phase sequenceRequire Atk G

rp

min Grpmax R T

(1) Set j 0 vj 0(2) Xmin

j max G00min + R + G01

min + R G10min + R + G11

min + R1113966 1113967

(3) Xmaxj min G00

max + R + G01max + R G10

max + R + G11max + R1113864 1113865

(4) Tprime T minus sjminus 1 minus middot middot middot minus s1(5) for r 0 1 do(6) plan Pr0 1 + jlowast 2 + rlowast 4(7) plan Pr1 2 + jlowast 2 + rlowast 4(8) end for(9) for sj 1 T do(10) if sj geXmin

j and sj lemin Xmaxj Tprime1113966 1113967 then

(11) xj sj

(12) effect G0 effect G1 sj minus 2R

(13) optV0 optV1 999990(14) for r 0 1 do(15) for i Gr0

min Gr0max do

(16) tGr0 i

(17) tGr1 effectGr minus tGr0(18) if tGr1 geGr1

min and tGr1 leGr1max then

(19) Delayr f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Calculated by Algorithm 2(20) if Delayr lt optVr then(21) optVr Delayr

(22) optGr0 tGr0(23) optGr1 tGr1(24) end if(25) end if(26) end for(27) end for(28) else(29) vj 99999(30) xj 0(31) end if(32) end for(33) xlowastj optGr0 + R + optGr1 + R

(34) fj(xlowastj ) optV0 + optV1(35) vj fj(xlowastj ) + vjminus 1

Ensure planPrp optGrp xlowastj vj

(36) if jlt 2 then(37) j j + 1 go to step 2(38) end if

ALGORITHM 1 +e COP algorithm

+e delay calculation of ring r at stage j f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Require r p1 p2 g1 g2 xj Atk

(1) l0p1 A0p1 l0p2 A0p2(2) for i 1 xj do(3) lip1 liminus 1p1 minus Dip1 + Aip1(4) lip2 liminus 1p2 minus Dip2 + Aip2(5) di lip1 + lip2(6) end for(7) st

Dip1 1 if ileg1 and (i + 1)2 0

0 if g1lt ilexj1113896

Dip2 1 if (g1 + R)lt ile (g1 + R + g2) and (i + 1)2 0

0 i (g1 + R + g2)lt ilexj and ile (g1 + R)1113896

(8) Delayr 1113936xj

i1 di

Ensure Delayr

ALGORITHM 2 +e delay calculation algorithm

6 Security and Communication Networks

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 3: Congestion Attack Detection in Intelligent Traffic Signal

22 I-SIG Data Flow +e data flow of the I-SIG system isrevealed in Figure 2 Each OBU of a vehicle sends BSMs tothe RSU for real-time trajectory collection+en the data arepreprocessed to form an arrival table (Table 1) to be used asinput for signal planning which contains COP and EVLSalgorithms If the penetration rate (PR) of OBU for a vehicleis less than 95 the arrival table will be sent to EVLS for anupdate Otherwise it will be directly sent to the COP al-gorithm for planning According to the results of the COPalgorithm a downward signaling command will be trans-ferred to the phase signal controller After each stage ofsignal control the status of the signal will be returned asfeedback for continuous COP planning

+ere are 8 traffic signals in I-SIG as shown in Figure 3called phases odd numbers are for left-turn lanes evennumbers are for through lanes Table 1 is the arrival tablewhich is sent to the signal planning model In Table 1 Ti

i (0le ileM) denotes the time to arrive at the stop bar fromthe current location I-SIG sets M 130 seconds covering aBSM statistic of over two minutes Nij(i isin [0 M] j isin [1 8])

means that in phase j there will beNij vehicles that are goingto reach the stop bar within Ti seconds Here the stop bar isset in front of the traffic light as it is marked in real roadintersections

+e EVLS is based onWiedemannrsquos car-following modeland is used to fill the blank monitoring area of the moni-toring segment and insert vehicle data between OBU-equipped vehicles

+e key is to estimate the number of queued vehiclesBecause it is assumed that a queue always begins at the stopbar the last vehicle in the queue needs to be found to de-termine the queue length

First the historical distances to the stop bar and stoptime of the last stopped connected vehicle and the second-to-the-last stopped connected vehicle in the queue arecalculated these are denoted as Lq1 Tq1 Lq2 and Tq2 re-spectively +e current time is Tc and the estimated queuelength is Les Assuming that the queue propagation speed vq

is constant we have

vq Lq1 minus Lq2

Tq1 minus Tq2

Les minus Lq1

TC minus Tq1 (1)

+en

Les Lq1 + vq TC minus Tq11113872 1113873 (2)

If the average vehicle length is C the number N0i ofvehicles in queue is then calculated as follows

N0i Les

C i isin [1 8] (3)

Although such estimation provides effective support fora low PR it also introduces a new threat of data spoofingattack to the COP algorithm

3 Demystifying Attack on COP

31Data Spoofing0reat +ere are two data spoofing attackstrategies proposed in I-SIG (Figure 4) +e first one is adirect attack on the arrival table without considering PR thesecond one is an indirect attack on EVLS when the PR is lessthan 95

+e first strategy is for arrival time and phase spoofingfor both the full deployment period (PR ge 95) and

Space network

Ground network

CV environment

V2V

OBUs

RSUs

LegendDSRC Transmission (V2V V2I)

OBUsBSM

BSM

Groundsegment

V2I

Signal planning

BSM

NTCIP Transmission

GPSsatellites

Figure 1 +e architecture for space-air-ground integrated network of I-SIG

Security and Communication Networks 3

transition period (PR lt 95) +e attacker changes thelocation and speed information in vehicle BSMs to alter thevehiclersquos arrival time and requested phase thus the corre-sponding arrival table elements in Table 1 are changed +isattack strategy can directly attack input data flow no matterwhat the PR is As shown in Figure 4(a) the attacker adds aspoofed vehicle into the original vehicle queue at any lo-cation +e insertion of a spoofed vehicle makes the queuelonger Moreover there is an increase in the duration of thegreen light allocated by the COP algorithm for the currentphase which delays the next start time of the green light ofall phases thus increasing the delay for vehicles to passthrough the intersection

+e second strategy is for queue-length spoofing forthe transition period only +is strategy aims to extend the

queue length estimated by the EVLS algorithm bychanging the location and speed values in BSMsFigure 4(b) shows that the attacker adds a stopped vehiclewith the farthest distance to the stop bar Owing to theEVLS algorithm estimating the queue length based on thelocation of the last stopped connected vehicle this attackcauses the estimated queue length Les calculated byequation (2) to increase +erefore the number of vehiclesin the queue N0i calculated by equation (3) increases aswell

32 Planning-Level CongestionAnalysis +eCOP algorithmis responsible for traffic signal planning thus it is essentialfor planning-level congestion analysis of I-SIG

Trajectory collection

Signal planning

COP EVLSPhase signal

controller

Signalling

Signal status

Arrival table

BSMPreprocessingOBU ę

Trajectorydata

Figure 2 Data flow of the I-SIG system

Table 1 Arrival table Numbers 1 to 8 are phases and T0 to TM are the remaining arrival time of vehicles

Phase 1 2 middot middot middot 8T0 N01 N02 middot middot middot N08T1 N11 N12 middot middot middot N18T2 N21 N22 middot middot middot N28middot middot middot middot middot middot middot middot middot middot middot middot middot middot middot

TM NM1 NM2 middot middot middot NM8

1 6

2 5

7

4

83I-SIG

Figure 3 I-SIG signal control scenario including 8 phases

4 Security and Communication Networks

+rough reading the published COP-related papers[6 7] and analyzing the implementation code we reveal amore complete and detailed COP algorithm for the first time(Algorithms 1 and 2) +e authors in [6] first proposed aCOP algorithm that allows optimization of various per-formance indices including delay stops and queue lengthsfor the optimal control of a single intersection However itdid not support flexible or dual ring and phase sequencesand it is difficult to understand for most readers due to thelack of the algorithm flow Based on the COP algorithm theauthors in [7] presented a real-time adaptive traffic controlalgorithm by utilizing data from connected vehicles tooptimize the phase sequence However they did not providethe details of the algorithm Compared with [6 7] Algo-rithm 1 is the first algorithm that provides a complete anddetailed flow of signal planning

In Table 2 we list the meanings of the mathematicalsymbols that appear in the two algorithms

+e design of the COP algorithm uses the collaborationof two-stage planning and operation +e COP algorithmplans signals for the next-stage based on the vehiclersquos esti-mation and such planned signal duration will be operated atthe next-stage signal control time +us this is a continuousalternate process in a fixed phase sequence which meansthat the I-SIG system cannot change the order and durationof phases in the current stage since this is set in the previousstage When bringing foresight of planning such a designalso opens the door to attack signal planning in order toaffect next-stage operation continuously

+e spoofing of the arrival table affects the variables Atk

and planPrp and the later calculation of Delayr in line 19 ofAlgorithms 1 +e change in Delayr causes the variables

optVr in line 21 optGr0 in line 22 and optGr1 in line 23 ofAlgorithms 1 to change as well Finally the outputs planPrpoptGrp xlowastj and vj are changed

33 High-Level Image Feature-Based Congestion AttackPrediction In this subsection we employ an image feature-based CycleGAN to explain the relationship between thephase where the spoofed vehicle is located and the con-gestion image features two stages later

As mentioned in the Data Spoofing+reat section thereare two data spoofing attack strategies but either attack willcause congestion in a period Different phases of spoofedvehicles lead to different congestion effects +erefore theimage features of intersection congestion are also different+e CycleGAN model can mine the potential relationshipbetween two different types (X and Y) of images +roughtraining CycleGAN can generate the corresponding imagesY according to X and generate the related images Xaccording to Y +erefore we utilize the CycleGANmodel topredict the congestion effects according to the phases ofspoofed vehicles which were considered the attack featurein order to reveal the relationship between the phase of thespoofed vehicle and the caused congestion image feature

+e CycleGAN architecture is illustrated in Figure 5One training sample is a pair of image xi and image yi toform (xi yi) xi isin X and yi isin Y xi refers to the processedtraffic image at the spoofing time and yi is the processedtraffic congestion image two stages later +e image pro-cessing consists of three steps (1) filter out environmentbackground (2) extract four images in which each has 2phases at one intersection and (3) join these four images

1 6

2 574

83

Add a spoofed vehicle at any location as a late

arrived one

The equipped vehicles

The unequipped vehicles

(a)

e equipped vehicles

e unequipped vehicles

1 6

2 574

83

Add a spoofed vehicle at the location with maximum increased queue length

Queuingregion

Increased queue length

(b)

Figure 4 Two strategies of congestion data spoofing attack PR is short for penetration rate (a) Direct attack on arrival table withoutconsidering PR (b) Indirect attack on EVLS when PR is less than 95

Security and Communication Networks 5

+e plan of the optimal green duration and phase sequenceRequire Atk G

rp

min Grpmax R T

(1) Set j 0 vj 0(2) Xmin

j max G00min + R + G01

min + R G10min + R + G11

min + R1113966 1113967

(3) Xmaxj min G00

max + R + G01max + R G10

max + R + G11max + R1113864 1113865

(4) Tprime T minus sjminus 1 minus middot middot middot minus s1(5) for r 0 1 do(6) plan Pr0 1 + jlowast 2 + rlowast 4(7) plan Pr1 2 + jlowast 2 + rlowast 4(8) end for(9) for sj 1 T do(10) if sj geXmin

j and sj lemin Xmaxj Tprime1113966 1113967 then

(11) xj sj

(12) effect G0 effect G1 sj minus 2R

(13) optV0 optV1 999990(14) for r 0 1 do(15) for i Gr0

min Gr0max do

(16) tGr0 i

(17) tGr1 effectGr minus tGr0(18) if tGr1 geGr1

min and tGr1 leGr1max then

(19) Delayr f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Calculated by Algorithm 2(20) if Delayr lt optVr then(21) optVr Delayr

(22) optGr0 tGr0(23) optGr1 tGr1(24) end if(25) end if(26) end for(27) end for(28) else(29) vj 99999(30) xj 0(31) end if(32) end for(33) xlowastj optGr0 + R + optGr1 + R

(34) fj(xlowastj ) optV0 + optV1(35) vj fj(xlowastj ) + vjminus 1

Ensure planPrp optGrp xlowastj vj

(36) if jlt 2 then(37) j j + 1 go to step 2(38) end if

ALGORITHM 1 +e COP algorithm

+e delay calculation of ring r at stage j f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Require r p1 p2 g1 g2 xj Atk

(1) l0p1 A0p1 l0p2 A0p2(2) for i 1 xj do(3) lip1 liminus 1p1 minus Dip1 + Aip1(4) lip2 liminus 1p2 minus Dip2 + Aip2(5) di lip1 + lip2(6) end for(7) st

Dip1 1 if ileg1 and (i + 1)2 0

0 if g1lt ilexj1113896

Dip2 1 if (g1 + R)lt ile (g1 + R + g2) and (i + 1)2 0

0 i (g1 + R + g2)lt ilexj and ile (g1 + R)1113896

(8) Delayr 1113936xj

i1 di

Ensure Delayr

ALGORITHM 2 +e delay calculation algorithm

6 Security and Communication Networks

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 4: Congestion Attack Detection in Intelligent Traffic Signal

transition period (PR lt 95) +e attacker changes thelocation and speed information in vehicle BSMs to alter thevehiclersquos arrival time and requested phase thus the corre-sponding arrival table elements in Table 1 are changed +isattack strategy can directly attack input data flow no matterwhat the PR is As shown in Figure 4(a) the attacker adds aspoofed vehicle into the original vehicle queue at any lo-cation +e insertion of a spoofed vehicle makes the queuelonger Moreover there is an increase in the duration of thegreen light allocated by the COP algorithm for the currentphase which delays the next start time of the green light ofall phases thus increasing the delay for vehicles to passthrough the intersection

+e second strategy is for queue-length spoofing forthe transition period only +is strategy aims to extend the

queue length estimated by the EVLS algorithm bychanging the location and speed values in BSMsFigure 4(b) shows that the attacker adds a stopped vehiclewith the farthest distance to the stop bar Owing to theEVLS algorithm estimating the queue length based on thelocation of the last stopped connected vehicle this attackcauses the estimated queue length Les calculated byequation (2) to increase +erefore the number of vehiclesin the queue N0i calculated by equation (3) increases aswell

32 Planning-Level CongestionAnalysis +eCOP algorithmis responsible for traffic signal planning thus it is essentialfor planning-level congestion analysis of I-SIG

Trajectory collection

Signal planning

COP EVLSPhase signal

controller

Signalling

Signal status

Arrival table

BSMPreprocessingOBU ę

Trajectorydata

Figure 2 Data flow of the I-SIG system

Table 1 Arrival table Numbers 1 to 8 are phases and T0 to TM are the remaining arrival time of vehicles

Phase 1 2 middot middot middot 8T0 N01 N02 middot middot middot N08T1 N11 N12 middot middot middot N18T2 N21 N22 middot middot middot N28middot middot middot middot middot middot middot middot middot middot middot middot middot middot middot

TM NM1 NM2 middot middot middot NM8

1 6

2 5

7

4

83I-SIG

Figure 3 I-SIG signal control scenario including 8 phases

4 Security and Communication Networks

+rough reading the published COP-related papers[6 7] and analyzing the implementation code we reveal amore complete and detailed COP algorithm for the first time(Algorithms 1 and 2) +e authors in [6] first proposed aCOP algorithm that allows optimization of various per-formance indices including delay stops and queue lengthsfor the optimal control of a single intersection However itdid not support flexible or dual ring and phase sequencesand it is difficult to understand for most readers due to thelack of the algorithm flow Based on the COP algorithm theauthors in [7] presented a real-time adaptive traffic controlalgorithm by utilizing data from connected vehicles tooptimize the phase sequence However they did not providethe details of the algorithm Compared with [6 7] Algo-rithm 1 is the first algorithm that provides a complete anddetailed flow of signal planning

In Table 2 we list the meanings of the mathematicalsymbols that appear in the two algorithms

+e design of the COP algorithm uses the collaborationof two-stage planning and operation +e COP algorithmplans signals for the next-stage based on the vehiclersquos esti-mation and such planned signal duration will be operated atthe next-stage signal control time +us this is a continuousalternate process in a fixed phase sequence which meansthat the I-SIG system cannot change the order and durationof phases in the current stage since this is set in the previousstage When bringing foresight of planning such a designalso opens the door to attack signal planning in order toaffect next-stage operation continuously

+e spoofing of the arrival table affects the variables Atk

and planPrp and the later calculation of Delayr in line 19 ofAlgorithms 1 +e change in Delayr causes the variables

optVr in line 21 optGr0 in line 22 and optGr1 in line 23 ofAlgorithms 1 to change as well Finally the outputs planPrpoptGrp xlowastj and vj are changed

33 High-Level Image Feature-Based Congestion AttackPrediction In this subsection we employ an image feature-based CycleGAN to explain the relationship between thephase where the spoofed vehicle is located and the con-gestion image features two stages later

As mentioned in the Data Spoofing+reat section thereare two data spoofing attack strategies but either attack willcause congestion in a period Different phases of spoofedvehicles lead to different congestion effects +erefore theimage features of intersection congestion are also different+e CycleGAN model can mine the potential relationshipbetween two different types (X and Y) of images +roughtraining CycleGAN can generate the corresponding imagesY according to X and generate the related images Xaccording to Y +erefore we utilize the CycleGANmodel topredict the congestion effects according to the phases ofspoofed vehicles which were considered the attack featurein order to reveal the relationship between the phase of thespoofed vehicle and the caused congestion image feature

+e CycleGAN architecture is illustrated in Figure 5One training sample is a pair of image xi and image yi toform (xi yi) xi isin X and yi isin Y xi refers to the processedtraffic image at the spoofing time and yi is the processedtraffic congestion image two stages later +e image pro-cessing consists of three steps (1) filter out environmentbackground (2) extract four images in which each has 2phases at one intersection and (3) join these four images

1 6

2 574

83

Add a spoofed vehicle at any location as a late

arrived one

The equipped vehicles

The unequipped vehicles

(a)

e equipped vehicles

e unequipped vehicles

1 6

2 574

83

Add a spoofed vehicle at the location with maximum increased queue length

Queuingregion

Increased queue length

(b)

Figure 4 Two strategies of congestion data spoofing attack PR is short for penetration rate (a) Direct attack on arrival table withoutconsidering PR (b) Indirect attack on EVLS when PR is less than 95

Security and Communication Networks 5

+e plan of the optimal green duration and phase sequenceRequire Atk G

rp

min Grpmax R T

(1) Set j 0 vj 0(2) Xmin

j max G00min + R + G01

min + R G10min + R + G11

min + R1113966 1113967

(3) Xmaxj min G00

max + R + G01max + R G10

max + R + G11max + R1113864 1113865

(4) Tprime T minus sjminus 1 minus middot middot middot minus s1(5) for r 0 1 do(6) plan Pr0 1 + jlowast 2 + rlowast 4(7) plan Pr1 2 + jlowast 2 + rlowast 4(8) end for(9) for sj 1 T do(10) if sj geXmin

j and sj lemin Xmaxj Tprime1113966 1113967 then

(11) xj sj

(12) effect G0 effect G1 sj minus 2R

(13) optV0 optV1 999990(14) for r 0 1 do(15) for i Gr0

min Gr0max do

(16) tGr0 i

(17) tGr1 effectGr minus tGr0(18) if tGr1 geGr1

min and tGr1 leGr1max then

(19) Delayr f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Calculated by Algorithm 2(20) if Delayr lt optVr then(21) optVr Delayr

(22) optGr0 tGr0(23) optGr1 tGr1(24) end if(25) end if(26) end for(27) end for(28) else(29) vj 99999(30) xj 0(31) end if(32) end for(33) xlowastj optGr0 + R + optGr1 + R

(34) fj(xlowastj ) optV0 + optV1(35) vj fj(xlowastj ) + vjminus 1

Ensure planPrp optGrp xlowastj vj

(36) if jlt 2 then(37) j j + 1 go to step 2(38) end if

ALGORITHM 1 +e COP algorithm

+e delay calculation of ring r at stage j f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Require r p1 p2 g1 g2 xj Atk

(1) l0p1 A0p1 l0p2 A0p2(2) for i 1 xj do(3) lip1 liminus 1p1 minus Dip1 + Aip1(4) lip2 liminus 1p2 minus Dip2 + Aip2(5) di lip1 + lip2(6) end for(7) st

Dip1 1 if ileg1 and (i + 1)2 0

0 if g1lt ilexj1113896

Dip2 1 if (g1 + R)lt ile (g1 + R + g2) and (i + 1)2 0

0 i (g1 + R + g2)lt ilexj and ile (g1 + R)1113896

(8) Delayr 1113936xj

i1 di

Ensure Delayr

ALGORITHM 2 +e delay calculation algorithm

6 Security and Communication Networks

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 5: Congestion Attack Detection in Intelligent Traffic Signal

+rough reading the published COP-related papers[6 7] and analyzing the implementation code we reveal amore complete and detailed COP algorithm for the first time(Algorithms 1 and 2) +e authors in [6] first proposed aCOP algorithm that allows optimization of various per-formance indices including delay stops and queue lengthsfor the optimal control of a single intersection However itdid not support flexible or dual ring and phase sequencesand it is difficult to understand for most readers due to thelack of the algorithm flow Based on the COP algorithm theauthors in [7] presented a real-time adaptive traffic controlalgorithm by utilizing data from connected vehicles tooptimize the phase sequence However they did not providethe details of the algorithm Compared with [6 7] Algo-rithm 1 is the first algorithm that provides a complete anddetailed flow of signal planning

In Table 2 we list the meanings of the mathematicalsymbols that appear in the two algorithms

+e design of the COP algorithm uses the collaborationof two-stage planning and operation +e COP algorithmplans signals for the next-stage based on the vehiclersquos esti-mation and such planned signal duration will be operated atthe next-stage signal control time +us this is a continuousalternate process in a fixed phase sequence which meansthat the I-SIG system cannot change the order and durationof phases in the current stage since this is set in the previousstage When bringing foresight of planning such a designalso opens the door to attack signal planning in order toaffect next-stage operation continuously

+e spoofing of the arrival table affects the variables Atk

and planPrp and the later calculation of Delayr in line 19 ofAlgorithms 1 +e change in Delayr causes the variables

optVr in line 21 optGr0 in line 22 and optGr1 in line 23 ofAlgorithms 1 to change as well Finally the outputs planPrpoptGrp xlowastj and vj are changed

33 High-Level Image Feature-Based Congestion AttackPrediction In this subsection we employ an image feature-based CycleGAN to explain the relationship between thephase where the spoofed vehicle is located and the con-gestion image features two stages later

As mentioned in the Data Spoofing+reat section thereare two data spoofing attack strategies but either attack willcause congestion in a period Different phases of spoofedvehicles lead to different congestion effects +erefore theimage features of intersection congestion are also different+e CycleGAN model can mine the potential relationshipbetween two different types (X and Y) of images +roughtraining CycleGAN can generate the corresponding imagesY according to X and generate the related images Xaccording to Y +erefore we utilize the CycleGANmodel topredict the congestion effects according to the phases ofspoofed vehicles which were considered the attack featurein order to reveal the relationship between the phase of thespoofed vehicle and the caused congestion image feature

+e CycleGAN architecture is illustrated in Figure 5One training sample is a pair of image xi and image yi toform (xi yi) xi isin X and yi isin Y xi refers to the processedtraffic image at the spoofing time and yi is the processedtraffic congestion image two stages later +e image pro-cessing consists of three steps (1) filter out environmentbackground (2) extract four images in which each has 2phases at one intersection and (3) join these four images

1 6

2 574

83

Add a spoofed vehicle at any location as a late

arrived one

The equipped vehicles

The unequipped vehicles

(a)

e equipped vehicles

e unequipped vehicles

1 6

2 574

83

Add a spoofed vehicle at the location with maximum increased queue length

Queuingregion

Increased queue length

(b)

Figure 4 Two strategies of congestion data spoofing attack PR is short for penetration rate (a) Direct attack on arrival table withoutconsidering PR (b) Indirect attack on EVLS when PR is less than 95

Security and Communication Networks 5

+e plan of the optimal green duration and phase sequenceRequire Atk G

rp

min Grpmax R T

(1) Set j 0 vj 0(2) Xmin

j max G00min + R + G01

min + R G10min + R + G11

min + R1113966 1113967

(3) Xmaxj min G00

max + R + G01max + R G10

max + R + G11max + R1113864 1113865

(4) Tprime T minus sjminus 1 minus middot middot middot minus s1(5) for r 0 1 do(6) plan Pr0 1 + jlowast 2 + rlowast 4(7) plan Pr1 2 + jlowast 2 + rlowast 4(8) end for(9) for sj 1 T do(10) if sj geXmin

j and sj lemin Xmaxj Tprime1113966 1113967 then

(11) xj sj

(12) effect G0 effect G1 sj minus 2R

(13) optV0 optV1 999990(14) for r 0 1 do(15) for i Gr0

min Gr0max do

(16) tGr0 i

(17) tGr1 effectGr minus tGr0(18) if tGr1 geGr1

min and tGr1 leGr1max then

(19) Delayr f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Calculated by Algorithm 2(20) if Delayr lt optVr then(21) optVr Delayr

(22) optGr0 tGr0(23) optGr1 tGr1(24) end if(25) end if(26) end for(27) end for(28) else(29) vj 99999(30) xj 0(31) end if(32) end for(33) xlowastj optGr0 + R + optGr1 + R

(34) fj(xlowastj ) optV0 + optV1(35) vj fj(xlowastj ) + vjminus 1

Ensure planPrp optGrp xlowastj vj

(36) if jlt 2 then(37) j j + 1 go to step 2(38) end if

ALGORITHM 1 +e COP algorithm

+e delay calculation of ring r at stage j f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Require r p1 p2 g1 g2 xj Atk

(1) l0p1 A0p1 l0p2 A0p2(2) for i 1 xj do(3) lip1 liminus 1p1 minus Dip1 + Aip1(4) lip2 liminus 1p2 minus Dip2 + Aip2(5) di lip1 + lip2(6) end for(7) st

Dip1 1 if ileg1 and (i + 1)2 0

0 if g1lt ilexj1113896

Dip2 1 if (g1 + R)lt ile (g1 + R + g2) and (i + 1)2 0

0 i (g1 + R + g2)lt ilexj and ile (g1 + R)1113896

(8) Delayr 1113936xj

i1 di

Ensure Delayr

ALGORITHM 2 +e delay calculation algorithm

6 Security and Communication Networks

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 6: Congestion Attack Detection in Intelligent Traffic Signal

+e plan of the optimal green duration and phase sequenceRequire Atk G

rp

min Grpmax R T

(1) Set j 0 vj 0(2) Xmin

j max G00min + R + G01

min + R G10min + R + G11

min + R1113966 1113967

(3) Xmaxj min G00

max + R + G01max + R G10

max + R + G11max + R1113864 1113865

(4) Tprime T minus sjminus 1 minus middot middot middot minus s1(5) for r 0 1 do(6) plan Pr0 1 + jlowast 2 + rlowast 4(7) plan Pr1 2 + jlowast 2 + rlowast 4(8) end for(9) for sj 1 T do(10) if sj geXmin

j and sj lemin Xmaxj Tprime1113966 1113967 then

(11) xj sj

(12) effect G0 effect G1 sj minus 2R

(13) optV0 optV1 999990(14) for r 0 1 do(15) for i Gr0

min Gr0max do

(16) tGr0 i

(17) tGr1 effectGr minus tGr0(18) if tGr1 geGr1

min and tGr1 leGr1max then

(19) Delayr f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Calculated by Algorithm 2(20) if Delayr lt optVr then(21) optVr Delayr

(22) optGr0 tGr0(23) optGr1 tGr1(24) end if(25) end if(26) end for(27) end for(28) else(29) vj 99999(30) xj 0(31) end if(32) end for(33) xlowastj optGr0 + R + optGr1 + R

(34) fj(xlowastj ) optV0 + optV1(35) vj fj(xlowastj ) + vjminus 1

Ensure planPrp optGrp xlowastj vj

(36) if jlt 2 then(37) j j + 1 go to step 2(38) end if

ALGORITHM 1 +e COP algorithm

+e delay calculation of ring r at stage j f(r planPr0 planPr1 tGr0 tGr1 xj Atk)

Require r p1 p2 g1 g2 xj Atk

(1) l0p1 A0p1 l0p2 A0p2(2) for i 1 xj do(3) lip1 liminus 1p1 minus Dip1 + Aip1(4) lip2 liminus 1p2 minus Dip2 + Aip2(5) di lip1 + lip2(6) end for(7) st

Dip1 1 if ileg1 and (i + 1)2 0

0 if g1lt ilexj1113896

Dip2 1 if (g1 + R)lt ile (g1 + R + g2) and (i + 1)2 0

0 i (g1 + R + g2)lt ilexj and ile (g1 + R)1113896

(8) Delayr 1113936xj

i1 di

Ensure Delayr

ALGORITHM 2 +e delay calculation algorithm

6 Security and Communication Networks

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 7: Congestion Attack Detection in Intelligent Traffic Signal

from top to bottom to form one sample image according tothe phase order of phase (47) phase (83) phase (25) andphase (61) Here the number of phases is consistent withthat shown in Figure 3 which is joined by the four parts of anintersection from top to down to form one sample image

+ere are four neural networks in the CycleGAN ar-chitecture two generative networks (G and F) and twodiscriminant networks (DX and DY) +e generator Ggenerates a fake image 1113957y which is similar to y given realimage x ie G X⟶ Y Meanwhile F generates a fakeimage 1113957x which is similar to x given real image y ieF Y⟶ X +e adversarial discriminator DX aims todistinguish whether the input image is x and outputsprobability P(x) SimilarlyDY aims to discriminate whetherthe input image is y and outputs probability P(y)

For x isin X x⟶ G(x)⟶ F(G(x)) asymp x is a cyclecalled forward cycle consistency Similarly for y isin Y

y⟶ F(y)⟶ G(F(y)) asymp y is a cycle called backwardcycle consistency +ere are two kinds of losses adversarialloss and cycle consistency loss Adversarial loss can onlyguarantee that the samples generated by the generator aredistributed with the real samples but we want the imagesbetween the corresponding domains to correspond one byone +at is X-Y-X can also be migrated back So forwardcycle consistency and backward cycle consistency are used tomake the samples generated by two generators not con-tradict each other

Adversarial Loss +is refers to the difference in datasetdistribution between generated images and correspondingreal images For discriminators DX and DY the closer theoutput value is to 1 the smaller the loss is

+e losses of G and F can be calculated as LG and LFrespectively as follows

LG G DY X Y( 1113857 EysimPdata(y) logDY(y)1113858 1113859 + ExsimPdata(x) log 1 minus DY(G(x))( 11138571113858 1113859 (4)

LF F DX Y X( 1113857 ExsimPdata(x) logDX(x)1113858 1113859 + EysimPdata(y) log 1 minus DY(G(y))( 11138571113858 1113859 (5)

Cycle Consistency Loss +is prevents the learned mappingsG and F from contradicting each other making F(G(x)) asymp x

and (F(y)) asymp y +e loss of Lcyc(G F) is calculated by thefollowing equation

Lcyc(G F) ExsimPdata(x) F(G(x)) minus x11113858 1113859

+ EysimPdata(y) G(F(y)) minus y11113858 1113859(6)

+e total loss for CycleGAN is

Table 2 Mathematical symbols used in the COP algorithm

t Index of arrival time k Global phase index

Atk

Element of arrival table denoting the number of vehicle arrivals forphase k at time t p Local phase index

r Ring index in each stage Grp

min Minimum green time of phase p in ring rGrpmax Maximum green time of phase p in ring r R Duration of yellow light and red light

T Total number of discrete time steps in the planning horizon in seconds j Index of stage

vj

Value function given state j which represents the accumulatedperformance measure for the current and all previous stages Xmin

j Minimum possible length of stage j

Xmaxj Maximum possible length of stage j sj

State variable denoting the total number oftime steps allocated to stage j

PlanPrp

Planned phase of phase p in ring r EffectGr

Effective total green light time of ring r instage j

Opt Vr Optimal delay of ring r in stage j Opt Grp Optimal green duration of phase p in ring rxj Length of stage j under the optimal solution xlowastj Length of stage j under the optimal solution

fj(xj) Performance measure at stage j likNumber of vehicle departing for phase k at

time tDik Number of vehicle departing for phase k at time t Delayr Delay of ring r at stage j

Table 3 Feature composition schema through selecting equal features from traffic flow head and tail

Flow head (10 s) Flow tail (10 s)Macrofeatures CR αCR βCR ICD αICD βIC D CR αCR βCR ICD αICD βICD

MicrofeaturesPCD1 PCD2 PCD8 PCD1 PCD2 PCD8αPCD1

αPCD2 αPCD8

αPCD1 αPCD2

αPCD8

βPCD1 βPCD2

βPCD8βPCD1

βPCD2 βPCD8

Security and Communication Networks 7

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 8: Congestion Attack Detection in Intelligent Traffic Signal

L G F DX DY( 1113857 LG G DY X Y( 1113857 + LF F DX Y X( 1113857

+ λLcyc(G F)(7)

in which λ is an important parameter +en the objectivefunction of the CycleGAN is defined as follows

Glowast Flowast

arg minGF

maxDXDY

L G F DX DY( 1113857 (8)

+e detailed implementation of neural networks inCycleGAN will be described in the following experimentsetup

34 Traffic Flow Feature-Based Congestion AttackVerification In this subsection we use a deep learning-based decision tree model TGRU to explain the relationshipbetween the traffic flow features and the congestion attack+is relationship can then be used to verify if congestion isoccurring +e TGRU model is an interpretable depth time-series model which is very suitable for intersection trafficflow features with time characteristics At the same timeinterpretability helps to analyze better the relationship be-tween traffic flow features and the congestion attack

+e input of the congestion prediction is the traffic imagefeature of the intersection and the prediction model outputsthe congestion affects two stages later according to the imagefeature which indicates that whether the congestion willoccur However after the congestion prediction the veri-fication model is used to verify whether the congestionattack is occurring +e verification input is the definedtraffic flow feature that is calculated according to vehiclesrsquoinformation When we find a possible attack that can causecongestion with high probability and subsequent trafficflows also verify congestion then we can predict there existsa congestion attack +us we can realize timely and accurate

congestion attack detection by integrating empirical pre-diction and analytical verification

Feature Definition Based on Traffic Flow To measure thecongestion effects caused by spoofed vehicles we proposecapacity ratio and congestion degree as well as an attackacceleration and attack amplification ratio based on capacityratio and congestion degree We define features as follows

(1) Vehicle Capacity Ratio (CR) Cmaxk is the maximum

vehicle capacity of each phase and the vehicle ca-pacity of all 8 phases is computed asCmaxtotal 1113936

8k1 Cmax

k +en the vehicle CR can bedenoted by CR 1113936

8k1 NkCmax

total where Nk is thevehicle number of the kth phase

(2) Congestion Degree (CD) +e number of vehiclesqueuing in the kth phase is denoted as Qk Qnormal isthe number of vehicles during normal queuing and isa constant +en the CD of the kth phase can becomputed by PCDk QkQnormal and the global CDfor an intersection is ICD 1113936

8k1 PCDk

(3) Attack Acceleration Let t0 be the start time of thedata spoofing attack +en the accelerations of CRPCDk and ICD at time t are respectively calculatedby αCR(t) (CR(t) minus CR(t0))(t minus t0) αPCD(t k)

(PCD(t k) minus PCD(t0 k))(t minus t0) and αICD(t)

(ICD(t) minus ICD(t0))(t minus t0)

(4) Attack Amplification Ratio Let t0 be the start time ofthe data spoofing attack +en the amplificationratio of CR PCDk and ICD at time t is respectivelycalculated by βCR(t) CR(t)CR(t0) βPCD(t k)

PCD(t k)PCD(t0 k) and βICD(t) ICD

(t)ICD(t0)

Forward cycle consistency

P (y) P (x)DY DX

G

G

F

F

Backward cycle consistency

y~

x~

x

y

Figure 5 CycleGAN architecture

8 Security and Communication Networks

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 9: Congestion Attack Detection in Intelligent Traffic Signal

Features are divided into macrofeatures and micro-features for the sake of discussing interpretability dependingon whether they are a feature of the whole intersection or aspecific phase (Table 3) Macrofeatures measure the con-gestion characteristics of the whole intersection andmicrofeatures measure the phase of a single signal phaseUnlike the traditional traffic flow characteristics such astraffic flow traffic density and speed the traffic flow featureswe defined are related to attacks and are divided into thefeatures for all single signal phases and the features of thewhole intersection For a traffic flow of 1800 seconds weonly sample the first 10 seconds of flow head and the last 10seconds of flow tail For flow head we choose features ofmacro micro or both and then we choose the same featuresfrom the flow tail +erefore the number of features is from20 to 600 We use the Z-score as a standardization to adjustfeature values For values (x1 x2 xn) of one feature in allsamples the new value is computed by xprime xi minus xs inwhich s is the standard deviation and x is the mean value of(x1 x2 xn)

TGRU Model We try data spoofing exhaustedly using thelast vehicle and collect time-sequence samples For such

data we use TGRU a time-series model with decision treeregularization for interpretability Figure 6 shows the TGRUarchitecture for end-to-end calculation

+ere are four main calculators sigm tanh plus andHadamard product Sigm refers to the sigmoid function andtanh refers to the hyperbolic tangent function +e objectivefunction is as follows

minW

λψ(W) + 1113944N

n11113944

T

t1loss ynt 1113957ynt xn W( 1113857( 1113857⎛⎝ ⎞⎠ (9)

where λ (λgt 0) is the regularization strengthW is the wholeparameter space N is the sample number and T denotes asampling frequency in one series+e logistic loss function isbinary cross entropy

Next a single binary decision tree that accurately re-produces the networkrsquos thresholded binary predictions 1113957yn

given input xn is found+en the complexity of this decisiontree as the output of Ω(W) is measured +e complexity ismeasured by the average decision path length ie the av-erage number of decision nodes that must be touched tomake a prediction for an input example xn A regularizationfunction 1113957Ω(W) is used to map W to an estimate of the

ht-1

rt

yt

ht

zt

xt

1-zt

hprimet

sigm

sigm

tanh

λ

PlusHadamard product

Figure 6 TGRU architecture

Table 4 Experimental environment configuration

Platform Experimental environment Environmental configuration

COP and VISSIM

Operating system Windows 10CPU AMD Ryzen5 3550H with Radeon Vega Mobile Gfx 210GHzRAM 16G

Software PTV VISSIM 430 Visual Studio 2019

TGRU and CycleGAN

Operating system Ubuntu 16046 LTSCPU Intel(R) Core(TM) i7-9700F CPU 300GHzRAM 32GGPU MSI GeForce RTX 2070 VENTUS

Graphic memory 151MiBFramework TensorFlow-gpu-1140

Security and Communication Networks 9

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 10: Congestion Attack Detection in Intelligent Traffic Signal

average path length and is implemented by a multilayerperception (MLP) approximator

+en tree regularization is conducted and its objectivefunction is defined as follows

minξ

1113944

J

i1Ω Wj1113872 1113873 minus 1113957Ω Wj ξ1113872 11138731113872 1113873

2⎛⎝ ⎞⎠ + λξ22 (10)

where J is the size of the candidate dataset of W and vector ξdenotes the parameters of this chosen MLP approximator

4 Experiment

41 Setup +e platform and experimental environmentconfiguration are shown in Table 4 We use a PC to run theCOP algorithm and VISSIM for real-time traffic flow signalcontrol and corresponding traffic simulation We use an-other GPU server for both TGRU and CycleGAN training

VISSIM the traffic simulation platform can capture anddisplay the changes of traffic signal and traffic flow plannedby the COP algorithm in real-time as shown in Figure 7Table 5 shows the sample datasets for TGRU and CycleGANIn TGRU we train a 3-layer MLP with 100 first-layer nodes100 second-layer nodes and 10 third-layer nodes In theCycleGAN the generator contains encoding transforma-tion and decoding Encoding includes one 7times 7 Convolu-tion-InstanceNorm-ReLU layer with stride 1 and two 3times 3Convolution-InstanceNorm-ReLU layers with stride 2Transformation includes 9 residual blocks for 256times 256images and two 3times 3 convolutional layers with the samenumber of filters on both layers Finally decoding includestwo 3 times 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7times 7 Convolution-InstanceNorm-ReLU layer with stride 1 In the discrimi-nator networks we use 70times 70 PatchGANs [17] and the

discriminator architecture includes four 4times 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2 +e lastlayer contains a convolution to produce a 1-dimensionaloutput

42 Congestion Attack Prediction and Visualized AnalysisWe evaluate the performance of the CycleGANmodel basedon image features

EvaluationMetric ForN samples testing we further evaluatethe CR PCD and ICD based on the mean absolute error(MAE) and root mean squared error (RMSE)We haveMAEand RMSE of CR expressed as follows

MAECR 1N

1113944

N

i1CR

iminus

1113958CR

i111386811138681113868111386811138681113868

111386811138681113868111386811138681113868

RMSECR

1N

1113944

N

i1CR

iminus 1113958CRi1113872 1113873

2

11139741113972

(11)

where CRi is the real value and 1113958CRi is the estimated valueSimilarly we have MAEPCDk

RMSEPCDk MAEICD and

RMSEICD

Visualized Results and Quantitative Qnalysis In Figure 8 thefirst column is the original image x the second one is theoutput imageG(x) by CycleGAN and the third column givesthe real image y with congestion Our approach has a sat-isfied generator and can predict a future result of congestionattacks to provide a visualization for better humanunderstanding

Table 6ndash8 show MAE and RMSE values under differentevaluation metrics and test sets Tables 6 and 7 show the CR

Figure 7 VISSIM simulation environment+e planned results of the COP algorithm and the real-time traffic flow are displayed in VISSIM

Table 5 Sample datasets for TGRU and CycleGAN

TGRU Feature number 32Sample number 610

CycleGAN Image (256times 256 pixels) number of X 2238Image (256times 256 pixels) number of Y 2238

10 Security and Communication Networks

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 11: Congestion Attack Detection in Intelligent Traffic Signal

and CD measurements of one intersection and display asatisfying prediction compared with the ground truth Asshown in Table 6 in 10-fold cross validation our CycleGANhas a pretty good performance in CR prediction It has verysmall MAE and RMSE values 00205 and 00225 respec-tively which is better than that obtained with 4-fold crossvalidation

For ICD (Table 7) the 4-fold cross validation results ofMAE and RMSE are better than those of 10-fold crossvalidation reaching 08100 and 09987 respectively Wepresent the detailed values of each phase forMAE and RMSEof congestion degree in Table 8 We can see that throughcomparing values based on the training set and cross vali-dation our CycleGAN-based model does not overfit by

training +e best results are at k 3 and we have the lowestvalues of MAEPCDk

and RMSEPCDk(02250 02050 02050

02617 02519 and 02360) compared with the values ofother phases However the errors at k 5 increase a lotwhich is why MAEICD and RMSEICD approach 1 +is isbecause the fewer the vehicles the better the prediction effectof the model However the attack vehicle is at phase 3 whichhas the least number of queues while the congestion occursat phase 5 which has the largest number of queues+erefore the prediction effect of phase 3 is the best of the 8phases so the lowest MAE and RMSE values are k 3 whilethe prediction errors at phase 5 increased a lot

We present bar charts for MAE and RMSE of 8-phasecongestion degree in Figures 9 and 10 respectively In

Input x Output G (x) Real y

Figure 8 Visualized CycleGAN output compared with real traffic image two stages later based on three different original image inputs at thebeginning of spoofing attack +e blue dots represent OBU-equipped vehicles whereas the while dots represent unequipped vehicles

Table 6 MAECRand RMSECR obtained on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAECR 00257 00213 00205RMSECR 00310 00256 00225+e bold values denote the minimum values of MAE or RMSE on different training sets

Security and Communication Networks 11

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 12: Congestion Attack Detection in Intelligent Traffic Signal

Figure 9 the best average value of MAE is based on 10-foldcross validation (with a value of 03844) and the worstaverage value is based on 4-fold cross validation (with avalue of 04481) In Figure 10 we have similar results forRMSE the best and the worst are 04471 and 05491 re-spectively Both average values mean that the CycleGAN isrobust and that we have good feature capture in ourapproach

In addition we compare the performance of theCycleGAN model with that of pix2pix [17] another GAN-based model by quantitatively analyzing experimental re-sults from the whole intersection and the specific phasesperspective respectively Here we use the experimentalresults under 4-fold cross validation For the measurementsof the whole intersection as shown in Table 9 we haveMAECR 00213 RMSECR 00256 MAEICD 08100 andRMSEICD 09987 for CycleGAN and MAECR 01167RMSECR 01297 MAEICD 37917 and RMSEIC D

38500 for pix2pix We can see that CycleGAN has lowerMAE and RMSE values than pix2pix +erefore theCycleGAN model has a better performance than the pix2pixmodel on the measurements of the whole intersection

We further compare the model performance for thespecific phases Table 10 shows the detailed MAE and RMSEvalues of each phase for CycleGAN and pix2pix +ere arethe lowest MAE and RMSE values at k 3 for both models02050 and 02538 for CycleGAN and 02519 and 09830 forpix2pix For all phases the MAE and RMSE values ofCycleGAN are lower than those of pix2pix +erefore theCycleGAN model also has a better performance than thepix2pix model on the measurements of all phases

To sum up in the CycleGAN-based prediction modelwe extract four-direction road images of the intersection andperform phase-based composition for generating a newsample image to quantify the traffic flow characteristicsBased on the image feature the CycleGAN-based approachanalyzes the potential relationship between the congestionattack and the corresponding congestion effect two stageslater Also the model is used to analyze the congestioneffects that different phases of the attack vehicle causedMeanwhile we can obtain the visualized results based on theimage feature +e experimental results on the CycleGAN-based model and compared experiments with the pix2pixmodel demonstrated the superiority of the CycleGAN-basedmodel

43 Congestion Attack Verification Here we evaluate theperformance of the TGRU model based on traffic flowfeatures We use the confusion matrix accuracy AUC valueprecision recall and F1-score

+e TGRU model is trained to distinguish whether theintersection is under a spoofing attack based on traffic flowfeatures We collect time-series traffic flow data for3600 seconds under both the normal state and attack stateWe consider 1-second intervals as time steps Each datavector xnt has 30 features as defined in Section 34 Eachoutcome ynt is a binary label marking whether the

intersection is under a spoofing attack +e sequence lengthis set to 20 seconds considering that the maximum greentime of each signal is 20 seconds Hence 360 samples areobtained in total 180 samples of which contain 3600 trafficflow data and are used for training and the other 180 samplesare used for testing

We apply the model to the test set and calculate its AUCvalue accuracy precision recall and F1-score these valuesare reported in Table 11 We see that for different parametersettings the TGRU model with our defined traffic featurescan achieve a great prediction quality+eAUC values are allapproximately 08 and the accuracy values are 079 079and 075 when using different parameter settings Fur-thermore the average values of precision recall and F1-score are satisfying almost near 08

Figure 11 shows the three ROC [18] curves of TGRUCorresponding AUC values are shown as well We can seethat these curves are similar and their AUC values (082085 and 078) are all around 08 Moreover the TGRUmodel has similar performance with different parametersettings this indicates that our defined classification featuresare efficient and the different parameter settings have littleeffect on TGRU modelrsquos performance

+e decision tree generated by Graphviz [19] is shown inFigure 12 For 3600 traffic flows this tree has 9 levels Fromtop to down according to each feature value the flow datacan be grouped into different classes step by step For ex-ample when X [13]le 0068 there are 32 traffic flows of 57flows correctly predicted as the class of spoofing attack 1 thisindicates the importance of the 13th dimension feature iethe congestion degree of the 8th phase PCD8 in predictingthe class of spoofing attack 1

Also we compare the TGRU model with a time-seriesprediction method seasonal autoregressive integratedmoving average (SARIMA) Here it is detected whether thecongestion occurs or not based on traffic flow features Wecarry out experiments for different approaches under dif-ferent traffic flow feature sets We choose the primary trafficflow data for the first feature set as the traffic flow feature FS1+e second feature set FS2 is shown in Table 3 According tothe two approaches we construct two feature sets based onthe traffic flow data we collect As shown in Table 12 theaccuracy values of SARIMA and TGRU on the feature setFS1 are 0744 and 0772 respectively and on the feature setFS2 are 0784 and 0790 respectively which demonstratesthat the TGUR model based on our defined traffic flowfeatures is superior to others

In conclusion in the TGRU-based verificationmodel wepropose some timing characteristics including capacityratio congestion degree attack acceleration and attackamplification ratio to measure the congestion effects basedon traffic flow Based on the defined traffic flow features theTGRU-based model is used to analyze the underlying re-lationship between the congestion attack and traffic flowfeatures at the current moment Meanwhile the decision treehelps better interpret the relationship between traffic flowfeatures and the congestion attack +e experimental resultson the TGRU-based model and compared experiments with

12 Security and Communication Networks

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 13: Congestion Attack Detection in Intelligent Traffic Signal

the SARIMA model demonstrated the superiority of theTGRU-based approach

5 Defense Suggestions

To proactively address the congestion attack of the I-SIGsystem this section discusses how to defend against theattacks assessed above

EVLS Improvement for COP Reinforcement As estimated bythe USDOT [20] I-SIG may take 25ndash30 years to reach a 95PR for intelligent transportation systems +us for I-SIGunder a real low PR I-SIG needs to adopt an EVLS algorithmto estimate non-OBU-equipped vehiclesrsquo location and speedIn the current I-SIG system design the congestion attack on

the COP algorithm utilizes a nonrobust estimation of EVLSHowever it is possible to improve EVLS and thus reinforcethe COP algorithm For single global positioning system(GPS) spoofing we can introduce more collaborationmechanisms from the transportation field such as thecar-following model A natural way to accomplish this is tosignificantly improve queue-length prediction In theexisting EVLS this could be realized by adding a newsoftware module that interacts with the COP algorithmSuch implementation has a low cost and brings little changeto the original COP algorithm

Another problem is the high impact of PR on securitywhich we have to change In the current design when the PRis smaller the impact of the attack on the system is moresignificant because the system cannot accurately obtain the

Table 7 MAEICD and RMSEICD on training set and with 4-fold cross validation or 10-fold cross validation

Training set 4-fold cross validation 10-fold cross validationMAEICD 08350 08100 11800RMSEICD 10500 09987 13609+e bold values denote the minimum values of MAE or RMSE on different training sets

Table 8 MAEPCD and RMSEPCD on training set and with 4-fold cross validation or 10-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

Training set 4-fold cross validation 10-fold cross validation Training set 4-fold cross validation 10-fold cross validation

k 1 05650 04450 06575 06749 05497 07884k 2 03850 05050 03625 05074 06268 04242k 3 02250 02050 02050 02617 02519 02360k 4 03450 03200 02250 04319 03733 02639k 5 08300 09200 05925 09859 11070 05720k 6 03850 05350 04125 05035 06535 05188k 7 03900 03750 03600 04701 04759 04602k 8 04300 02800 02600 04990 03550 03131

0

02

04

06

08

1

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

04481

0384404444

PCD (Training set)PCD (4-CV)PCD (10-CV)

Figure 9 Bar chart of MAEPCDk

Security and Communication Networks 13

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 14: Congestion Attack Detection in Intelligent Traffic Signal

queue length with fewer data We do not suggest providingtwo alternative versions of EVLS (ie one for high and onefor low PR respectively) Although we analyzed a car-fol-lowing model in work [21] we believe that a more usefulmodel with a collaboration mechanism should be studiedthis will make the estimation of EVLS more accurate as wellas COP security more robust

Authentication and Anomaly Detection In the current designauthentication is realized through communication betweenOBUandRSUHowever the attack vehiclemight not be a newlyjoining vehicle or an unauthenticated vehicle in fact it can be anormal vehicle with legal authentication +us although au-thentication reinforcement is not the solution it can be used to

aid in anomaly detection+e idea is that a vehicle cannot appearsomewhere suddenly from the beginning authentication weshould perform analysis on time-series trajectory data to dis-cover any anomaly behavior this requires a powerful RSU withmore computing ability and storing capacity In addition to ananomaly detection algorithm implementation needs the supportof a collaboration mechanism of multiple I-SIGs this is acomplex global design of intelligent transportation and has notbeen realized yet We believe this is critical work that must beaccomplished before wide I-SIG deployment

Prevent Cold-Start Attack Essentially the congestion at-tack is a type of insider attack +us it is challenging toperform anomaly detection for such an attack in a pretty

K = 2K = 1 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8

AveragePCD (Training set)AveragePCD (4-CV)AveragePCD (10-CV)

05491

0447105418

PCD (Training set)PCD (4-CV)PCD (10-CV)

0

02

04

06

08

1

12

Figure 10 Bar chart of RMSEPCDk

Table 9 MAECR RMSECR MAEICD and RMSEICD of CycleGAN and pix2pix with 4-fold cross validation

CycleGAN pix2pixMAECR 00213 01167RMSECR 00256 01297MAEICD 08100 37917RMSEICD 09987 38500

Table 10 MAEPCD and RMSEPCD of CycleGAN and pix2pix with 4-fold cross validation (k denotes the kth phase)

MAEPCDkRMSEPCDk

CycleGAN pix2pix CycleGAN pix2pix

k 1 04450 29500 05497 32383k 2 05050 09850 06268 11697k 3 02050 02538 02519 09830k 4 03200 17550 03733 27336k 5 09200 33283 11070 34750k 6 05350 11250 06535 11307k 7 03750 19500 04759 23499k 8 02800 03342 03550 05393+e bold values denote the minimum values of MAE or RMSE when k varies from 1 to 8

14 Security and Communication Networks

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 15: Congestion Attack Detection in Intelligent Traffic Signal

short time only based on nearby vehicle speed and lo-cation information +is means that we cannot avoid thefirst spoofing data entering the arrival table of the COP

algorithm We suggest that emerging blockchain tech-nology especially light blockchain should be consideredto rebuild I-SIG or even the whole intelligent

Table 11 TGRU performance in terms of AUC value accuracy precision recall and F1-score

Iteration times AUC AccuracyClassification report

Precision Recall F1-scoreIters_retrain 25

082 0790 normal 071 098 082

Num_iters 300 1 attack 097 059 074Average 084 079 078

Iters_retrain 50085 079

0 normal 072 095 082

Num_iters 1000 1 attack 093 064 076Average 083 079 079

Iters_retrain 100078 075

0 normal 069 090 078

Num_iters 3000 1 attack 086 060 071Average 078 075 075

00

02

04

06

08

10

00 02 04 06 08 10

True

Pos

itive

Rat

e

False Positive Rate

25300 (AUC = 082)501000 (AUC = 085)1003000 (AUC = 078)

Figure 11 ROC curve of TGRU with AUC values

Figure 12 Whole decision tree of 9-level depth

Table 12 Comparison of different prediction approaches

Feature set FS1 FS2SARIMA 0744 0784TGRU 0772 0790+e bold values are the maximum of accuracy when the prediction approach is different

Security and Communication Networks 15

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 16: Congestion Attack Detection in Intelligent Traffic Signal

transportation system After that any data of one nodehave to be verified by all other nodes +is would result innearly no chance for spoofed data to be accepted How-ever the cost of rebuilding the system is obviouslyenormous and more attention should be paid to the lightblockchain to test the trade-off between efficiency andsecurity Regardless we still believe that this is a prom-ising future for I-SIG security defense

6 Related Work

Data Spoofing Attacks in SAGIN +e SAGIN has a het-erogeneous structure including vehicle nodes roadsideinfrastructure mobile terminal users drones airships andother stratospheric nodes as well as high altitude satellitenodes this brings security challenges [22] such as thevarious attacks of authenticity identity confidentiality dataintegrity and privacy [23] As a SAGIN-based intelligenttransportation system deployed in California Florida andNew York by the USDOT I-SIG is exposed to data spoofingattacks [8] which can cause heavy congestion Such anattack is a position-faking attack of GPS spoofing but isdifferent from a tunnel attack In a tunnel attack each ve-hicle of a Vehicular Ad hoc NETwork (VANET) [24 25] isequipped with a positioning system (receiver) +e attackcan be achieved using a transmitter generating localizationsignals stronger than those generated by the real satellites[26 27] +e victim could be waiting for a GPS signal afterleaving a physical tunnel or a jammed-up area In com-parison the position spoofing attack to I-SIG refers to anauthenticated vehicle only sending the wrong position toaffect the COP algorithm which has lower attack cost andeasier implementation In such an attack the data spoofing isjust one factor while themechanism of the COP algorithm isthe key factor Furthermore for the GPS spoofing attack ourwork focuses on algorithm-level security analysis under aspoofing attack

Congestion Attack Analysis +e previous work [8] revealsthe existence of such congestion attacks on the COP algo-rithm It analyzes how congestion attacks affect COP de-cisions and explains how to execute an attack using dataspoofing in SAGIN However it lacks consideration aboutthe potential features and the quantified correlation betweenthe attack and congestion degree In comparison we de-mystify the attack on I-SIG and corresponding congestionfrom a machine learning perspective by exploring differentkinds of features based on both supervised learning andunsupervised learning In addition as the first utilization ofboth traffic flow features and image features our work caninspire all stakeholders of I-SIG including experts oftransportation SAGIN and security

7 Conclusions

Toward the spoofing to connected vehicle technology andthe SAGIN a congestion attack has been revealed on theCOP algorithm of I-SIG which performs dynamic andoptimal signal control based on automatic traffic situation

awareness Owing to the lack of quantified feature-levelanalysis we demystify the attack on I-SIG and the corre-sponding congestion from both supervised learning andunsupervised learning We propose a CycleGAN-basedapproach to analyze the potential relations between thecongestion attack and the corresponding results two stageslater We also present a TGRU-based approach to explorethe relations between the congestion attack and traffic flowfeatures at a certain moment In our experiment we collecthigh-quality 4476 image samples and 3600 attack-orientedtraffic flow data We then evaluate our approach empiricallyusing the COP algorithm and VISSIM and our results showthe effectiveness of our approach compared with groundtruth

+is work is expected to inspire a series of follow-upstudies on the security of CV-based I-SIG but not limited to(1) more machine learning-based approaches (2) moreconcrete defense implementation on SAGIN-based I-SIGand (3) more feature fusion for attack and defense analysis

Data Availability

All data generated or analyzed during this study are ownedby all the authors and will be used to our further research+e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

+e authors declare that they have no conflicts of interest

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under Grant nos 61972025 6180238961672092 U1811264 and 61966009 the National Key RampDProgram of China under Grant nos 2020YFB1005604 and2020YFB2103802 and the Guangxi Key Laboratory ofTrusted Software under Grant no KX201902

References

[1] U S Dot connected vehicle pilot deployment programhttpswwwitsdotgovpilots

[2] Connected Vehicle Applications httpswwwitsdotgovpilotscv_pilot_appshtm

[3] Usdot Multimodal Intelligent Traffic Safety System (Mmitss)httpswwwitsdotgovresearch_archivesdmabundlemmitss_planhtm

[4] J Liu Y Shi Z M Fadlullah and N Kato ldquoSpace-air-groundintegrated network a surveyrdquo IEEE Communications Surveysamp Tutorials vol 20 no 4 pp 2714ndash2741 2018

[5] W Zhang L Li N Zhang T Han and S Wang ldquoAir-groundintegrated mobile edge networks a surveyrdquo IEEE Accessvol 8 pp 125998ndash126018 2020

[6] S Sen and K L Head ldquoControlled optimization of phases atan intersectionrdquo Transportation Science vol 31 no 1pp 5ndash17 1997

[7] Y Feng K L Z Head S Khoshmagham andM Zamanipour ldquoA real-time adaptive signal control in a

16 Security and Communication Networks

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17

Page 17: Congestion Attack Detection in Intelligent Traffic Signal

connected vehicle environmentrdquo Transportation ResearchPart C Emerging Technologies vol 55 pp 460ndash473 2015

[8] Q A Chen Y Yin Y Feng Z M Mao and H X LiuldquoExposing Congestion Attack on Emerging Connected Ve-hicle Based Traffic Signal Controlrdquo in Proceedings of theNetwork and Distributed System Security Symposiumpp 391ndash3915 San Diego CA USA February 2018

[9] J Zhu T Park P Isola and A A Efros ldquoUnpaired image-to-image translation using cycle-consistent adversarial net-worksrdquo in Proceedings of the 2017 IEEE International Con-ference on Computer Vision (ICCV) pp 2242ndash2251 VeniceItaly October 2017

[10] M Wu M C Hughes S Parbhoo M Zazzi V Roth andF Doshivelez ldquoBeyond sparsity tree regularization of deepmodels for interpretabilityrdquo in Proceedings of the 0e 0irty-Second AAAI Conference on Artificial Intelligence pp 1670ndash1678 New Orleans LI USA February 2018

[11] K Cho B Van Merrienboer C Gulcehre et al ldquoLearningphrase representations using rnn encoderndashdecoder for sta-tistical machine translationrdquo in Empirical Methods in NaturalLanguage Processing pp 1724ndash1734 Springer Berlin Ger-many 2014

[12] Ptv Vissim httpvision-trafficptvgroupcomen-usproductsptv-vissim

[13] A Krok ldquoUs department of transportation hopes to mandate v2vcommunicationsrdquo 2016 httpswwwcnetcomroadshownewsus-department-of-transportation-hopes-to-mandate-v2v-communications

[14] H Xiong Z Tan R Zhang and S He ldquoA new dual axle driveoptimization control strategy for electric vehicles using ve-hicle-to-infrastructure communicationsrdquo IEEE Transactionson Industrial Informatics vol 16 no 4 pp 2574ndash2582 2020

[15] J B Kenney ldquoDedicated short-range communications (dsrc)standards in the United Statesrdquo Proceedings of the IEEEvol 99 no 7 pp 1162ndash1182 2011

[16] R K Patel and E J Seymour ldquo+e national transportationcommunication for its protocol (ntcip) for transportationinteroperabilityrdquo in Proceedings of the Conference on Intel-ligent Transportation Systems pp 543ndash548 Boston MA USANovember 1997

[17] P Isola J Zhu T Zhou and A A Efros ldquoImage-to-imagetranslation with conditional adversarial networksrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5967ndash5976 Honolulu HIUSA July 2017

[18] X Sun and W Xu ldquoFast implementation of DeLongrsquos al-gorithm for comparing the areas under correlated receiveroperating characteristic curvesrdquo IEEE Signal Processing Let-ters vol 21 no 11 pp 1389ndash1393 2014

[19] Graphviz-graph Visualization Software httpsgraphvizorg[20] Vehicle-infrastructure integration (vii) initiative benefit-cost

analysis httpswwwpcbitsdotgovconnected_vehiclehtm[21] X Gao J Liu Y Li et al ldquoQueue length estimation based

defence against data poisoning attack for traffic signal con-trolrdquo IFIP Advances in Information and CommunicationTechnology pp 254ndash265 2020

[22] M Arshad Z Ullah M Khalid et al ldquoBeacon trust man-agement system and fake data detection in vehicular ad-hocnetworksrdquo IET Intelligent Transport Systems vol 13 no 5pp 780ndash788 2019

[23] A Kapadia N Triandopoulos C Cornelius D Peebles andD Kotz ldquoAnonysense opportunistic and privacy-preservingcontext collectionrdquo in Proceedings of the International

Conference on Pervasive Computing pp 280ndash297 New YorkNY USA May 2018

[24] S Zeadally R Hunt Y-S Chen A Irwin and A HassanldquoVehicular ad hoc networks (vanets) status results andchallengesrdquo Telecommunication Systems vol 50 no 4pp 217ndash241 2012

[25] X Zhong L Li Y Zhang B Zhang W Zhang and T YangldquoOodt obstacle aware opportunistic data transmission forcognitive radio ad hoc networksrdquo IEEE Transactions onCommunications vol 68 no 6 pp 3654ndash3666 2020

[26] Z Lu G Qu and Z Liu ldquoA survey on recent advances invehicular network security trust and privacyrdquo IEEE Trans-actions on Intelligent Transportation Systems vol 20 no 2pp 760ndash776 2019

[27] A Boualouache S-M Senouci and S Moussaoui ldquoA surveyon pseudonym changing strategies for vehicular ad-hocnetworksrdquo IEEE Communications Surveys amp Tutorials vol 20no 1 pp 770ndash790 2018

Security and Communication Networks 17