network traffic prediction based on deep belief network and...

11
Research Article Network Traffic Prediction Based on Deep Belief Network and Spatiotemporal Compressive Sensing in Wireless Mesh Backbone Networks Laisen Nie, 1 Xiaojie Wang , 2 Liangtian Wan , 2 Shui Yu, 3 Houbing Song, 4 and Dingde Jiang 5 1 School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China 2 School of Soſtware, Dalian University of Technology, Dalian 116620, China 3 School of Information Technology, Deakin University, Burwood, VIC 3125, Australia 4 Department of Electrical, Computer, Soſtware, and Systems Engineering, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA 5 School of Astronautics & Aeronautics, University of Electronic Science and Technology of China, Chengdu 611731, China Correspondence should be addressed to Xiaojie Wang; [email protected] Received 9 September 2017; Accepted 7 December 2017; Published 4 January 2018 Academic Editor: Kuan Zhang Copyright © 2018 Laisen Nie et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Wireless mesh network is prevalent for providing a decentralized access for users and other intelligent devices. Meanwhile, it can be employed as the infrastructure of the last few miles connectivity for various network applications, for example, Internet of ings (IoT) and mobile networks. For a wireless mesh backbone network, it has obtained extensive attention because of its large capacity and low cost. Network traffic prediction is important for network planning and routing configurations that are implemented to improve the quality of service for users. is paper proposes a network traffic prediction method based on a deep learning architecture and the Spatiotemporal Compressive Sensing method. e proposed method first adopts discrete wavelet transform to extract the low-pass component of network traffic that describes the long-range dependence of itself. en, a prediction model is built by learning a deep architecture based on the deep belief network from the extracted low-pass component. Otherwise, for the remaining high-pass component that expresses the gusty and irregular fluctuations of network traffic, the Spatiotemporal Compressive Sensing method is adopted to predict it. Based on the predictors of two components, we can obtain a predictor of network traffic. From the simulation, the proposed prediction method outperforms three existing methods. 1. Introduction e Wireless Mesh Network (WMN) provides ubiquitous and last few miles connectivity for future wireless service, for example, IoT, 5G mobile network, and cognitive radio. Besides, it is also a promising solution to IoT crowdsensing applications by connecting a large group of individuals with capacities of computing and sensing. Compared with other wireless architectures (e.g., ad hoc networks), the WMN has high capacity, robustness, and low-cost deployment [1]. Hence, it is much more popular as an emerging access paradigm in practice. With the rapid development of mobile communications, mobile cloud systems, and IoT, the applica- tions provided by wireless networks become multitudinous. Besides, the big data has become a crucial role in both industry and daily life. e quantity and substantial growth of WMN (both in scale and in service) bring a series of new challenges for the capacity; for example, the network congestion may appear in a wireless mesh backbone network induced by huge traffic demands. Mesh routers constitute the principal infrastructure of the WMN known as the wireless mesh backbone network. e self-organized manner of the WMN architecture reinforces the resilience of the network to failures, but it arises some limitations, typically, resource allocation problem, and so on. Above all, imperative network management operation is useful to provide a cost-effective solution for improving Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 1260860, 10 pages https://doi.org/10.1155/2018/1260860

Upload: others

Post on 08-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

Research ArticleNetwork Traffic Prediction Based on Deep BeliefNetwork and Spatiotemporal Compressive Sensing inWireless Mesh Backbone Networks

Laisen Nie1 Xiaojie Wang 2 Liangtian Wan 2 Shui Yu3

Houbing Song4 and Dingde Jiang5

1School of Electronics and Information Northwestern Polytechnical University Xirsquoan 710072 China2School of Software Dalian University of Technology Dalian 116620 China3School of Information Technology Deakin University Burwood VIC 3125 Australia4Department of Electrical Computer Software and Systems Engineering Embry-Riddle Aeronautical University Daytona BeachFL 32114 USA5School of Astronautics amp Aeronautics University of Electronic Science and Technology of China Chengdu 611731 China

Correspondence should be addressed to Xiaojie Wang wangxj1988maildluteducn

Received 9 September 2017 Accepted 7 December 2017 Published 4 January 2018

Academic Editor Kuan Zhang

Copyright copy 2018 Laisen Nie et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Wireless mesh network is prevalent for providing a decentralized access for users and other intelligent devices Meanwhile it can beemployed as the infrastructure of the last few miles connectivity for various network applications for example Internet of Things(IoT) and mobile networks For a wireless mesh backbone network it has obtained extensive attention because of its large capacityand low cost Network traffic prediction is important for network planning and routing configurations that are implementedto improve the quality of service for users This paper proposes a network traffic prediction method based on a deep learningarchitecture and the Spatiotemporal Compressive Sensing method The proposed method first adopts discrete wavelet transformto extract the low-pass component of network traffic that describes the long-range dependence of itself Then a prediction modelis built by learning a deep architecture based on the deep belief network from the extracted low-pass component Otherwisefor the remaining high-pass component that expresses the gusty and irregular fluctuations of network traffic the SpatiotemporalCompressive Sensing method is adopted to predict it Based on the predictors of two components we can obtain a predictor ofnetwork traffic From the simulation the proposed prediction method outperforms three existing methods

1 Introduction

The Wireless Mesh Network (WMN) provides ubiquitousand last few miles connectivity for future wireless servicefor example IoT 5G mobile network and cognitive radioBesides it is also a promising solution to IoT crowdsensingapplications by connecting a large group of individuals withcapacities of computing and sensing Compared with otherwireless architectures (eg ad hoc networks) the WMNhas high capacity robustness and low-cost deployment [1]Hence it is much more popular as an emerging accessparadigm in practice With the rapid development of mobilecommunications mobile cloud systems and IoT the applica-tions provided by wireless networks become multitudinous

Besides the big data has become a crucial role in bothindustry and daily life The quantity and substantial growthof WMN (both in scale and in service) bring a series ofnew challenges for the capacity for example the networkcongestion may appear in a wireless mesh backbone networkinduced by huge traffic demands

Mesh routers constitute the principal infrastructure of theWMN known as the wireless mesh backbone network Theself-organized manner of the WMN architecture reinforcesthe resilience of the network to failures but it arises somelimitations typically resource allocation problem and soon Above all imperative network management operationis useful to provide a cost-effective solution for improving

HindawiWireless Communications and Mobile ComputingVolume 2018 Article ID 1260860 10 pageshttpsdoiorg10115520181260860

2 Wireless Communications and Mobile Computing

Low-pass component

Discrete wavelet transform

Train DBN Sparsity regularizedmatrix factorization

Prediction forhigh-pass component

Inverse discrete wavelet transform

Input data

Yes

Training is over

Prediction forlow-pass component

No

Output results

No

Yes

Figure 1 Prediction method

the performance of a WMN These network managementoperations are implemented in terms of the relative networktraffic information For instance in order to improve thequality of service for users predictive network planning isnecessary for ISPs This planning is carried out according tothe future traces of network traffic flows between all possibleorigin-destination (OD) node pairs [2 3]

A great number of methods have been proposed to dealwith the network traffic prediction problem in traditionalIP backbone networks [4ndash8] Statistical methods have beenwidely adopted in this field Originally some simple modelssuch as Autoregressive (AR) and Autoregressive IntegratedMoving Average (ARIMA) are used to pursue the short-range dependence (SRD) of network traffic [4] Howevercurrent network traffic exhibits a long-range dependence(LRD) characteristic and multifractal features in terms ofthe behaviors of terminals [4] In this case the FractionalAutoregressive IntegratedMoving Average (FARIMA)modeland the Multifractal Wavelet model (MWM) are involved inthis field to deal with the network traffic prediction problem[5]With the variety of network services and applications thecharacteristics of network traffic aremuchmore complex Forinstance it exhibits some nonlinear features [4 6] Hencesome methods refer to the GARCHmodel to model networktraffic for prediction Besides a lot of methods based onhybrid models have been proposed to predict network traffic[7] However these methods are not suitable for dealing withthe problems of network traffic prediction in a wireless meshbackbone network [2] Generally the users join in a WirelessMesh Network randomly Additionally they often have thecomplicated individual association which is significantlydiscrepant compared with the users of a traditional IPbackbone network Although the usersrsquo applications of two

networks are probably coincident the dominant applicationsusually are distinguishing

Motivated by this issue we propose a network traf-fic prediction method based on the deep belief network(DBN) and the Spatiotemporal Compressive Sensing (STCS)method named Deep Belief Network and SpatiotemporalCompressive Sensing (DBNSTCS) models To the best of ourknowledge this is the first paper that focuses on the problemof traffic prediction for wireless mesh backbone networksand takes into account the spatiotemporal characteristic forprediction We take account of the long-range dependenceand irregular fluctuation behaviors of network traffic inde-pendently see Figure 1 [9] By the discrete wavelet transform(DWT) the network traffic is divided into two componentstagged by scaling and discrete wavelet transform coefficientsNamely the DWT just likes a filter that decomposes thenetwork traffic into a low-pass component and a high-passcomponent The low-pass component expresses the long-range dependence of network traffic and the high-pass onedeclares the gusty and irregular fluctuationsThe LRDmeansthat the network traffic at any time depends upon multipleprevious traffic dataThe LRD of network traffic derives froma series of interacted factors (eg the behaviors of users)To describe these multifarious relationships the former ispredicted by a deep architecture based onDBNTheproposedarchitecture can deeply learn the LRD of network traffic Forthe short-range and irregular fluctuations the STCS methodknown as an excellent interpolation algorithm is employed topredict them

The contributions of the paper are proposed as follows

(i) We use the DWT to extract the low-pass and high-pass components of network traffic The DWT of a

Wireless Communications and Mobile Computing 3

time series can be viewed as making this time seriespass a low-pass filter and a high-pass filter respec-tively Hence we can obtain the low-pass and high-pass components of network traffic They show thelow-pass approximation and the details of networktraffic respectively In our method we predict twotypes of coefficients independently

(ii) We propose a deep architecture based on DBN tocapture the low-pass component of network trafficUnder this architecture by learning the built deeparchitecture in terms of a training set via knownnetwork traffic the deep architecture can describe theLRD characteristic of traffic flows and carry out aprediction for network traffic

(iii) We adopt the Spatiotemporal Compressive Sensingto fit the gusty and irregular fluctuations of networktraffic We first assume that the high-pass componentobeys a spatiotemporal dependenceThen we achievea predictor of network traffic by the Sparsity Regular-ized Matrix Factorization (SRMF) method

The remaining parts of this paper are organized asfollows Section 2 reviews the related work about networktraffic prediction problem In Section 3 we introduce somedefinitions about network traffic the DWT techniques theDBN theory and the Spatiotemporal Compressive Sensingmethod respectively We propose our prediction method inSection 4 Then we verify the performance of our method inSection 5 Section 6 concludes our work of this paper

2 Related Work

Lots of researchers have investigated network traffic pre-diction that is instructive for congestion control predic-tive network planning and intelligent routing [10ndash14] Theexisting network traffic prediction techniques consist of fourcategories linear time series methods nonlinear time seriesmethods hybrid model methods and decomposition modelmethods

The linear time series methods (eg AR MA andARMA) are frequently used to model end-to-end trafficflows for prediction According to novel research findingsthe traffic flows exhibit observably nonlinear features undercomplex network users behaviors and various applicationsTypically the GARCH model in [10] is used to model theburst characteristics Besides neural network is also a validmethod to track the traffic flows with nonlinear characteristic[12]

With the rapidly development of network services theISP network has been a heterogeneous and complex networkTraffic flows show manifold statistic characteristics such asLRD SRD heavy-tailed distribution andmultifractal featureTherefore researchers adopt some hybrid models to modelthe traffic flows with complicated distributionsThemethodsbased on hybridmodel take advantage of two ormoremodelsto capture the traces of traffic flows In [4] the authorscombine the ARIMA model with the GARCH model to fitseveral characteristics of traffic flows (ie LRD and SRD

characteristics) Meanwhile the proposed hybrid model canalso model the self-similarity and multifractal features oftraffic flows The autocorrelation function and the partialautocorrelation function are employed by the authors todetermine the parameters in the hybrid model The fourthmethod as mentioned in the above part is the decompositionmodel methods in which the traffic flows are divided intoseveral components Based on this decomposition the gainedcomponents are respectively modeled and predicted Thesemethods can be viewed as an evolution of the hybrid modelmethods In [13] the authors jointly use the StationaryWavelet Transform (SWT) the Quantum Genetic Algorithm(QGA) and the BackpropagationNeuralNetwork (BPNN) toimplement traffic prediction of wireless network traffic Theyfirst decompose the traffic flows by the SWT such that thetraffic flows are made up of several stationary componentsAfter that all these components are predicted by a trainedBPNN using the QGA The authors in [14] decompose thetraffic of a large scale cellular network into regular andrandom components by a classic time series decompositionmethod

3 Background

31 Traffic Matrix A traffic matrix is an expression form ofnetwork traffic After collecting network traffic informationthe operators implement appropriate network managementfunctions in terms of this network traffic information Duringthis period the network traffic information is expressed asthe traffic matrix If we denote an OD flow by 119909119901119902(119905) whichdescribes the mean of the volume of traffic flow from theorigin node 119901 to the destination node 119902 in the 119905th time slotthen the traffic matrix is defined as

119883 = [[[[[[[

11990911 (1) 11990911 (2) sdot sdot sdot 11990911 (119879)11990912 (1) 11990912 (2) sdot sdot sdot 11990912 (119879) sdot sdot sdot 119909119873119873 (1) 119909119873119873 (2) sdot sdot sdot 119909119873119873 (119879)

]]]]]]] (1)

where 119901 119902 isin 1 2 119873 and 119905 isin 1 2 119879 This trafficmatrix reports the network traffic data with 119879 time slotsGenerally the length of time slot is 5 or 15minutes

32 Discrete Wavelet Transform For a time series 119891(119905) it canbe expressed by

119891 (119905) = +infinsum119899=minusinfin

1198881198711198992minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198992minus1198972120595( 1199052119897 minus 119899) (2)

where 119888119871119899 and 119889119897119899 are scaling and discrete wavelet transformcoefficients 2minus1198712120601(1199052119871minus119899) is the scaling function at scale 1198712minus1198972120595(1199052119897 minus 119899) is so-called the wavelet function The scalingcoefficients express a coarse approximation of 119891(119905) The

4 Wireless Communications and Mobile Computing

Hidden units

Visible layer

Visible units

Hidden layer

Figure 2 Restricted Boltzmann Machine

Hidden layer

Hidden layer

Visible layer

ℎ1

ℎ2

Figure 3 Deep Belief Network with two Restricted Boltzmann Machines

discrete wavelet transform coefficients represent the detailsof 119891(119905) Hence (2) can be viewed as making the time seriespass a filter and then obtaining a representation of 119891(119905) bythe combination of low-pass and high-pass approximations

33 Deep Belief Network The DBN is a common deeplearning primitive It is a combination of a number ofRestricted BoltzmannMachines (RBMs) [15ndash17] A RBM thatis two-layer undirected graphicalmodel consists of the visibleand hidden layers denoted by V and ℎ (shown by Figure 2)[9 15] Each unit in a layer is connected with all units ofthe other layer by undirected edges The units in the samelayer are disconnected with each other Figure 3 shows anexample of DBN architecture with two RBMs [9] The DBNis a stack of many RBMsThe values of all units are stochasticvariables [16] Generally they obey a Bernoulli distributionor a Gaussian distributionWhen the visible and hidden unitsareGaussian andBernoulli we have the following conditionaldistribution

119875 (V119894 | ℎ) = 119873(119887119894 + 119869sum119895=1

119908119894119895ℎ119895 1) 119875 (ℎ119895 = 1 | V) = sigm(119886119895 + 119868sum

119894=1

119908119894119895V119894) (3)

where119873(119887119894+sum119869119895=1 119908119894119895ℎ119895 1) denotes the Gaussian distributionwhosemean and variance are 119887119894+sum119869119895=1 119908119894119895ℎ119895 and 1 sigm(119911) =

exp(119911)(1 + exp(119911)) is the sigmoid function 119868 and 119869 are thenumbers of visible and hidden units respectively [18] 119887119894 and119886119895 are the biases of visible and hidden units 119908119894119895 expressesthe symmetric interaction term between the visible unit V119894and the hidden unit ℎ119895 For a RBM the joint probabilitydistribution function over visible and hidden units can bedenoted by

119875 (V ℎ) = exp (minus119864 (V ℎ))sumVℎ exp (minus119864 (V ℎ)) (4)

where 119864(V ℎ) is termed as the energy function defined as

119864 (V ℎ) = minus12119868sum119894=1

(119887119894 minus V119894)2 minus 119869sum119895=1

119886119895ℎ119895 minus 119868sum119894=1

119869sum119895=1

119908119894119895V119894ℎ119895 (5)

To train theDBN the idea is to employ a layer-wise greedystrategy Besides the parameters are updated by minimizingthe log probability log119875(V)34 Spatiotemporal Compressive Sensing and Sparsity Regu-larized Matrix Factorization Compressive sensing is a novelsampling technique for signal processing in recent yearswhich makes good use of the structure or redundancy ofreal-world signals It takes advantage of an adaptive sam-pling scheme to sense these structural signals In detail thestructure of these signals means that it can be denoted bya vector that just has several nonzero elements (ie thevector is sparse) In the adaptive sampling scheme a random

Wireless Communications and Mobile Computing 5

0 100 200 300 400 500 600 700 800 900 1000Time slot

0

2

4

6

8

10

12

14

16

Low

-pas

s com

pone

nt o

f OD

63

times107

(a) Low-pass component of OD 63

900 1000Time slot

0 100 200 300 400 500 600 700 800

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 6

3

minus1

minus15

minus05

times107

(b) High-pass component of OD 63

Figure 4 Decomposition of OD 63

matrix called themeasurementmatrix is used to concurrentlyimplement compressing and coding During the decodingphase the compressive sensing reconstruction algorithm isa splendid approach to deal with the inverse problem withill-posed feature

As a derivative of compressive sensing the spatiotem-poral compressive sensing technique is always used as aninterpolation algorithm to recover the missing elements ofa data set The spatiotemporal feature means that the valuesof neighboring elements in the data set are properly similarIn terms of this feature the SRMF method is proposed in[19] where the missing elements can be recovered preciselythough the data loss probability is tremendous Besides theSRMF method is also an accurate tool for prediction Underthe prediction process the elements that need to be predictedare viewed as continuous missing elements

4 Our Methodology

41 Decomposition of Network Traffic We assume that theknown network traffic is denoted by 119883 whose each OD flowis denoted by a time series 119909119901119902(119905) where 119905 = 1 2 119879According to (1) it can be denoted by

119909119901119902 (119905) = +infinsum119899=minusinfin

1198881198711198991199011199022minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198991199011199022minus1198972120595( 1199052119897 minus 119899) (6)

If we set the scale to be 1 then we have

119909119901119902 (119905) = +infinsum119899=minusinfin

11988811198991199011199022minus12120601( 1199052 minus 119899)+ +infinsum119899=minusinfin

11988911198991199011199022minus12120595( 1199052 minus 119899) (7)

The above equation divides the network traffic into twocomponents One is the low-pass approximation (shown bythe scaling coefficients) that exhibits the LRD of the networktraffic 119909119901119902(119905) and the other is the high-pass approximation(described by the discrete wavelet transform coefficients) thatexpresses the gusty and irregular fluctuation behaviors of thenetwork traffic 119909119901119902(119905) For a traffic matrix that describes thevolume of traffic between all OD node pairs obviously itslow-pass and high-pass approximation components can bedenoted by two matrices respectively

Figures 4 and 5 give two examples of the traffic flowdecomposition by theDWTWe select twoODflows from thereal network traffic data set randomly and plot their low-passand high-pass components respectively Obviously we seethat the low-pass components are periodical which meansthat they are much easier to be predicted comparing withthe high-pass components shown by Figures 4(b) and 5(b)In this case two components are predicted independently inthis paper

42 Deep Architecture for Low-Pass Component PredictionFor an OD flow 119909119901119902(119905) we assume that the length of thisseries is an even number In this case the number of scalingcoefficients is 1198792 The deep architecture for prediction isplotted in Figure 6 There are 119872 hidden layers in thisarchitecture Both the hidden and the input layers have 1198792units At the top of the deep architecture a single neurondefined as the logistic regression is employed for predictionThe logistic regression model is made up of a hidden layerwith 1198792 units and an output layer with one unit The deeparchitecture is trained by the backpropagation algorithm inour method The parameters of the deep architecture aredetermined lay by layer [18]

In our method we first collect 119870 training set denotedby (1199091119901119902(119905) 119909119870119901119902(119905)) Then we have the scaling coefficienttraining set (1198881119901119902 119888119870119901119902) Meanwhile the corresponding

6 Wireless Communications and Mobile Computing

0

02

04

06

08

1

12

14

16

18

2

Low

-pas

s com

pone

nt o

f OD

105

900 1000Time slot

0 100 200 300 400 500 600 700 800

times108

(a) Low-pass component of OD 105

minus25

minus2

minus15

minus1

minus05

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 1

05

times107

900 1000Time slot

0 100 200 300 400 500 600 700 800

(b) High-pass component of OD 105

Figure 5 Decomposition of OD 105

Logisticregression

DBN

c

ℎM

ℎ1

c

Figure 6 Deep architecture for prediction

predictors are (1198881119901119902 119888119870119901119902) By training the proposed deeparchitecture using the scaling coefficient training set wecan obtain a relationship between input and output scalingcoefficients Furthermore we use the scaling coefficients of119909119901119902(119905) as an input and then a predictor of the scalingcoefficient will be achieved

43 Spatiotemporal Compressive Sensing for High-Pass Com-ponent Prediction For the discrete wavelet transform coef-ficients SRMF is a matrix-oriented interpolation algorithmTherefore different from the prediction of the scaling coef-ficients where each OD flow is predicted independently wepredict the discrete wavelet transform coefficients of all ODflows at the same time According to (7) the discrete wavelettransform coefficients of all OD flows constitute a matrixdenoted by 119863 in this paper The matrix 119863 consists of twoportions One is from the training data obtained bymeasurednetwork traffic data and the other needs to be predictedWe denote the final predicting result by 119863 and then it

can be predicted by the following regularized optimizationmodel 10038171003817100381710038171003817119863 minus 119863100381710038171003817100381710038172119865 + 120582 (1198712119865 + 1198772119865) + 10038171003817100381710038171003817(119871119877119879)119867100381710038171003817100381710038172119865 (8)

where the notations sdot 119865 and (sdot)119879 denote the Frobenius normand transposition respectively The matrix 119867 so-called thetemporal constraint matrix (shown by (9)) describes thetemporal neighbors

119867 = [[[[[[[

1 minus1 0 sdot sdot sdot 00 1 minus1 sdot sdot sdot 0 0 0 0 sdot sdot sdot 1

]]]]]]] (9)

The matrices 119871 and 119877 are from the singular value decompo-sition of the matrix119863 that is

119863 = 119880Σ119881119879 (10)

119880 and 119879 are two unitary matrices and Σ is a diagonal matrixwhose diagonal elements are the singular values of 119863 Thematrices 119871 and 119877 are equal to 119880Σ12 and 119881Σ12

Finally according to the predictors of the scalingand discrete wavelet transform coefficients we predictthe network traffic by inverse discrete wavelet transformAlgorithm 1 proposes the details of our method

5 Simulation Results and Analysis

This section will verify the performance of our predictionmethod In our simulations a real network traffic data setwith 2016 time slots is sampled on a time scale of 5minutesFor exact prediction the first 2000 time slots are used asthe prior information to train the deep architecture and the

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 2: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

2 Wireless Communications and Mobile Computing

Low-pass component

Discrete wavelet transform

Train DBN Sparsity regularizedmatrix factorization

Prediction forhigh-pass component

Inverse discrete wavelet transform

Input data

Yes

Training is over

Prediction forlow-pass component

No

Output results

No

Yes

Figure 1 Prediction method

the performance of a WMN These network managementoperations are implemented in terms of the relative networktraffic information For instance in order to improve thequality of service for users predictive network planning isnecessary for ISPs This planning is carried out according tothe future traces of network traffic flows between all possibleorigin-destination (OD) node pairs [2 3]

A great number of methods have been proposed to dealwith the network traffic prediction problem in traditionalIP backbone networks [4ndash8] Statistical methods have beenwidely adopted in this field Originally some simple modelssuch as Autoregressive (AR) and Autoregressive IntegratedMoving Average (ARIMA) are used to pursue the short-range dependence (SRD) of network traffic [4] Howevercurrent network traffic exhibits a long-range dependence(LRD) characteristic and multifractal features in terms ofthe behaviors of terminals [4] In this case the FractionalAutoregressive IntegratedMoving Average (FARIMA)modeland the Multifractal Wavelet model (MWM) are involved inthis field to deal with the network traffic prediction problem[5]With the variety of network services and applications thecharacteristics of network traffic aremuchmore complex Forinstance it exhibits some nonlinear features [4 6] Hencesome methods refer to the GARCHmodel to model networktraffic for prediction Besides a lot of methods based onhybrid models have been proposed to predict network traffic[7] However these methods are not suitable for dealing withthe problems of network traffic prediction in a wireless meshbackbone network [2] Generally the users join in a WirelessMesh Network randomly Additionally they often have thecomplicated individual association which is significantlydiscrepant compared with the users of a traditional IPbackbone network Although the usersrsquo applications of two

networks are probably coincident the dominant applicationsusually are distinguishing

Motivated by this issue we propose a network traf-fic prediction method based on the deep belief network(DBN) and the Spatiotemporal Compressive Sensing (STCS)method named Deep Belief Network and SpatiotemporalCompressive Sensing (DBNSTCS) models To the best of ourknowledge this is the first paper that focuses on the problemof traffic prediction for wireless mesh backbone networksand takes into account the spatiotemporal characteristic forprediction We take account of the long-range dependenceand irregular fluctuation behaviors of network traffic inde-pendently see Figure 1 [9] By the discrete wavelet transform(DWT) the network traffic is divided into two componentstagged by scaling and discrete wavelet transform coefficientsNamely the DWT just likes a filter that decomposes thenetwork traffic into a low-pass component and a high-passcomponent The low-pass component expresses the long-range dependence of network traffic and the high-pass onedeclares the gusty and irregular fluctuationsThe LRDmeansthat the network traffic at any time depends upon multipleprevious traffic dataThe LRD of network traffic derives froma series of interacted factors (eg the behaviors of users)To describe these multifarious relationships the former ispredicted by a deep architecture based onDBNTheproposedarchitecture can deeply learn the LRD of network traffic Forthe short-range and irregular fluctuations the STCS methodknown as an excellent interpolation algorithm is employed topredict them

The contributions of the paper are proposed as follows

(i) We use the DWT to extract the low-pass and high-pass components of network traffic The DWT of a

Wireless Communications and Mobile Computing 3

time series can be viewed as making this time seriespass a low-pass filter and a high-pass filter respec-tively Hence we can obtain the low-pass and high-pass components of network traffic They show thelow-pass approximation and the details of networktraffic respectively In our method we predict twotypes of coefficients independently

(ii) We propose a deep architecture based on DBN tocapture the low-pass component of network trafficUnder this architecture by learning the built deeparchitecture in terms of a training set via knownnetwork traffic the deep architecture can describe theLRD characteristic of traffic flows and carry out aprediction for network traffic

(iii) We adopt the Spatiotemporal Compressive Sensingto fit the gusty and irregular fluctuations of networktraffic We first assume that the high-pass componentobeys a spatiotemporal dependenceThen we achievea predictor of network traffic by the Sparsity Regular-ized Matrix Factorization (SRMF) method

The remaining parts of this paper are organized asfollows Section 2 reviews the related work about networktraffic prediction problem In Section 3 we introduce somedefinitions about network traffic the DWT techniques theDBN theory and the Spatiotemporal Compressive Sensingmethod respectively We propose our prediction method inSection 4 Then we verify the performance of our method inSection 5 Section 6 concludes our work of this paper

2 Related Work

Lots of researchers have investigated network traffic pre-diction that is instructive for congestion control predic-tive network planning and intelligent routing [10ndash14] Theexisting network traffic prediction techniques consist of fourcategories linear time series methods nonlinear time seriesmethods hybrid model methods and decomposition modelmethods

The linear time series methods (eg AR MA andARMA) are frequently used to model end-to-end trafficflows for prediction According to novel research findingsthe traffic flows exhibit observably nonlinear features undercomplex network users behaviors and various applicationsTypically the GARCH model in [10] is used to model theburst characteristics Besides neural network is also a validmethod to track the traffic flows with nonlinear characteristic[12]

With the rapidly development of network services theISP network has been a heterogeneous and complex networkTraffic flows show manifold statistic characteristics such asLRD SRD heavy-tailed distribution andmultifractal featureTherefore researchers adopt some hybrid models to modelthe traffic flows with complicated distributionsThemethodsbased on hybridmodel take advantage of two ormoremodelsto capture the traces of traffic flows In [4] the authorscombine the ARIMA model with the GARCH model to fitseveral characteristics of traffic flows (ie LRD and SRD

characteristics) Meanwhile the proposed hybrid model canalso model the self-similarity and multifractal features oftraffic flows The autocorrelation function and the partialautocorrelation function are employed by the authors todetermine the parameters in the hybrid model The fourthmethod as mentioned in the above part is the decompositionmodel methods in which the traffic flows are divided intoseveral components Based on this decomposition the gainedcomponents are respectively modeled and predicted Thesemethods can be viewed as an evolution of the hybrid modelmethods In [13] the authors jointly use the StationaryWavelet Transform (SWT) the Quantum Genetic Algorithm(QGA) and the BackpropagationNeuralNetwork (BPNN) toimplement traffic prediction of wireless network traffic Theyfirst decompose the traffic flows by the SWT such that thetraffic flows are made up of several stationary componentsAfter that all these components are predicted by a trainedBPNN using the QGA The authors in [14] decompose thetraffic of a large scale cellular network into regular andrandom components by a classic time series decompositionmethod

3 Background

31 Traffic Matrix A traffic matrix is an expression form ofnetwork traffic After collecting network traffic informationthe operators implement appropriate network managementfunctions in terms of this network traffic information Duringthis period the network traffic information is expressed asthe traffic matrix If we denote an OD flow by 119909119901119902(119905) whichdescribes the mean of the volume of traffic flow from theorigin node 119901 to the destination node 119902 in the 119905th time slotthen the traffic matrix is defined as

119883 = [[[[[[[

11990911 (1) 11990911 (2) sdot sdot sdot 11990911 (119879)11990912 (1) 11990912 (2) sdot sdot sdot 11990912 (119879) sdot sdot sdot 119909119873119873 (1) 119909119873119873 (2) sdot sdot sdot 119909119873119873 (119879)

]]]]]]] (1)

where 119901 119902 isin 1 2 119873 and 119905 isin 1 2 119879 This trafficmatrix reports the network traffic data with 119879 time slotsGenerally the length of time slot is 5 or 15minutes

32 Discrete Wavelet Transform For a time series 119891(119905) it canbe expressed by

119891 (119905) = +infinsum119899=minusinfin

1198881198711198992minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198992minus1198972120595( 1199052119897 minus 119899) (2)

where 119888119871119899 and 119889119897119899 are scaling and discrete wavelet transformcoefficients 2minus1198712120601(1199052119871minus119899) is the scaling function at scale 1198712minus1198972120595(1199052119897 minus 119899) is so-called the wavelet function The scalingcoefficients express a coarse approximation of 119891(119905) The

4 Wireless Communications and Mobile Computing

Hidden units

Visible layer

Visible units

Hidden layer

Figure 2 Restricted Boltzmann Machine

Hidden layer

Hidden layer

Visible layer

ℎ1

ℎ2

Figure 3 Deep Belief Network with two Restricted Boltzmann Machines

discrete wavelet transform coefficients represent the detailsof 119891(119905) Hence (2) can be viewed as making the time seriespass a filter and then obtaining a representation of 119891(119905) bythe combination of low-pass and high-pass approximations

33 Deep Belief Network The DBN is a common deeplearning primitive It is a combination of a number ofRestricted BoltzmannMachines (RBMs) [15ndash17] A RBM thatis two-layer undirected graphicalmodel consists of the visibleand hidden layers denoted by V and ℎ (shown by Figure 2)[9 15] Each unit in a layer is connected with all units ofthe other layer by undirected edges The units in the samelayer are disconnected with each other Figure 3 shows anexample of DBN architecture with two RBMs [9] The DBNis a stack of many RBMsThe values of all units are stochasticvariables [16] Generally they obey a Bernoulli distributionor a Gaussian distributionWhen the visible and hidden unitsareGaussian andBernoulli we have the following conditionaldistribution

119875 (V119894 | ℎ) = 119873(119887119894 + 119869sum119895=1

119908119894119895ℎ119895 1) 119875 (ℎ119895 = 1 | V) = sigm(119886119895 + 119868sum

119894=1

119908119894119895V119894) (3)

where119873(119887119894+sum119869119895=1 119908119894119895ℎ119895 1) denotes the Gaussian distributionwhosemean and variance are 119887119894+sum119869119895=1 119908119894119895ℎ119895 and 1 sigm(119911) =

exp(119911)(1 + exp(119911)) is the sigmoid function 119868 and 119869 are thenumbers of visible and hidden units respectively [18] 119887119894 and119886119895 are the biases of visible and hidden units 119908119894119895 expressesthe symmetric interaction term between the visible unit V119894and the hidden unit ℎ119895 For a RBM the joint probabilitydistribution function over visible and hidden units can bedenoted by

119875 (V ℎ) = exp (minus119864 (V ℎ))sumVℎ exp (minus119864 (V ℎ)) (4)

where 119864(V ℎ) is termed as the energy function defined as

119864 (V ℎ) = minus12119868sum119894=1

(119887119894 minus V119894)2 minus 119869sum119895=1

119886119895ℎ119895 minus 119868sum119894=1

119869sum119895=1

119908119894119895V119894ℎ119895 (5)

To train theDBN the idea is to employ a layer-wise greedystrategy Besides the parameters are updated by minimizingthe log probability log119875(V)34 Spatiotemporal Compressive Sensing and Sparsity Regu-larized Matrix Factorization Compressive sensing is a novelsampling technique for signal processing in recent yearswhich makes good use of the structure or redundancy ofreal-world signals It takes advantage of an adaptive sam-pling scheme to sense these structural signals In detail thestructure of these signals means that it can be denoted bya vector that just has several nonzero elements (ie thevector is sparse) In the adaptive sampling scheme a random

Wireless Communications and Mobile Computing 5

0 100 200 300 400 500 600 700 800 900 1000Time slot

0

2

4

6

8

10

12

14

16

Low

-pas

s com

pone

nt o

f OD

63

times107

(a) Low-pass component of OD 63

900 1000Time slot

0 100 200 300 400 500 600 700 800

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 6

3

minus1

minus15

minus05

times107

(b) High-pass component of OD 63

Figure 4 Decomposition of OD 63

matrix called themeasurementmatrix is used to concurrentlyimplement compressing and coding During the decodingphase the compressive sensing reconstruction algorithm isa splendid approach to deal with the inverse problem withill-posed feature

As a derivative of compressive sensing the spatiotem-poral compressive sensing technique is always used as aninterpolation algorithm to recover the missing elements ofa data set The spatiotemporal feature means that the valuesof neighboring elements in the data set are properly similarIn terms of this feature the SRMF method is proposed in[19] where the missing elements can be recovered preciselythough the data loss probability is tremendous Besides theSRMF method is also an accurate tool for prediction Underthe prediction process the elements that need to be predictedare viewed as continuous missing elements

4 Our Methodology

41 Decomposition of Network Traffic We assume that theknown network traffic is denoted by 119883 whose each OD flowis denoted by a time series 119909119901119902(119905) where 119905 = 1 2 119879According to (1) it can be denoted by

119909119901119902 (119905) = +infinsum119899=minusinfin

1198881198711198991199011199022minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198991199011199022minus1198972120595( 1199052119897 minus 119899) (6)

If we set the scale to be 1 then we have

119909119901119902 (119905) = +infinsum119899=minusinfin

11988811198991199011199022minus12120601( 1199052 minus 119899)+ +infinsum119899=minusinfin

11988911198991199011199022minus12120595( 1199052 minus 119899) (7)

The above equation divides the network traffic into twocomponents One is the low-pass approximation (shown bythe scaling coefficients) that exhibits the LRD of the networktraffic 119909119901119902(119905) and the other is the high-pass approximation(described by the discrete wavelet transform coefficients) thatexpresses the gusty and irregular fluctuation behaviors of thenetwork traffic 119909119901119902(119905) For a traffic matrix that describes thevolume of traffic between all OD node pairs obviously itslow-pass and high-pass approximation components can bedenoted by two matrices respectively

Figures 4 and 5 give two examples of the traffic flowdecomposition by theDWTWe select twoODflows from thereal network traffic data set randomly and plot their low-passand high-pass components respectively Obviously we seethat the low-pass components are periodical which meansthat they are much easier to be predicted comparing withthe high-pass components shown by Figures 4(b) and 5(b)In this case two components are predicted independently inthis paper

42 Deep Architecture for Low-Pass Component PredictionFor an OD flow 119909119901119902(119905) we assume that the length of thisseries is an even number In this case the number of scalingcoefficients is 1198792 The deep architecture for prediction isplotted in Figure 6 There are 119872 hidden layers in thisarchitecture Both the hidden and the input layers have 1198792units At the top of the deep architecture a single neurondefined as the logistic regression is employed for predictionThe logistic regression model is made up of a hidden layerwith 1198792 units and an output layer with one unit The deeparchitecture is trained by the backpropagation algorithm inour method The parameters of the deep architecture aredetermined lay by layer [18]

In our method we first collect 119870 training set denotedby (1199091119901119902(119905) 119909119870119901119902(119905)) Then we have the scaling coefficienttraining set (1198881119901119902 119888119870119901119902) Meanwhile the corresponding

6 Wireless Communications and Mobile Computing

0

02

04

06

08

1

12

14

16

18

2

Low

-pas

s com

pone

nt o

f OD

105

900 1000Time slot

0 100 200 300 400 500 600 700 800

times108

(a) Low-pass component of OD 105

minus25

minus2

minus15

minus1

minus05

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 1

05

times107

900 1000Time slot

0 100 200 300 400 500 600 700 800

(b) High-pass component of OD 105

Figure 5 Decomposition of OD 105

Logisticregression

DBN

c

ℎM

ℎ1

c

Figure 6 Deep architecture for prediction

predictors are (1198881119901119902 119888119870119901119902) By training the proposed deeparchitecture using the scaling coefficient training set wecan obtain a relationship between input and output scalingcoefficients Furthermore we use the scaling coefficients of119909119901119902(119905) as an input and then a predictor of the scalingcoefficient will be achieved

43 Spatiotemporal Compressive Sensing for High-Pass Com-ponent Prediction For the discrete wavelet transform coef-ficients SRMF is a matrix-oriented interpolation algorithmTherefore different from the prediction of the scaling coef-ficients where each OD flow is predicted independently wepredict the discrete wavelet transform coefficients of all ODflows at the same time According to (7) the discrete wavelettransform coefficients of all OD flows constitute a matrixdenoted by 119863 in this paper The matrix 119863 consists of twoportions One is from the training data obtained bymeasurednetwork traffic data and the other needs to be predictedWe denote the final predicting result by 119863 and then it

can be predicted by the following regularized optimizationmodel 10038171003817100381710038171003817119863 minus 119863100381710038171003817100381710038172119865 + 120582 (1198712119865 + 1198772119865) + 10038171003817100381710038171003817(119871119877119879)119867100381710038171003817100381710038172119865 (8)

where the notations sdot 119865 and (sdot)119879 denote the Frobenius normand transposition respectively The matrix 119867 so-called thetemporal constraint matrix (shown by (9)) describes thetemporal neighbors

119867 = [[[[[[[

1 minus1 0 sdot sdot sdot 00 1 minus1 sdot sdot sdot 0 0 0 0 sdot sdot sdot 1

]]]]]]] (9)

The matrices 119871 and 119877 are from the singular value decompo-sition of the matrix119863 that is

119863 = 119880Σ119881119879 (10)

119880 and 119879 are two unitary matrices and Σ is a diagonal matrixwhose diagonal elements are the singular values of 119863 Thematrices 119871 and 119877 are equal to 119880Σ12 and 119881Σ12

Finally according to the predictors of the scalingand discrete wavelet transform coefficients we predictthe network traffic by inverse discrete wavelet transformAlgorithm 1 proposes the details of our method

5 Simulation Results and Analysis

This section will verify the performance of our predictionmethod In our simulations a real network traffic data setwith 2016 time slots is sampled on a time scale of 5minutesFor exact prediction the first 2000 time slots are used asthe prior information to train the deep architecture and the

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 3: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

Wireless Communications and Mobile Computing 3

time series can be viewed as making this time seriespass a low-pass filter and a high-pass filter respec-tively Hence we can obtain the low-pass and high-pass components of network traffic They show thelow-pass approximation and the details of networktraffic respectively In our method we predict twotypes of coefficients independently

(ii) We propose a deep architecture based on DBN tocapture the low-pass component of network trafficUnder this architecture by learning the built deeparchitecture in terms of a training set via knownnetwork traffic the deep architecture can describe theLRD characteristic of traffic flows and carry out aprediction for network traffic

(iii) We adopt the Spatiotemporal Compressive Sensingto fit the gusty and irregular fluctuations of networktraffic We first assume that the high-pass componentobeys a spatiotemporal dependenceThen we achievea predictor of network traffic by the Sparsity Regular-ized Matrix Factorization (SRMF) method

The remaining parts of this paper are organized asfollows Section 2 reviews the related work about networktraffic prediction problem In Section 3 we introduce somedefinitions about network traffic the DWT techniques theDBN theory and the Spatiotemporal Compressive Sensingmethod respectively We propose our prediction method inSection 4 Then we verify the performance of our method inSection 5 Section 6 concludes our work of this paper

2 Related Work

Lots of researchers have investigated network traffic pre-diction that is instructive for congestion control predic-tive network planning and intelligent routing [10ndash14] Theexisting network traffic prediction techniques consist of fourcategories linear time series methods nonlinear time seriesmethods hybrid model methods and decomposition modelmethods

The linear time series methods (eg AR MA andARMA) are frequently used to model end-to-end trafficflows for prediction According to novel research findingsthe traffic flows exhibit observably nonlinear features undercomplex network users behaviors and various applicationsTypically the GARCH model in [10] is used to model theburst characteristics Besides neural network is also a validmethod to track the traffic flows with nonlinear characteristic[12]

With the rapidly development of network services theISP network has been a heterogeneous and complex networkTraffic flows show manifold statistic characteristics such asLRD SRD heavy-tailed distribution andmultifractal featureTherefore researchers adopt some hybrid models to modelthe traffic flows with complicated distributionsThemethodsbased on hybridmodel take advantage of two ormoremodelsto capture the traces of traffic flows In [4] the authorscombine the ARIMA model with the GARCH model to fitseveral characteristics of traffic flows (ie LRD and SRD

characteristics) Meanwhile the proposed hybrid model canalso model the self-similarity and multifractal features oftraffic flows The autocorrelation function and the partialautocorrelation function are employed by the authors todetermine the parameters in the hybrid model The fourthmethod as mentioned in the above part is the decompositionmodel methods in which the traffic flows are divided intoseveral components Based on this decomposition the gainedcomponents are respectively modeled and predicted Thesemethods can be viewed as an evolution of the hybrid modelmethods In [13] the authors jointly use the StationaryWavelet Transform (SWT) the Quantum Genetic Algorithm(QGA) and the BackpropagationNeuralNetwork (BPNN) toimplement traffic prediction of wireless network traffic Theyfirst decompose the traffic flows by the SWT such that thetraffic flows are made up of several stationary componentsAfter that all these components are predicted by a trainedBPNN using the QGA The authors in [14] decompose thetraffic of a large scale cellular network into regular andrandom components by a classic time series decompositionmethod

3 Background

31 Traffic Matrix A traffic matrix is an expression form ofnetwork traffic After collecting network traffic informationthe operators implement appropriate network managementfunctions in terms of this network traffic information Duringthis period the network traffic information is expressed asthe traffic matrix If we denote an OD flow by 119909119901119902(119905) whichdescribes the mean of the volume of traffic flow from theorigin node 119901 to the destination node 119902 in the 119905th time slotthen the traffic matrix is defined as

119883 = [[[[[[[

11990911 (1) 11990911 (2) sdot sdot sdot 11990911 (119879)11990912 (1) 11990912 (2) sdot sdot sdot 11990912 (119879) sdot sdot sdot 119909119873119873 (1) 119909119873119873 (2) sdot sdot sdot 119909119873119873 (119879)

]]]]]]] (1)

where 119901 119902 isin 1 2 119873 and 119905 isin 1 2 119879 This trafficmatrix reports the network traffic data with 119879 time slotsGenerally the length of time slot is 5 or 15minutes

32 Discrete Wavelet Transform For a time series 119891(119905) it canbe expressed by

119891 (119905) = +infinsum119899=minusinfin

1198881198711198992minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198992minus1198972120595( 1199052119897 minus 119899) (2)

where 119888119871119899 and 119889119897119899 are scaling and discrete wavelet transformcoefficients 2minus1198712120601(1199052119871minus119899) is the scaling function at scale 1198712minus1198972120595(1199052119897 minus 119899) is so-called the wavelet function The scalingcoefficients express a coarse approximation of 119891(119905) The

4 Wireless Communications and Mobile Computing

Hidden units

Visible layer

Visible units

Hidden layer

Figure 2 Restricted Boltzmann Machine

Hidden layer

Hidden layer

Visible layer

ℎ1

ℎ2

Figure 3 Deep Belief Network with two Restricted Boltzmann Machines

discrete wavelet transform coefficients represent the detailsof 119891(119905) Hence (2) can be viewed as making the time seriespass a filter and then obtaining a representation of 119891(119905) bythe combination of low-pass and high-pass approximations

33 Deep Belief Network The DBN is a common deeplearning primitive It is a combination of a number ofRestricted BoltzmannMachines (RBMs) [15ndash17] A RBM thatis two-layer undirected graphicalmodel consists of the visibleand hidden layers denoted by V and ℎ (shown by Figure 2)[9 15] Each unit in a layer is connected with all units ofthe other layer by undirected edges The units in the samelayer are disconnected with each other Figure 3 shows anexample of DBN architecture with two RBMs [9] The DBNis a stack of many RBMsThe values of all units are stochasticvariables [16] Generally they obey a Bernoulli distributionor a Gaussian distributionWhen the visible and hidden unitsareGaussian andBernoulli we have the following conditionaldistribution

119875 (V119894 | ℎ) = 119873(119887119894 + 119869sum119895=1

119908119894119895ℎ119895 1) 119875 (ℎ119895 = 1 | V) = sigm(119886119895 + 119868sum

119894=1

119908119894119895V119894) (3)

where119873(119887119894+sum119869119895=1 119908119894119895ℎ119895 1) denotes the Gaussian distributionwhosemean and variance are 119887119894+sum119869119895=1 119908119894119895ℎ119895 and 1 sigm(119911) =

exp(119911)(1 + exp(119911)) is the sigmoid function 119868 and 119869 are thenumbers of visible and hidden units respectively [18] 119887119894 and119886119895 are the biases of visible and hidden units 119908119894119895 expressesthe symmetric interaction term between the visible unit V119894and the hidden unit ℎ119895 For a RBM the joint probabilitydistribution function over visible and hidden units can bedenoted by

119875 (V ℎ) = exp (minus119864 (V ℎ))sumVℎ exp (minus119864 (V ℎ)) (4)

where 119864(V ℎ) is termed as the energy function defined as

119864 (V ℎ) = minus12119868sum119894=1

(119887119894 minus V119894)2 minus 119869sum119895=1

119886119895ℎ119895 minus 119868sum119894=1

119869sum119895=1

119908119894119895V119894ℎ119895 (5)

To train theDBN the idea is to employ a layer-wise greedystrategy Besides the parameters are updated by minimizingthe log probability log119875(V)34 Spatiotemporal Compressive Sensing and Sparsity Regu-larized Matrix Factorization Compressive sensing is a novelsampling technique for signal processing in recent yearswhich makes good use of the structure or redundancy ofreal-world signals It takes advantage of an adaptive sam-pling scheme to sense these structural signals In detail thestructure of these signals means that it can be denoted bya vector that just has several nonzero elements (ie thevector is sparse) In the adaptive sampling scheme a random

Wireless Communications and Mobile Computing 5

0 100 200 300 400 500 600 700 800 900 1000Time slot

0

2

4

6

8

10

12

14

16

Low

-pas

s com

pone

nt o

f OD

63

times107

(a) Low-pass component of OD 63

900 1000Time slot

0 100 200 300 400 500 600 700 800

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 6

3

minus1

minus15

minus05

times107

(b) High-pass component of OD 63

Figure 4 Decomposition of OD 63

matrix called themeasurementmatrix is used to concurrentlyimplement compressing and coding During the decodingphase the compressive sensing reconstruction algorithm isa splendid approach to deal with the inverse problem withill-posed feature

As a derivative of compressive sensing the spatiotem-poral compressive sensing technique is always used as aninterpolation algorithm to recover the missing elements ofa data set The spatiotemporal feature means that the valuesof neighboring elements in the data set are properly similarIn terms of this feature the SRMF method is proposed in[19] where the missing elements can be recovered preciselythough the data loss probability is tremendous Besides theSRMF method is also an accurate tool for prediction Underthe prediction process the elements that need to be predictedare viewed as continuous missing elements

4 Our Methodology

41 Decomposition of Network Traffic We assume that theknown network traffic is denoted by 119883 whose each OD flowis denoted by a time series 119909119901119902(119905) where 119905 = 1 2 119879According to (1) it can be denoted by

119909119901119902 (119905) = +infinsum119899=minusinfin

1198881198711198991199011199022minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198991199011199022minus1198972120595( 1199052119897 minus 119899) (6)

If we set the scale to be 1 then we have

119909119901119902 (119905) = +infinsum119899=minusinfin

11988811198991199011199022minus12120601( 1199052 minus 119899)+ +infinsum119899=minusinfin

11988911198991199011199022minus12120595( 1199052 minus 119899) (7)

The above equation divides the network traffic into twocomponents One is the low-pass approximation (shown bythe scaling coefficients) that exhibits the LRD of the networktraffic 119909119901119902(119905) and the other is the high-pass approximation(described by the discrete wavelet transform coefficients) thatexpresses the gusty and irregular fluctuation behaviors of thenetwork traffic 119909119901119902(119905) For a traffic matrix that describes thevolume of traffic between all OD node pairs obviously itslow-pass and high-pass approximation components can bedenoted by two matrices respectively

Figures 4 and 5 give two examples of the traffic flowdecomposition by theDWTWe select twoODflows from thereal network traffic data set randomly and plot their low-passand high-pass components respectively Obviously we seethat the low-pass components are periodical which meansthat they are much easier to be predicted comparing withthe high-pass components shown by Figures 4(b) and 5(b)In this case two components are predicted independently inthis paper

42 Deep Architecture for Low-Pass Component PredictionFor an OD flow 119909119901119902(119905) we assume that the length of thisseries is an even number In this case the number of scalingcoefficients is 1198792 The deep architecture for prediction isplotted in Figure 6 There are 119872 hidden layers in thisarchitecture Both the hidden and the input layers have 1198792units At the top of the deep architecture a single neurondefined as the logistic regression is employed for predictionThe logistic regression model is made up of a hidden layerwith 1198792 units and an output layer with one unit The deeparchitecture is trained by the backpropagation algorithm inour method The parameters of the deep architecture aredetermined lay by layer [18]

In our method we first collect 119870 training set denotedby (1199091119901119902(119905) 119909119870119901119902(119905)) Then we have the scaling coefficienttraining set (1198881119901119902 119888119870119901119902) Meanwhile the corresponding

6 Wireless Communications and Mobile Computing

0

02

04

06

08

1

12

14

16

18

2

Low

-pas

s com

pone

nt o

f OD

105

900 1000Time slot

0 100 200 300 400 500 600 700 800

times108

(a) Low-pass component of OD 105

minus25

minus2

minus15

minus1

minus05

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 1

05

times107

900 1000Time slot

0 100 200 300 400 500 600 700 800

(b) High-pass component of OD 105

Figure 5 Decomposition of OD 105

Logisticregression

DBN

c

ℎM

ℎ1

c

Figure 6 Deep architecture for prediction

predictors are (1198881119901119902 119888119870119901119902) By training the proposed deeparchitecture using the scaling coefficient training set wecan obtain a relationship between input and output scalingcoefficients Furthermore we use the scaling coefficients of119909119901119902(119905) as an input and then a predictor of the scalingcoefficient will be achieved

43 Spatiotemporal Compressive Sensing for High-Pass Com-ponent Prediction For the discrete wavelet transform coef-ficients SRMF is a matrix-oriented interpolation algorithmTherefore different from the prediction of the scaling coef-ficients where each OD flow is predicted independently wepredict the discrete wavelet transform coefficients of all ODflows at the same time According to (7) the discrete wavelettransform coefficients of all OD flows constitute a matrixdenoted by 119863 in this paper The matrix 119863 consists of twoportions One is from the training data obtained bymeasurednetwork traffic data and the other needs to be predictedWe denote the final predicting result by 119863 and then it

can be predicted by the following regularized optimizationmodel 10038171003817100381710038171003817119863 minus 119863100381710038171003817100381710038172119865 + 120582 (1198712119865 + 1198772119865) + 10038171003817100381710038171003817(119871119877119879)119867100381710038171003817100381710038172119865 (8)

where the notations sdot 119865 and (sdot)119879 denote the Frobenius normand transposition respectively The matrix 119867 so-called thetemporal constraint matrix (shown by (9)) describes thetemporal neighbors

119867 = [[[[[[[

1 minus1 0 sdot sdot sdot 00 1 minus1 sdot sdot sdot 0 0 0 0 sdot sdot sdot 1

]]]]]]] (9)

The matrices 119871 and 119877 are from the singular value decompo-sition of the matrix119863 that is

119863 = 119880Σ119881119879 (10)

119880 and 119879 are two unitary matrices and Σ is a diagonal matrixwhose diagonal elements are the singular values of 119863 Thematrices 119871 and 119877 are equal to 119880Σ12 and 119881Σ12

Finally according to the predictors of the scalingand discrete wavelet transform coefficients we predictthe network traffic by inverse discrete wavelet transformAlgorithm 1 proposes the details of our method

5 Simulation Results and Analysis

This section will verify the performance of our predictionmethod In our simulations a real network traffic data setwith 2016 time slots is sampled on a time scale of 5minutesFor exact prediction the first 2000 time slots are used asthe prior information to train the deep architecture and the

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 4: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

4 Wireless Communications and Mobile Computing

Hidden units

Visible layer

Visible units

Hidden layer

Figure 2 Restricted Boltzmann Machine

Hidden layer

Hidden layer

Visible layer

ℎ1

ℎ2

Figure 3 Deep Belief Network with two Restricted Boltzmann Machines

discrete wavelet transform coefficients represent the detailsof 119891(119905) Hence (2) can be viewed as making the time seriespass a filter and then obtaining a representation of 119891(119905) bythe combination of low-pass and high-pass approximations

33 Deep Belief Network The DBN is a common deeplearning primitive It is a combination of a number ofRestricted BoltzmannMachines (RBMs) [15ndash17] A RBM thatis two-layer undirected graphicalmodel consists of the visibleand hidden layers denoted by V and ℎ (shown by Figure 2)[9 15] Each unit in a layer is connected with all units ofthe other layer by undirected edges The units in the samelayer are disconnected with each other Figure 3 shows anexample of DBN architecture with two RBMs [9] The DBNis a stack of many RBMsThe values of all units are stochasticvariables [16] Generally they obey a Bernoulli distributionor a Gaussian distributionWhen the visible and hidden unitsareGaussian andBernoulli we have the following conditionaldistribution

119875 (V119894 | ℎ) = 119873(119887119894 + 119869sum119895=1

119908119894119895ℎ119895 1) 119875 (ℎ119895 = 1 | V) = sigm(119886119895 + 119868sum

119894=1

119908119894119895V119894) (3)

where119873(119887119894+sum119869119895=1 119908119894119895ℎ119895 1) denotes the Gaussian distributionwhosemean and variance are 119887119894+sum119869119895=1 119908119894119895ℎ119895 and 1 sigm(119911) =

exp(119911)(1 + exp(119911)) is the sigmoid function 119868 and 119869 are thenumbers of visible and hidden units respectively [18] 119887119894 and119886119895 are the biases of visible and hidden units 119908119894119895 expressesthe symmetric interaction term between the visible unit V119894and the hidden unit ℎ119895 For a RBM the joint probabilitydistribution function over visible and hidden units can bedenoted by

119875 (V ℎ) = exp (minus119864 (V ℎ))sumVℎ exp (minus119864 (V ℎ)) (4)

where 119864(V ℎ) is termed as the energy function defined as

119864 (V ℎ) = minus12119868sum119894=1

(119887119894 minus V119894)2 minus 119869sum119895=1

119886119895ℎ119895 minus 119868sum119894=1

119869sum119895=1

119908119894119895V119894ℎ119895 (5)

To train theDBN the idea is to employ a layer-wise greedystrategy Besides the parameters are updated by minimizingthe log probability log119875(V)34 Spatiotemporal Compressive Sensing and Sparsity Regu-larized Matrix Factorization Compressive sensing is a novelsampling technique for signal processing in recent yearswhich makes good use of the structure or redundancy ofreal-world signals It takes advantage of an adaptive sam-pling scheme to sense these structural signals In detail thestructure of these signals means that it can be denoted bya vector that just has several nonzero elements (ie thevector is sparse) In the adaptive sampling scheme a random

Wireless Communications and Mobile Computing 5

0 100 200 300 400 500 600 700 800 900 1000Time slot

0

2

4

6

8

10

12

14

16

Low

-pas

s com

pone

nt o

f OD

63

times107

(a) Low-pass component of OD 63

900 1000Time slot

0 100 200 300 400 500 600 700 800

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 6

3

minus1

minus15

minus05

times107

(b) High-pass component of OD 63

Figure 4 Decomposition of OD 63

matrix called themeasurementmatrix is used to concurrentlyimplement compressing and coding During the decodingphase the compressive sensing reconstruction algorithm isa splendid approach to deal with the inverse problem withill-posed feature

As a derivative of compressive sensing the spatiotem-poral compressive sensing technique is always used as aninterpolation algorithm to recover the missing elements ofa data set The spatiotemporal feature means that the valuesof neighboring elements in the data set are properly similarIn terms of this feature the SRMF method is proposed in[19] where the missing elements can be recovered preciselythough the data loss probability is tremendous Besides theSRMF method is also an accurate tool for prediction Underthe prediction process the elements that need to be predictedare viewed as continuous missing elements

4 Our Methodology

41 Decomposition of Network Traffic We assume that theknown network traffic is denoted by 119883 whose each OD flowis denoted by a time series 119909119901119902(119905) where 119905 = 1 2 119879According to (1) it can be denoted by

119909119901119902 (119905) = +infinsum119899=minusinfin

1198881198711198991199011199022minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198991199011199022minus1198972120595( 1199052119897 minus 119899) (6)

If we set the scale to be 1 then we have

119909119901119902 (119905) = +infinsum119899=minusinfin

11988811198991199011199022minus12120601( 1199052 minus 119899)+ +infinsum119899=minusinfin

11988911198991199011199022minus12120595( 1199052 minus 119899) (7)

The above equation divides the network traffic into twocomponents One is the low-pass approximation (shown bythe scaling coefficients) that exhibits the LRD of the networktraffic 119909119901119902(119905) and the other is the high-pass approximation(described by the discrete wavelet transform coefficients) thatexpresses the gusty and irregular fluctuation behaviors of thenetwork traffic 119909119901119902(119905) For a traffic matrix that describes thevolume of traffic between all OD node pairs obviously itslow-pass and high-pass approximation components can bedenoted by two matrices respectively

Figures 4 and 5 give two examples of the traffic flowdecomposition by theDWTWe select twoODflows from thereal network traffic data set randomly and plot their low-passand high-pass components respectively Obviously we seethat the low-pass components are periodical which meansthat they are much easier to be predicted comparing withthe high-pass components shown by Figures 4(b) and 5(b)In this case two components are predicted independently inthis paper

42 Deep Architecture for Low-Pass Component PredictionFor an OD flow 119909119901119902(119905) we assume that the length of thisseries is an even number In this case the number of scalingcoefficients is 1198792 The deep architecture for prediction isplotted in Figure 6 There are 119872 hidden layers in thisarchitecture Both the hidden and the input layers have 1198792units At the top of the deep architecture a single neurondefined as the logistic regression is employed for predictionThe logistic regression model is made up of a hidden layerwith 1198792 units and an output layer with one unit The deeparchitecture is trained by the backpropagation algorithm inour method The parameters of the deep architecture aredetermined lay by layer [18]

In our method we first collect 119870 training set denotedby (1199091119901119902(119905) 119909119870119901119902(119905)) Then we have the scaling coefficienttraining set (1198881119901119902 119888119870119901119902) Meanwhile the corresponding

6 Wireless Communications and Mobile Computing

0

02

04

06

08

1

12

14

16

18

2

Low

-pas

s com

pone

nt o

f OD

105

900 1000Time slot

0 100 200 300 400 500 600 700 800

times108

(a) Low-pass component of OD 105

minus25

minus2

minus15

minus1

minus05

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 1

05

times107

900 1000Time slot

0 100 200 300 400 500 600 700 800

(b) High-pass component of OD 105

Figure 5 Decomposition of OD 105

Logisticregression

DBN

c

ℎM

ℎ1

c

Figure 6 Deep architecture for prediction

predictors are (1198881119901119902 119888119870119901119902) By training the proposed deeparchitecture using the scaling coefficient training set wecan obtain a relationship between input and output scalingcoefficients Furthermore we use the scaling coefficients of119909119901119902(119905) as an input and then a predictor of the scalingcoefficient will be achieved

43 Spatiotemporal Compressive Sensing for High-Pass Com-ponent Prediction For the discrete wavelet transform coef-ficients SRMF is a matrix-oriented interpolation algorithmTherefore different from the prediction of the scaling coef-ficients where each OD flow is predicted independently wepredict the discrete wavelet transform coefficients of all ODflows at the same time According to (7) the discrete wavelettransform coefficients of all OD flows constitute a matrixdenoted by 119863 in this paper The matrix 119863 consists of twoportions One is from the training data obtained bymeasurednetwork traffic data and the other needs to be predictedWe denote the final predicting result by 119863 and then it

can be predicted by the following regularized optimizationmodel 10038171003817100381710038171003817119863 minus 119863100381710038171003817100381710038172119865 + 120582 (1198712119865 + 1198772119865) + 10038171003817100381710038171003817(119871119877119879)119867100381710038171003817100381710038172119865 (8)

where the notations sdot 119865 and (sdot)119879 denote the Frobenius normand transposition respectively The matrix 119867 so-called thetemporal constraint matrix (shown by (9)) describes thetemporal neighbors

119867 = [[[[[[[

1 minus1 0 sdot sdot sdot 00 1 minus1 sdot sdot sdot 0 0 0 0 sdot sdot sdot 1

]]]]]]] (9)

The matrices 119871 and 119877 are from the singular value decompo-sition of the matrix119863 that is

119863 = 119880Σ119881119879 (10)

119880 and 119879 are two unitary matrices and Σ is a diagonal matrixwhose diagonal elements are the singular values of 119863 Thematrices 119871 and 119877 are equal to 119880Σ12 and 119881Σ12

Finally according to the predictors of the scalingand discrete wavelet transform coefficients we predictthe network traffic by inverse discrete wavelet transformAlgorithm 1 proposes the details of our method

5 Simulation Results and Analysis

This section will verify the performance of our predictionmethod In our simulations a real network traffic data setwith 2016 time slots is sampled on a time scale of 5minutesFor exact prediction the first 2000 time slots are used asthe prior information to train the deep architecture and the

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 5: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

Wireless Communications and Mobile Computing 5

0 100 200 300 400 500 600 700 800 900 1000Time slot

0

2

4

6

8

10

12

14

16

Low

-pas

s com

pone

nt o

f OD

63

times107

(a) Low-pass component of OD 63

900 1000Time slot

0 100 200 300 400 500 600 700 800

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 6

3

minus1

minus15

minus05

times107

(b) High-pass component of OD 63

Figure 4 Decomposition of OD 63

matrix called themeasurementmatrix is used to concurrentlyimplement compressing and coding During the decodingphase the compressive sensing reconstruction algorithm isa splendid approach to deal with the inverse problem withill-posed feature

As a derivative of compressive sensing the spatiotem-poral compressive sensing technique is always used as aninterpolation algorithm to recover the missing elements ofa data set The spatiotemporal feature means that the valuesof neighboring elements in the data set are properly similarIn terms of this feature the SRMF method is proposed in[19] where the missing elements can be recovered preciselythough the data loss probability is tremendous Besides theSRMF method is also an accurate tool for prediction Underthe prediction process the elements that need to be predictedare viewed as continuous missing elements

4 Our Methodology

41 Decomposition of Network Traffic We assume that theknown network traffic is denoted by 119883 whose each OD flowis denoted by a time series 119909119901119902(119905) where 119905 = 1 2 119879According to (1) it can be denoted by

119909119901119902 (119905) = +infinsum119899=minusinfin

1198881198711198991199011199022minus1198712120601( 1199052119871 minus 119899)+ 119871sum119897=1

+infinsum119899=minusinfin

1198891198971198991199011199022minus1198972120595( 1199052119897 minus 119899) (6)

If we set the scale to be 1 then we have

119909119901119902 (119905) = +infinsum119899=minusinfin

11988811198991199011199022minus12120601( 1199052 minus 119899)+ +infinsum119899=minusinfin

11988911198991199011199022minus12120595( 1199052 minus 119899) (7)

The above equation divides the network traffic into twocomponents One is the low-pass approximation (shown bythe scaling coefficients) that exhibits the LRD of the networktraffic 119909119901119902(119905) and the other is the high-pass approximation(described by the discrete wavelet transform coefficients) thatexpresses the gusty and irregular fluctuation behaviors of thenetwork traffic 119909119901119902(119905) For a traffic matrix that describes thevolume of traffic between all OD node pairs obviously itslow-pass and high-pass approximation components can bedenoted by two matrices respectively

Figures 4 and 5 give two examples of the traffic flowdecomposition by theDWTWe select twoODflows from thereal network traffic data set randomly and plot their low-passand high-pass components respectively Obviously we seethat the low-pass components are periodical which meansthat they are much easier to be predicted comparing withthe high-pass components shown by Figures 4(b) and 5(b)In this case two components are predicted independently inthis paper

42 Deep Architecture for Low-Pass Component PredictionFor an OD flow 119909119901119902(119905) we assume that the length of thisseries is an even number In this case the number of scalingcoefficients is 1198792 The deep architecture for prediction isplotted in Figure 6 There are 119872 hidden layers in thisarchitecture Both the hidden and the input layers have 1198792units At the top of the deep architecture a single neurondefined as the logistic regression is employed for predictionThe logistic regression model is made up of a hidden layerwith 1198792 units and an output layer with one unit The deeparchitecture is trained by the backpropagation algorithm inour method The parameters of the deep architecture aredetermined lay by layer [18]

In our method we first collect 119870 training set denotedby (1199091119901119902(119905) 119909119870119901119902(119905)) Then we have the scaling coefficienttraining set (1198881119901119902 119888119870119901119902) Meanwhile the corresponding

6 Wireless Communications and Mobile Computing

0

02

04

06

08

1

12

14

16

18

2

Low

-pas

s com

pone

nt o

f OD

105

900 1000Time slot

0 100 200 300 400 500 600 700 800

times108

(a) Low-pass component of OD 105

minus25

minus2

minus15

minus1

minus05

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 1

05

times107

900 1000Time slot

0 100 200 300 400 500 600 700 800

(b) High-pass component of OD 105

Figure 5 Decomposition of OD 105

Logisticregression

DBN

c

ℎM

ℎ1

c

Figure 6 Deep architecture for prediction

predictors are (1198881119901119902 119888119870119901119902) By training the proposed deeparchitecture using the scaling coefficient training set wecan obtain a relationship between input and output scalingcoefficients Furthermore we use the scaling coefficients of119909119901119902(119905) as an input and then a predictor of the scalingcoefficient will be achieved

43 Spatiotemporal Compressive Sensing for High-Pass Com-ponent Prediction For the discrete wavelet transform coef-ficients SRMF is a matrix-oriented interpolation algorithmTherefore different from the prediction of the scaling coef-ficients where each OD flow is predicted independently wepredict the discrete wavelet transform coefficients of all ODflows at the same time According to (7) the discrete wavelettransform coefficients of all OD flows constitute a matrixdenoted by 119863 in this paper The matrix 119863 consists of twoportions One is from the training data obtained bymeasurednetwork traffic data and the other needs to be predictedWe denote the final predicting result by 119863 and then it

can be predicted by the following regularized optimizationmodel 10038171003817100381710038171003817119863 minus 119863100381710038171003817100381710038172119865 + 120582 (1198712119865 + 1198772119865) + 10038171003817100381710038171003817(119871119877119879)119867100381710038171003817100381710038172119865 (8)

where the notations sdot 119865 and (sdot)119879 denote the Frobenius normand transposition respectively The matrix 119867 so-called thetemporal constraint matrix (shown by (9)) describes thetemporal neighbors

119867 = [[[[[[[

1 minus1 0 sdot sdot sdot 00 1 minus1 sdot sdot sdot 0 0 0 0 sdot sdot sdot 1

]]]]]]] (9)

The matrices 119871 and 119877 are from the singular value decompo-sition of the matrix119863 that is

119863 = 119880Σ119881119879 (10)

119880 and 119879 are two unitary matrices and Σ is a diagonal matrixwhose diagonal elements are the singular values of 119863 Thematrices 119871 and 119877 are equal to 119880Σ12 and 119881Σ12

Finally according to the predictors of the scalingand discrete wavelet transform coefficients we predictthe network traffic by inverse discrete wavelet transformAlgorithm 1 proposes the details of our method

5 Simulation Results and Analysis

This section will verify the performance of our predictionmethod In our simulations a real network traffic data setwith 2016 time slots is sampled on a time scale of 5minutesFor exact prediction the first 2000 time slots are used asthe prior information to train the deep architecture and the

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 6: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

6 Wireless Communications and Mobile Computing

0

02

04

06

08

1

12

14

16

18

2

Low

-pas

s com

pone

nt o

f OD

105

900 1000Time slot

0 100 200 300 400 500 600 700 800

times108

(a) Low-pass component of OD 105

minus25

minus2

minus15

minus1

minus05

0

05

1

15

Hig

h-pa

ss co

mpo

nent

of O

D 1

05

times107

900 1000Time slot

0 100 200 300 400 500 600 700 800

(b) High-pass component of OD 105

Figure 5 Decomposition of OD 105

Logisticregression

DBN

c

ℎM

ℎ1

c

Figure 6 Deep architecture for prediction

predictors are (1198881119901119902 119888119870119901119902) By training the proposed deeparchitecture using the scaling coefficient training set wecan obtain a relationship between input and output scalingcoefficients Furthermore we use the scaling coefficients of119909119901119902(119905) as an input and then a predictor of the scalingcoefficient will be achieved

43 Spatiotemporal Compressive Sensing for High-Pass Com-ponent Prediction For the discrete wavelet transform coef-ficients SRMF is a matrix-oriented interpolation algorithmTherefore different from the prediction of the scaling coef-ficients where each OD flow is predicted independently wepredict the discrete wavelet transform coefficients of all ODflows at the same time According to (7) the discrete wavelettransform coefficients of all OD flows constitute a matrixdenoted by 119863 in this paper The matrix 119863 consists of twoportions One is from the training data obtained bymeasurednetwork traffic data and the other needs to be predictedWe denote the final predicting result by 119863 and then it

can be predicted by the following regularized optimizationmodel 10038171003817100381710038171003817119863 minus 119863100381710038171003817100381710038172119865 + 120582 (1198712119865 + 1198772119865) + 10038171003817100381710038171003817(119871119877119879)119867100381710038171003817100381710038172119865 (8)

where the notations sdot 119865 and (sdot)119879 denote the Frobenius normand transposition respectively The matrix 119867 so-called thetemporal constraint matrix (shown by (9)) describes thetemporal neighbors

119867 = [[[[[[[

1 minus1 0 sdot sdot sdot 00 1 minus1 sdot sdot sdot 0 0 0 0 sdot sdot sdot 1

]]]]]]] (9)

The matrices 119871 and 119877 are from the singular value decompo-sition of the matrix119863 that is

119863 = 119880Σ119881119879 (10)

119880 and 119879 are two unitary matrices and Σ is a diagonal matrixwhose diagonal elements are the singular values of 119863 Thematrices 119871 and 119877 are equal to 119880Σ12 and 119881Σ12

Finally according to the predictors of the scalingand discrete wavelet transform coefficients we predictthe network traffic by inverse discrete wavelet transformAlgorithm 1 proposes the details of our method

5 Simulation Results and Analysis

This section will verify the performance of our predictionmethod In our simulations a real network traffic data setwith 2016 time slots is sampled on a time scale of 5minutesFor exact prediction the first 2000 time slots are used asthe prior information to train the deep architecture and the

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 7: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

Wireless Communications and Mobile Computing 7

(1) Input training set (1199091119901119902(119905) 1199092119901119902(119905) 119909119870119901119902(119905)) known network traffic119883(2) Output network traffic predictor119883(119879 + 1)(3) for 119894 = 1 119870 do(4) 119888119894119901119902 larr DWT(119909119894119901119902(119905))(5) end for(6) Training deep architecture using (1198881119901119902 1198882119901119902 119888119870119901119902) as a training set(7) [119862119863] larr DWT (119883)(8) 119862 larr DBN (119862)(9) 119863 larr SRMF (119863)(10)119883(119879 + 1) larr IDWT (119862 D)(11) return 119883(119879 + 1)

Algorithm 1 DBNSTCS algorithm

Predictors of DBNSTCS

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 7 Real traffic data versus their predictors via DBNSTCS

matrix 119863 We will compare our method with three state-of-the-art methods in network traffic prediction field that isthe principal component analysis (PCA) method [20] theTomogravity method [21] and the SRMF method [19] Theproposed method is implemented by MATLAB in a singlemachine with Core i5 central processing unit 4 GBmemoryand 1664 MB graphics processing unit (GPU) memoryMeanwhile we set119872 = 8 and 119879 = 800

We first plot the real network traffic versus their pre-dictors from four methods respectively Figure 7 displaysthe prediction results of our method The 119909-axis and 119910-axis denote predictors and real network traffic respectivelyFrom Figure 7 we see that our method has low predictionbiases for small network traffic flows By contrast ourmethodshows positive predictions for large network trafficThe sameconclusion can be obtained from Figure 8 For large networktraffic it also has positive predictions For small networktraffic PCA has much larger prediction bias Tomogravityhas consistently positive predictions for large network trafficshown by Figure 9 For small network traffic Tomogravityshows a desired prediction error Besides for large network

Predictors of PCA

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 8 Real traffic data versus their predictors via PCA

traffic SRMF in Figure 10 has positive or negative predictionsmore or less

Now we refer to the spatial and temporal relative errors asa metric to compare four methods The spatial and temporalrelative errors are defined as

SRE (119899) = radicsum119879119905=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum119879119905=1 1199091198992 (119905)

TRE (119905) = radicsum1198732

119899=1 (119909119899 (119905) minus 119909119899 (119905))2radicsum1198732

119899=1 1199091198992 (119905) (11)

where 119909119899(119905) and 119909119899(119905) are the 119899th end-to-end networktraffic flow and its predictor As mentioned above 119901 119902 isin1 2 119873 thus the number of OD flows is1198732 Figure 11(a)exhibits the spatial relative errors (SREs) of four methodsThe 119909-axis is the identities of end-to-end network trafficflows The end-to-end network traffic flows are sorted indescending order with respect to their means Meanwhilethe 119910-axis is the SRE From this simulation we find that

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 8: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

8 Wireless Communications and Mobile Computing

Predictors of Tomogravity

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 9 Real traffic data versus their predictors via Tomogravity

Predictors of SRMF

0

1

2

3

4

5

6

7

8

9

Real

net

wor

k tr

affic

0 1 2 3 4 5 6 7 8 9

times107

times107

Figure 10 Real traffic data versus their predictors via SRMF

our method has a consistently low SRE comparing with theother three methods Figure 11(b) shows the TREs of fourmethods From Figure 11(b) it shows that the TREs of PCAare much higher than that of the other methods The TRE ofDBNSTCS is the lowest one in four methodsThe cumulativedistributions of SRE and TRE are shown by Figure 12 It canshow the prediction error more directly Besides we find thatDBNSTCS has much more prominent improvement in TREcomparing with SRE That is because our method predictsthe low-pass component of each traffic flow independentlyIn addition contrasting to the high-pass component thiscomponent has a significant effect on prediction Hencethe weak improvement of SRE is caused by predicting thehigh-pass component using the Spatiotemporal CompressiveSensing method

The diminutive error or bias of a prediction method doesnot mean it is available Though it has low error it fails to

provide precise predictors when it has high variance Hencethe standard deviation is involved in our simulation as ametric for variance which is defined as

SD (119899) = radic 1119879 minus 1119879sum119905=1

(119909119899 (119905) minus 119909119899 (119905) minus 119887 (119899))2 (12)

where 119887(119899) = (1119879)sum119879119905=1(119909119899(119905) minus 119909119899(119905)) In (12) 119879 is thelength of predicted traffic data set Figure 13 shows the biasversus standard deviation of four methods We find thatthe four methods perform very differently with respect tovariance Tomogravity has larger variance compared withthe other methods In contrast the PCA method exhibitsrelatively high variance The DBNSTCS and SRMF methodsshow relatively low variance

Finally the performance improvement ratio is shownin Figure 14 as an overall evaluation The performanceimprovement ratio is defined as

PIR = sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 minus sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119887 (119905)1003816100381610038161003816sum1198732

119899=1sum119879119905=1 1003816100381610038161003816119909119899119886 (119905)1003816100381610038161003816 (13)

where 119909119899119886(119905) and 119909119899119887(119905) denote the predictors via the algo-rithms 119886 and 119887 respectively The performance improvementratios of DBNSTCS are 6874 524 and 1470 to PCATomogravity and SRMF

6 Conclusions and Future Work

This paper focuses on the problem of network traffic pre-diction in wireless mesh backbone networks and proposesa hierarchical prediction method The proposed hierarchicalprediction method divides the network traffic into twocomponents and then predicts each component by differentmodels In detail the proposed method takes advantage ofDBN and Spatiotemporal Compressive Sensing for networktraffic prediction In our method the DWT is applied todividing the network traffic into two components that isthe long-range dependence component and the fluctuationcomponent represented by the low-pass and high-pass com-ponents respectively A deep architecture consisting of aDBN layer and a logistic regression architecture is proposedto predict the low-pass component Meanwhile the other ispredicted by the SRMF method which can capture the spa-tiotemporal characteristic of the high-pass component Weassess the performance of the proposed prediction methodand compare it with three methods that are widely used fornetwork traffic prediction According to the simulation ourmethod is well in prediction error especially in TRE Themain bottleneck of wireless mesh backbone network trafficprediction is the predicted accuracy for the irregular fluctu-ations of network traffic Thereby the prediction algorithmaiming at low-pass components is necessary in the future

Conflicts of Interest

The authors declare that they have no conflicts of interest

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 9: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

Wireless Communications and Mobile Computing 9

0

50

100

150

200Sp

atia

l rel

ativ

e err

or

DBNSTCSPCATomogravity

SRMF

0 20 40 60 80 100 120 140

(a) Flow ID from smallest to largest in mean

Tem

pora

l rel

ativ

eer

ror

08

06

04

02

0

2 4 6 8 10 12 14 16

DBNSTCSPCATomogravity

SRMF

(b) Time slot order

Figure 11 SRE and TRE

Cum

ulat

ive

distr

ibut

ion

of S

RE

0 20 40 60 80 100 120 140 160 180

08

06

04

02

0

DBNSTCSPCATomogravity

SRMF

(a) Spatial relative error

Cum

ulat

ive

distr

ibut

ion

of T

RE0 02 04 06 08 1

1

05

0

DBNSTCSPCATomogravity

SRMF

(b) Temporal relative error

Figure 12 Cumulative distributions of SRE and TRE

Standard deviation in error

minus4

minus3

minus2

0

1

2

3

Bias

DBNSTCSTomogravityPCA

times107

times1013

0 2 4 6 8 10 12

minus1

SRMF

Figure 13 Bias versus standard deviation

Acknowledgments

This work was supported by the National Natural Sci-ence Foundation of China (61701406 61701413) and theFundamental Research Funds for the Central Universities(3102017OQD022)

0

10

20

30

40

50

60

70

80

DBNSTCS PCADBNSTCS TomogravityDBNSTCS SRMF

()

Figure 14 Performance improvement ratio

References

[1] T Qiu R Qiao and D Wu ldquoEABS an event-aware backpres-sure scheduling scheme for emergency internet of thingsrdquo IEEETransactions on Mobile Computing 2017

[2] T B Reddy B S Manoj and R Rao ldquoOn the accuracy ofsampling schemes for wireless network characterizationrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 10: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

10 Wireless Communications and Mobile Computing

Proceedings of the IEEEWireless Communications and Network-ing Conference WCNC 2008 pp 3314ndash3319 usa April 2008

[3] T Qiu A Zhao F Xia W Si and D OWu ldquoROSE RobustnessStrategy for Scale-Free Wireless Sensor Networksrdquo IEEEACMTransactions on Networking vol 25 no 5 pp 2944ndash2959 2017

[4] B Zhou D He and Z Sun ldquoTraffic predictability based onARIMAGARCHmodelrdquo inProceedings of the 2006 2ndConfer-ence on Next Generation Internet Design and Engineering NGI2006 pp 200ndash207 Spain April 2006

[5] P Kuanhoong I Tan and C YikKeong ldquoBitTorrent networktraffic forecasting with ARIMArdquo vol 4 pp 143ndash156 2012

[6] T Qiu Y Zhang D Qiao X Zhang M L Wymore and AK Sangaiah ldquoA Robust Time Synchronization Scheme forIndustrial Internet of Thingsrdquo IEEE Transactions on IndustrialInformatics pp 1-1

[7] P Calyam and A Devulapalli ldquoModeling of multi-resolutionactive network measurement time-seriesrdquo in Proceedings of the33rd IEEE Conference on Local Computer Networks LCN 2008pp 900ndash907 Canada October 2008

[8] J Huang C Xing and C Wang ldquoSimultaneous wirelessinformation and power transfer technologies applications andresearch challengesrdquo IEEE Communications Magazine vol 55no 11 pp 26ndash32 2017

[9] L Nie D Jiang S Yu and H Song ldquoNetwork traffic predictionbased on deep belief network in wireless mesh backbonenetworksrdquo in Proceedings of the 2017 IEEE Wireless Commu-nications and Networking Conference (WCNC) pp 1ndash5 SanFrancisco CA USA March 2017

[10] N C Anand C Scoglio and B Natarajan ldquoGARCH - Non-linear time series model for traffic modeling and predictionrdquo inProceedings of the NOMS 2008 - IEEEIFIP Network Operationsand Management Symposium Pervasive Management for Ubiq-uitous Networks and Services pp 694ndash697 Brazil April 2008

[11] J Huang Q Duan C-C Xing and H Wang ldquoTopologycontrol for building a large-scale and energy-efficient internetof thingsrdquo IEEEWireless CommunicationsMagazine vol 24 no1 pp 67ndash73 2017

[12] M Joshi and T Hadi A review of network traffic analysis andprediction techniques Computer Science 2015

[13] Y Liu B Li X Sun and Z Zhou ldquoA fusionmodel of SWTQGAand BP neural network for wireless network traffic predictionrdquoin Proceedings of the 15th IEEE International Conference onCommunication Technology (ICCT rsquo13) pp 769ndash774 IEEEGuilin China November 2013

[14] F Xu Y Lin J Huang et al ldquoBig data driven mobile trafficunderstanding and forecasting a time series approachrdquo IEEETransactions on Services Computing vol 9 no 5 pp 796ndash8052016

[15] W Huang G Song H Hong and K Xie ldquoDeep architecturefor traffic flow prediction deep belief networks with multitasklearningrdquo IEEE Transactions on Intelligent Transportation Sys-tems vol 15 no 5 pp 2191ndash2201 2014

[16] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquoNeural Computation vol 18 no7 pp 1527ndash1554 2006

[17] J Huang Q Duan Y Zhao Z Zheng andWWang ldquoMulticastRouting for Multimedia Communications in the Internet ofThingsrdquo IEEE Internet of Things Journal vol 4 no 1 pp 215ndash224 2017

[18] R B Palm ldquoPrediction as a candidate for learning deephierarchical models of datardquo Tech Rep Technical Universityof Denmark 2012

[19] M Roughan Y Zhang W Willinger and L Qiu ldquoSpatio-temporal compressive sensing and internet traffic matrices(Extended Version)rdquo IEEEACM Transactions on Networkingvol 20 no 3 pp 662ndash676 2012

[20] A Soule A Lakhina N Taft et al ldquoTraffic matrices Balancingmeasurements inference and modelingrdquo in Proceedings of theSIGMETRICS 2005 International Conference on MeasurementandModeling of Computer Systems pp 362ndash373 can June 2005

[21] Y Zhang M Roughan N Duffield and A Greenberg ldquoCom-putation of IP traffic from linkrdquo in Proceedings of the ACMSIGMETRICS 2003 - International Conference on Measurementand Modeling of Computer Systems pp 206ndash217 USA June2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 11: Network Traffic Prediction Based on Deep Belief Network and …downloads.hindawi.com/journals/wcmc/2018/1260860.pdf · 2019-07-30 · low-pass approximation and the details of network

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom