internet traffic classification using multifractal analysis approach
DESCRIPTION
Yulios Zavala Huaman, Jeferson Wilian de Godoy Stênico, Lee Luan LingInternational Journal of Advanced Computer Science, Vol. 3, No. 8, Pp. 388-394, Aug. 2013.TRANSCRIPT
International Journal of Advanced Computer Science, Vol. 3, No. 8, Pp. 388-394, Aug. 2013.
Manuscript Received:
15,Apr., 2011
Revised:
5,May, 2011
Accepted:
21,Jun., 2011
Published: 15,Jul., 2011
Keywords
Network
Traffic,
Traffic
Classification,
Multifractal
Analysis,
Multiplicative
Cascade
Abstract In this work, we present a
traffic classifier based on the theory of
multifractal network traffic. We use
precisely the concept of multiplicative
binomial cascades to get a feature vector to
be used in the classification scheme. This
vector is obtained by the multiplier variances
of the multiplicative cascade traffic view. We
analyze the performance of the technique
proposed by a popular ML Software-based
and the results showed viability classification
rates of traffic over 90%.
1. Introduction
The global data traffic on the internet has grown rapidly
and must quadruple over the next few years as Cisco
predicts [1], mainly driven by the greater number of devices
(tables and smartphones), the growth of users, the highest
speed broadband and the increment of video on the network.
In this scenario the correct classification of types of traffic
plays an important role.
The network management tasks, such as workload
characterization, capacity planning, provision of routes,
traffic control and policing depends on the identification
and classification of network traffic [2]. Network operators
need to know what is flowing through their networks in real
time so they can react quickly to avoid many problems and
achieve their business goals. Thus if these operators want to
block incoming traffic to a protocol on your network or if
any IPS (Internet Server Provider) tries to process different
types of connections with different priority, e.g. limiting the
delay of the data in real time, the identification of the
protocol in use is key [3].
Therefore the precise classification of network traffic is
essential for various activities related to networks, from
security monitoring to accounting, and from Quality of
Service to providing operators with useful forecasts for
long-term provisioning [4].
Increasingly, new applications are being deployed on
the internet, e.g. P2P, voip, video, applications that have
Yulios Zavala, Jeferson Wilian de Godoy Stênico and Lee Luan Ling
are with School of Electrical and Computer Engineering, State University
of Campinas – Unicamp, PO Box 6101-13.083-970, Campinas, SP, Brazil,
e-mail: {yulíos,jeferson,lee}@-decom.fee.unícamp.br.
become popular quickly, and that increase the use of ports
unpredictable. Thus, with this evolution of the traffic the
traditional classification techniques, such as those based on
the well-known port numbers or payload of the packet
analysis [5], are not effective for all types of network traffic,
or are unable to be deployed because of concerns about
security or privacy for the data.
Because of this great importance, different techniques
have been studied and used to classify network traffic. In
this paper we propose a new multifractal technique based in
the use of cascades multiplicative to development a reliable
internet traffic classifier. After extract characteristics of a
group of records with multifractal theory we use them as
inputs of machine learning algorithm to determine the
performance and feasibility of the use of multifractal
analyze.
The paper is organized as follows: Section 2 reviews
related work in this field. Section 3 overview the basic
concepts of multifractal and multiplicative cascades.
Section 4 gives the technique experiments and results. At
last, we conclude the paper.
2. Related Work
There are a considerable number of works that study
classification techniques of network traffic and internet.
This section provides an overview of these techniques and
systems that are related to our work.
The classical techniques using the well-known port
numbers defined by IANA(Internet Assigned Numbers
Authority) to identify internet traffic (e.g. Domain Name
Service applications commonly use port 53) but currently
this technique is ineffective because there are applications
that use dynamic port numbers to mask their traffic to be
recognized as known applications (e.g. the current
generation of P2P applications). The work of
Karaginnis et al. shows the classic techniques problems [6].
Another technique used is the analysis of packet
payloads as in [7]. This technique searches in payloads
features to differentiate one application from other but can
be difficult or maybe impossible when analyzed
applications using proprietary protocols or encrypted traffic.
The need to work with traffic patterns, large sets of
multi-dimensional data and various types of traffic attributes are the reasons for the introduction of ML
(Machine Learning) techniques in this field. Nguyen et al.
survey and compare the complete literature in the field of
ML-based traffic classification in [8]. Moore et al. [4]
proposed 249 flow discriminators and used machine
Internet Traffic Classification Using Multifractal
Analysis Approach Yulios Zavala, Jeferson Wilian de Godoy Stênico & Lee Luan Ling
Zavala et al.: Internet Traffic Classification Using Multifractal Analysis Approach.
International Journal Publishers Group (IJPG) ©
389
learning to select those best to classify new flows [9].
Similar strategies were applied in [10-14] to determine the
class or protocol type of traffic analyzed. Although exist an
extensive work in the field of traffic classification, there are
some important issues remain unresolved and consequently
the majority of ML-based techniques are not used by the
network operators [15].
3. Methodology
The traffic on communications networks is analyzed
using probabilistic processes that represent the impose users
utilization on network resources. So are considered
variables such as inter-arrival time of packets, time between
connections, length of connections, packet length, and
duration between sessions. In the beginning researches was
thought that inter-arrival times were independent of each
other, and the amount of demand. Subsequently was
necessary including the effect of correlation between these
variables. So they began using Poisson traffic models where
the correlation falls exponentially over time.
Important theoretical concepts to the analyst of
networks appear in 1941 with Kolmogorov who introduced
the concept of self-similarity to describe scaling process
without changes in their statistical properties [16] and in
1977 Mandelbrot proposed the term fractal to describe
irregular objects [17]. With these concepts in 1993 Leland et.
al. [18], using Ethernet traffic collected in the network of
Bellcore Morristown Research and Center Engineering,
demonstrated that traffic traces of modern high speed data
networks exhibit fractal properties, such as self-similarity
and long-range dependence (LRD). It was found that these
properties, especially the long-range dependence, have a
strong influence on network performance [19], however not
being adequately modeled by Poisson processes or more
generically, Markov models.
In contrast to the self-similar or monofractal behavior,
some recent studies suggest that the measured TCP/IP and
WAN ATM traffic flows exhibit a more complex scaling
behavior, which is consistent with multifractals [20, 21].
Multifractal based traffic modeling is more general than the
monofractal based (e.g., self-similar and long range
dependent), and provides a more accurate and detailed
description of network traffic series in different time scales
[22].
Many different multifractal traffic models have been
proposed. Most and widely studied ones include: MWM –
( Multifractal Wavelet Model) [23], AWMM - (Adaptive
Wavelet Based Multifractal Model) [24], MMNB –
(Multifractal Model based in Newton Binomial) [25],
Multi-scaling Models with Lognormal [26] and Pareto [27]
distributed traffic loads, and VVGM (Variable Variance
Gaussian Multiplier) [28].
This section explains the procedure of constructing
conservative multiplicative cascade and presents the
construction of inverse cascade, as a method to verify that a
given set of data is consistent with a conservative cascade
construction.
A. Multifractal
The concept of multifractal process was introduced by
Mandelbrot in the context of turbulence [16]. Currently the
multifractal theory has found applications in several areas
that need to describe non-linear phenomena which have
multiplicative structure, such as stock prices [29],
geophysical phenomena [30], evolution of DNA [15], traffic
modeling [23-25],[28], and others. The network traffic to be
considered multifractal means it has a strong dependence on
the inherent structure, with an incidence of bursts at various
scales. These characteristics make the network performance
be worse than that estimated using Gausssian and
short-dependency models [31].
The simplest multifractal is typically constructed by an
iterative procedure called multiplicative cascade [17]. In
this study we use the concept of multiplicative cascades to
building our internet traffic classifier.
B. Multiplicative Cascades
Definition 1: A multiplicative cascade is an iterative
process that fragments a given set into smaller pieces
according to some geometric rule and at the same time the
total mass distribution in the given set according to another
rule.
The cascade called binomial, i.e., where the division of a
given set occurs every two, is the simplest way to obtain a
multifractal process. Given a closed set [0,1] it generates a
multiplicative cascade as follows:
Let m0 = r and m1 = 1 − r , two multipliers for
cascade generation, possibly with random r. At stage n = 0
of the cascade iteration, we have the unit measure denoted
by μ0 uniformly distributed on interval [0, 1]. At stage
N = 1 the initial measure is divided into two parts, m0 on
the subinterval [0,1/2] and mass m1 on [1/2, 1] . At
stage N = 2, the interval [0,1/2] is again divided into two
subintervals [0,1/4] and [1/4, 1/2] and the procedure is
repeated for interval [1/2,1], with the following measures
[17]:
μ2[0,1/4] = m0m0 μ2[1/4,1/2] = m0m1
μ2[1/2,3/4] = m1m0 μ2[3/4,1] = m1m1 (Equ. 1)
This process is iterated for k levels, and at each stage
it can be seen that total measure is preserved. Considering
the kth stage of the cascade, the mass is fragmented over
the dyadic sub-intervals of type [t, t + 2−k] with their
corresponding measures μ′s. Let φ0 and φ1 denote the
relative frequencies of 0′s and 1′s, respectively, in the
cascade development. The measure μ in the dyadic interval
[t, t + 2−k] is given by:
μ[t, t + 2−k] = μ[∆k] = m0kφ0m1
kφ1 (Equ. 2)
In Figure 1 illustrates the formation of this cascade for
two stages.
International Journal of Advanced Computer Science, Vol. 3, No. 8, Pp. 388-394, Aug. 2013.
International Journal Publishers Group (IJPG) ©
390
Fig.1. Binomial Multiplicative Cascade
Conservative cascades are a type of multiplicative
cascade that conserves mass in all its stages. Conservative
cascades are arise naturally in the data network context and
the inverse-cascade construction provides a simple heuristic
for checking whether or not a given data set conforms to an
underlying conservative cascade construction [32]. So the
main objective of building the inverse cascade is verify or
not the conservative rule in the mass redistribution of an
initial range for the two subintervals, and if so it, infer the
relevant statistical properties of the cascade conservative
generator.
For example, we use the data set WWW (web traffic)
and bitTorrent. So we take the arrival time traffic data at
stage (N − 1). The traffic series at cascade stage N can be
obtained by adding consecutive values of the later stage in
non-overlapping blocks of size 2. Similarly, given the
number on the scale (N − j), XiN−j
, (i = 1,… , 2N−j), obtain
data on the scale by adding (N − j − 1) , consecutive
values of stage (N − j) as follows:
XiN−j−1
= X2i−1N−j
− X2iN−j
(Equ. 3)
For i = 1,… , 2N−j−1. This procedure terminates when
the aggregation value is a single point on the final stage of
the cascade. An estimate rj(i)
multipliers can be obtained by
the following equation adapted [32]:
rj(i)=
XiN−j
X2i−1N−j−1 (Equ. 4)
For i = 1,… , 2N−j−1 . We can consider rj(i)
as
samples of the distribution of multipliers fRj(r) in stage
j. The multiplier distribution at scale j, can be obtained
from the histogram of rj(i)
.
Figure 2 and 3 shows the levels 6 and 9 of the inverse
cascade construction using the WWW traffic trace and
Figure 4 shows the histograms for bitTorrent traffic
trace, to stages N = 5 and N = 8. It can be observed that the
distribution of the multipliers (generator of the
multiplicative cascade) is approximately Gaussian, with
mean 0.5.
From the distributions obtained, we estimate the
variance at each stage of the inverse cascade, as seen in
Figure 5.
Fig. 2. Inverse Cascade – Stage 6
Fig. 3. Inverse Cascade – Stage 9
(a) (b)
Fig. 4 (a) Stage N = 5, (b) Stage N = 8
Fig.5. Measured Variance
C. Proposed Approach The main reason of this work is to show the
performance and viability of our multifractal Internet traffic classifier.
Consider the variable arrival time of a set of packets that belongs to the same type of traffic. We use the construction process of inverse cascade explained in Section above for obtain the multipliers of the variable considered.
The variance values of the multipliers obtained are
placed in a vector called the “feature vector”. For example,
if we analyze a set of 256 packages will get a cascade of
eight levels and our feature vector was formed by the
0 10 20 30 40 50 600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Mul
tiplie
r0 50 100 150 200 250 300 350 400 450 500
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
TimeM
ulti
plie
r
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
30
35
r
fR
(r)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
70
80
90
100
rfR
(r)
1 2 3 4 5 6 7 8 9 10
0
0.02
0.04
0.06
0.08
0.1
0.12
Stage
Variance
WWW
bitTorrent
Zavala et al.: Internet Traffic Classification Using Multifractal Analysis Approach.
International Journal Publishers Group (IJPG) ©
391
variance values of the multipliers in each level, thus, this
vector will have eight values.
We build the training dataset that consists of pairs
{features, label}, where features are represented as a vector
of traffic features and the label is a name that identifies the
network application that generated the traffic. So in the
training phase we decided to use the C4.5 supervised ML
method [33] given its high accuracy and low overhead
compared to other ML techniques.
The C4.5 algorithm generates a decision tree from data,
performed recursively by partitioning. The algorithm
considers all possible tests that can divide the data set and
select the test that results with the highest information gain.
For each discrete attribute, it is considered a test with n
outcomes, where n is the number of possible values that can
take the attribute. For each continuous attribute, binary test
is performed on each of the attribute values. At each node,
the system must decide which test chooses to split de data.
This C4.5 algorithm builds in an offline phase a model from
a pre-classified training dataset build before. In the
identification phase we first extract the feature vector of the
set of traffic records to analyze. This vector is used by the
trained model, built in the previous phase, to generate the
application prediction.
Figure 6 shows the scheme used in the process of
classifying an unknown traffic.
Fig 6. Classification Technique.
4. Experimental Results
In this section we explain how the experiments have
been configured, the datasets used and the categories of
traffic analyzed.
We tested the proposed technique on three traffic
datasets. So the first one dataset consisted of anonymized
payload traces collected at two edge links located in Italy,
Spanish and USA. The UNINA trace [34] was captured at a
link with 200Mbps network connection of the University for
the Rest of the Internet. These traces are in tcpdump format.
We used as the www traffic category Trace1 that is the
traffic to TCP port 80, generated by clients inside the
network at the University of Napoles Federico II - Unina -
reaching the outside world. As a traffic sample for the
category Mail was used Trace1, this is traffic to TCP port
25 generated by clients inside the network at Unina
reaching the outside world. Table 1 shows this packet
dataset.
TABLE 1
PACKET DATA SET
Class Port Instances
www 80 70000
Mail 25 60000
The second dataset was the evaluation dataset used in the
paper [35] consists of seven traces collected at the Gigabit
access link of the Universitat Politècnica de Catalunya
(UPC), which connects about 25 faculties and 40
departments (Geographically Distributed in 10 campuses) to
the Internet through the Spanish Research and Education
network (RedIRIS). We only use four application types
(Bittorrent, Domain Name Service - Dns, http and voip) of
the UPC-II. Table 2 shows this flow dataset.
TABLE 2
FLOWS DATA SET
Application Instances
BitTorrent 70000
DNS 60000
HTTP 60000
Skype 70000
The last data set used was the NSL-KDD [37] that is a
improvement Data Set of KDD-99 used for The Third
International Knowledge Discovery and Data Mining Tools
Competition, the competition task was to build a network
intrusion detector, a predictive model capable of
distinguishing between „bad‟ connections, called intrusions
or attacks, and „good‟ normal connections. Table 3 shows
this anomalies dataset.
TABLE 3
ANOMALIES DATA SET
Application Instances
Normal 67343
Attacks 58630
We performed our experiments on an Intel Pentium Dual
Core 1.86GHz and 2.00GB of RAM. The inverse cascade
construction algorithm was implemented in MATLAB 7. In
this paper, we use WEKA ML software [28] to build the J48
decision tree, an open source java extension of the original
C4.5. This software is also used in the work of Moore et al.
to perform their analysis [4].
To evaluate the performance of classification algorithm
were used metrics: Detection rate (DR), True positive rate
(TPR), False positives rate (FPR). For each traffic class is
defined metrics TPR and FPR as:
=
(Equ. 5)
=
(Equ. 6)
International Journal of Advanced Computer Science, Vol. 3, No. 8, Pp. 388-394, Aug. 2013.
International Journal Publishers Group (IJPG) ©
392
where TP is the number of correctly classified positives
samples, TN is the number of correctly classified negatives
samples, FP is the number of incorrectly classified positives
samples and FN is the number of incorrectly classified
positives samples.
We can observe that the metric TPR is the ratio of the
number of positive samples correctly classified and the total
number of positive samples and the metric FPR is the ratio
between the number of negative incorrectly classified
samples and the total number of negative samples. The
metric DR is defined as:
=
(Equ. 7)
Tables 4 and 5 shows classification rates metrics of
Packets DataSet using seven (7n) and eight (8n) cascade
stages with two traffic features (Inter-arrival Time and
Packet Size).
TABLE 4
PACKETS DATASET-TRAIN RESULTS
Class 7 Stages 8 Stages
TPR(%) FPR(%) TPR(%) FPR(%)
www 90.7 30.1 97.4 15.8
Smtp 69.9 9.3 84.2 2.6
TABLE 5
PACKETS DATASET-TEST RESULTS
Class 7 Stages 8 Stages
TPR(%) FPR(%) TPR(%) FPR(%)
www 84.7 34.6 97.6 28.2
Smtp 65.4 15.3 71.8 2.4
Tables 4 and 5 show that with seven stages of the
cascade we obtain an accuracy of 90% and 97% for eight
stages so that we can observe that with greatest number of
stages used, the feature vector will be larger which implies a
better classification of traffic. This is confirmed by
observing the detection rates shown in the Table 6.
TABLE 6
PACKETS RESULTS
7 Stages 8 Stages
DR (%) DR (%)
Train 81.06 91.32
Test 75.46 85.18
Tables 7 and 8 shows classification rates metrics of
Flows DataSet using seven (7n) and eight (8n) cascade
stages with three traffic features (Packets Number, Flow
Bytes and Flow Time).
TABLE 7
FLOWS DATASET-TRAIN RESULTS
Application 7 Stages 8 Stages
TPR(%) FPR(%) TPR(%) FPR(%)
Bittorrent 86.6 2.4 93.4 0.7
Dns 99.8 4.4 98.7 1.7
Http 93.6 0.1 99.8 0.6
Skype 98.2 0.5 98.9 0.1
TABLE 8
FLOWS DATASET-TEST RESULTS
Application 7 Stages 8 Stages
TPR(%) FPR(%) TPR(%) FPR(%)
Bittorrent 72.9 5.8 73.8 5.8
Dns 82.1 8.9 87.2 0.6
Http 98.7 0.1 99.9 0.1
Skype 92.9 3.3 90.5 0.1
We can see in the rates of classification for the tables 7
and 8 increased due to the use of more features for our
analysis in this case three. Thus the use of more features
improves the performance of our technique, as also shown in
the Table 9 rates.
TABLE 9
FLOWS RESULTS
7 Stages 8 Stages
DR (%) DR (%)
Train 94.43 97.63
Test 86.50 87.65
Tables 10 and 11 shows classification rates metrics of
Anomalies DataSet using seven (6n) and seven (7n) cascade
stages with four traffic features (Count, srv_count,
dst_host_coun, dst_host_srv_count).
TABLE 10
ANOMALIES DATASET-TRAIN RESULTS
Class 6 Stages 7 Stages
TPR(%) FPR(%) TPR(%) FPR(%)
Normal 99.4 0.5 99.9 0.2
Attacks 99.5 0.6 99.8 0.1
Fig. 7 True Positive Rate of levels 5 to 10 for Anomalies Train Dataset.
TABLE 11
ANOMALIES DATASET-TEST RESULTS
Class 6 Stages 7 Stages
TPR(%) FPR(%) TPR(%) FPR(%)
Normal 98.7 22.5 99.9 17.0
Attacks 77.5 1.3 83.0 0.1
5 6 7 8 9 1096
97
98
99
100
Stages
Tru
e P
ositiv
e R
ate
Normal
Attacks
Zavala et al.: Internet Traffic Classification Using Multifractal Analysis Approach.
International Journal Publishers Group (IJPG) ©
393
Fig. 8 True Positive Rate of levels 5 to 10 for Anomalies Test Dataset
For the last dataset analyzed we can observe in the
tables 10, 11 and 12 a good classification rate. So this
demonstrates the viability of our classification technique that
can achieve more than traditional techniques for traffic flow
classification that are often no-more accurate that 50-70%
[4].
TABLE 12
ANOMALIES RESULTS
6 Stages 7 Stages
DR (%) DR (%)
Train 94.44 99.89
Test 86.61 90.28
5. Conclusions
In this work, we have evaluated three datasets for
classifying traffic application employed with packets, flows
and anomaly records. In ours experiments, the classification
technique based on multifractal multiplicative cascades can
archive detection rates above 90% .
The multifractal classification technique extract traffic
features to build a model on offline phase, which is later
used to identify network traffic online. The approach
showed a good performance in classification task of records
traces studied in experiments with six, seven and eight
cascade stages. We believe that the performance of this
technique can be refined thought the selection of optimum
numbers of cascades levels used for the analysis.
References
[1] Cisco. “Global Internet Traffic Projected to Quadruple by
2015,” (2013). Available: http://newsroom.cisco.com/home1
(last accessed April 2013)
[2] R. Alshammari & A. N. Zincir-Heywood, “A Flow Based
Approach for SSH Traffic Detection,” (2007) In Systems,
Man and Cybernetics. ISIC. IEEE International Conference
on, pp. 296-301.
[3] J. Hurley, E. Garcia-Palacios & S. Sezer, “Classifying
Network Protocols: A 'Two-Way' Flow Approach,” (2011)
Communications, IET, vol. 5, pp. 79-89.
[4] A. W. Moore & D. Zuev, “Internet Traffic Classification
Using Bayesian Analysis Techniques,” (2005) SIGMETRICS
Perform. Eval. Rev., vol. 33, pp. 50-60.
[5] J. Erman, A. Mahanti & M. Arlitt, “Internet Traffic
Identification Using Machine Learning,” (2006) In Global
Telecommunications Conference, GLOBECOM '06. IEEE,
2006, pp. 1-6.
[6] T. Karagiannis, A. Broido, M. Faloutsos & K. Claffy,
“Transport Layer Identication of P2P Trfc,” (2004) In
Proc. Of IMC'04.
[7] S. Sen, O. Spatscheck, & D. Wangccurte, “Scalable
In-Network Identication P2P Trfc Using Application
Signatures,” (2004) In WWW200S, New York, USA, May
17-22.
[8] T. T. T. Nguyen, & G. Armitage, “A Survey of Techniques
for Internet Traffic Classification Using Machine Learning,”
(2008). IEEE Communications Surveys & Tutorials, vol.
10, pp. 56-76.
[9] A. Moore, D. Zuevd, & M. Crogan, “Discriminators for Use
in Flow-Based Classification,” (2005) Technical Report, Intel
Research.
[10] D. Zuev & A. Moore, “Traffic Classification Using a
Statistical Approach,” (2005) lect.notescomput.sci., 3431,pp.
321–324.
[11] G.P.S. Junior, J.E.B. Maia, R. Holanda & J.N. Sousa,“P2P
Traffic Identification Using Cluster Analysis,”(2007). Global
Information Infrastructure Symp., giis, pp. 128–133.
[12] S. Zander, T. Hguyen & G. Armitage “Automated Traffic
Classification and Application Identification Using Machine
Learning,” (2005). Proc. IEEE Conference. on local
Computer Networks 30th Anniversary, lcn.
[13] M. Roughan, S. Sen, O. Spatscheck & N. Duffield,
“Class-of-Service Mapping for QoS: A Statistical
Signature-Based Approach to IP Traffic Classification,”
(2004). Proc. Fourth ACM Sigcomm Conference on Internet
Measurement, pp. 135–148.
[14] A. Mcgregor, M. Hall, P. Lorier & J. Brunskill, “Flow
Clustering Using Machine Learning Techniques,”(2004)
Proc. Fifth Passive and Active Measurement Workshop, Pam.
[15] D. R. Bickel & B.J. West, “Multiplicative and Fractal
Processes in DNA Evolution,”(1998) Fractals, 6, 211–217.
[16] A.N. Kolmogorov,“A Refinement of Previous Hypothses
Concerning The Local Structure of Turbulence in a Viscous
Incompressible Fluid a High Reynolds Number,”(1962)
J.Fluid Mech., 13, 82–85.
[17] A. Feldmann, A. Gilbert, W. Willinger & T.G. Kurtz, “The
Changing Nature of Network Traffic: Scaling
Phenomena,”(1998) ACM Computer Communication
Review, v.28, p.5-29. Group, Tech.1 Rep.Disas-STP-93-30.
[18] W. Leland, M. Taqqu, W. Willinger & D. Wilson, “On The
Self-Similar Nature of Ethernet Traffic,” (1994) (Extended
Version), IEEE/ACM Transactions on Networking, v.2,n.1,
pp 1-15.
[19] I. Norros, “A Storage Model with Self-Similar Input,” (1994)
Queueing Systems,16, pp.387-396.
[20] M. S. Taqqu, V. Teverovsky & W. Willinger, “Is network
traffic self-similar or multifractal?,” (1997) Fractals, vol. 5,
pp. 63-74.
[21] A. Feldman, A. C. Gilbert & W. Willinger, “Data Network As
Cascades: Investigating The Multifractal Nature of Internet
WAN Traffic,” (1998). Computer Communication Review.
[22] J. Vinay Ribeiro, R. H. Riedi, M. S. Crouse & R. G.
Baraniuk, “Multiscale Queuing Analysis of
Long-Range-Dependent Network Traffic,” (2000). IEEE
INFOCOM 2000.
5 6 7 8 9 10
70
80
90
100
Stages
Tru
e P
ositiv
e R
ate
Normal
Attacks
International Journal of Advanced Computer Science, Vol. 3, No. 8, Pp. 388-394, Aug. 2013.
International Journal Publishers Group (IJPG) ©
394
[23] R.H Riedi, M.S. Crouse, V.J. Ribeiro & R.G. Baraniuk, “A
Multifractal Wavelet Model with Application to Network
Traffic,” (1999) IEEE Transactions on Information Theory.
[24] F.H.T.Vieira & L.L. Lee, “Adaptive Wavelet Based
Multifractal Model Applied to the Effective Bandwidth
Estimation of Network Traffic Flows,” (2009). IET
Communications.
[25] J.W.G. Stenico & L.L. Lee, “A New Binomial Conservative
Multiplicative Cascade Approach for Network Traffic
Modeling,” (2013). In 27th IEEE International Conference on
Advanced Information Networking and Applications – IEEE
AINA 2013.
[26] J.W.G. Stenico & L.L. Lee, “A Multifractal Based Dynamic
Bandwidth Allocation Approach for Network Traffic Flows,”
(2010) IEEE International Conference on Communications
(ICC), 23-27 May 2010, pages 1 – 6.
[27] J.W.G. Stenico & L.L. Lee, “A New Approach for Buffer
Queueing Evaluation under Network Flows with Multi-Scale
Characteristics,” (2012) In: International Joint Conferences
on Computer, Information, and Systems Sciences, and
Engineering (CISSE 12), 2012, University of Bridgeport -
EUA.
[28] P.M. Krishna, V.M. Gadre & U.B. Desai, “Multifractal Based
Network Traffic Modeling,” (2003) Kluwer Academic
Publishers.
[29] B. B. Mandelbrot, L. Calvet & A. Fisher, “Large Deviations
and The Distribution of Price Changes,”(1997). Discussion
paper No 1165 of the Cowles Foundation for Economics at
Yale University.
[30] V. Gupta & E. Waymire, “A Statistical Analysis of
Mesoscale Rainfall as a Random Cascade” (1993) Journal of
Applied Meteorology, 32, 251–267.
[31] T.D. Dang, S. Molnár & I.Maricza,“Queuing Performance
Estimation for General Multifractal Traffic,”(2003). Int. J.
Commun. Syst., vol 16 no 2, pp 117–136.
[32] A.C. Gilbert, W. Willinger & A. Feldmann, “Scaling
Analysis of Conservative Cascades, with Applications to
Network Traffic,” (1999) AT&T Labs.-Res., Florham Park,
NJ.Information Theory, IEEE Transactions on, vol. 45, pp.
971-991.
[33] J.R. Quinlan, “C4.5: Programs for Machine Learning,” (1993)
Morgan Kaufmann. Publishers Inc. San Francisco, CA, USA.
[34] Unina. “Network Tools And Traffic Traces,” (2013)
Available:
,http://www.grid.unina.it/Traffic/Traces/ttraces.php (last
accessed April 2013).
[35] V. Carela-Español, P. Barlet-Ros, A.C. Aparicio & S. Pareta,
“Analysis of The Impact of Sampling on NetFlow Traffic
Classification,” (2001) Computer Networks, vol. 55, pp.
1083-1099.
[36] G. Holmes, A. Donkin & H. Witten, “WEKA: A Machine
Learning Workbench,” (1994) In Intelligent Information
Systems. Proceedings of the Second Australian and New
Zealand Conference on, Brisbane, Qld., Australia, pp.
357-361.
[37] M. Tavallaee, E. Bagheri, W. Lu & A. Ghorbani, “A Detailed
Analysis of the KDD CUP 99 Data Set,” (2009) IEEE
Symposium on Computational Intelligence for Security and
Defense Applications (CISDA), pp. 1-6.
Yulios Zavala Huaman Has graduation
at Ingeniería de Sistemas e Informática
by Universidad Nacional Mayor de San
Marcos (2005) is a master's student in
Laboratory of Pattern Recognition and
Communication Networks, Universidade
Estadual de Campinas - Unicamp. Has
experience in the area of Electrical
Engineering, with emphasis on
telecommunications. His research is specialized in application and
service identification. He interests span network monitoring,
machine learning, data mining.
Jeferson Wilian de Godoy Stênico
received the B.S. in Mathematics from
Universidade Estadual Paulista Júlio de
Mesquita Filho – UNESP, Brazil (2006)
and M. Sc. in Electrical Engineering
from State University of Campinas –
Unicamp, Brazil (2009), is currently
Ph.D. student at Electrical Engineering
from State University of Campinas –
Unicamp. His current research interests
include network traffic modeling, network design, performance
analysis and communications system.
Lee Luan Ling received the B.S. and M.
Sc. Degrees in electrical Engineering
from University of São Paulo (1980) and
State University of Campinas (1984),
respectively, in São Paulo, Brazil. In
1991 he received a Ph.D. degree in
Electrical Engineering from Cornell
University, Ithaca, USA. In 1984 he
became a faculty member at School of
Electrical and Computer Engineering,
State University of Campinas where currently he is a Full
Professor. His current research interests include pattern
recognition, handwriting recognition, biometrics, image
processing, artificial intelligence, video monitoring and
surveillances, network traffic modeling and network design and
performance analysis.