Connect. Communicate. Collaborate
A Network Security Service for GÉANT2 (and beyond….)
Maurizio Molina, DANTE
TNC 08, Brugges, 20 th May 2008
Connect. Communicate. CollaborateOutline
• The vision• Proof of concept• Supporting tools• Service Outlook
Connect. Communicate. Collaborate
The vision:enhance NRENs security
• NRENs have their CERTs to deal with security• and collaborate with each other
– Trusted Introducer– GN2 JRA2
• and DANTE can filter traffic on GN2 if NRENs request it….
! BUT !
• Can we be more proactive to NREN CERTs exploiting
the visibility of the GN2 core?
Connect. Communicate. Collaborate
The vision (cont.):enhance NRENs security
• To spot security anomalies in the GN2 core you need data• Good old SNMP? Too coarse!• Router Logs? Ok, but need to know what you’re looking for• Run a darknet?
– It’s not where a core network makes a difference– others already do it
• NetFlow? yes, but you need good tools!
• Routing data? Only as a complement of NetFlow
Connect. Communicate. Collaborate
Proof of concept: what can we see with NetFlow data?
NfSen, enhanced with self writtenAnomaly Detection extensions
Netflow collected on all peering interfaces
1 / 1,000 Sampling
3k flows/s
Connect. Communicate. Collaborate
Bits, Packets or Flows?What to use?
• Flows/s are more indicative of security incidents• But with fixed thresholds, small interesting peaks will disappear
in daily cycles!
Connect. Communicate. CollaborateOK, we’re smart… let’s filter! Connect. Communicate. Collaborate
+
K2
K1
X1 X2 +
-
Input
+
Forecast error
k1=(1-p1)*(1-p2) ; k2= (1-p1)+(1-p2)
Choice: p1=p2=0.9
•The “error” is used in control loops• Here we use it
to spot a
deviation from a
baseline
• It’s an “observer”
Connect. Communicate. Collaborate
Does it help? Not if we stick to volumes (e.g. flows/s) … Connect. Communicate. Collaborate
UDP flows (filtered)
TCP flows (filtered)
Connect. Communicate. Collaborate
Are there other more “security sensitive” features?
• Recent work on Anomaly Detection suggests focusing on the concentration or dispersion of
– Flows per IP source address– Flows per IP destination address– Flows per IP source port– Flows per IP destination port
• AKA “IP features entropies”
Connect. Communicate. Collaborate
Explanation of IP feature entropy Connect. Communicate. Collaborate
fraction of total flows received per IP address
0
0.05
0.1
0.15
0.2
0.25
1 6 11 16 21 26
IP (ranked)
fraction of total flows received per IP address
0
0.05
0.1
0.15
0.2
0.25
1 6 11 16 21 26
IP (ranked)
Normal
S
n
S
nxH i
N
i
i2
1
log)(The Entropy H is: H varies between 0 (“one point takes all”)and log2N (uniform distribution)
Traffic more focused towards a few hosts
Connect. Communicate. Collaborate
IP feature entropy (simplified) Connect. Communicate. Collaborate
Normal Traffic more focused towards a few hosts
fraction of total flows received per IP address
0
0.05
0.1
0.15
0.2
0.25
1 6 11 16 21 26
IP (ranked)
fraction of total flows received per IP address
0
0.05
0.1
0.15
0.2
0.25
1 6 11 16 21 26
IP (ranked)
f=0.6 f=0.81
•Percentage of flows associated to top N src IPs, dst IPs, src ports, dst ports• We tried N = 1, 10, 100, 500• N=10 was the best choice (anomalies appear more evident)
Connect. Communicate. Collaborate
IP features entropies (after “observer” filtering)
Connect. Communicate. Collaborate
UDP features entropies
TCP features entropies
10 days of GN2 traffic
Connect. Communicate. CollaborateDrilling downDrilling down on a TCP peak Connect. Communicate. Collaborate
-Concentration of DST IPs and DST ports receiving flows
-Dispersion of SRC IPs and SRC ports
• IRC server in Slovenia, receiving a lot of 60 bytes syn pkts on port 6667, mainly from a /16 Subnetwork of an University in the Netherlands.
• Likely a “BotNet war”?
-The “bounce” is due to the filter, and needs a state machine to be correctly interpreted!
Connect. Communicate. CollaborateDrilling downDrilling down on a UDP peak Connect. Communicate. Collaborate
- Concentration of SRC and DST IPs and SRC ports
- Dispersion of DST ports
• Portscan of host in CARNET, from 4 hosts, 29 bytes packets
-Observe again the “bounce”!
Connect. Communicate. CollaborateDrilling downAnd on smaller aggregates?“DWS” NREN example Connect. Communicate. Collaborate
-Concentration of SRC and DST IPs and SRC ports
-Dispersion of DST ports
• A few hours routing shift event (primary to backup access) triggers a lot of “noise”
• One MUST be able to correlate feature entropies & traffic shifts!
• Other than that, peaks are still very clear!
-Observe again the “bounce”!
Connect. Communicate. Collaborate
And on smaller aggregates?“NON DWS” NREN example Connect. Communicate. Collaborate
• Fewer peaks, but still evident
Connect. Communicate. CollaborateLessons learnt so far
• IP features entropies evidence also low volume anomalies, and can give an initial hint on the anomaly type, but:– need a state machine to be interpreted– fully automatic conclusions are difficult– one must not be oblivious of big volume shifts and
macroscopic events!• A lot of anomalies are “observable” on DWS connectivity
– Good reason for having a security service protecting DWS customers!
– But we’ve seen attacks/scans between NRENs as well
Connect. Communicate. CollaborateMoving forward
• With NfSen and self-written extensions we have enough evidence that:– anomalies are observable in the GÉANT2 core– Novel automatic methodologies for their classifications
are applicable• However, we are looking at commercial tools for moving to
a service– To reduce effort to engineer / maintain / evolve code– Scalability and tool support is an issue for a service
Connect. Communicate. CollaborateTools requirements
• Detection of both low and high volume anomalies– (DoS, DDoS, host and Network scans, worms, phishing
sites, etc.)• Automatic classification, collection of evidence• Detection of anomaly entry points, suggestion of ACLs• Give correct indications also in presence of sudden traffic
shifts due to routing changing/network outages• Robustness to occasional loss of NetFlow records• Work well also with sampled NetFlow
Connect. Communicate. CollaborateTools’ comparison
• Work just started, no conclusion yet• We just report “lesson learnt” so far
– on paper analysis of some tools (four in some detail)– Interaction with vendors
Connect. Communicate. CollaborateTools approaches
• IP features entropy + volume– Pros: no additional info needed, works with low sampling
rate, can catch a wide range of anomalies – Cons: needs drill down after “alert”
• Volume + “fingerprints”– Pros: precise, an alert is already “a conclusion”– Cons: won’t catch what you don’t look for
• Per host behavioural analysis– Pros: precise– Cons: scalability? robustness to low sampling?
Connect. Communicate. CollaborateTools common features
• Require NetFlow on ingress links only• Capable of doing NetFlow v5 and v9• Require SNMP access to routers to read configuration data
Connect. Communicate. Collaborate
Tools distinguishing features
• BGP processing– create POP to POP (or even prefix to prefix) matrixes– correlate big volume shifts to routing changes
• Internal routing (e.g. IS/IS) processing– traffic split of peers on internal (backbone) links
• NetFlow collection on multiple points (routing tracing)– But this is not really a plus, rather an additional burden
for NOT using routing data!
Connect. Communicate. Collaborate
Tools distinguishing features (cont.)
• different approaches to distinguish “normal” from “not normal” behaviour– Principal Component Analysis– Host type classification & rather complex “scoring”
system– moving averages– fixed thresholds
Connect. Communicate. CollaborateService Outlook
• Primary recipients: NREN CERTs• Info provided:
– security alerts about all types of discovered anomalies– Collected evidence– Suggested mitigation actions – Periodic summary reports
• Other recipients: APMs, NREN PC, EU commission– For strategic decisions
Connect. Communicate. CollaborateAcknowledgements
• Prof. Francesco Donati and Dott.sa Gabriella Caporaletti EICAS automazione S.P.A. for the useful hints on the “observer” design and tuning
• Peter Haag from SWITCH for the development of NfSen