agilis: an on-line map reduce environment for collaborative security
TRANSCRIPT
Sapienza Università di RomaDipartimento di Informatica e Sistemistica
Middleware LaboratoryMIDLAB
AGILIS: an on-line map reduce environment for collaborative security
Roberto Baldoni
Università degli Studi di Roma “La Sapienza”
[email protected], http://www.dis.uniroma1.it/~baldoni/
Prin Meeting - San Vito di Cadore
Joint Work with IBM Haifa
in the context of CoMiFin EU Project
14/2/2011
Mid
dle
ware
Labora
tory
MID
LAB
Focus and structure of the talk
■ Requirements coming from the financial context;
■ Collaborative event processing for Cyber Security
■Edge vs centralized event processing over the internet
■ Agilis
■Esper
Roberto Baldoni
Sapienza Università di RomaDipartimento di Informatica e Sistemistica
Middleware LaboratoryMIDLAB
The case of the Financial Critical Infrastructure
Mid
dle
ware
Labora
tory
MID
LAB
The case of Collaborative Cyber Security in Financial Ecosystem
■"webification" of critical financial services, such as home banking, online trading, remote payments;
■Cross-domain interactions, spanning different organization boundaries are in place in financial contexts;
■Heterogeneous infrastructure systems such as telecommunication supply, banking, and credit card companies working on heterogeneous data;
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
The case of Collaborative Cyber Security in Financial Ecosystem■ A payment card fraud (2008)
■100 compromised payment cards used by a network of coordinated attackers retrieving cash from 130 different ATMs in 49 countries worldwide, totaling 9 million of US dollars.
■High degree of coordination, half an hour to be executed
■evade all the local monitoring techniques used for detecting anomalies in payment card usage patterns.
■The fraud has been detected only later, after aggregating all the information gathered locally by each financial institution involved in the payment card scam
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
The case of Collaborative Cyber Security in Financial Ecosystem
■Distributed Denial Of Service Attack (2007, Northern Europe)
■ render web-based financial services unreachable from legitimate users.
■DDoS attack targeted a credit card company and two DNS.
■Internet restored only after several trial-and-error activities carried out manually by network administrators of the attacked systems and of their Internet Service Providers (ISPs).
■Long preparation time (days), short attack time (seconds)
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Economics of a DDOS
■render internet-based financial services unreachable from legitimate users.
■Use of Botnets (rented now with a credit card in a few minutes)
■Three examples of DDOS campaign in Cyberwarfare:
■Estonia 2007
■Georgia 2008
■Iran (in progress!). Stuxnet worm invaded Iran’s Supervisory Control and Data Acquisition systems
McAfee report 2010 “in the crossfire: critical infrastructures in the age of cyber war “
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Economics of a DDOS
McAfee report 2010 “in the crossfire: critical infrastructures in the age of cyber war “
■cost of downtime from major attacks exceeds U.S. $6 million per day
■damage to reputation
■loss of personal information about customers
■one out of five DDos attacks is accompanied with an extorsion
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
The case of Collaborative Cyber Security in Financial Ecosystem■Both previous attacks cannot be detected quickly
through information available at the IT infrastructure of a single financial player (i.e., using local monitoring)
■Need of Information Sharing
■Exchange non-sensitive status information
■Set up of agreements
■Advantages of a global monitoring system
■Damage mitigation
■Quick reaction
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Barriers to Collaboration
Internet
LLYODS
Unicredit
France Telecom
EDF
AT&T
SWIFT
Events
warnings
UBS
■Barriers to collaboration
■Understanding the economics
■Trust
■Legal Issues
Roberto Baldoni
Sapienza Università di RomaDipartimento di Informatica e Sistemistica
Middleware LaboratoryMIDLAB
Collaborative event Processing for cyber security:
The CoMiFin Project
Internet level
Collaboration Level
Application Level
Mid
dle
ware
Labora
tory
MID
LAB
Collaborative Cyber Security Platform
■Monitoring and reaction to threats (MitM, Stealty Scan , Phishing, …)
■Black/white lists distribution (for credit reputation, trust level, …)
■Anti-terrorism lists (with name check VAS)
■Anti money laundering monitoring
■Risk management support
■ Some Requirements on the platform
■ uneven workload along the time
■ High throughput
■ high computational power
■ Large storage capabilities
■ Timeliness
■ Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
The notion of semantic room
■Contract
■ set of processing and data sharing services provided by the SR along with the data protection, privacy, isolation, trust, security, dependability, performance requirements.■ The contract also contains the hardware and software requirements a member has to provision in order to be admitted into the SR.
■ Objective
■ each SR has a specic strategic objective to meet (e.g, large-scale stealthy scans detection, detecting Man-In-The-Middle attacks)
■ Deployment
■ highly flexible to accommodate the use of different technologies for the implementation of the processing and sharing within the SR (i.e., the implementation of the SR logic or functionality).
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
The notion of semantic room: relationship with cloud computing
Internet
Level
Collaboration
Level
Application
Level
■Private cloud
■ Deployment of the semantic room through the federation of computing and storage capabilities at each member
■ Each member brings a private cloud to federate
■Public Cloud
■ Deployment of the semantic room on a third party cloud provider
■ The third party owns all computing and storage capabilities
■Hybrid approach
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Data Management problems in the semantic room
■ Jurisdiction and regulation (Where and how will data be governed?)
■ Ownership of Data (Who owns the data in the semantic room?)
■ Data Portability
■Data anonymization
■Data Retention/Permanence (What happens to data over time?)
■Security and Privacy (How is data secure and protected?)
■Reliability, Liability and Quality of Service of the partner of the semantic room
■ Government Surveillance (How much data can the government get from a semantic room?)
■………………….Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
A specific collaborative platform: CoMiFin Architecture
contract
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Related work
Roberto Baldoni
■ IBM System S [ICDCS 06]
■ high cost of ownership
■Centralized data management
■No cooperative approach
■Cooperative Intrusion Detection Systems (e.g. Dshiels)
■Correlation among local warnings
■ High cost of ownership
■ Obscure data management
Sapienza Università di RomaDipartimento di Informatica e Sistemistica
Middleware LaboratoryMIDLAB
Preventing Stealthy Scan Through centralized processing
Internet level
Collaboration Level
Application Level
Mid
dle
ware
Labora
tory
MID
LAB
Collaborative Stealthy scan
Attacker performs port scanning simultaneously at multiple sites trying to identify TCP/UDP ports that have been left open. Those ports can then be used as the attack vectors
Added value of collaboration:
■Ability to identify an attacker trying to conceal his/her activity by accessing only a small number of ports within each individual domain
Action taken:
■black list IP addresses
■update historical records
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Collaborative Stealthy scan detection
Attack subjects:
■External web servers in DMZ’s of the SR members
Pattern:
■“Unusually” high number of requests
■Originating from a particular source IP address, and
■Directed to distinct (machine, port) pairs
Action taken:
■Matching source IP’s are banned from the future access to external web servers
Defining the attack
■Use of common scanning tools (nmap)
■Use of real trace (e.g., ITOC US Army)
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Collaborative Stealthy scan detection
■Rank-Syn algorithm.
■ Analyze the sequence of SYN, ACK, RST packets in the three-way TCP handshake. Specifically, in normal activities the following sequence is verified (i) SYN, (ii) SYN-ACK, (iii) ACK.
■ In the presence of a SYN port scan, the connection looks like the following: (i) SYN, (ii) SYN-ACK, (iii) RST (or nothing)
■ For a given IP address, if the number of incomplete connections is higher than a certain threshold T, we can conclude that the IP address is likely carrying out malicious port scanning activities.
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Stealthy scan detection
Raw data: TCPdump10:53:14.647181 IP 9.148.30.136.pop3 > 9.148.17.85.madcap: R 0:0(0) ack 1 win 0
10:53:14.653813 IP 9.148.17.85.sis-emt > 9.148.30.136.xfer: S 268426387:268426387(0) win 65535 <mss 1460,nop,nop,sackOK>
10:53:14.653817 IP 9.148.30.136.xfer > 9.148.17.85.sis-emt: R 0:0(0) ack 268426388 win 0
Normalized data format:
■LogEvent: sourceIp, destIp,sourcePort, destPort, startTime, endTime, bytesSent, bytesRecieved, returnStatus;
Online data summary
Historical Data Format
BlackList: List of IP addresses
Source IP
Rate
Source IP
Requests count
Port count
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Example of semantic room for stealthy scan: Ingredients
Roberto Baldoni
Branch jBranch 1Gateway
I/O socket
adapter
sniffer
Branch NGateway
I/O socket
adapter
sniffer
.
.
.
POJOs
POJO
s
Esper CEP Engine
I/O socket
Main Engine
Input Streams
Output Streams
Scanner list
suspected IPs
EPL QueryEPL Query
EPL Query
EPL QueryEPL Query
Subscriber
Mid
dle
ware
Labora
tory
MID
LAB
Example of semantic room for stealthy scan: Ingredients
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Testbed: latency measurement
Two Semantic rooms
■Man-In-The-Middle Attack
■Stealthy Scan
Roberto Baldoni
Sapienza Università di RomaDipartimento di Informatica e Sistemistica
Middleware LaboratoryMIDLAB
Preventing Stealthy Scan through edge processing
Internet level
Collaboration Level
Application Level
Mid
dle
ware
Labora
tory
MID
LAB
Example of semantic room for stealthy scan: Ingredients■WebSphere eXtreme Scale (WXS): in-memory
distributed storage
■High-level language for processing logic: Jaql (SQL-like, supports flows)
■Distributed processing runtime: MapReduce
■Distributed file system for long-term storage: HDFS
■ Agilis consists of a distributed network of processing and storage elements hosted on a cluster of machines (also geographycally dispersed)
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Data Dissemination: Agilis
Roberto Baldoni
Gateway
JaqlInterpr
eter
Jaql query
AGILIS
Re-define InputFormat,
OutputFormat
Distributed In-Memory Store (WXS)
Storage container
Storage container Cat 1
Cat 2
Distributed File System (HDFS)
Jaql Adapter
Map-Reduce (Hadoop)
Job Tracker
TaskTracker
TaskTracker
TaskTracker
HDFS Adapter
WXS Adapter
Mid
dle
ware
Labora
tory
MID
LAB
Example of semantic room for stealthy scan: architecture
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Collaborative Stealthy scan detection with Agilis
Detection of stealty scan
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Demo: Done at Haifa IBM Research LAB (2009)
■Simple and homemade attacks
■ artificial traces
■Simple stealty scan detection algorithm
8 Linux Machines on a LAN, each of which with
■2GB of RAM and
■20GB of disk space
■One machine was hosting all the management processes (JT, XS Catalogue)
Each of the remaining 7 hosts modeled a single SR participant
■DMZ web server under attack
■TT and XS data server
Scenarios:
■Single intruding host that generated a series of TCP/SYN requests targeting a fixed set of 300 unique ports on each the 7 attacked servers
■requests injected at constant rate of 10, 20, and 30 req/server/sec
■ratio of attack to legitimate traffic 1:5
■blacklisting threshold: 20,000 requests and 1000 unique port
■processing window: 4 minutes
Results:■No overload
■Detection latency 700 sec, 430 sec, 330 sec
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Demo: work we are doing in our LAB (2010)
The video shows a semantic room implemented in Agilis using
■Nmap for producing the attack
■Real TCP dumpsJoint work with Giorgia Lodi and Leonardo Aniello
Roberto Baldoni
Mid
dle
ware
Labora
tory
MID
LAB
Testbed: latency measurement
Two Semantic rooms
■Man-In-The-Middle Attack
■Stealthy Scan
Roberto Baldoni