network intrusion detection dean final, actual version

27
Network Intrusion Detection By: Jack Song, Julina Zhang, Kerry Jones Advisors: Dr. Don Brown, Dr. Hyojung Kang, Dr. Malathi Veeraraghavan Client: UVA Information Security, Policy, and Records Office (ISPRO) Sponsors: UVA SEAS/ Leidos 1

Upload: tianye-song

Post on 21-Feb-2017

17 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Network Intrusion Detection Dean Final, actual version

Network Intrusion Detection

By: Jack Song, Julina Zhang, Kerry JonesAdvisors: Dr. Don Brown, Dr. Hyojung Kang, Dr.

Malathi VeeraraghavanClient: UVA Information Security, Policy, and

Records Office (ISPRO)Sponsors: UVA SEAS/ Leidos

1

Page 2: Network Intrusion Detection Dean Final, actual version

Agenda

● Team Members● Project Objectives● Progress to Date● Deliverables● Potential Sponsors

2

Page 3: Network Intrusion Detection Dean Final, actual version

Team Members - Data Science Institute

3

Jack Song● Majored in

Computer Science at UVA

Julina Zhang● Majored in

Statistics and Economics at UVA

Kerry Jones● Majored in

Government and Geography at UMD

Page 4: Network Intrusion Detection Dean Final, actual version

Team Members - Advisors

4

Dr. Donald E. Brown● Director of the Data

Science Institute● Dept. of Systems and

Information Engineering

Dr. Malathi Veeraraghavan● Dept. of Electrical &

Computer Engineering

Dr. Hyojung Kang ● Dept. of Systems and

Information Engineering

Page 5: Network Intrusion Detection Dean Final, actual version

Team Members

Jason Belford● Chief Information Security

Officer 5

Jeff Collyer● Information Security Engineer

Page 6: Network Intrusion Detection Dean Final, actual version

Team Members

6

Sourav Maji● Third-year PhD student in Computer

Engineering

Ron Hutchins● Vice President for Information

Technology

Page 7: Network Intrusion Detection Dean Final, actual version

Objectives

● To detect anomalous traffic leaving UVA network using machine learning and data mining.

● Develop a network intrusion detection prototype.

7

Page 8: Network Intrusion Detection Dean Final, actual version

Agenda

● Team Members● Project Objectives● Progress to Date● Deliverables● Potential Sponsors

8

Page 9: Network Intrusion Detection Dean Final, actual version

Background - Approaches

● Lancope StealthWatch● Previous approaches

○ Density-based Spatial Clustering of Applications with Noise (Erman, Arlitt, Mahanti)

○ K-Means Clustering (Erman, Arlitt, Mahanti) ○ One-class Support Vector Machine (Locke, Wang,

Paschalidis)○ Neural Network (Locke, Wang, Paschalidis)○ Hierarchical Clustering (Ling, Rosti, Swanson)○ Isolation Forest( Liu, Ting, Zhou)

● Our approach○ Isolation Forest - An unsupervised learning method

that utilizes a tree structure to isolate anomalies. 9

Page 10: Network Intrusion Detection Dean Final, actual version

Our progress, in a glance

10

- ISPRO- Preprocessing- Wireshark- Filtering

- Unsupervised methods- Isolation Forest

- Didn’t work out well

- Collection server- Power Edge

- TShark- Conversation data- Better ‘Unit’- Preliminary results

Course of Time

Prog

ress

Initial Data

Filtered Data

netFlow data

Page 11: Network Intrusion Detection Dean Final, actual version

Initial data phase

Data from ISPRO

+ Data preprocessing

+ Data filtering by source IPs within UVA network

Result: a subset of packet capture data of all conns initiated within the UVA network

11

Page 12: Network Intrusion Detection Dean Final, actual version

Init, Data Preprocessing

12

ISPRO data 1 TB

WIRESHARK/TShark

50GB → 5GB.pcap → .csv

One pcap file

50GB/6min

Summary statistics;AlgorithmsPython Script

Page 13: Network Intrusion Detection Dean Final, actual version

Filtered data phase

Result from last phase

Created source - destination IP pairs

Calculated frequency and mean length for each pair

+ Isolation Forest

Provided an initial view, but more is needed.

13

Page 14: Network Intrusion Detection Dean Final, actual version

Filtered data phase, what we’ve learned

Packet capture data ONLY captures packets

+ Need to capture the entire use session

Need netFlow records data

14

Page 15: Network Intrusion Detection Dean Final, actual version

NetFlow data phase -- Now

● Setting up a collection server

○ Power Edge

● Conversation data & TShark

● Better ‘Unit of comparison’

○ include port number

● Preliminary analysis15

Page 16: Network Intrusion Detection Dean Final, actual version

16

Page 17: Network Intrusion Detection Dean Final, actual version

17

Count 157,313

Unique Source IP 11514

Unique Destination IP 13113

Unique Destination Ports 1631

Unique Source Ports 48925

Average Duration 31 Secs

Average Packets Source to Destination 34 Packets

Average Packets Destination to Source 31 Packets

Average Bytes Source to Destination 10172 Bytes

Average Bytes Destination to Source 58134 Bytes

Summary Statistics

Page 18: Network Intrusion Detection Dean Final, actual version

Top Five Most Frequently used Destination Ports

18

Destination Port

Count Number of Unique Source IP pairs

80 ( HTTP) 66390 11238

443 (HTTPS) 38422 954

25 (FTP) 24277 39

6 20387 1

3 957 2

Page 19: Network Intrusion Detection Dean Final, actual version

19

Page 20: Network Intrusion Detection Dean Final, actual version

NetFlow data phase, next steps

● Finish setting up Power Edge○ Shell script ○ Cron job

■ Automation of daily data collection● Go into specifics, “symptoms”

○ DNS tunneling○ Phishing

20

Page 21: Network Intrusion Detection Dean Final, actual version

Identified Cyber Security Needs

● Identifying anomalous behavior in traffic leaving the UVa network

○ Source data: NetFlow records

○ Traffic from hosts with static public IP addresses

● DNS Tunneling

○ Data theft using port 53 as a pathway

● Phishing Attack

○ Obtain sensitive information by disguising and baiting.

21

Page 22: Network Intrusion Detection Dean Final, actual version

Challenges

1. Domain knowledge2. Size of data

a. 36 min of data, approx. 270 GB3. IP addresses

a. Dynamic vs. Staticb. Private vs. Public

4. Unlabeled data → unsupervised learning

22

Page 23: Network Intrusion Detection Dean Final, actual version

Deliverables

● Paper● Network intrusion detection prototype● Shell script

23

Page 24: Network Intrusion Detection Dean Final, actual version

Potential Sponsors

● NSF Cybersecurity Innovation for Cyberinfrastructure (CICI)

● NSF Secure and Trustworthy Cyberspace (SaTC) programs

● DHS CyberSecurity Division programs

● DOE Cybersecurity for Energy program

● Industry, specifically NTT Labs and Cisco

24

Page 25: Network Intrusion Detection Dean Final, actual version

References

1. Ashfaq, Rana Aamir Raza, et al. "Fuzziness Based Semi-Supervised Learning Approach for Intrusion Detection System." Information Sciences (2016).

2. Boutaba, Carol Fung and Raouf. Intrusion Detection Networks. CRC Press, 2013.3. —. Intrusion Detection Networks: A Key to Distributed Security. CRC Press, 2013.4. Erman, Jeffrey, Martin Arlitt, and Anirban Mahanti. "Traffic Classification using Clustering Algorithms." Proceedings

of the 2006 SIGCOMM workshop on Mining network data. ACM, 2006. 281-286.5. Farnham, Greg. “Detecting DNS Tunneling”. SANS Institute InfoSec Reading Room. 2013 6. Grimes, Robert. Detect network anomalies with StealthWatch. 2014. IDG. 2016.

<http://www.infoworld.com/article/2848768/security/detect-network-anomalies-with-stealthwatch.html>.7. Locke, R., J. Wang, and I. Paschalidis. "Anomaly Detection Techniques for Data Exfiltration Attempts.." Boston

University Center for Information and Systems Engineering, 2012.8. Sommer, Robin, and Vern Paxson. "Outside the Closed World: On using Machine Learning for Network Intrusion

Detection." 2010 IEEE symposium on security and privacy (2010).9. Yuning Ling, Marcus Rosti, Gregory Swanson. "A Hands-off Approach to Network Intrustion Detection." IEEE

Systems and Information Engineering Design Conference (SIEDS). Charlottesville : IEEE, 2016. 216-220.10. Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. “Isolation-based anomaly detection.” ACM Transactions on Knowledge

Discovery from Data (TKDD) 6.1 (2012): 3.

25

Page 26: Network Intrusion Detection Dean Final, actual version

Isolation Forest

• Unsupervised learning method• Builds an ensemble of ITrees

for a given data set.• The anomalies are those

observations with shortest average length path root node.

26

Page 27: Network Intrusion Detection Dean Final, actual version

Preliminary Results of iForest

27