data mining netflow...cost of participating in data mining no yes 10 10 10 red haring time lost to...

14
Data Mining NetFlow So What’s Next? Mark E Kane FloCon 2005 20 September 05

Upload: others

Post on 14-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Data Mining NetFlowSo What’s Next?

Mark E KaneFloCon 200520 September 05

Page 2: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Objectives

Data Mining, very brieflyFrequency PatternsDiscoveriesRealizationsChanges Made

Page 3: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Data Mining

Data Mining – automated extraction of previously unknown data that is interesting and potentially useful.

Page 4: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Cost of Participating in Data Mining

Red Haring101010YESNO

Time Lost to Investigate and Clean

Up After Crime∞∞∞NOYES

-000NONO

Crime Prevented / Prosecuted101010YESYES

Result

Example SysAdmin

Hours

Example Investigator

Hours

Example Analyst Hours

Result of Data

MiningReality

Page 5: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Complexity of Mining NetFlow

Shear VolumeComplex Protocol AnalysisAmbiguous InterpretationsVery Smart Adversaries

Page 6: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Common Investigator Issues

Undermanned and overworkedVaried knowledge baseDoes not own networksNo direct reporting structure

Page 7: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Data Mining Techniques

Primary TechniquesRule and Tree InductionCharacterizationClassificationRegressionAssociationClustering

Other TechniquesDependency ModelingChange DetectionTrend AnalysisDeviation DetectionLink AnalysisPattern AnalysisSpatiotemporal Data MiningMining Path Traversal PatternsMining Sequential/Frequent Patterns

Uncertain Reasoning TechniquesFuzzy LogicNeural NetworksBayesian NetworksGenetic AlgorithmsRough Set Theory

Page 8: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Frequency Patterns

Mining Frequent Patterns in Data Streams in Multiple Time Granularities(Giennella, Han, Pei, Yan, and Yu)

Support Decision MakingPast Less Significant than PresentRecord ReductionTime Tilted Windows

Page 9: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Interpreting Time-Tilted Windows

DAYWindowTransition N Y N Y N Y N YSize 1 1 2 2 4 4 8 8

Monday 9Tuesday 15 9Wednesday 6 12Thursday 6 6 12Friday 12 6 12Saturday 16 12 6 12Sunday 6 14 9Monday 12 6 14 9Tuesday 15 9 14 9

0 1 2 3Day 1: 9 events

Day 2: 15 events (two buckets)

Day 3: 6 events (two buckets)

Day 4: 6 events (two buckets)

Day 5: 16 events (three buckets)

Day 6: 12 events (four buckets)

Page 10: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Presenting Frequency Patterns

Page 11: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Data Mining Discoveries

Failed email serversPreviously, unknown trusted relationshipsEncryption without authenticationPossible, but unproven intrusions

Page 12: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Data Mining Results

Frustrated InvestigatorsFrustrated AnalystsOne Very Frustrated Developer

Page 13: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Changes to Employ Data Mining

Establish common basis of understandingEstablish criteria for reporting

Geo-ResolutionTimelinessVolume

Establish reporting procedures

Page 14: Data Mining NetFlow...Cost of Participating in Data Mining NO YES 10 10 10 Red Haring Time Lost to Investigate and Clean Up After Crime YES NO ∞ ∞ ∞ NO NO 0 0 0 - Crime Prevented

Questions

Mark Kane

mkane @ ddktechgroup.com