visualization in the age of big data

46
Raffael Marty, CEO Visualization In The Age of Big Data HoneyNet Project Workshop Stavanger, Norway May, 2015

Upload: raffael-marty

Post on 28-Jul-2015

3.154 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Visualization in the Age of Big Data

Raffael Marty, CEO

Visualization In The Age of Big Data

HoneyNet Project Workshop Stavanger, Norway

May, 2015

Page 2: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .2

How Compromises Are Detected

Mandiant M Trends Report 2014 Threat Report

Attackers in networks before detection

27 days

229 days

Average time to resolve a cyber attack

Seems Like Cyber Security Is Not Working

Page 3: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .3

breaches can be detected (early) - or even be prevented - if we looked at the data

Monitoring To The Rescue

Page 4: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .4

Interactive Visualization

Page 5: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .5

I am Raffy - I do Viz!

IBM Research

Page 6: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .6

• Security Landscape

• What is Going Wrong?

• A New Approach

• Security Analytics

• Big Data Lake

• Visualization

• Challenges

• Data Discovery and Exploration

• Examples

Overview

Page 7: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .7

Monitoring Tools

Scoring

Behavior

Log Mgmt

Threat Feeds

Context

Ticket

IR

False Positive

ManualTriage

Sandboxes

Data Sources

Firewall

IPS

Proxy

AV

Endpoint

SIEM

Page 8: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .8

• Products / Tools • Firewall - Blocks traffic based on pre-defined rules • Web Application Firewall - Monitors for signs of known malicious activity in Web traffic • Intrusion Prevention System - Looks for ‘signs’ of known attacks in traffic and protocol violations • Anti Virus - Looks for ‘signs’ of known attacks on the end system • Malware Sandbox - Runs new binaries and monitors their behavior for malicious signs • Security Information Management - Uses pre-defined rules to correlate signs from different data

streams to augment intelligence • Vulnerability Scanning - Searches for known vulnerabilities and vulnerable software

• Rely on pattern matching and signatures based knowledge from the past • Reactive -> always behind • Unknown and new threats -> won’t be detected • ‘Imperfect’ patterns and rules -> cause a lot of false positives

We Are Monitoring - What is Going Wrong?

Defense Has Been Relying On Past Knowledge

Page 9: Visualization in the Age of Big Data

Security Analytics

Page 10: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .10

A New Approach

ENABLE analysts to leverage their knowledge effectively and efficiently

• scalability - big data based, extensible platform

• visualization - interactive exploration of billions of events

• knowledge - capture from experts

- leverage machines to guide

- automate where possible

- enable collaboration

We Need Analysts in the Loop!

(not better algorithms)

Page 11: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .11

• Intercept attacks (APT) early in the kill chain

• Detecting intrusions

• Detecting data leaks

• Network-based anomaly detection

• Threat Intelligence

• Attack surface analysis

• Speed up forensic investigations and incident response

• Insider threat detection

• User behavior monitoring

• Privilege abuse

• Fraud detection

• Compliance

• Continuous monitoring

• Risk quantification and metrics

• Business improvements

• Spending justification for security

• Spending optimization (esp. cloud)

Use-Cases Enabled Through Analytics

Data Stores Analytics Forensics Models Admin

10.9.79.109 --> 3.16.204.150 10.8.24.80 --> 192.168.148.19310.8.50.85 --> 192.168.148.19310.8.48.128 --> 192.168.148.19310.9.79.6 --> 192.168.148.193

10.9.79.6

10.8.48.128

80

538.8.8.8

127.0.0.1

Anomalies

Decomposition

Data

Seasonal

Trend

Anomaly Details

Find Intruders and ‘New Attacks’

Resolve Incidents Quicker

Communicate Findings

Page 12: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .12

Analytics Platform - How It’s Done

Rules Patterns Scoring

context

data

Security Big Data Lake

• Explore & Hunt

• Visual Forensics

Behavior Anomaly Detection

• Alert Triage

Visualization

Analytics

• Visualization in the center • Not relying on past knowledge • Analytics to support not alert

Page 13: Visualization in the Age of Big Data

13

Visualization

Page 14: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .14

Visualization To …

Present / Communicate Discover / Explore

Page 15: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .15

Unknown Unknowns - Visualization Is Central

"There are 1000 ways for someone to steal information. If we knew how, we could have prevented it. Visualization helps find that one way.”

- CISO UBS Switzerland

Page 16: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .16

Visualization Example (Unknown Unknowns)Pix lCloud i s a v isual analytics platform for cyber security.

This example shows a heatmap of behavior over time.

In this case, we see activity per user. We can see that ‘vincent’ is visually different from all of the other users. He shows up very lightly o v e r t h e e n t i re t i m e period. This seems to be something to look into.

We were able to find this purely v isual , without understanding the data more intrinsically.

Page 17: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .17

Why Visualization?the stats ...

http://en.wikipedia.org/wiki/Anscombe%27s_quartet

the data...

Page 18: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .18

Why Visualization?

http://en.wikipedia.org/wiki/Anscombe%27s_quartet

Human analyst: • pattern detection • remembers context • fantastic intuition • can predict

Page 19: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .

• Access to data

• Parsed data and data context

• Data architecture for central data access and fast queries

• Application of data mining (how?, what?, scalable, …)

• Visualization tools that support

• Complex visual types (||-coordinates, treemaps,

heat maps, link graphs)

• Linked views

• Data mining (clustering, …)

• Visual analytics workflow

19

Visualization Challenges

Page 20: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .20

Access paradigms for a backend:

• Analytical queries - mainly for visual interaction

• Accessing large amounts of data in aggregated ways

• Support for intelligent caching (reduce slow re-query of data)

• Statistics - answering frequent ‘aggregation’ queries very fast

• Ad-hoc search

• Raw data retrieval

• Context - deal with data context for time-series data

Enablement - Data Layer Requirements

Note: No mention of HADOOP!

Page 21: Visualization in the Age of Big Data

Big Data Lake

Page 22: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .22

The Big Data Lake

• One central location to store all cyber security data • “Data collected only once and third party software leveraging it” • Scalability and interoperability

• Hard problems: • Parsing: can you re-parse? • Data store capabilities (search, analytics, distributed processing, etc.) • Access to data: SQL (even in Hadoop context), how can products

access the data?

Prevent Re-Collection?

Page 23: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .23

The Security Data Lake - Federated Data Access

SIEM

dispatcher

SIEM connector SIEM console

Prod A

AD / LDAPHR

IDS

FW

Prod B

DBs

Data Lake

SNMP

Many many challenges!

Page 24: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .24

Data Lake Version 0.5a

SIEM

columnar or

search engineor

log management

processing

SIEM connector

raw logs

SIEM console

SQL or searchinterface

processing filtering

HDFS

lake

Current solutions (log mgmt / siem): - not open - don’t scale

Page 25: Visualization in the Age of Big Data

25

Data Discovery & Exploration

Page 26: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .26

Visualize Me Lots (>1TB) of Data

Page 27: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .27

Information Visualization Mantra

Overview Zoom / Filter Details on Demand

Principle by Ben Shneiderman

Page 28: Visualization in the Age of Big Data

28

SecViz Examples

Page 29: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .29

Additional information about objects, such as:

• machine • roles • criticality • location • owner • …

• user • roles • office location • …

Add Context

source destination

machine and user context

machine role

user role

Page 30: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .30

Traffic Flow Analysis With Context

Page 31: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .31

An Analytical Example - Monitor Password Resets

threshold

outliers have different magnitudes

Page 32: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .32

Approximate Curvefitting a curve distance to curve

Page 33: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .33

• Holt Winters is exponential smoothing • Lets you define thresholds for alerting!

Data Mining Applied

• Hard to define alert threshold

better threshold

Page 34: Visualization in the Age of Big Data

copyright (c) 2013pixlcloud | creating actionable data stories

Internet Service Provider

• Monitoring entire network • shows scans across

customers on port 445 (Windows shares)

new worm emerging

Page 35: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .35

Machine Learning - Clustering Users

Source:Email logs

Explanation:The graph shows email communications between employees and outside people.

By clustering the data, different user groups become visible automatically. It became visible that there was an entire cluster that we cannot assign to a known group of users!

unknown

product teams

sales and marketing

competition

Page 36: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .36

Intra-Role Anomaly - Random Order

users

time

dc(machines)

Page 37: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .37

Intra-Role Anomaly - With Seriation

Page 38: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .38

Intra-Role Anomaly - Sorted by User Role

Administrator

Sales

Development

Finance

Admin???

Page 39: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .39

• This looks interesting

• What is it?

• Green -> Port 53

• Only port 53?

• What IPs?

• What’s the time behavior?

• The graph doesn’t answer

these questions

Graphs - A Story

Page 40: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .40

Graphs - A Story

• Adding a port histogram

• Select DNS traffic

and see if other

ports light up.

Note how this is a

user experience

challenge!

Page 41: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .41

• Linked Views

• Histograms for

• Source

• Port (Source)

• Destination

• ||-coord

DNS Traffic - A Closer Look

Page 42: Visualization in the Age of Big Data

42

Bringing It All Together

Page 43: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .43

Bringing It All Together

Data Stores Analytics Forensics Models Admin

10.9.79.109 --> 3.16.204.150 10.8.24.80 --> 192.168.148.19310.8.50.85 --> 192.168.148.19310.8.48.128 --> 192.168.148.19310.9.79.6 --> 192.168.148.193

10.9.79.6

10.8.48.128

80

538.8.8.8

127.0.0.1

Anomalies

Decomposition

Data

Seasonal

Trend

Anomaly Details

“Hunt” ExplainVisual Search

• Big data backend • Own visualization engine (Web-based) • Visualization workflows

Page 44: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .44

http://secviz.org

List: secviz.org/mailinglist

Twitter: @secviz

Share, discuss, challenge, and learn about security visualization.

Security Visualization Community

Page 45: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .45

BlackHat Workshop

Visual Analytics - Delivering Actionable Security

Intelligence

August 1-6 2015, Las Vegas, USA

big data | analytics | visualization

http://secviz.org

Page 46: Visualization in the Age of Big Data

Secur i ty. Analyt ics . Ins ight .

[email protected]

http://slideshare.net/zrlram

http://secviz.org and @secviz

Further resources: