data analytics: dealing with cyber storms · data analytics: dealing with cyber storms ... applying...

36
Data Analytics: Dealing with Cyber Storms Cybersecurity Analysis Suite (CASe) for Precision and Prediction in an Era of Big Data Joseph Kielman Science Advisor 18 May 2015

Upload: donhu

Post on 16-May-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Data Analytics: Dealing with Cyber Storms Cybersecurity Analysis Suite (CASe) for Precision and Prediction in an Era of Big Data

Joseph Kielman Science Advisor 18 May 2015

Page 2: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

2

§  Data analysis challenges are well documented* §  Annual global IP traffic will surpass the zettabyte threshold (1.3 zettabytes) by the end of

2016. In 2016, global IP traffic will reach 1.3 zettabytes per year or 110.3 exabytes per month.

§  Global IP traffic has increased eightfold over the past 5 years, and will increase threefold over the next 5 years. Overall, IP traffic will grow at a compound annual growth rate (CAGR) of 29 percent from 2011 to 2016.

§  In 2016, the gigabyte equivalent of all movies ever made will cross global IP networks every 3 minutes. Global IP networks will deliver 12.5 petabytes every 5 minutes in 2016.

§  The number of devices connected to IP networks will be nearly three times as high as the global population in 2016.

§  This work focuses on the analysis portion of the data processing cycle and applying it to the cyber security environment §  US CERT projects need for analysis will rise to 10B events/day when EINSTEIN 3

becomes operational

§  Predictive analysis will help anticipate cybersecurity events – key as number and frequency of events increases

PART 1: Operational Need

*Source: CISCO VNI forecast, 2011-2016

Page 3: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Organization

Cognition

Prediction

Connect the Dots

Content Management

Data Information Knowledge Wisdom

Aggregation

Integration Extraction

Link Discovery

Pattern Analysis

Graph Matching

Evidence Extraction

Visual Analytics Synthesis

Analysis

Predictive/Prospective

Discovery

Policy Context, Culture, Genes Cognitive/Behavioral Analytics

Page 4: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Analytics for Complex Information (ACI)

Being Prospective rather than Retrospective means enabling users to harness Massive data, which comes in Multiple modes and Multiple types, through Multiple devices, in Diverse user environments, In order to make decisions in real-time.

Dynamic Information

Made Actionable

In Real-Time

Page 5: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Information after Analytics

•  What part is relevant?

•  Am I missing something?

•  What does it mean?

•  When will I need it?

•  How do I use it?

•  Can I trust it?

•  Is it private or personal?

•  How do I convey it to others?

•  When can I take actions with it? And which ones?

•  How will others see it?

•  How should I interact with it?

Page 6: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Other Issues with Digital Data

•  Data Archiving •  How and where to store it? •  What do we do about clouds, warehouses, and websites?

•  Data Permanence •  How long will it last?

•  Data Rendering •  What will we use to run it?

•  Regulatory or Legal Issues •  How to deal with Digital Rights Management and piracy? •  Who enforces the copyright?

Page 7: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Science and Technology Principles

What Is the Problem?

Information -There are real-world [resiliency] issues for which the availability and usefulness of information is problematic

How Do We Structure the Solution?

Utility -The capabilities being developed must address at least four information-related concerns: increasing amounts and diversity of data compounded by multiplicity of users; indeterminate nature of relevance coupled with multiplicity and complexity of [threats] or problems

What Is the Value of the Work?

Outcome -The improvements being provided must have real-world outcomes or implications for the [homeland security] enterprise

NOTE: Please replace words in [ ] with terms appropriate to your industry or affiliation.

Page 8: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

What We Want/Need/Expect from Analytics

The Goal … A coherent story using complete, up-to-date information and with which we are comfortable in making decisions

But …

•  How coherent?

•  How complete?

•  How current?

•  How comfortable?

So …

§  Risk measures

§  Confidence limits

Page 9: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

q Develops and delivers capability to discern, analyze and investigate, and predict multiple, large-scale cybersecurity threats to critical infrastructures

q Addresses both DHS components’ investigative and protective missions and the requirements of the recent Executive Order and Presidential Policy Directive

q Delivered in three stages tailored for successively more complex applications §  Standalone system addressing localized to nation-wide to global financial crimes and

cyber attacks by individuals and criminal networks §  Networked capability allowing distributed toolsets and data sources to be readily

applied to address large-scale, distributed attacks on multiple infrastructure sectors §  Public-Private Partnership using matching private sector funds to deliver a common

cyber threat and situational awareness capability for multiple, linked infrastructures

q Enhances system performance and ROI with each discrete implementation stage §  Relies on previously developed tools, frameworks, and systems as well as others

currently under development to replace largely manual, case-specific methods §  Integrates and scales capabilities to provide real-time analysis of developing cyber

infrastructure attacks at levels of tens of billions events or transactions §  Enables interdiction of criminal activities or cyber attacks before they become

widespread or severe as well as forecasting or prediction of potential cyber threats

PART 2: Program Objectives and Design

9

Page 10: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

10

§  PPD-21 dated 2/12/13 calls for the following:

§  Strategic Imperative 3 calls for implementing “An integration and analysis function to inform planning and operational decisions regarding Critical Infrastructures (CI)”

§  DHS to: “Maintain national critical infrastructure centers that shall provide a situational awareness capability that includes integrated, actionable information about emerging trends, imminent threats, and the status of incidents that may impact critical infrastructures”

§  DHS, “In conjunction with … other Federal Departments and agencies, to provide analysis, expertise and other technical assistance to CI owners and operators”

§  DHS to: “Support the Attorney General and law enforcement agencies with their responsibilities to investigate and prosecute threats to and attacks against critical infrastructure”

§  Integrated Task Force (ITF) Incentives Study in process. Cyber Security Division leading this engagement for S&T, which includes integration of the R&D requirements and strategy

EO-13636/PPD-21 Requirements

Page 11: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Presenter’s Name June 17, 2003

No

Op Context Chart – Financial Crimes Current (As is)

11

Current detection of financial crimes comes from the victim or the company holding the victim’s access and financial information •  Financial crimes happen very quickly and are international in nature, as the same information can be sent around the world to other locations

for action (e.g. ATM withdrawals simultaneously in US, UK, Germany, etc.) •  Aware of the communities responsible for large-scale financial crimes, but difficult prosecuting the leads for these communities. •  EX: Over 1 week in Feb. 2013, one incident with 42,000 transactions across multiple countries led to a total loss of $39M.

The raw financial data comes in different ways, exported from different systems, into massive excel files •  Evidence collection is collected by location, amount, and account information •  Geolocation of events is difficult as each ATM, bank, etc. are not properly tagged with location data •  Surveillance information at sites varies by country, city, town, etc.

Determining responsible party(ies) is difficult and labor intensive •  The “middle man” is easier to identify using camera and surveillance systems at the site of the crime, however, the larger organization and the

individuals therein giving the orders and disseminating the financial information is harder to identify for prosecution

Prosecuting responsible parties is difficult due to the level of data needed for prosecution is complicated and labor intensive •  The “middle man” is easier to identify using camera and surveillance systems at the site of the crime, however, the larger organization

and the individuals therein giving the orders and disseminating the financial information is harder to identify for prosecution

1  

2  

3  

Ongoing      

Collec+on/Analysis  

4  

Version 12 (2-5-2013)

Various agencies and organizations are responsible for maintaining the integrity of the nation's financial infrastructure and payment systems. The y constantly implement and evaluate prevention and response measures to guard against electronic crimes as well as other computer related fraud. The financial industry estimates billions of dollars in annual losses associated with credit card fraud.

Financial crime occurs  Receive  raw  

financial  crime  data  via  massive  excel  files  

1  Organize  and  analyze  data  across  mul+ple  details  (e.g.  loca+on,  +me,  iden++es,  etc.)  

2  Determine  responsible  par+es,  collect  evidence  

for  prosecu+on    

3  Prosecute  responsible  

par+es  

4  T = days T = weeks

Page 12: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Presenter’s Name June 17, 2003

No

Op Context Chart – Financial Crimes Future (To be)

12

Support initial anomaly detection activities of Federal Government and its major partners •  Currently receives cluttered raw data in varied forms, mostly via Excel files •  If appropriate, major partners could link to the WireVis system to immediately •  This would enable uniform data creation and sharing methods for the US and its major partners

Apply the WireVis tool to enable faster organization, analysis, and understanding of financial crimes data (minutes vs. days) •  WireVis was developed to investigate money-laundering & fraud; can be applied to everything from risk analysis to financial business intelligence •  This tool was developed by UNCC with the support of S&T’s Office of University Programs and Bank of America •  Supports highly interactive exploration from a grand overview to particular cases

Determining responsible party(ies) would remain the job of the US experts, but WireVis could help them sift through the large amounts of data in a more timely and effective data to support conclusion finding

Prosecuting responsible parties

1  

2  

3  

Return on Investment Annual capability and/or efficiency improvements:

•  Process more data in a more cost and time effective manner •  Supports better understanding of criminal network strategies and

practices, which could lead to more arrests and prosecutions

Millions of U.S. dollars saved by disrupting the communities

responsible for large-scale financial crimes around the world

4  

U.S

.

Ongoing      

Collec+on/Analysis  

Financial crime occurs

Organize  and  analyze  data  across  mul+ple  details  (e.g.  loca+on,  +me,  iden++es,  etc.)  

Determine  responsible  par+es,  collect  evidence  

for  prosecu+on    

3  Prosecute  responsible  

par+es  

4  

Con

tract

ors

Receive  raw  financial  crime  data  

T = minutes Tools to help financial institutions identify

events Apply  WireVis  to  integrate  data  into  transac+onal  database  with  visualiza+on  interface  for  analysis  and  data  

manipula+on  

2  Anomaly  detec+on  within  financial  transac+ons  

1  

Page 13: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

13

Impact on Operations

Collect Fraud Data

Financial Fraud Investigation (As-Is)

Analyze Data (Patterns-Trends)

Extract Data & Upload (Transaction and IP Tags)

0 Day Time = Days

Time = Weeks

Share & Disseminate

Days+60

Financial Fraud Investigation (To Be)

Collect Fraud Data

Extract Data & Upload (Transaction and IP Tags)

Analyze Data (Patterns-Trends)

Visualize (Spatial/tem-poral)

Share & Disseminate

Store (Future pattern recognition)

**Additional Capability

0 Hour

0.25 hours

0.5 hours

0.75 hours

2.00 hours

Page 14: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

14

Return on Investment (ROI)

§  Analyst workload impact §  One case that takes 60 days, using

two analysts:

§  As is: 480 analyst hours x $32.93/hr = $15,806

§  To be: 4 analyst hours x $32.93/hr = $131.72

§  Using some of the FY 11 data on case loads:

§  125 disaster fraud investigations opened, 1000+ ongoing

§  Undisclosed # of mortgage fraud investigations

§  Undisclosed # of Electronic crimes cases, including financial fraud

§  Very roughly 250 cases x $15K labor savings per case = ~$3.75M in analyst hours in first year

Assumes 13% rise in case load each year

Page 15: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

§  VASA – Visual Analytics for Security Applications §  Joint Germany-US research program §  Three-year effort with 10 partner institutions §  Focus is on Critical Infrastructures

§  Understanding and Disrupting the Economics of Cybercrime §  DHS/S&T Cyber Security Division program funded through White House

CNCI §  Carnegie-Mellon University is prime §  Focus is twofold: identifying disincentives and understanding criminal

behaviors

§  LINEBACkER – Line-speed Bio-inspired Analysis and Characterization for Event Recognition §  Another White House CNCI initiative §  Prime is Pacific Northwest National Laboratory §  Focus is distributed, early warning of possible cyber attacks

Cybersecurity Foundations

15

Page 16: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

16

Visual and Data Analytics for Cybersecurity

Heterogeneous Data - Diverse - Diffuse - Distributed

Situational Awareness

+ Predictive Insights

Analysis Analytic Tool Description Performer Investment to Date Status

Traffic Circle and Clique

Cyber threats analysis tools for network attacks

PNNL Basic capability funding, $2.75M US CERT and other implementation, $1.5M

Pilot deployment at US CERT

In-spire Text analysis tool now being applied to cyber threat data

PNNL/U Ill UC Basic capability, $3.5M Cyber threat application, $200k

Deployed @NBIC

Precision Information Environment (PIE)

Situational awareness and decision-making for large-scale, multi-agency emergency response actions

PNNL $2.25M Deployed w/FEMA Region X

GREEN Suite Near real-time analysis of threats, risks vulnerabilities of Power Grid

PNNL $3.5M S&T funding. Co funding from IC, DOE, and Bonneville Power

Deployed at Bonneville Power Admin

WireVis Financial fraud analytical tool

UNCC $1.75 S&T (COE) funding, $5M Bank of America (BofA) funding

Deployed at BofA

Page 17: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

17

Technical Approach

Analytic Tool Data Inputs Data Outputs Capacity (As Is)

Capacity (to Be)

Wire Viz Structured, Semi Structured financial transaction data

Heatmap (Clustering), Similarity based comparison, temporal relationships

Query up to 10M data points/minute

Modify to work for more general analysis applications

CLIQUE Structured User defined temporal views, anomalous behavior

Up to 100M records per query (laptop)

Higher throughput possible when used with other appliances (Neteeza)

PIE Structured, Unstructured (Text) Event tracking, Collaborative environment

Up to 400k data points Up to 1M data points, add resource modeling and tasking, video &image

Green Suite Multiple types of structured & unstructured data Geospatial (Location-connection) Physics (voltage-Current) Telecom (Phone calls, text messages)

Link analysis, network analysis

Up to 1M nodes (desktop) Up to 100M link queries/30 mins

Scale to 1B link queries/30 mins

Page 18: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

18

Program Timeline & Cost

Yr 4 Yr 5$8M $3M

0 Yr 1 Yr 2 Yr 3$6M $7M

Data Analysis for Cybersecurity

Big  Data  Analytics  for  Cyber  Security  Time:  2  YRS  Cost:  $6M  1)  Common  Integrated  analytical  platform2)  Near  real  time  processing  of  big  data  sets3)  Collection,  Processing,  Analysis,  Sharing

Homeland  Security  Data  Analytics  Network  Time:  2  YRSCost:  $7M1)  Nationwide  Computational  and  analysis  network  for  real  time  insight  into  the  cyber  threat  environment2)  Analogous  to  the  Big  Science  networks  in  place  for  grand  scientific  challenges

Next  Gen  data  and  Interoperability  for  Cyber-­‐Disaster  Management    Time:  3  YRSCost:  $11M  (Yr  1,  $8M,  Yr  2  $3M)  1)  Establish/Maintain  public-­‐private  partnership  w/joint  funding2)  Nationwide  Computational  and  analysis  network  for  real  time  insight  into  the  cyber  threat  environment3)  Address  multiple,  cascading  emergency  response  scenarios  for  interdependent  critical  infrastructures  4)  Establish  education  complex  with  competitions  and  challenges5)    Addresses  economics  of,  insurance  for  and  risks  of  cyber  threats

Page 19: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

CASe Phase I

Development: Deliver an advanced data analysis tool to DHS and one financial institution in support of DHS financial fraud and electronic crime missions.

PIE

CLIQUE

WireVis

Green Suite

Other

Identified

Data

Network Activity Analysis

tool

Business processes

tool for financial

data

Massive scale link analysis

tool

(e.g. Insurance fraud, ATM

locations, card transactions)

Situational Awareness and Collaboration

Example problem - 1 financial crime event over the course of 1 week: 42,000 transactions = a loss of $39M

Example Use Case: Financial Crime Event

Financial Crime Event:

42,000 transactions over 1 week

Loss of $39M

Current Analysis:

4-weeks to organize & analyze giant spreadsheet data

On average, 80% of analysts time is spent organizing, not analyzing data

Program Plan: 1.  Concept development/Requirements Analysis: Observe

& document DHS and financial institution analytic processes& operational Needs

2.  Design: AoA, Tech Foraging, Development strategy 3.  Development: Tailor/modify existing tools to meet

different requirements for DHS and Financial Institution, increase application capacities

4.  Development testing. Test each application throughout development phase

5.  Integration: Integrate CLIQUE, WireVis, Green Suite and PIE

6.  Integration Testing 7.  Pilot Deployment; Deploy integrated suite for piloting at

both DHS and commercial facility 8.  Finalize S/W and Documentation for customer

acceptance and transition 9.  Transition

CASe

Page 20: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

20M 22M 2 YR10M 1 YR 14M 16M 18MT = 0M 2M 4M 6M 8M

Phase 1: Data Analytics Deployment to USSS and Commercial Bank

PNNL   IA  Awarded  (Mod)

UNCC  Grant  Awarded

Beta Deployment$120K

Final S/W  and  document  Production$150K Transition

Development  1) Tailoring S/W  based  on  analyst/business  processes   (e.g.  Wire  Vis  for  USSS)2)  Increasing  capacity  for  PIE3)  Additonal  data  type  handling  for  PIE$1.68M

Development  Testing$540K

Integration  -­‐ WireViz, CLIQUE,  GREEN  Suite  &  PIE$1.2M

Concept  Development$300K

Req  Analysis1)  Interviews2)  Review  &  documentanalytic  proc-­‐esses$450K

Design$300K

IntegrationTesting$660K

Project  Approved

StrategyApprovedGo/No  Go

Dev  ReviewGo/No  Go

IntegrationReviewGo/No  Go

Dev ReviewGo/No  Go

CASe Phase 1 Timeline

20

Phase I: Data Analytics Deployment to DHS and Commercial Bank

DHS

Page 21: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

21

§  CASe Phase 1: Data Analysis Capability for Financial Sector

§  This program element delivers a standalone system addressing localized to nation-wide to global financial crimes and cyber attacks by individuals and criminal networks

Technical Approach

Phase Activities Outputs Cost Concept Development

•  Preliminary analysis •  Initial transition strategy

•  Operational Needs statement $300K

Requirements Analysis

•  Define KPP, interfaces, technical requirements

•  FRD •  Tech Reqs Document •  CONOPS (if required)

$450K

Design •  Individual design features •  Integrated design features

•  AOA Documentation •  Sys Design Document

$900K

Development •  Tailor PIE, WireViz, CLIQUE, Green Suite to desired reqs

•  Testable s/w modules & capabilities

$1680K

Development Testing •  Analysis tool testing •  Test plan/test report $540K

Integration & Testing •  Test integrated capability of 4 tools

•  Test plan/test report $1860K

Pilot deployment •  Deploy pilot •  S/W fixes $120K

Transition Prep •  Final S/W builds •  C&A Support

•  User/Admin guide $150K

Page 22: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

CASe Phase II

22

Development: Build a distributed, networked capability and incorporate additional analysis capabilities as required by respective infrastructures Expand CASe’s analysis capabilities by incorporating other tools, namely: •  An investigative tool for modeling

attacker behavior and criminal supply chains, and

•  A modeling capability for understanding interdependencies of critical infrastructure

Develop a networked capability to access the CASe capability from distributed locations, effectively: •  Capitalizing on larger available

computing infrastructure •  Increasing information sharing/

access •  Better understanding of the

sector’s data analysis needs, requirements, and future challenges

PIE

CLIQUE

WireVis

Green Suite

Other

Data

CMU

VASA

Attacker behavior & cyber crime supply chain

models

Critical Infrastructure

inter-dependencies CASe

Page 23: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

23

CASe Phase 2 Timeline

Phase 2: Homeland Security Analysis Network

2 YR1 YR 14M 16M 18M 20M 22M T = 0M 2M 4M 6M 8M 10M

BetaDeployment$280K

Final S/W  and  document  Production$140K Transition

Development  1) Based   ob  BAA  topic  awards,  development  will  be  required  for  technologies  to  support  distributed  processing,  storage  retrieval2)  Tailoring  CMU  models3)  tailoring  WASA  capabilities$1.47M Development  

Testing$385K

Integration  -­‐ WASA capability  and  Carnegie  Mellon  Cybercrime  models  into  CASe  baseline$2.52M

Concept  Development$350K

Req  Analysis1)  Interviews2)  Review  &  documentanalytic  proc-­‐esses$245K

Design$700K

IntegrationTesting  $840K

Project  Approved

StrategyApprovedGo/No  Go

Dev  ReviewGo/No  Go

IntegrationReviewGo/No  Go

Dev ReviewGo/No  Go

Acquistion  Apporach  -­‐ BAA  or  fast  track  acq  processAwards  made

Phase 2: Homeland Security Analysis Network

Page 24: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

24

§  Phase 2: Homeland Security Data Analysis Network

§  This program element aims to deliver a networked capability allowing distributed toolsets and data sources to be readily applied to address large-scale, distributed attacks on multiple infrastructure sectors. The end state will be a network for cybersecurity threat analysis and information sharing, analogous to the Big Science research networks.

Technical Approach

Phase Activities Outputs Cost Concept Development •  Preliminary analysis

•  Initial transition strategy •  Operational Needs statement $350K

Requirements Analysis •  Define KPP, interfaces, technical requirements

•  Multiple sector analysis

•  FRD •  Tech Reqs Document •  CONOPS (if required)

$245K

Design •  Individual design features •  Integrated design features

•  AOA Documentation •  Sys Design Document

$700K

Development •  Technologies to support distributed processing, storage retrieval

•  Tailoring CMU models •  Tailoring WASA capabilities

•  Testable s/w modules & capabilities $1470K

Development Testing •  Analysis tool testing •  Test plan/test report $385K

Integration & Testing •  Test integrated capability of 2 additional tools plus testing of distributed model

•  Test plan/report $3360K

Pilot deployment •  Deploy pilot to 2 sectors (ISACs) •  S/W fixes $280K

Transition Prep •  Final S/W builds •  C&A Support

•  User/Admin guide $140K

Page 25: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

CASe Phase III

25

Development: Advance public-private partnerships by engaging a number of committees and councils with missions to improve and secure the banking and financial sector “To continue to improve the resilience and availability of financial services,

the Banking and Finance Sector will work through its public-private partnership to address the evolving nature of threats and the risks posed

by the sector’s dependency upon other critical sectors.” – National Infrastructure Protection Plan, Banking and Finance Sector

Financial and Banking Information Infrastructure Committee (FBIIC)

Financial Services Sector Coordinating Council (FSSCC) for Critical Infrastructure Protection and Homeland Security (CIP/HLS)

Financial Services-Information Sharing and Analysis Center (FS-ISAC)

Others?

Sector & Global Dependencies: •  The Department of the Treasury and the

FBIIC have identified four important sector dependencies with the Banking and Financial Sector (BFS):

1.  Energy 2.  Information Technology 3.  Transportation Systems 4.  Communications

•  The BFS relies on an extensive and

complex supply chain, often reaching to providers outside the U.S., including third-party providers.

•  The international nature of financial services markets and the cross-border interdependencies in financial infrastructure require close cooperative relationships with public-private sector organizations in major markets around the world.

•  These relationships will ensure a coordinated approach to financial infrastructure protection around the globe.

Page 26: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

26

§  Desired end state: Data analytics capability, to include deployment at multiple nodes across the US, will help meet the requirements for increased situational awareness, information sharing and analysis and improved decision support required in PPD-21

End State and Impact

PPD 21 Directive Data Analysis for Cybersecurity will: Provide a situational awareness capability that includes integrated, actionable information about emerging trends, imminent threats

•  Through nationwide network and public-private partnership, enable consistent collection and analysis of cyber-risk data from a range of government, industry, commercial, and internal sources to gain a more complete understanding of threats, risks and exposures

•  Provide predictive insights into actual conditions within and across multiple IT environments, including insight that can identify anomalous behavior

•  Enable the storage, retrieval and analysis of massive historical data sets in real time to identify anomalous activity

In conjunction with the SSAs and other Federal Departments and agencies, provide analysis, expertise and other technical assistance to CI owners and operators

•  Establish public-private partnership to support increased information sharing between public and private domains

•  Increased analytical capability to ISACs and other public-private nodes through technology and educational/competition component

Support the Attorney General and law enforcement agencies with their responsibilities to investigate and prosecute threats to and attacks against critical infrastructure

•  Predictive insights can better guide allocation of scare investigative resources

•  Discover of non obvious relationships for both investigative and prosecutorial purposes

Integration and analysis function to inform planning and operational decisions regarding CI

•  Allow modeling & simulation for multiple, complex and cascading emergency response scenarios for interdependent critical infrastructures

•  Provide economic risk and analysis

Page 27: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

27

§  Technical approach §  Sample analysis of who is addressing which parts of the digital data analysis lifecycle

through their respective R&D programs:

§  Note: No agencies currently examining data analysis for the unclassified cybersecurity domain across multiple steps in the analysis process

Technical Aspects

Source: Harnessing the Power of Digital Data for Science and Society, January 2009, Interagency Working Group on Digital Data , National Science and Technology Council

Agency Programs Collect Curate Store Search Retrieve Analyse Visualize ShareDARPA - Anamoly Detection at Multiple Scales (ADAMS) X

DARPA - Video Image Retrieval and Analysis Tool (VIRAT) X XDARPA - Cyber Insider Threat (CINDER) XDOE - High Performance Storage System (HPSS) XNASA Earth Observing System Data and Information System (EOSDIS) X X X X

NSF - BIGDATA X X XCSD - Data Analysis for Cybersecurity X X X X X

Digital Data Life Cycle Steps*

Page 28: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

28

§  Limited data available for analysis for Cybersecurity applications, however:

§  McKinsey Global Institute report found that up to $200B in savings could be realized in the US Health Care Market through implementing big data analytic approaches to develop or improve:

§  Comparative effectiveness research (CER)

§  Clinical decision support system

§  Predictive modeling

§  Clinical trial design

§  McKinsey report also estimated potential operating cost savings of 10-25% the manufacturing sector by using sensor data–driven operations analytics. This has obvious analogies in cybersecurity, where massive data sets available from the sensor grid of the “Internet of Things” could provide near real time indications and warnings

Return on Investment

*Source: Big Data: The Next Frontier for innovation, competition and productivity, McKinsey Global Institute, June 2011

Page 29: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

29

§  Potential Partnerships §  Strategic:

§  Agency level partnerships with Department of Commerce (NIST), Department of Justice, and Intelligence Community per Executive Order and PPD-21

§  NITRD chartered Big Data Senior Steering Group. Current focus is climate change modeling, materials genome, and health records

§  National "Big Data Initiative“ comprising six Federal departments and agencies committing more than $200 million to big data research projects

§  Tactical/Departmental Level:

•  CBP CISO, NPPD/CS&C/US CERT, DHS CIO, USSS

•  DHS component requirements, as expressed in the “Big Data” Steering Committee report

§  Commercialization plan: The desired end state is a public-private partnership technologies developed and integrated will be used in a public-private partnership

Success and Transition

Page 30: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

§  Acquisition Strategy §  Type of performer (national lab, academic institution, private sector)

•  Initial efforts with National Laboratories and Center of Excellence schools

•  BAA would be the preferred, long-term solicitation vehicle; LRBAA Topic CSD.17 Open

§  For each program element, plan to release BAA outlining the research requirements for each

§  Cost sharing will be in the form of component or customer participation in technology pilots, analytic and design reviews, requirements integration

§  Deliverables will be based on each contract and will vary by program element

Program Management

30

Page 31: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

31

§  Technical risks to this project §  Pace of data growth could outstrip traditional R&D timeline (Mitigation: Explore Cyber

Fast Track (CFT) like project; discussions underway between CSD, DHS OPO & DARPA)

§  Analyzing, measuring, and ranking the provenance of massive heterogeneous data sources

§  Policy risks §  Information sharing policies could inhibit effectiveness of sharing process

§  Privacy concerns around the topic of data gathering and analysis

§  Working in concert with CSD Data Privacy Technology program

Technical & Policy Risks

Page 32: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

•  Applies  modern  informa+cs  and  decision-­‐making  techniques  to  issues  of  cybersecurity  and  cri+cal  infrastructure  protec+on  

•  Develops  and  implements  analy+cs  capability  for  maintaining  situa+onal  awareness  of  and  managing  widespread,  catastrophic  cyber  emergencies  

•  Addresses  na+onal-­‐level  requirements  outlined  in  the  recent  cybersecurity  Execu+ve  Order  and  Presiden+al  Policy  Direc+ve  

•  Structured  as  three-­‐phase  program,  beginning  with  an  agency  mission-­‐specific  need  and    ending  with  a  public-­‐private  partnership  

•  Enables  return  on  investment  to  be  determined  at  conclusion  of  each  phase  •  Relies  on  available  technologies:  a  few,  basic  capabili+es  with  known  

performance  specifica+ons  •  Will  be  pursued  through  BAAs  •  Looking  for  partners    

 

   

32

Summary - CASe

Page 33: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

•  Cyber  Health  •  LINEBACkER  •  Anomaly  Detec+on  •  Reputa+on-­‐based  Security  

•  Cyber  Resilience  –  Workshop:  17-­‐18  November  2014  –  US-­‐UK  Collabora+on  –  Book  to  be  published  Fall  2015  

•  Cyber  Iden+ty  –  Workshop:  30  June  –  1  July,  Rutgers  University  –  Na+onal  Conversa+on  

33

PART 3: Looking to the Future

Page 34: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

•  Theme  1  –  Securing  Infrastructure  from  Cyber  Disrup+ons  –  Situa+onal  Awareness  for  Resilient  Cyber  Infrastructures  –  Architecture  and  Design  for  Resilient  Systems  –  Understanding  the  Unique  Aspects  of  the  restora+on/Recovery  of  

Infrastructure  and  Social  Networks  from  Cyber  Aaacks  •  Theme  2  –  Modeling  and  Measuring  Societal  Resilience  

–  Cyber  Threats  and  the  Percep+on  of  a  Dependent  Society  –  Dynamic  Topologies  of  Responsibili+es  and  Governance  in  a  Cyber-­‐

disabled  Era  –  Cyber  Threats  in  the  Context  of  a  Challenged  and  Changing  World  Order  

•  Theme  3  –  Streaming  Analy+cs  for  Effec+ve  Data  Exploita+on  –  Cyber  Decision-­‐making  in  the  Presence  of  Noisy,  Voluminous  Data,  Using  

Gold-­‐Standard  Analogies  –  Privacy-­‐preserving  Informa+on  Sharing  –  Modeling,  Monitoring,  and  Recognizing  Poten+ally  Dangerous  Changes  to  

Cyber  Situa+ons  –  Cascading  Impact:  Mul+dimensional  Analysis  of  Infrastructural,  Societal,  

and  Vulnerability  Networks    34

Cyber Resilience

Page 35: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

•  Theme  1  –  Iden+ty  Proofing  in  the  Era  of  Social  Media  and  Data  Breaches  

•  Theme  2  –  Provenance  for  the  “Internet  of  Things”  •  Theme  3  –  Metrics  for  Trust  

Par$cipa$on  in  the  Workshop  and  Na$onal  Conversa$on  Town  Mee$ng  is  Welcome  

35

Cyber Identity

Page 36: Data Analytics: Dealing with Cyber Storms · Data Analytics: Dealing with Cyber Storms ... applying it to the cyber security environment ! ... Visual and Data Analytics for Cybersecurity

Ques+ons  or  comments:  [email protected]  

36