bridging the gap between privacy and big data ulf mattsson - protegrity sep 10

52
Bridging the Gap Between Privacy and Big Data Ulf Mattsson, CTO Protegrity ulf.mattsson AT protegrity.com

Upload: ulf-mattsson

Post on 15-Jan-2015

302 views

Category:

Technology


3 download

DESCRIPTION

Big Data systems like Hadoop provide analysis of massive amounts of data to open up “Big Answers”, identifying trends and new business opportunities. The massive scalability and economical storage also provides the opportunity to monetize collected data by selling it to a third party. However, the biggest issue with Big Data remains security. Like any other system, the data must be protected according to regulatory mandates, such as PCI, HIPAA and Privacy laws; from both external and internal threats – including privileged users. So how can we bridge the gap between access to vast amounts of data, and security of more and more types of data, in this rapidly evolving new environment? In this webinar, Ulf Mattsson explores the issues and provide solutions to bring together data insight and security in Big Data. With deep knowledge in advanced data security technologies, Ulf explains the best practices in order to safely unlock the power of Big Data.

TRANSCRIPT

Page 1: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Bridging the Gap Between Privacy and Big Data

Ulf Mattsson, CTO

Protegrity

ulf.mattsson AT protegrity.com

Page 2: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

2

20 years with IBM • Research & Development & Global Services

Inventor • Encryption, Tokenization & Intrusion

Prevention

Member of• PCI Security Standards Council (PCI SSC)

• American National Standards Institute (ANSI) X9

• Encryption & Tokenization

• International Federation for Information Processing

• IFIP WG 11.3 Data and Application Security

Ulf Mattsson, CTO Protegrity

Page 3: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Proven enterprise data security software and innovation leader

• Sole focus on the protection of data

Growth driven by risk management and compliance

• PCI (Payment Card Industry)

• PII (Personally Identifiable Information)

• PHI (Protected Health Information) – HIPAA

• State and Foreign Privacy Laws, Breach Notification Laws

Successful across many key industries

Introduction to Protegrity

3

Page 4: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Agenda

Big Data adoption rate

Security holes and threats to data

Privacy regulations

New data protection techniques

A data protection methodology

Best practices for protecting big data

4

Page 5: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

5

Bridging the Gap Between Privacy and Big Data

Page 6: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

6

DataMonetization

CustomerSupport

CustomerProfiles

Sales &Marketing

SocialMedia

BusinessImprovement

Big Data

Regulations& Breaches Increased

profits

Increased profits

Increased profits

Increased profits

Increased profits

Balancing Security and Data Insight

Page 7: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

BIG DATA ADOPTION AND HOLES

7

Page 8: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

8

Source: Gartner

Has Your Organization Already Invested in Big Data?

Page 9: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

In 2012, Gartner predicted Big Data to be majority adopted in 2 to 5 years.

In 2013, Gartner updated this prediction to 5 to 10 years…

Why has adoption been slower than expected?

Big Data Adoption Statistics

9

Are they investing?

Source: Gartner

Page 10: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Holes in Big Data…

10

Source: Gartner

Page 11: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Balancing Security and Data Insight

Tug of war between security and data insight

Big Data is designed for access

Privacy regulations require de-identification

Granular data-level protection

Traditional security don’t allow for seamless data use

11

Page 12: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Many Ways to Hack Big Data

Source: http://nosql.mypopescu.com/post/1473423255/apache-hadoop-and-hbase

12

HDFS(Hadoop Distributed File System)

MapReduce (Job Scheduling/Execution System)

Hbase (Column DB)

Pig (Data Flow) Hive (SQL) Sqoop

ETL Tools BI Reporting RDBMS

Avr

o (S

eria

lizat

ion)

Zoo

keep

er (

Coo

rdin

atio

n)

Hackers

PrivilegedUsers

UnvettedApplications

OrAd Hoc

Processes

Page 13: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Big Data and The Insider Threat

13

Page 14: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

BALANCING SECURITY AND DATA INSIGHT

14

Page 15: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Reduction of Pain with New Protection Techniques

15

1970 2000 2005 2010

High

Low

Pain& TCO

Strong EncryptionAES, 3DES

Format Preserving EncryptionDTP, FPE

Vault-based Tokenization

Vaultless Tokenization

Input Value: 3872 3789 1620 3675

!@#$%a^.,mhu7///&*B()_+!@

8278 2789 2990 2789

8278 2789 2990 2789

Format Preserving

Greatly reduced Key Management

No Vault

8278 2789 2990 2789

Page 16: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

HOW IS TOKENIZATION BRIDGING THE GAP?

17

Page 17: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

De-Identified Sensitive Data Field Real Data Tokenized / Pseudonymized

Name Joe Smith csu wusoj

Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA

Date of Birth 12/25/1966 01/02/1966

Telephone 760-278-3389 760-389-2289

E-Mail Address [email protected] [email protected]

SSN 076-39-2778 076-28-3390

CC Number 3678 2289 3907 3378 3846 2290 3371 3378

Business URL www.surferdude.com www.sheyinctao.com

Fingerprint Encrypted

Photo Encrypted

X-Ray Encrypted

Healthcare Data – Primary Care Data

Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc.

Protection methods can be equally applied to the actual healthcare data, but not needed with de-identification

18

Page 18: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Enterprise Flexibility in Token Format ControlsType of Data Input Token Comment

Credit Card 3872 3789 1620 3675 8278 2789 2990 2789 Numeric

Credit Card 3872 3789 1620 3675 3872 3789 2990 3675 Numeric, First 6, Last 4 digits exposed

Credit Card 3872 3789 1620 3675 3872 qN4e 5yPx 3675 Alpha-Numeric, Digits exposed

Account Num 29M2009ID 497HF390D Alpha-Numeric

Date 10/30/1955 12/25/2034 Date - multiple date formats

E-mail Address [email protected] [email protected] Numeric, delimiters in input preserved

SSN 075672278 or 075-67-2278 287382567 or 287-38-2567 Numeric, delimiters in input

Binary 0x010203 0x123296910112

Decimal 123.45 9842.56 Non length preserving

Alphanumeric Indicator

5105 1051 0510 5100 8278 2789 299A 2781 Position to place alpha is configurable

Invalid Luhn 5105 1051 0510 5100 8278 2789 2990 2782 Luhn check will fail

Multi-Merchant or Multi-Client ID

3872 3789 1620 3675ID 1: 8278 2789 2990 2789ID 2: 9302 8999 2662 6345

This supports delivery of a different token to different merchant or clients based on the same credit card number.

19

Page 19: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Research Brief

“Tokenization Gets Traction”Aberdeen has seen a steady increase in enterprise use of tokenization

Alternative to encryption

47% using tokenization for something other than cardholder data (PII and PHI)

Tokenization users had 50% fewer security-related incidents

20 Author: Derek Brink, VP and Research Fellow, IT Security and IT GRC

Page 20: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Protection Granularity: Encryption of Fields

21

Production SystemsEncryption • Reversible• Policy Control (authorized / Unauthorized Access)• Lacks Integration Transparency• Complex Key Management• Example !@#$%a^.,mhu7///&*B()_+!@

Page 21: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Protection Granularity: Encryption vs. Masking

22

Production Systems

Non-Production Systems

Encryption of fields• Reversible• Policy Control (authorized / Unauthorized Access)• Lacks Integration Transparency• Complex Key Management• Example

Masking of fields• Not reversible• No Policy, Everyone can access the data• Integrates Transparently• No Complex Key Management• Example 0389 3778 3652 0038

!@#$%a^.,mhu7///&*B()_+!@

Page 22: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Protection Granularity: Tokenization of Fields

23

Production Systems

Non-Production Systems

Vaultless Tokenization / Pseudonymization

• No Complex Key Management• Business Intelligence

• Reversible • Policy Control (Authorized / Unauthorized Access)

• Not Reversible• Integrates Transparently

Credit Card: 0389 3778 3652 0038

Page 23: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

24

1. Classify 2. Discovery 3. Protect 4. Enforce 5. Monitor

Five Point Data Protection Methodology

Page 24: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

25

1Classify

Determine what data is sensitive to your organization.

Page 25: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Financial Services

Healthcare and Pharmaceuticals

Infrastructure and Energy

Federal Government

Select US Regulations for Security and Privacy

26

Page 26: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Privacy Laws

54 International Privacy Laws

30 United States Privacy Laws

27

Page 27: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

1. Classify: Examples of Sensitive Data

28

Sensitive Information Compliance Regulation / Laws

Credit Card Numbers PCI DSS

Names HIPAA, State Privacy Laws

Address HIPAA, State Privacy Laws

Dates HIPAA, State Privacy Laws

Phone Numbers HIPAA, State Privacy Laws

Personal ID Numbers HIPAA, State Privacy Laws

Personally owned property numbers HIPAA, State Privacy Laws

Personal Characteristics HIPAA, State Privacy Laws

Asset Information HIPAA, State Privacy Laws

Page 28: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

PCI DSS Build and maintain a secure network.

1. Install and maintain a firewall configuration to protect data

2. Do not use vendor-supplied defaults for system passwords and other security parameters

Protect cardholder data. 3. Protect stored data4. Encrypt transmission of cardholder data and

sensitive information across public networks

Maintain a vulnerability management program.

5. Use and regularly update anti-virus software6. Develop and maintain secure systems and

applications

Implement strong access control measures.

7. Restrict access to data by business need-to-know

8. Assign a unique ID to each person with computer access

9. Restrict physical access to cardholder data

Regularly monitor and test networks.

10. Track and monitor all access to network resources and cardholder data

11. Regularly test security systems and processes

Maintain an information security policy.

12. Maintain a policy that addresses information security

29

Page 29: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Protection of cardholder data in memory

Clarification of key management dual control and split knowledge

Recommendations on making PCI DSS business-as-usual and best practices

Security policy and operational procedures added

Increased password strength

New requirements for point-of-sale terminal security

More robust requirements for penetration testing

PCI DSS 3.0

30

Page 30: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

HIPAA PHI: List of 18 Identifiers

1. Names

2. All geographical subdivisions smaller than a State

3. All elements of dates (except year) related to individual

4. Phone numbers

5. Fax numbers

6. Electronic mail addresses

7. Social Security numbers

8. Medical record numbers

9. Health plan beneficiary numbers

10. Account numbers

31

11. Certificate/license numbers

12. Vehicle identifiers and serial numbers

13. Device identifiers and serial numbers

14. Web Universal Resource Locators (URLs)

15. Internet Protocol (IP) address numbers

16. Biometric identifiers, including finger prints

17. Full face photographic images

18. Any other unique identifying number

Page 31: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Make business associates of covered entities directly liable for compliance

Strengthen the limitations on the use and disclosure of protected health information for marketing purposes

Prohibit the sale of protected health information without individual authorization.

Expand individuals’ rights to receive electronic copies of their health information

HIPAA Omnibus Final Rule – Some Examples

32

Page 32: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

33

2Discovery

Discover where the sensitive data is located and how it flows

Page 33: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

2. Discovery in a large enterprise with many systems

034

System

System

Corporate Firewall

System

System

System

System

System

034

System

System

System System

System

System

Page 34: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

2. Discovery: Determine the context to the Business

035

System

System

Corporate Firewall

System

System

035

System

System

System

Retail

Employees

Corporate IP

Healthcare

Page 35: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

2. Discover: Context to the Business and to Security

036

Stores & Ecommerce

Corporate Firewall

Applications

Research Databases

Data Protection Solution Requirements

File Server containing IP

File Server

Databases

Hadoop

Collecting transactions

Page 36: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

37

3Protect

Protect the sensitive data at rest and in transit.

Page 37: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Protection Beyond Kerberos

38

Volume Encryption

Field level data protection for existing and new data.

API enabled Field level data protection

API enabled Field level data protection

HDFS

(Hadoop Distributed File System)

MapReduce (Job Scheduling/Execution System)

Hbase (Column DB)

Pig (Data Flow) Hive (SQL) Sqoop

ETL Tools BI Reporting RDBMS

Page 38: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Volume Encryption

39

HDFS

MapReduce

Protected with Volume Encryption

Entire file is in the clear when analyzed

Page 39: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

File Encryption – Authorized User

40

HDFS

MapReduce

Protected with File Encryption

Entire file is in the clear when analyzed

Page 40: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

File Encryption – Non Authorized User

41

HDFS

MapReduce

Protected with File Encryption

Entire file is in unreadable when analyzed

Page 41: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Volume Encryption + Gateway Field Protection

42

HDFS

MapReduce

Kerberos AccessControl

Granular Field Level Protection

Protected with Volume Encryption

Data Protection File Gateway

Page 42: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Volume Encryption + Internal MapReduce Field Protection

43

HDFS

MapReduce

Kerberos Access Control

Granular Field Level Protection

Protected with Volume Encryption

Hadoop Staging

MapReduce

Analytics

Page 43: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

44

4Enforce

Policies are used to enforce rules about how sensitive data should be treated in the enterprise.

Page 44: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

A Data Security Policy

45

What is the sensitive data that needs to be protected. Data Element.

How you want to protect and present sensitive data. There are several methods for protecting sensitive data. Encryption, tokenization, monitoring, etc.

Who should have access to sensitive data and who should not. Security access control. Roles & Members.

When should sensitive data access be granted to those who have access. Day of week, time of day.

Where is the sensitive data stored? This will be where the policy is enforced. At the protector.

Audit authorized or un-authorized access to sensitive data. Optional audit of protect/unprotect.

What

Who

When

Where

How

Audit

Page 45: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Volume Encryption + Field Protection + Policy Enforcement

46

HDFSProtected with

Volume Encryption

MapReduce

Data Protection Policy

Page 46: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Volume Encryption + Field Protection + Policy Enforcement

47

HDFSProtected with

Volume Encryption

MapReduce

Data Protection Policy

Page 47: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

4. Authorized User Example

48

Presentation to requestorName: Joe SmithAddress: 100 Main Street, Pleasantville, CA

Selected data displayed (least privilege)

Protection at restName: csu wusojAddress: 476 srta coetse, cysieondusbak, CA

Data Scientist, Business Analyst

PolicyEnforcement

Does the requestor have the authority to access the protected data?

Authorized

Request

Response

Page 48: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

4. Un-Authorized User Example

49

Request

Response

Presentation to requestorName: csu wusojAddress: 476 srta coetse, cysieondusbak, CA

Protection at restName: csu wusojAddress: 476 srta coetse, cysieondusbak, CA

Privileged Used, DBA, System Administrators, Bad Guy

PolicyEnforcement

Does the requestor have the authority to access the protected data?

Not Authorized

Page 49: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

50

5Monitor

A critically important part of a security solution is the ongoing monitoring of any activity on sensitive data.

Page 50: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

Best Practices for Protecting Big Data

Start early

Granular protection

Select the optimal protection

Enterprise coverage

Protection against insider threat

Protect highly sensitive data in a way that is mostly transparent to the analysis process

Policy based protection

Record data access events

51

Page 51: Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10

How Protegrity Can Help

52

1

2

3

4

5

We can help you Classify the sensitive data

We can help you Discover where the sensitive data sits

We can help you Protect your sensitive data in a flexible way

We can help you Enforce policies that will enable business functions and preventing sensitive data from the wrong hands.

We can help you Monitor sensitive data to gain insights on abnormal behaviors.