isaca new delhi india privacy and big data

76
Bridging the Gap Between Privacy and Big Data Ulf Mattsson, CTO Protegrity ulf.mattsson AT protegrity.com

Upload: ulf-mattsson

Post on 12-May-2015

304 views

Category:

Technology


1 download

DESCRIPTION

big data,cloud,isaca,out,privacy laws

TRANSCRIPT

Page 1: Isaca new delhi india   privacy and big data

Bridging the Gap Between Privacy and Big Data

Ulf Mattsson , CTO

Protegrity

ulf.mattsson AT protegrity.com

Page 2: Isaca new delhi india   privacy and big data

20 years with IBM • Research & Development & Global Services

Inventor • Encryption, Tokenization & Intrusion Prevention

Involvement

Ulf Mattsson, CTO Protegrity

2

• PCI Security Standards Council (PCI SSC)

• American National Standards Institute (ANSI) X9

• Encryption & Tokenization

• International Federation for Information Processing• IFIP WG 11.3 Data and Application Security

• ISACA New York Metro chapter

Page 3: Isaca new delhi india   privacy and big data

3

Page 4: Isaca new delhi india   privacy and big data

Agenda

1. What is Big Data & Cloud?

2. Risk & Drivers for Data Security

3. The Evolution of Data Security Methods

4. Data De-Identification

5. Off-Shoring & Outsourcing

6. Use Cases & Case Studies

4

Page 5: Isaca new delhi india   privacy and big data

Who is Protegrity?

Proven enterprise data protection software leader since the 90’s.

Business driven by compliance

• PCI (Payment Card Industry)

• PII (Personally Identifiable Information)

• PHI (Protected Health Information) – HIPAA

• State and Industry Privacy Laws• State and Industry Privacy Laws

Servicing many Industries

• Retail, Hospitality, Travel and Transportation

• Financial Services, Insurance, Banking

• Healthcare

• Telecommunications, Media and Entertainment

• Manufacturing and Government

Page 6: Isaca new delhi india   privacy and big data

Big Data

Page 7: Isaca new delhi india   privacy and big data

Hadoop

• Designed to handle the emerging “4 V’s”

• Massively Parallel Processing (MPP)

• Elastic scale

• Usually Read-Only

• Allows for data insights on massive, heterogeneous data sets

What is Big Data?

data sets

• Includes an ecosystem of components:

7

Hive

MapReduce

HDFS

Physical Storage

Pig Other

Application Layers

Storage Layers

Page 8: Isaca new delhi india   privacy and big data

Has Your Organization Already Invested in Big Data?

8

Source: Gartner

Page 9: Isaca new delhi india   privacy and big data

Cloud

9

Page 10: Isaca new delhi india   privacy and big data

Services usually provided by a third party

• Can be virtual, public, private, or hybrid

Increasing adoption – up 12% from 2012*

Often an outsourced solution, sometimes cross-border

Allows for greater accessibility of data and low overhead

Cloud Services

*Source: GigaOM

Page 11: Isaca new delhi india   privacy and big data

Cloud Services and Models

Source: NIST, CSA

Page 12: Isaca new delhi india   privacy and big data

Drivers for Data Security

12

Data Security

Page 13: Isaca new delhi india   privacy and big data

Regulations & Laws

• Payment Card Industry Data Security Standard (PCI DSS)

• National Privacy Laws

• Cross-Border & Outsourcing Privacy Laws

Expanding Threat Landscape

• Hackers & APT

Drivers for Data Security

• Hackers & APT

• Internal Threats & Rogue Privileged Users

• Excessive Privilege or Security Negligence

Sensitive Data Insight & Usability

• Unprotected Sensitive or Restricted Data is Unusable for Marketing, Monetization, Outsourcing, etc.

Vulnerabilities in Emerging Technologies

13

Page 14: Isaca new delhi india   privacy and big data

Regulations & LawsLaws

PCI DSS

14

Page 15: Isaca new delhi india   privacy and big data

Founded in 2006, comprised of four major credit card brands

Each card brand enforcement program issues fines, fees and schedule deadlines

• Visa's Cardholder Information Security Program (CISP)http://www.visa.com/cisp

PCI Data Security Standards Council

• MasterCard's Site Data Protection (SDP) programhttp://www.mastercard.com/us/sdp/index.html

• Discover's Discover Information Security and Compliance (DISC) programhttp://www.discovernetwork.com/fraudsecurity/disc.html

• American Express Data Security Operating Policy (DSOP)http://www.americanexpress.com/datasecurity

15

Page 16: Isaca new delhi india   privacy and big data

PCI DSS Build and maintain a secure network.

1. Install and maintain a firewall configuration to protect data

2. Do not use vendor-supplied defaults for system passwords and other security parameters

Protect cardholder data. 3. Protect stored data4. Encrypt transmission of cardholder data and

sensitive information across public networks

Maintain a vulnerability management program.

5. Use and regularly update anti-virus software6. Develop and maintain secure systems and

applicationsapplications

Implement strong access control measures.

7. Restrict access to data by business need-to-know8. Assign a unique ID to each person with computer

access9. Restrict physical access to cardholder data

Regularly monitor and test networks.

10. Track and monitor all access to network resources and cardholder data

11. Regularly test security systems and processes

Maintain an information security policy.

12. Maintain a policy that addresses information security

16

Page 17: Isaca new delhi india   privacy and big data

Protection of cardholder data in memory

Clarification of key management dual control and split knowledge

Recommendations on making PCI DSS business-as-usual and best practices

PCI DSS 3.0

Security policy and operational procedures added

Increased password strength

New requirements for point-of-sale terminal security

More robust requirements for penetration testing

17

Page 18: Isaca new delhi india   privacy and big data

Relevant to all sensitive data that is outsourced t o cloud

1. Clients retain responsibility for the data they put in the cloud

2. Public-cloud providers often have multiple data centers, which may often be in multiple countries or regions

3. The client may not know the location of their data, or the data may

PCI DSS Cloud Guidelines

3. The client may not know the location of their data, or the data may exist in one or more of several locations at any particular time

4. A client may have little or no visibility into the controls

5. In a public-cloud environment, one client’s data is typically stored with data belonging to multiple other clients. This makes a public cloud an attractive target for attackers

18

Page 19: Isaca new delhi india   privacy and big data

Regulations & LawsLaws

National Privacy Laws

19

Page 20: Isaca new delhi india   privacy and big data

National Privacy Laws - USA

1. Names

2. All geographical subdivisions smaller than a State

3. All elements of dates (except year) related to individual

4. Phone numbers

11. Certificate/license numbers

12. Vehicle identifiers and serial numbers

13. Device identifiers and serial numbers

14. Web Universal Resource Locators

Heath Information Portability and Accountability Ac t – HIPAA

4. Phone numbers

5. Fax numbers

6. Electronic mail addresses

7. Social Security numbers

8. Medical record numbers

9. Health plan beneficiary numbers

10. Account numbers

20

14. Web Universal Resource Locators (URLs)

15. Internet Protocol (IP) address numbers

16. Biometric identifiers, including finger prints

17. Full face photographic images

18. Any other unique identifying number

Page 21: Isaca new delhi india   privacy and big data

Privacy Laws

54 International Privacy Laws

30 United States Privacy Laws

21

Page 22: Isaca new delhi india   privacy and big data

Information Technology Act – 2000 (IT Act)• Requires that the corporate body and Data Processor

implement reasonable security practices and standards

• IS/ISO/IEC 27001 requirements recognized

Information Technology Act – 2008 (Amended IT Act)• Damages for negligence and wrongful gain or loss

• Criminal punishment for disclosing Sensitive Personal

National Privacy Laws - India

• Criminal punishment for disclosing Sensitive Personal Information (SPI)

India Privacy Law – 2011• Expanded definition of SPI to passwords, financial data,

health data, medical treatment records, and more

Right to Privacy Bill – 2013 (Proposed)• Increased jail terms & fines for disclosure of SPI

• Addresses data handled for foreign clients

22

Page 23: Isaca new delhi india   privacy and big data

Regulations & Laws

Cross-Border & Outsourcing Laws

23

Page 24: Isaca new delhi india   privacy and big data

The laws of the sending country apply to data sent across international borders, including outsourced operations

• i.e. National Privacy Laws

APEC Cross-Border Privacy Laws

• Non-binding privacy enforcement in Asia-Pacific region

Cross-Border & Outsourcing Laws

• Non-binding privacy enforcement in Asia-Pacific region

24

Page 25: Isaca new delhi india   privacy and big data

Expanding Threat Landscape

Page 26: Isaca new delhi india   privacy and big data

26

Page 27: Isaca new delhi india   privacy and big data

Cyber Criminals Cost India USD 4 Billion

27

Source: Symantec 2013

Page 28: Isaca new delhi india   privacy and big data

28

Page 29: Isaca new delhi india   privacy and big data

29

http://www.ey.com/Publication/vwLUAssets/EY_-_2013_Global_Information_Security_Survey/$FILE/EY-GISS-Under-cyber-attack.pdf

Page 30: Isaca new delhi india   privacy and big data

Sensitive Data Insight &

30

Insight & Usability

Page 31: Isaca new delhi india   privacy and big data

Vulnerabilities in Emerging

31

in Emerging Technologies

Page 32: Isaca new delhi india   privacy and big data

Holes in Big Data…

32

Source: Gartner

Page 33: Isaca new delhi india   privacy and big data

Many Ways to Hack Big Data

MapReduce(Job Scheduling/Execution System)

Pig (Data Flow) Hive (SQL) Sqoop

ETL Tools BI Reporting RDBMS

Avr

o (S

eria

lizat

ion)

Zoo

keep

er (

Coo

rdin

atio

n)

Hackers

UnvettedApplications

OrAd Hoc

Processes

Source: http://nosql.mypopescu.com/post/1473423255/apache-hadoop-and-hbase

33

HDFS(Hadoop Distributed File System)

Hbase (Column DB)

Avr

o (S

eria

lizat

ion)

Zoo

keep

er (

Coo

rdin

atio

n)

PrivilegedUsers

Page 34: Isaca new delhi india   privacy and big data

The Insider Threat

34

Page 35: Isaca new delhi india   privacy and big data

Big Data and Cloud environments are designed for access and deep insight into vast data pools

Data can monetized not only by marketing analytics, but through sale or use by a third party

The more accessible and usable the data is, the

Sensitive Data Insight & Usability

The more accessible and usable the data is, the greater this ROI benefit can be

Security concerns and regulations are often viewed as opponents to data insight

35

Page 36: Isaca new delhi india   privacy and big data

Big Data (Hadoop) was designed for data access, not security

Security in a read-only environment introduces new challenges

Massive scalability and performance requirements

Big Data Vulnerabilities and Concerns

Sensitive data regulations create a barrier to usability, as data cannot be stored or transferred in the clear

Transparency and data insight are required for ROI on Big Data

36

Page 37: Isaca new delhi india   privacy and big data

Public cloud security is often not visible to the client, but client is still responsible for security

Greater access to shared data sets by more users creates additional points of vulnerability

Data redundancy for high availability, often across multiple data centers, increases vulnerability

Cloud Vulnerabilities and Concerns

multiple data centers, increases vulnerability

Virtualization can create numerous security issues

Transparency and data insight are required for ROI

37

How do you lock this?

Page 38: Isaca new delhi india   privacy and big data

Security Improving but We Are Losing Ground

38

Page 39: Isaca new delhi india   privacy and big data

Breach Discovery Methods

39

Verizon 2013 Data-breach-investigations-report

Page 40: Isaca new delhi india   privacy and big data

The Evolution of Data Security Data Security

Methods

40

Page 41: Isaca new delhi india   privacy and big data

Coarse Grained Security

• Access Controls

• Volume Encryption

• File Encryption

Fine Grained Security

Evolution of Data Security Methods

Time

Fine Grained Security

• Access Controls

• Field Encryption (AES & )

• Masking

• Tokenization

• Vaultless Tokenization

41

Page 42: Isaca new delhi india   privacy and big data

Use of Enabling Technologies

1%

18%

30%

21%

91%

47%

35%

39%

Access controls

Database activity monitoring

Database encryption

Backup / Archive encryption 21%

28%

7%

22%

39%

28%

29%

23%

Backup / Archive encryption

Data masking

Application-level encryption

Tokenization

Evaluating

42

Page 43: Isaca new delhi india   privacy and big data

Old and flawed:

Minimal access

levels so people

can only carry

Access Control

Risk

High –

can only carry

out their jobs

43

AccessPrivilege

LevelI

High

I

Low

Low –

DC6

Page 44: Isaca new delhi india   privacy and big data

Slide 43

DC6 I have no idea what this graph is supposed to representDaniel Crum, 11/6/2013

Page 45: Isaca new delhi india   privacy and big data

Applying the protection profile to the content of data fields allows

for a wider range of authority for a wider range of authority options

44

Page 46: Isaca new delhi india   privacy and big data

Risk

High –

Old:

Minimal access

levels – Least New:

Much greater

How the New Approach is Different

AccessPrivilege

LevelI

High

I

Low

Low –

levels – Least

Privilege to avoid

high risks

Much greater

flexibility and

lower risk in data

accessibility

45

Page 47: Isaca new delhi india   privacy and big data

Reduction of Pain with New Protection Techniques

High

Pain& TCO

Strong Encryption Output:AES, 3DES

Format Preserving EncryptionDTP, FPE

Input Value: 3872 3789 1620 3675

!@#$%a^.,mhu7///&*B()_+!@

8278 2789 2990 2789

46

1970 2000 2005 2010

Low

Vault-based Tokenization

Vaultless Tokenization

8278 2789 2990 2789

Format Preserving

Greatly reduced Key Management

No Vault

8278 2789 2990 2789

Page 48: Isaca new delhi india   privacy and big data

Fine Grained Security: Encryption of Fields

Production SystemsEncryption of fields• Reversible• Policy Control (authorized / Unauthorized Access)• Lacks Integration Transparency• Complex Key Management• Example: !@#$%a^.,mhu7///&*B()_+!@

47

Non-Production Systems

Page 49: Isaca new delhi india   privacy and big data

Fine Grained Security: Masking of Fields

Production Systems

48

Non-Production SystemsMasking of fields• Not reversible• No Policy, Everyone can access the data• Integrates Transparently• No Complex Key Management• Example: 0389 3778 3652 0038

Page 50: Isaca new delhi india   privacy and big data

Fine Grained Security: Tokenization of Fields

Production Systems

Tokenization (Pseudonymization)

• No Complex Key Management• Business Intelligence• Example: 0389 3778 3652 0038

49

Non-Production Systems

• Reversible • Policy Control (Authorized / Unauthorized Access)

• Not Reversible• Integrates Transparently

Page 51: Isaca new delhi india   privacy and big data

Fine Grained Data Security Methods

Tokenization and Encryption are Different

Used Approach Cipher System Code System

Cryptographic algorithms

Cryptographic keys

TokenizationEncryption

50

Cryptographic keys

Code books

Index tokens

Source: McGraw-HILL ENCYPLOPEDIA OF SCIENCE & TECHNOLOGY

Page 52: Isaca new delhi india   privacy and big data

Fine Grained Data Security Methods

Vault-based Tokenization Vaultless Tokenization

Footprint Large, Expanding. Small, Static.

High Availability,

Disaster Recovery

Complex, expensive

replication required.

No replication required.

Vault-based vs. Vaultless Tokenization

51

Distribution Practically impossible to

distribute geographically.

Easy to deploy at different

geographically distributed locations.

Reliability Prone to collisions. No collisions.

Performance,

Latency, and

Scalability

Will adversely impact

performance & scalability.

Little or no latency. Fastest industry

tokenization.

Page 53: Isaca new delhi india   privacy and big data

PCI DSS 3.0

• Split knowledge and dual control

PCI SSC Tokenization Task Force

• Tokenization and use of HSM

Card Brands – Visa, MC, AMEX …

The Future of Tokenization

• Tokens with control vectors

ANSI X9

• Tokenization and use of HSM

52

Page 54: Isaca new delhi india   privacy and big data

Security of Different Protection Methods

High

Security Level

I

Format

Preserving

Encryption

I

Vaultless

Data

Tokenization

I

AES CBC

Encryption

Standard

I

Basic

Data

Tokenization

53

Low

Page 55: Isaca new delhi india   privacy and big data

10 000 000 -

1 000 000 -

100 000 -

10 000 -

Transactions per second*

Speed of Different Protection Methods

10 000 -

1 000 -

100 -I

Format

Preserving

Encryption

I

Vaultless

Data

Tokenization

I

AES CBC

Encryption

Standard

I

Vault-based

Data

Tokenization

*: Speed will depend on the configuration

54

Page 56: Isaca new delhi india   privacy and big data

Risk Adjusted Data Protection

Data Security Methods Performance Storage Security Tran sparency

System without data protection

Monitoring + Blocking + Obfuscation

Data Type Preservation Encryption

Strong Encryption

There is always a trade-off between security and usability.

Strong Encryption

Vaultless Tokenization

Hashing

Anonymisation

BestWorst

55

Page 57: Isaca new delhi india   privacy and big data

DataDe-Identification

56

De-Identification

Page 58: Isaca new delhi india   privacy and big data

The solution to protecting Identifiable data is to properly de-identify it.

Redact the information – remove it.

What is de-identification of identifiable data?

Personally Identifiable Information Health Information / Financial Information

Personally Identifiable Information Health Information / Financial Information�

Redact the information – remove it.

The identifiable portion of the record is de-identified with any number of protection methods such as masking, tokenization, encryption, redacting (removed), etc.

The method used will depend on your use case and the reason that you are de-identifying the data.

57

Page 59: Isaca new delhi india   privacy and big data

Identifiable Sensitive InformationField Real Data Tokenized / Pseudonymized

Name Joe Smith csu wusoj

Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA

Date of Birth 12/25/1966 01/02/1966

Telephone 760-278-3389 760-389-2289

E-Mail Address [email protected] [email protected]

SSN 076-39-2778 937-28-3390

CC Number 3678 2289 3907 3378 3846 2290 3371 3378

Business URL www.surferdude.com www.sheyinctao.com

Fingerprint Encrypted

Photo Encrypted

X-Ray Encrypted

Healthcare / Financial Services

Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc.Financial Services Consumer Products and activities

Protection methods can be equally applied to the actual healthcare data, but not needed with de-identification

58

Page 60: Isaca new delhi india   privacy and big data

De-Identified Sensitive Data Field Real Data Tokenized / Pseudonymized

Name Joe Smith csu wusoj

Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA

Date of Birth 12/25/1966 01/02/1966

Telephone 760-278-3389 760-389-2289

E-Mail Address [email protected] [email protected]

SSN 076-39-2778 076-28-3390

CC Number 3678 2289 3907 3378 3846 2290 3371 3378

Business URL www.surferdude.com www.sheyinctao.com

Fingerprint Encrypted

Photo Encrypted

X-Ray Encrypted

Healthcare / Financial Services

Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc.Financial Services Consumer Products and activities

Protection methods can be equally applied to the actual data, but not needed with de-identification

59

Page 61: Isaca new delhi india   privacy and big data

Use

Case

How Should I Secure Different Data?

Simple –PCI

PII

Encryption

of Files

CardHolder Data

Tokenization of Fields

Personally Identifiable Information

Type of

DataI

Structured

I

Un-structured

Complex – PHI

ProtectedHealth

Information

60

Personally Identifiable Information

Page 62: Isaca new delhi india   privacy and big data

Research Brief

Tokenization Gets Traction

Aberdeen has seen a steady increase in enterprise use of tokenization for protecting sensitive data over encryption

Nearly half of the respondents (47%) are currently using tokenization for something other than cardholder data

Over the last 12 months, tokenization users had 50% fewer security-related incidents than tokenization non-users

61 Author: Derek Brink, VP and Research Fellow, IT Security and IT GRC

Page 63: Isaca new delhi india   privacy and big data

The business intelligence exposed through Vaultless Tokenization can allow many users and processes to perform job functions on protected data

Extreme flexibility in data de-identification can allow responsible data monetization

Vaultless Tokenization & Data Insight

Data remains secure throughout data flows, and can maintain a one-to-one relationship with the original data for analytic processes

62

Page 64: Isaca new delhi india   privacy and big data

Use Cases for Coarse & Fine Coarse & Fine

Grained Security

63

Page 65: Isaca new delhi india   privacy and big data

Off-shoring & OutsourcingOutsourcing

Page 66: Isaca new delhi india   privacy and big data

Business Process Outsourcing (BPO)

• Business Processes

• E.g. Loans, Mortgages, Call Centre, Claims Processing, ERP, etc.

• Application Development

• Need to de-identify Data for Testing and Development

Off-Shoring

Privacy Impacts BPO & Offshore Business Solutions

• Same as Outsourcing, but data is sent for business functions (like call center, etc.) off-shore.

Laws governing your ability to send real data to 3rd parties are already restrictive, and becoming more so

Penalties for infringement are growing more severe

Risk of data breaches and data theft is increased

65

Page 67: Isaca new delhi india   privacy and big data

Major Bank in EU wants to centralise EDW operations in a single country and therefore send customer data from country A to country B. Privacy Laws in country A prohibit this.

Private Bank in Europe wants to offshore Finance

Examples

Private Bank in Europe wants to offshore Finance Operations. Privacy Law prohibits transfer of citizen data to India.

Retail Bank in Scandinavia wants to offshore Customer Services. Privacy law prevents transfer of citizen data to the Far East.

66

Page 68: Isaca new delhi india   privacy and big data

Case Studies

Page 69: Isaca new delhi india   privacy and big data

Protegrity Use Case: UniCredit

CHALLENGES The primary challenge was to protect PII – names and addresses, phone and email, policy and account numbers, birth dates, etc. – to the satisfaction of EU Cross Border Data Security requirements. This included incoming source data from various European banking entities, and existing data within those systems, which would be consolidated at the Italian HQ.

Page 70: Isaca new delhi india   privacy and big data

Case Study - Large US Chain Store

Reduced cost

• 50 % shorter PCI audit

Quick deployment

• Minimal application changes

• 98 % application transparent

Top performanceTop performance

• Performance better than encryption

Stronger security

69

Page 71: Isaca new delhi india   privacy and big data

Case Study: Large Chain Store

Why? Reduce compliance cost by 50%• 50 million Credit Cards, 700 million daily transactions

• Performance Challenge: 30 days with Basic to 90 minutes with Vaultless Tokenization

• End-to-End Tokens: Started with the D/W and expanding to stores

• Lower maintenance cost – don’t have to apply all 12 requirements

• Better security – able to eliminate several business and daily reports

• Quick deployment

• Minimal application changes

• 98 % application transparent

70

Page 72: Isaca new delhi india   privacy and big data

Aadhaar/UIDBig DataBig Data

Use Case

Page 73: Isaca new delhi india   privacy and big data
Page 74: Isaca new delhi india   privacy and big data

Aadhaar Data Stores

Mongo cluster(all enrolment records/documents

– demographics + photo)

Shard

1

Shard

4

Shard

5

Shard

2

Shard

3Low latency indexed read (Documents per sec),High latency random search (seconds per read)

Low latency indexed read (milli-

Solr cluster(all enrolment records/documents

– selected demographics only)

Low latency indexed read (Documents per sec),Low latency random search (Documents per sec)

Shard

0

Shard

2

Shard

6

Shard

9

Shard

a

Shard

d

Shard

f

MySQL(all UID generated records - demographics only,

track & trace, enrolment status )

Low latency indexed read (milli-seconds per read),High latency random search (seconds per read)

UID master

(sharded)

Enrolment

DB

HDFS(all raw packets)

Data

Node 1Data

Node 10

Data

Node ..

High read throughput (MB per sec),High latency read (seconds per read)

Data

Node 20

HBase(all enrolment

biometric templates)

Region

Ser. 1Region

Ser. 10

Region

Ser. ..

High read throughput (MB per sec),Low-to-Medium latency read (milli-seconds per read)Region

Ser. 20

NFS(all archived raw packets)

Moderate read throughput,High latency read (seconds per read)

LUN 1 LUN 2 LUN 3 LUN 4

Page 75: Isaca new delhi india   privacy and big data

Protegrity Summary

Proven enterprise data security software and innovation leader

• Sole focus on the protection of data

• Patented Technology, Continuing to Drive Innovation

Cross-industry applicability• Retail, Hospitality, Travel and

TransportationTransportation

• Financial Services, Insurance, Banking

• Healthcare

• Telecommunications, Media and Entertainment

• Manufacturing and Government

74

Page 76: Isaca new delhi india   privacy and big data

Please contact us for more information

[email protected]

[email protected]

[email protected]

www.protegrity.com