apache argus - how do i secure my entire hadoop cluster? olivier renault @ hortonworks

29
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Apache Argus Olivier RENAULT

Upload: huguk

Post on 25-May-2015

480 views

Category:

Technology


2 download

DESCRIPTION

Olivier from Hortonworks will introduce Apache Argus a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. Data security within Hadoop has evolved to support multiple use cases for data access, while also providing a framework for central administration of security policies and monitoring of user access. With the advent of Apache YARN and Argus, the Hadoop platform can now support a true secure data lake architecture.

TRANSCRIPT

Page 1: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Apache Argus Olivier RENAULT

Page 2: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Apache Argus: History

XASecure created in 2013

Hortonworks acquires XASecure in Mid-May 2014

Hortonworks fill Apache Argus proposal – mid July 2014

Can get the bits from:

-  hortonworks.com

-  http://argus.incubator.apache.org/

Page 3: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Security needs are changing

Administration Centrally management & consistent security

Authentication Authenticate users and systems

Authorization Provision access to data

Audit Maintain a record of data access

Data Protection Protect data at rest and in motion

Security needs are changing •  YARN unlocks the data lake •  Multi-tenant: Multiple applications for data access

•  Changing and complex compliance environment

•  ETL of non-sensitive data can yield sensitive data

Summer 2014 65% of clusters host multiple workloads

Fall 2013 Largely silo’d deployments with single workload clusters

5 areas of security focus

Page 4: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Security in Hadoop

Authorization Restrict access to explicit data

Audit Understand who did what

Data Protection Encrypt data at rest & in motion

•  Kerberos in native Apache Hadoop

•  HTTP/REST API Secured with Apache Knox Gateway

•  HDFS Permissions, HDFS ACL, •  Audit logs in with HDFS & MR •  Hive ATZ-NG

Authentication Who am I/prove it?

•  Wire encryption in Hadoop

•  Open Source Initiatives

•  Partner Solutions

•  HDFS, Hive and Hbase

•  Fine grain access control

•  RBAC

•  Centralized audit reporting

•  Policy and access history

•  Future Integration

HD

P 2.

1 A

rgus

Centralized Security Administration

•  As-Is, works with current authentication methods

Page 5: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Central Security Administration •  Delivers a ‘single pane of glass’ for

the security administrator •  Centralizes administration of

security policy •  Ensures consistent coverage across

the entire Hadoop stack

Page 6: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Setup Authorization Policies

6

file level access control, flexible definition

Control permissions

Page 7: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Monitor through Auditing

7

Page 8: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

HDFS

Page 9: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What it means: HDFS API for Authorization

Today •  HDFS authorization is performed by JavaAgent based code

injection into namenode Tomorrow

•  Pluggable HDFS authorization is being added (HDFS-6826) •  Argus will replace the JavaAgent based code injection with a

custom authorization plugin •  Work being discussed currently

Page 10: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hive

Page 11: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hive Integration – Today

•  XA Secure/Argus uses multiple hooks in Hive hive.security.authorization.manager=com.xasecure.authorization.hive.authorizer.XaSecureAuthorizer hive.semantic.analyzer.hook=com.xasecure.authorization.hive.hooks.XaSecureSemanticAnalyzerHook hive.exec.post.hooks=com.xasecure.authorization.hive.hooks.XaSecureHivePostExecuteRunHook

– Not all information necessary to make authorization decision are available in Hive authorizer hooks

•  Local Grant/Revoke permission not integrated with Argus •  Storage based authorization only looks at POSIX

permissions

Page 12: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What it means: Tomorrow

12

•  New plug-in model in Hive to support external authorizers •  All information necessary to make authorization decision are provided to

authorizer plug-in •  XASecure/Argus Hive agent registers a single hook with Hive for

authorization hive.security.authorization.manager=com.xasecure.authorization.hive.authorizer.XaSecureHiveAuthorizerFactory

Page 13: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Integrate Grant/Revoke - Tomorrow

13

•  Integrate Grant/Revoke permissions •  New Hive Plugin enables Argus to handle Grant/Revoke permission •  Argus will store Grant/Revoke policy and enforce it, with auditing •  Option to disable Grant/Revoke •  Group/Roles mapped to Groups in Argus Admin

Page 14: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Storage Based Authorization - Tomorrow

•  In SBA, Hive used HDFS permissions for allowing operations

•  HDFS Permission Check •  Hive uses RPC to communicate with HDFS and validate permission on

HDFS folders •  If Argus is enabled, Hive will use permissions based on Argus policies in

HDFS •  Argus can be used for Storage based and regular Hive authorization

Page 15: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

HBase

Page 16: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What it means: Hbase Integration

Today –  Hbase Agents supports table, CF, Column level permissions –  Local Permissions not integrated

Tomorrow –  Integrate local grant/revoke permissions –  New Argus/XA co-processor, no changes in HBase –  Hbase-site.xml

<property> <name>hbase.coprocessor.master.classes</name> <value>com.xasecure.authorization.hbase.XaSecureAuthorizationCoprocessor</value> <property> <name>hbase.coprocessor.region.classes</name> <value>com.xasecure.authorization.hbase.XaSecureAuthorizationCoprocessor</value>

Page 17: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

About HBase Grant Revoke

17

•  Command Line Operations – Permission supported

–  Admin (A) – Create © – Write (W) – Read (R)

•  Can be performed at table, CF, column level grant <user> <permissions>[ <table>[ <column family>[ <column qualifier> ] ] ] #grants permissions revoke <user> <permissions> [ <table> [ <column family> [ <column qualifier> ] ] ] # revokes permissions

Page 18: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Storm

Page 19: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What it means?

•  Storm now support ACLs for authorization •  Argus provides administration for these ACLs, also enables access

auditing

•  Following permission support are enabled

•  Submit topology •  Kill topology •  Submit Topology •  File Upload •  Get Nimbus Conf •  Get Cluster Info •  File Download •  Kill Topology

•  Activate •  Deactivate •  Get Topology Conf •  Get Topology •  Get User Topology •  Get Topology Info •  Upload New Credential •  Rebalance

Page 20: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

KNOX

Page 21: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What it means?

•  Knox currently performs service level authorization •  Allow group or user access to specific REST API (WebHDFS, WebHcat, JDBC over http etc) •  Can also restrict based on ip address •  Permissions maintained in a file

•  Manage these permissions through Argus Portal •  User experience similar to other components

•  Get access to auditing records in Argus portal

Page 22: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

REST APIs

Page 23: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What does it mean?

•  Currently, Argus policies can only be managed through GUI •  Not a scalable model if there are large number of policies

•  Champlain work to expose REST APIs for the policy manager

•  Users can create/update/delete policies through these APIs

Page 24: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

REST API’s Available

•  Repository management

REST API Request type Request URL* Get Repository GET service/public/api/repository/{id} Create Repository POST service/public/api/repository Update Repository PUT service/public/api/repository/{id} Delete Repository DELETE service/public/api/repository/{id} Search Repositories GET service/public/api/repository

Page 25: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

REST API’s exposed in Champlain

•  Policy management

REST API Request type Request URL* Get Policy GET service/public/api/policy/{id} Create Policy POST service/public/api/policy Update Policy PUT service/public/api/policy/{id} Delete Policy DELETE service/public/api/policy/{id} Search Policies GET service/public/api/policy

Page 26: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Audit Log Storage in HDFS

Page 27: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What does it mean?

Today •  Argus audit data only in RDBMS (mysql) •  Issue with scalability

Tomorrow •  Option to write to RDBMS (mySQL or Oracle), HDFS •  Addition of Log4j file appender

•  HDFS destination can be specified in the appender •  Customer/Partners can add customer log4j appenders

•  Extensible HDFS LOG format •  Available as JSON format

Page 28: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Audit Logging to HDFS destination …

•  Argus Audit Logs To HDFS •  Log event is written to Local log file •  Local log file will be copied to HDFS destination (when

HDFS is available) •  Local log file and HDFS file rotated at a regular interval •  Design being enhanced

Page 29: Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Hortonworks

Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Questions ?