curb your insecurity with hdp - tips for a secure cluster

29
Curb Your Insecurity with HDP Tips for a Secure Cluster (with Spark too) Ancil McBarneA Senior Solu*ons Engineer – Security & Governance Future of Data Meetup – New York June 2 nd , 2016

Upload: ahortonworks

Post on 25-Jan-2017

211 views

Category:

Technology


0 download

TRANSCRIPT

Curb  Your  Insecurity  with    HDP  Tips  for  a  Secure  Cluster  (with  Spark  too)      

Ancil  McBarneA  Senior  Solu*ons  Engineer  –  Security  &  Governance    Future  of  Data  Meetup  –  New  York  June  2nd,  2016  

2   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Hadoop Security in 4 Steps

3   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Agenda

•  Introduction to Hadoop Security –  The  4  Steps  to  Hadoop  Security  

•  Authentication with Kerbeos –  Integra*on  with  LDAP  

•  Authorization with Apache Ranger –  Hive,  HDFS,  YARN  

•  Rest API Security with Apache Knox –  WebHDFS  

–  Hive  •  Encrypt the Data/ Data Protection

–  Transparent  Data  Encryp*on  and  KMS  

4   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

How  do  I  set  policy  across  the  en*re  cluster?  

Who  am  I/prove  it?  

What  can  I  do?  

What  did  I  do?  

How  can  I  encrypt  at  rest  and  over  the  wire?  

Comprehensive  Approach  to  Security  

Data  ProtecDon  

Protect  data  at  rest  and  in  mo*on  

In  order  to  protect  any  data  system  you  must  implement  the  following:  

Audit  

Maintain  a  record  of  data  access  

AuthorizaDon  

Provision  access  to  data  

AuthenDcaDon  

Authen*cate  users  and  systems  

AdministraDon  

Central  management  and  consistent  security  

5   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

HDP  Security:  Comprehensive,  Complete,  Extensible  

Data  ProtecDon  

Protect  data  at  rest  and  in  mo*on  

Security  in  HDP  is  the  most  comprehensive,  complete  and  extensible  for  Hadoop  

Audit  

Maintain  a  record  of  data  access  

AuthorizaDon  

Provision  access  to  data  

AuthenDcaDon  

Authen*cate  users  and  systems  

AdministraDon  

Central  management  and  consistent  security  

Single  administra*ve  console  to  set  policy  across  the  en*re  cluster:  Apache  Ranger  

Authen*ca*on  for  perimeter  and  cluster;  integrates  with  exis*ng  Ac*ve  Directory  and  LDAP  solu*ons:  Kerberos    |    Apache  Knox  

Consistent  authoriza*on  controls  across  all  Apache  components  within  HDP:  Apache  Ranger  

Record  of  data  access  events  across  all  components  that  is  consistent  and  accessible:  Apache  Ranger      

Encrypts  data  in  mo*on  and  data  at  rest;  refer  partner  encryp*on  solu*ons  for  broader  needs:  HDFS  TDE  with  Ranger  KMS  

6   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Security: Rings of Defense

Perimeter  Level  Security  •  Network  Security  (i.e.  Firewalls)  •  Apache  Knox  (i.e.  Gateways)  

AuthenDcaDon  •  Kerberos  

OS  Security  

AuthorizaDon  •  MR  ACLs  •  HDFS  Permissions  •  HDFS  ACLs  •  HiveATZ-­‐NG  •  HBase  ACLs  •  Accumulo  Label  Security  

7   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

AuthenDcaDon  with  Kerberos  

8   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Security  Without  Kerberos  

9   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Configure  Kerberos  –  Ambari  Wizard  

10   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Security  With  Kerberos  

11   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Apache  Ranger    

12   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Apache  Ranger  

13   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Centralized  Security  with  Ranger  

•  Administrators have complete visibility into the security administration process

Deep  Visibility  Centralized  PlaVorm  

•  Administer security for: – Database  –  Table  – Column  

–  LDAP  Groups  –  Specific  Users  

Fine-­‐Grained  Security  DefiniDon  

• Centralized platform to define, administer and manage security policies consistently

• Define security policy once and apply it to all the applicable components across the stack

14   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

HDFS  File  Security  

15   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Hive  Database  and  Table  Security  

16   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Authorization and Audit

Authorization Fine  grain  access  control  

•  HDFS  –  Folder,  File  

•  Hive  –  Database,  Table,  Column  •  HBase  –  Table,  Column  Family,  Column  

•  Storm,  Knox  and  more    

Audit Extensive  user  access  audi*ng  in    HDFS,  Hive  and  HBase  

•  IP  Address  •  Resource  type/  resource  

•  Timestamp  

•  Access  granted  or  denied  

Control  access  into  system  

Flexibility  in  defining  

policies  

17   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Rest  API  Security  with  Apache  Knox    

18   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

AuthenDcaDon—API  Security  with  Knox  

Eliminates SSH “edge node”

Central API management

Central audit control

Service level authorization

SSO Integration—Siteminder and OAM

LDAP and AD integration

Incubated  and  led  by  Hortonworks,    Apache  Knox  extends  the  reach  of  Hadoop  REST  API  without  Kerberos  complexi*es  

Integrated  with  exisDng  systems  to  simplify  idenDty  maintenance  

Single,  simple  point  of  access  for  a  cluster  

Central  controls  ensure  consistency  across  one  or  more  clusters  

Kerberos Encapsulation

Single Hadoop access point

REST API hierarchy

Consolidated API calls

Multi-cluster support

19   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Load  Balancer  

Extend Hadoop API reach with Knox

Hadoop  Cluster  

Applica*on  Tier  App  A   App  N  App  B   App  C  

Data  Ingest  

ETL  

Admin/  Operators  

Bas*an  Node  

SSH  

RPC  Call  

Falcon  Oozie  Scoop  Flume  

Data  Operator  

Business  User  

Hadoop    Admin  

JDBC/ODBC  REST/HTTP  

Knox  

20   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Hadoop REST APIs

Ã Useful for connecting to Hadoop from the outside the cluster Ã When more client language flexibility is required

–  i.e.  Java  binding  not  an  op*on  

Ã Challenges –  Client  must  have  knowledge  of  cluster  topology  –  Required  to  open  ports  (and  in  some  cases,  on  every  host)  outside  the  cluster  

Service   API  WebHDFS   Supports  HDFS  user  opera*ons  including  reading  files,  wri*ng  to  

files,  making  directories,  changing  permissions  and  renaming.  WebHCat   Job  control  for  MapReduce,  Pig  and  Hive  jobs,  and  HCatalog  DDL  

commands.  Learn  more  about  WebHCat.  Hive   Hive  REST  API  opera*ons  HBase   HBase  REST  API  opera*ons  Oozie   Job  submission  and  management,  and  Oozie  administra*on.  

21   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Hadoop REST API with Knox – Representative Examples

Service   Direct  URL   Knox  URL  WebHDFS   hkp://namenode-­‐host:50070/webhdfs  

 hkps://knox-­‐host:8443/webhdfs  

WebHCat   hkp://webhcat-­‐host:50111/templeton    

hkps://knox-­‐host:8443/templeton    

Oozie   hkp://ooziehost:11000/oozie    

hkps://knox-­‐host:8443/oozie    

Hbase/Stargate  

hkp://hbasehost:60080    

hkps://knox-­‐host:8443/hbase    

Hive   hkp://hivehost:10001/cliservice   hkps://knox-­‐host:8443/hive  

YARN   hkp://yarn-­‐host:yarn-­‐port/ws   hkps://knox-­‐host:8443/resourcemanager  

23   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Data  ProtecDon  

24   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Security  in  Hadoop  with  HDP  

 •  Wire  encryp*on  in  

Hadoop  •  HDFS  Encryp*on  

with  Ranger  KMS  

 •  Centralized  audit  

repor*ng  with  Apache  Ranger  

 •  Fine-­‐grain  access  

control  with    Apache  Ranger  

AuthorizaDon  What  can  I  do?  

Audit  What  did  I  do?  

Data  ProtecDon  Can  data  be  encrypted  at  rest  and  over  the  wire?  

•  Kerberos  •  API  security  with  Apache  

Knox  

AuthenDcaDon  Who  am  I/prove  it?  

HDP  2.4    

Centralized  Security  AdministraDon  with  Ranger      

25   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Data Protection HDP allows you to apply data protection policy at different layers across the Hadoop stack

Layer   What?   How  ?  

Storage  and  Access   Encrypt  data  while  it  is  at  rest   HDFS  Transparent  Data  Encryp*on,  Partners,  

Hbase  encryp*on,  OS  level  encrypt,    

Transmission   Encrypt  data  as  it  moves   SSL,  SASL,  Supported  from  HDP  2.1  

26   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Points  of  CommunicaDon  

Page  26  

WebHDFS  

DataTransferProtocol  

Nodes  

M/R  Shuffle  

Client  

1  

2  

4  

RPC  3  Nodes  

DataTransfer  2  

JDBC/ODBC  

3  

Hadoop  Cluster  

RPC  

4  

27   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Data  ProtecDon  -­‐  HDFS  EncrypDon  

DATA    ACCESS  

   

DATA    MANAGEMENT  

SECURITY  PARTNERS  

   YARN  

KeyProvider  API  (partner  integra*on  point)  

           

Key  Management  System  (KMS)  

Stateless  Key  Management  

°  

1  

°  

°  

°  

°  

°   °  

°   °  

°   °  

°   °  

°   N  °  

1   °   °   °   °   °  

°   °   °   °   °   °  

°   °   °   °   °   °  

°   °   °   °   °   °  

°   °   °   °   °   °  

°   °  

°   °  

°   °  

°   °  

°  

HDFS    

EncrypDon  Zone    

Encrypted    File  

Encrypted    File  

Encrypted    File  

Encrypted    File  Encrypted  

 Files   Name  Node  

HDFS  Client  

HDFS  Client  

•  Hortonworks  collabora*ng  with  partners  to  deliver  enterprise  scale      Key  Management  ,  deliver  more  choices  to  customers  

•  Open  source  KMS    with  Ranger  

•  Or  Partner  with  Voltage  KMS  -  Partner  joint  engineering  resources  -  Voltage  Stateless  Key  Management  integrated  with  KeyProvider  API    

Only  HDP  offers  open  source  and  

commercial  choices  for  key  management  

Open  Source  Key  Management  

28   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Demo  Transparent  Data  EncrypDon  

29   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

Security in Spark?

Spark supports running in a Kerberized Cluster Only Spark on YARN supports security (Kerberos support) From command line run kinit before submitting spark jobs Spark reads data from HDFS & ORC •  HDFS file permissions (& Ranger integration) applicable to Spark jobs

Spark submits job to YARN queue •  YARN queue ACL (& Ranger integration) applicable to Spark jobs

Wire Encryption •  Spark has some coverage, not all channels are covered

LDAP Authentication •  No Authentication in Spark UI OOB, supports filter for hooking in LDAP

30   ©  Hortonworks  Inc.  2011  –  2016.  All  Rights  Reserved  

What  makes  Hadoop  Summit  Different?  – Deep  technical  sessions  chosen  by  the  community    – Business  Track  based  on  real-­‐world  implementa*ons  – Keynotes  from  Progressive  Insurance,  Ford,  Macy’s,    MD  Anderson,  GE,  Capital  One,  …    

–  Free  Hands-­‐on  labs  – Networking  events  and  10  Year  Celebra*on!    –   20%  Off  Code:  16SJext20x  

Apache  Hadoop,  SPARK,  IoT,  Streaming,  Data  Science  

EVERYTHING  DATA!