atlas and ranger epam meetup

15
Next Generation of Hadoop Security & Governance Apache Atlas + Ranger Alex Zeltov – Solutions Engineer

Upload: alex-zeltov

Post on 13-Apr-2017

173 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Atlas and ranger epam meetup

Next Generation of Hadoop Security & Governance

Apache Atlas + Ranger

Alex Zeltov – Solutions Engineer

Page 2: Atlas and ranger epam meetup

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Hortonworks Data Platform Architecture

Page 3: Atlas and ranger epam meetup

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Ranger + Atlas Overview

Page 4: Atlas and ranger epam meetup

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

• Administrators have complete visibility into the security administration process

Deep VisibilityCentralized Platform

• Administer security for:– Database

– Table

– Column

– LDAP Groups

– Specific Users

Fine-Grained Security Definition

• Centralized platform to define, administer and manage security policies consistently

• Define security policy once and apply it to all the applicable components across the stack

Page 5: Atlas and ranger epam meetup

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Atlas Data Governance

Organizations need data governance to understand its information to answer questions such as:

• What do we know about our information?• Where did this data come from and how’s it being used?• Does this data adhere to company policies and rules?

Page 6: Atlas and ranger epam meetup

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Background: DGI Community becomes Apache Atlas

May2015

Proto-typeBuilt

Apache AtlasIncubation

DGI groupKickoff

Feb2015

Dec 2014

July2015HDP 2.3 FoundationGA Release

First kickoff to GA in 7 months

Global FinancialCompany

* DGI: Data Governance Initiative

Key Benefits:

• Co-Dev = Built for real customer use cases

• Faster & Safer = Customers know business + HWX knows Hadoop

Page 7: Atlas and ranger epam meetup

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Atlas

REST APIModern, flexible access to Atlas services, HDP components, UI & external tools

Search: SQL like DSL (Domain Specific Language)Support for key word, faceted and full text searches

Data Lineage Only product that captures lineage across Hadoop components at platform level.

ExchangeLeverage existing metadata / models by importing it from current tools. Export metadata to downstream systems

Page 8: Atlas and ranger epam meetup

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

What is Metadata?

Technical Metadata

• Database Name• Table Name• Column Name• Data Type

Business Metadata

• Business Name• Business Definition• Business Classification• Sensitivity Tags

Operational Metadata

• Who (security access)• What (job information)• When (logs/ audit trails)• Where (location)

Page 9: Atlas and ranger epam meetup

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Dynamic Access PolicyApache Ranger + Atlas Integration

Page 10: Atlas and ranger epam meetup

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Use cases drives design – high reliability

Metastore

• Tags• Assets• Entities

Notification Framework

Kafka Topics

AtlasAtlas Client

• Subscribes to Topic• Gets Metadata

Updates

PDPResource Cache

Ranger

Notification Metadata updates

Messagedurability

Optimized for Speed

Event driven updates

Page 11: Atlas and ranger epam meetup

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Tag-based Access Policy Requirements

• Basic Tag policy – PII example. Access and entitlements must be tag based ABAC and scalable in implementation.

• Geo-based policy – Policy based on IP address, proxy IP substitution maybe required. The rule enforcement but be geo aware.

• Time-based policy – Timer for data access, de-coupled from deletion of data.

• Prohibitions – Prevention of combination of Hive tables/Columns that may pose a risk together.

Page 12: Atlas and ranger epam meetup

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Expanded Native Connectors: Dataset Lineage

Page 13: Atlas and ranger epam meetup

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Sqoop

TeradataConnector

ApacheKafka

Expanded Native Connector: Dataset Lineage

Custom Activity Reporter

MetadataRepository

RDBMS

Page 14: Atlas and ranger epam meetup

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

UX proto-type: Taxonomy Navigation

Breadcrumbs for taxonomy context path

Contents at taxonomy context

Page 15: Atlas and ranger epam meetup

DEMO