a multi colored yarn

19
Vinod Kumar Vavilapalli Apache Hadoop PMC, Co-founder of YARN project Hortonworks Inc A Multi-Colored YARN

Upload: dataworks-summithadoop-summit

Post on 16-Apr-2017

3.796 views

Category:

Technology


2 download

TRANSCRIPT

Vinod Kumar VavilapalliApache Hadoop PMC, Co-founder of YARN project Hortonworks Inc

A Multi-Colored YARN

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

About.html

Apache Hadoop PMC, ASF Member 9 years of only Hadoop

– Finally the job-adverts asking for “10 years of Hadoop experience” have validity

’Rewritten’ the Hadoop processing side – Became Apache Hadoop YARN

With me today– Billie Rinaldi: VP Apache Accumulo, Apache Slider PMC, ASF Member– Jayush Luniya: Apache Ambari PMC– Vadim Vaks: Kickass field guy (Sr. Solutions Architect)

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Hadoop Compute Platform Today

Layers that enable applications and higher order frameworks

It’s all about data!

Still a single colored yarn

Apache Hadoop YARN pretty good at jobs, queries, short running apps

– We will continue doing this

Admins and admin tools (Ambari) takes care of statically provisioned services

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Hadoop Compute Platform Today

Platform Services

StorageResource

Management SecurityManagement

Monitoring

AlertsGovernance

MR Tez Spark …

Run everything in a single secure, multi-tenant, elastic Hadoop YARN cluster– An ongoing journey

Adding new ‘stuff’ to this stack is an involved effort

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Evolution of user focus

A need for reuse, composition and to keep building ‘upwards’ Applications & services & more complex combinations - Assembly

IOT ApplicationsApache Metron

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

IOT ApplicationsApache Metron

• Simplified deployment of an assembly– Ready to go packages– Discovery– Resource/capacity planning

• Management / monitoring / metrics of assemblies!– “Start / stop” my business app end-to-end– “Tell me what’s happening with my business application”– “I don’t care whether HBase RegionServer is down or not, is my assembly healthy?”

• Scale up/down the entire app!– “I got more input coming in, I don’t care how you scale individual pieces, but do scale the entire machinery”

Emerging needs of the platform

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why on YARN?

Manual plumbing is very tiresome, not repeatable Assemblies - similar to apps & services, but N x harder (because there are N services to

grapple with) Why not static allocations?

– Machines die– Jobs (MapReduce, Spark) are tolerant of faults, but static services aren’t!– Upfront capacity planning– Cannot react to hardware or utilization changes without manual intervention– Elasticity is a manual operation

This is fundamentally the same resource-management problem that YARN is built to address!

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why on YARN? Contd..

The Apache Hadoop ecosystem knows Data services the best – YARN is data-first! Big Data use-cases don’t stop at Hadoop services and apps

– Hive for all data, summary in traditional on-demand DB for driving analysts– Extracting results from HDP and hosting report servers, interactive Uis like Apache Zeppelin

Users don’t care about this separation– Big Data is already a huge cluster on one side– Asking for another infrastructure & needing separate management of this other stuff is

burdensome– Unified solution >> Silos

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Hadoop Compute Platform Next

A colorful, multi-threaded yarn For use-cases of various colors

Today’s applications better Simplified long running applications Bring your app easily

https://www.flickr.com/photos/happyskrappy/15699919424

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

What is happening now?

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Packaging

Containers– Lightweight mechanism for packaging and resource isolation– Popularized and made accessible by Docker– Can replace VMs in some cases– Or more accurately, VMs got used in places where they didn’t

need to be Native integration ++ in YARN

– Support for “Container Runtimes” in LCE: YARN-3611– Process runtime– Docker runtime

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

APIs

Applications need simple APIs Need to be deployable “easily”

Simple REST API layer fronting YARN– https://issues.apache.org/jira/browse/YARN-4793– [Umbrella] Simplified API layer for services and beyond

Spawn services & Manage them

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Platform++

YARN itself is evolving to support services and complex apps– https://issues.apache.org/jira/browse/YARN-4692– [Umbrella] Simplified and first-class support for services in YARN

Scheduling– Application priorities: YARN-1963– Affinity / anti-affinity: YARN-1042– Services as first-class citizens: Preemption, reservations etc

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Platform++ Contd

Application & Services upgrades– ”Do an upgrade of my Spark / HBase apps with minimal impact to end-users”– YARN-4726

Simplified discovery of services via DNS mechanisms: YARN-4757 YARN Federation – to infinity and beyond: YARN-2915 Easier container sizing models: Resource profiles: YARN-3926

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Framework++

Platform is only as good as the tools

A native YARN framework– https://issues.apache.org/jira/browse/YARN-4692– [Umbrella] Native YARN framework layer for services and

beyond

Slider supporting a DAG of apps:– https://issues.apache.org/jira/browse/SLIDER-875

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

User facing and operational experience

Modern YARN web UI - YARN-3368 Enhanced shell interfaces

Metrics: Timeline Service V2 – YARN-2928 Application & Services monitoring, integration with other systems

First class support for YARN hosted services in Ambari– https://issues.apache.org/jira/browse/AMBARI-17353

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Use-cases.. Assemble!

Platform Services

StorageResource

Management SecurityServiceDiscovery Management

Monitoring

Alerts

Holiday Assembly

HBase

WebServer

IOT Assembly

Kafka Storm HBase Solr

Governance

MR Tez Spark …

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Take away..

Beyond2.x1.x

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You(Rest of) The demo Team• Gour Saha• Sidhartha Seethana• Varun Vasudev• Shane Kumpf• Jaimin Jetly• Yusaku Sako• Yu Liu