1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Managing Enterprise Hadoop Clusters with
Apache Ambari
Jayush Luniya @ Hortonworks Apache Ambari PMC
© Hortonworks Inc. 2011 – 2016. All Rights Reserved May 2016
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Ambari Overview
Ambari Features Demo Q&A
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What’s Apache Ambari?
100% open-source platform for simplifying
Hadoop cluster management and use.
Highly extensible.
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
It’s a wild zoo out there! GoKa manage this
efficiently.
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Ambari Themes
• Deliver the core operaQonal capabiliQes to provision, manage and monitor Hadoop clusters at scale.
Operate Hadoop at Scale
• Robust API for integraQon with exisQng enterprise systems, such as MicrosoE SCOM and Teradata Viewpoint.
Integrate with the Enterprise
• Provide extensible plaIorm for Customers, Partners and the Community (Stacks, Views)
Extend for the Ecosystem
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Fast forward 5 years to today…
à Latest JIRA: AMBARI-‐16131
à 150+ Contributors
à 60+ CommiKers
à 16131 JIRAs filed
à 14254 JIRAs fixed At 1.5 day per JIRA ~ 90 person years!
à Used by hundreds of companies
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari – 3rd Biggest Project* @ Apache
* Based on total JIRAs filed on a project basis as of April 26, 2016
#2: Hadoop at ~32k as it is split across mulQple JIRA Projects
#1 #3 #4 #5
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Timeline
Ambari 1.6.* May 2014 908 JIRAs
Ambari 1.5.* Apr 2014 1218 JIRAs
Ambari 1.7.* Dec 2014 1620 JIRAs
Ambari 2.0.* April 2015 1804 JIRAs
Current GA Version (2.2.2)
Ambari 2.1.* July 2015 2674 JIRAs
Ambari Stacks
ResoluQon of 9k+ JIRAs
Ambari Blueprints Ambari Views
Alerts Framework Metrics System Rolling Upgrade Kerberos AutomaQon
Enhanced Dashboards Smart Configs
Ambari 2.2.* Dec 2015 1542 JIRAs
Express Upgrade
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Ambari Overview
Ambari Features Demo Q&A
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Extensibility Features
• To add new Services (ISV or otherwise) beyond HDP stack • To customize a Stack for customer specific environments Stacks
• To use Ambari for automaQng cluster installaQons. • To share best pracQces on layout and cluster configuraQon Blueprints
• To extend and customize the Ambari Web UI • Add new capabiliQes, customize exisQng capabiliQes Views
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Terminology
Term DefiniKon Examples
STACK Defines a set of Services, where to obtain the sooware packages and how to manage the lifecycle.
HDP-‐2.3, HDP-‐2.2
SERVICE Defines the Components that make-‐up the service. HDFS, NAGIOS, YARN
COMPONENT The building-‐blocks of a Service, that adhere to a certain lifecycle.
NAMENODE, DATANODE, OOZIE_SERVER
CATEGORY The category of Component. MASTER, SLAVE, CLIENT
REPO Repository metadata where the arQfacts reside hKp://public-‐repo-‐1.hortonworks.com/HDP/centos6/2.x/GA/2.3.0.0
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari Stack
à Stacks define Services + Repo – What is a stack, and where to get the bits
à Each service has a definiQon – What components are part of the Service
à Each service has defined lifecycle commands – start, stop, status, install, configure
à Lifecycle is controlled via command scripts
à Ability to define “custom” commands
Ambari Server
Stack
Service DefiniQons
Command Scripts
xml python
Ambari Agents
Repos
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stacks Support Inheritance
HDP 2.1 Stack
HDP 2.0 Stack
§ Overrides any Service definiQons, commands and configuraQons § Adds new Services specific to this Stack
§ Defines a set of Service definiQons § Default service configuraQons and command scripts
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Automated Cluster Deployment
à Deploy clusters of any scale with ease à Two REST API calls is all it takes to provision a cluster
Who uses it? Ã HDInsight (Microsoo Azure)
à Hortonworks QA
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Example: Create a 100-‐node Cluster
{! "configurations" : [! {! ”hdfs-site" : {!
! "dfs.datanode.data.dir" : ”/hadoop/1,/hadoop/2,/hadoop/3"! }! }! ],! "host_groups" : [! {! "name" : ”master-host",! "components" : [! { "name" : "NAMENODE” },! { "name" : "RESOURCEMANAGER” },! …! ],! "cardinality" : "1"! },! {! "name" : ”worker-host",! "components" : [! { "name" : ”DATANODE” },! { "name" : ”NODEMANAGER” },! …! ],! "cardinality" : "1+"! },! ],! "Blueprints" : {! "stack_name" : "HDP",! "stack_version" : "2.0"! }!}!
{! "blueprint" : ”my-blueprint",! "host_groups" :[! { ! "name" : ”master-host", ! "hosts" : [ ! { ! "fqdn" : ”master001.ambari.apache.org”!
! }! ]! },! { ! "name" : ”worker-host", ! "hosts" : [ ! { ! "fqdn" : ”worker001.ambari.apache.org”!
! },! { ! "fqdn" : ”worker002.ambari.apache.org”!
! },! …! { ! "fqdn" : ”worker099.ambari.apache.org”!
! }! ]! }! ]!}!
1. POST /api/v1/blueprints/my-blueprint! 2. POST /api/v1/clusters/my-cluster!
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Cluster ReplicaKon
{! "configurations" : [! {! ”cluster-env" : {!
! ”user_group" : ”hadoop"! }! ”hdfs-site" : {!
! "dfs.datanode.data.dir" : ”/hadoop/1,/hadoop/2,/hadoop/3"! }! }! ],! "host_groups" : [! {! "name" : ”master-host",! "components" : [! { "name" : "NAMENODE” },! { "name" : "RESOURCEMANAGER” },! …! ],! "cardinality" : "1"! }! ],! "Blueprints" : {! "stack_name" : "HDP",! "stack_version" : "2.0"! }!}!
GET/api/v1/clusters/my-cluster?format=blueprint!
à Export blueprint from an exisQng cluster à Import blueprint to replicate the cluster
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Blueprint Features
Ambari 2.0: Ã High availability (HA) cluster deployments
à Adding hosts using blueprints (AMBARI-‐8458) Ambari 2.1:
à Advanced cluster creaQon opQons (AMBARI-‐10750)
Ambari 2.2: Ã Kerberized cluster deployments (AMBARI-‐13431)
à Stack advisor recommendaQons (AMBARI-‐13487)
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Upgrades § Rolling vs Express Upgrade modes
§ Side-‐by-‐Side Bits and Configs
Bits: /usr/hdp/2.2.0.0-‐2041 /usr/hdp/2.2.4.2-‐2 /usr/hdp/2.3.0.0-‐3000
Configs: /etc/hive/conf/ (iniQal) /etc/hive/conf/v0 (HDP 2.2.4.2) /etc/hive/conf/v1 (HDP 2.3)
2.2.0.0 2.2.4.2 2.3.0.0 minor jump major jump
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Express vs Rolling Upgrade
Rolling Upgrade à Services are up the enQre Qme à Upgrade one component at a Qme
à Robust and fault-‐tolerant
à Service checks performed frequently during the upgrade Express Upgrade
à All services are brought down, upgraded and restarted
à Faster upgrade mode à Planned service downQme
à RelaQvely service checks performed less frequently during the upgrade.
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Upgrade – Install Version
§ Install new version in parallel on all agents § No downQme
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Upgrade – Orchestration
§ Not necessarily “one-‐click” but fully guided
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Upgrade – Upgrade Catalog
à Upgrades are driven by upgrade catalogs defined in stack definiQons. à Defines upgrade groups and upgrade order à Provides ability to modify configuraQons
– Set, move, delete, transform
à Upgrade steps can be marked as skippable and retryable à Supports execuQng custom scripts during upgrade
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Downgrade
§ Can trigger downgrade at any stage of the stack upgrade
§ Cannot downgrade once stack upgrade has been finalized
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hadoop ConfiguraKon Challenges
à Too many configuraQons – Which ones are important?
à Too easy to mess up – What are valid/reasonable values? – What are the units? – Ok, what about dependencies?
à Gets harder with combinaQons of services, host assignments, enabled features, CPU/RAM/disks, etc – Any recommendaQons? What am I doing wrong?
à Smart ConfiguraQons
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari Smart Configs UI
Customizable layout -‐ Tabs -‐ SecQons -‐ Sub-‐secQons -‐ Simple grid layout (Advanced Tab contains remaining configuraQons)
New Widgets -‐ Sliders
-‐ Recommended -‐ Minimum -‐ Maximum -‐ Increment Step
-‐ Combos -‐ Enumerated values
-‐ Toggles -‐ Binary opQons
-‐ Spinners -‐ Splits value into mulQple
controls. Time in milliseconds split into days, hours, minutes.
-‐ Lists -‐ Enumerated values -‐ Single select -‐ MulQ select
Implemented -‐ HDFS -‐ YARN -‐ MapReduce -‐ Hive -‐ HBase
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Driven Layouts
Stack has theme.json file
Layout § Tabs § SecQons § Sub-‐secQons
Placement § Configs placement in sub-‐secQons
Widgets § Widget type § OpQonal Units § Bytes (B, KB, MB, GB, TB, PB) § Time (Millis, Seconds, Minutes, Hours, Days, Months,
Years)
{ "name": "default", "description": "Default theme for HBASE service", "configuration": { "layouts": [ { "name": "default", "tabs": [ { "name": "settings", "display-name": "Settings", "layout": { "tab-columns": "3", "tab-rows": "3", "sections": [ ... ] } } ] } ], "placement": { "configuration-layout": "default", "configs": [...] }, "widgets": [ { "config": "hbase-env/hbase_master_heapsize", "widget": { "type": "slider", "units": [ { "unit-name": "GB" } ] } }, ... ] } }
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Config Metadata and Dependencies
Extended Metadata § Defined in property_value_aAributes
§ Hold non-‐UI metadata about value range, increment, unit, etc
Dependencies § Models bi-‐direcQonal relaQonship between configs
§ Depends On (property_depends_on) § Answers “which configs do I depend on?”
§ Depended By (dependencies) § Answers “which configs are dependent on me?”
§ Ambari automaQcally updates dependencies
{ "StackConfigurations": { "final": "false", "property_depends_on": [ { "type": "yarn-site", "name": "yarn.nodemanager.resource.memory-mb" } ], "property_description": “The minimum allocation for every", "property_display_name": "Minimum Container Size (Memory)", "property_name": "yarn.scheduler.minimum-allocation-mb", "property_type": [], "property_value": "512", "property_value_attributes": { "type": "int", "maximum": "5120", "minimum": "0", "unit": "MB", "increment_step": "256" }, "type": "yarn-site.xml" }, "dependencies": [ { "StackConfigurationDependency": { "dependency_name": "hive.tez.container.size", "property_name": "yarn.scheduler.minimum-allocation-mb” } }, { "StackConfigurationDependency": { "dependency_name": "mapreduce.map.memory.mb", "property_name": "yarn.scheduler.minimum-allocation-mb” } }, { "StackConfigurationDependency": { "dependency_name": "mapreduce.reduce.memory.mb", "property_name": "yarn.scheduler.minimum-allocation-mb” } }… ] }
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari Metrics Service (AMS) -‐ Goals
à Ability to collect metrics from Hadoop and other Stack services à Ability to collect system level metrics à Ability to retain metrics at a high precision for a configurable Qme period à Ability to automaQcally purge metrics aoer retenQon period à Provide integraQon point for metrics collecQon and retenQon by external system à Trigger alerts based on metrics in Ambari
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AMS Grafana
Ambari 2.2.2 à Powerful dashboard builder integrated with AMS à Pre-‐built Grafana dashboards for host-‐level and service-‐level metrics à User can build and save custom dashboards
43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Alert – Types
Type DescripKon Status Thresholds Configurable?
PORT Watches a port based on a configuraQon property such as the URI. OK, WARN, CRIT Yes (seconds)
WEB Watches an HTTP or HTTPS endpoint and determines connecQvity and HTTP status code. OK, WARN, CRIT No
AGGREGATE Aggregate of status for another alert definiQon. OK, WARN, CRIT Yes (percentage)
METRIC Watches a metric or series of metrics in JMX and compares a mathemaQcal result against a threshold. OK, WARN, CRIT Yes (variable)
SCRIPT Uses a custom script to handle checking. OK or CRIT No
44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
UI – Current Alerts
Configured by default; managed via the the web client
45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
UI – Host Alerts
§ AutomaQcally refreshes
§ Query alert history
46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
UI– CustomizaKon & Instances
§ Status text, thresholds, and interval
48 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari Views
View Framework § Provide various applicaQons accessible from Ambari Web UI – interact with the cluster via a
browser from a single place for all users (cluster operators, data analysis, developers, etc)
Easy to develop § No need to understand Ambari core code – view development is just like creaQng any other web
applicaQon
Easy to deploy § Packaged as a single jar file
§ Auto create / auto configure
49 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CS Queue Manager for Cluster Operators
Capacity Scheduler Queue Manager
50 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS File Browser for General Users
HDFS File Browser
51 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Job Analysis for Developers
Troubleshoot Tez Jobs Troubleshoot / Improve Hive queries
52 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Query Editors for Data Analysts
Create, edit, execute, and analyze Hive queries Create, edit, and execute Pig scripts
53 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari Server in Views-‐Only mode
Ambari Server Cluster managed by Ambari
Ambari Server “Views-‐only” mode
(aka “Stand-‐alone” mode) Cluster not managed by Ambari
Management
Use Views
Use Views
Use Views
à Use Views on exisQng clusters not managed by Ambari à Can use Views against mulQple clusters
55 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos AutomaKon
Ambari 2.0 à Ambari manage Kerberos principals and keytabs à Works with exisQng MIT KDC or AcQve Directory à Once Kerberized, seamlessly handle:
à Adding new hosts à Adding new components to exisQng hosts à Adding new services à Moving components to different hosts
56 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Ambari Overview
Ambari Features Demo Q&A
57 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Ambari Overview
Ambari Features Demo Q&A
58 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You!
Try Ambari à Follow the Ambari Quick Start Guide
hKps://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide
Learn more à Visit the project website hKp://ambari.apache.org/
Get Involved à User Mailing List: user-‐[email protected]
à Developer Mailing List: dev-‐[email protected]
à Use JIRA to file bugs and improvement requests hKps://issues.apache.org/jira/browse/AMBARI/
Jayush Luniya @ Hortonworks (Apache Ambari PMC)
59 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Roadmap
à AMS Grafana IntegraQon
à Ambari Management Packs
à Ambari Logsearch
à Patch Upgrades
à MulQ Service Versions
à MulQ Service Instances