oozie high availability (hadoop summit 2014 meetup)

Post on 03-Nov-2014

362 Views

Category:

Engineering

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

by Robert Kanter (Cloudera)

TRANSCRIPT

1

Oozie High Availability (HA)Robert Kanter

2

High Availability

• A system without non-planned downtime when partial failures occur• Typically achieved by having redundancies and removing

single-points of failure

• Our Goals• Don’t change the API or usage patterns• User doesn’t even have to know its HA

3

The HA SolutionArchitectural Overview

4

The HA Solution: Database

• Oozie stores all state in a database• (submitted jobs, workflow definitions, etc)

• Instead of a failover model, we want to run many Oozie servers against the same database• Active-Active HA• Also provides horizontal scalability

• ZooKeeper for coordination

5

The HA Solution: Database

6

The HA Solution: Access

• Users and client programs need a single address to connect (Web UI, REST/Java API, JobTracker callbacks, etc)

• Load Balancer, Virtual IP, or DNS round-robin can be used to provide a single entry point to the Oozie servers• Technically also needs to be HA

7

The HA Solution: Access

8

The HA Solution: Log Streaming

• Oozie’s log files are not in the database• Each Oozie Server only has access to its own logs

• Jobs are not assigned to a specific Oozie server

• What if Oozie Server A wants to get logs for a job processed by Oozie Server B?• Oozie Server A can ask Oozie Server B for its logs

• Caveat: If an Oozie Server goes down, any logs from it will be unavailable until it is brought back up

9

The HA Solution: Log Streaming

10

How to Enable HAConfiguration and Security

11

How to Enable HA

• Setup Load balancer, ZooKeeper ensemble, HA database, and multiple identically configured Oozie servers

• Enable Oozie HA services:<property> <name>oozie.services.ext</name> <value> org.apache.oozie.service.ZKLocksService, org.apache.oozie.service.ZKXLogStreamingService, org.apache.oozie.service.ZKJobsConcurrencyService </value></property>

12

How to Enable HA

• Point Oozie to ZooKeeper Ensemble:<property> <name>oozie.zookeeper.connection.string</name> <value>ZK_HOST1:2181,ZK_HOST2:2181</value></property>

• Point environment variable for callbacks to load balancer:

export OOZIE_BASE_URL="http://loadbalancer:11000/oozie"

13

How to Enable HA: Security

• Extra step to configure Kerberos with Load Balancer:<property> <name> oozie.authentication.kerberos.principal </name> <value>HTTP/loadbalancer@REALM</value></property>

• Note: this currently prevents clients from talking directly to any Oozie server

14

How to Enable HA: Security

• Enable Kerberos connection to ZooKeeper and ACLs:<property> <name>oozie.zookeeper.secure</name> <value>true</value></property>

• ACLs prevent malicious users or programs from interfering with Oozie’s znodes

15

Using Oozie with HA

16

Using Oozie with HA

• New Oozie CLI/REST API command to list all servers

$ oozie admin -oozie http://loadbalancer:11000/oozie -servershostA : http://hostA:11000/ooziehostB : http://hostB:11000/ooziehostC : http://hostC:11000/oozie

• Log messages now include which server wrote them2013-09-29 16:46:20,182 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostA] USER[root] GROUP[-] TOKEN[] APP[demo-wf] JOB[0000000-130925230553293-oozie-oozi-W] ACTION[0000000-130925230553293-oozie-oozi-W@streaming-node] [***0000000-130925230553293-oozie-oozi-W@streaming-node***]Action status=RUNNING

17

To DoWhat’s left

18

To Do

• HA support for SLAs and HCatalog integration• Sharelib Purging with HA• Log Streaming HA• With Kerberos, Oozie servers can’t talk to each other

• Breaks log streaming, sharelibupdate• Other misc improvements

19

top related