painless build and deploy for yarn applications with spring

28
© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission. Spring YARN By Janne Valkealahti

Upload: spring-io

Post on 20-Aug-2015

739 views

Category:

Software


2 download

TRANSCRIPT

© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission.

Spring YARN

By Janne Valkealahti

Agenda

• Hadoop YARN Intro• Spring YARN Intro• Concepts

2

• Demos• Simple Application• Multi Project• Testing• Spring Batch• Container Groups

http://github.com/SpringOne2GX-2014/JanneValkealahti-SpringYarn

Hadoop YARN

• Hadoop v1 vs. v2• Is a Resource Scheduler• Is not a Task Scheduler• YARN != Hadoop v2• MapReduce v2 is a YARN Application• Big Investment – Re-use Outside of MapReduce

3

YARN Components

4

Node Manager

Appmaster Container

Node Manager

Container

Resource Manager

Client

Spring YARN

• Is a Framework• Run Spring Contexts on YARN• Application Configuration• No Boilerplate for Something Simple• Extend to Create more Complex Applications• Not Meant for Existing Apps

5

DemoSimple Application – Single Project

6

Spring YARN Concepts - Client

• Access YARN Cluster• Submit / Control Running Applications• Launch Context for Appmaster

• Config

• Application Files (Localization)

• Environment

7

Spring YARN Concepts - Appmaster

• Control the Running Application Instance• Appmaster is a main() of the Application• Lifecycle• Controls and Launches YARN Containers• Launch Context for Container

• Config

• Container Files (Localization)

• Environment

8

Spring YARN Concepts - Container

• Heavy Lifting of the Application• Run / Do Something and Exit

9

Project Setup

• Gradle vs. Maven• Single vs. Multi Project Setup

10

Multi Project

• What Gets Packaged• Client vs. Appmaster vs. Container• Spring Profiles

11

Boot Application Model

• Spring Boot• End to End

• Project, Build, Launch, Control

• Heavily Based on Boot AutoConfiguration• @OnYarnAppmasterCondition

• @OnYarnContainerCondition

• @OnYarnClientCondition

• JavaConfig• Yaml Config Files

12

Boot Application Model

• Container• @YarnComponent, @OnContainerStart

• @YarnEnvironments, @YarnParameters

13

@YarnComponentpublic class HelloPojo {

@OnContainerStart public void myMethod() throws Exception { // do your stuff here }

}

DemoSimple Application – Multi Project

14

Testing with YARN

• Testing is Difficult• Spring YARN to Rescue• Spring Test / Spring YARN Test• @MiniYarnCluster / @MiniYarnClusterTest• AbstractBootYarnClusterTests• Yarn Configuration from a Mini Cluster

15

Test - JUnit

16

@MiniYarnClusterTestpublic class AppIT extends AbstractBootYarnClusterTests {

@Test public void testApp() throws Exception { ApplicationInfo info = submitApplicationAndWait(ClientApplication.class, new String[0]); assertThat(info.getYarnApplicationState(), is(YarnApplicationState.FINISHED)); }

}

DemoTesting Application

17

Spring Batch Partitioned Steps

• Run Partitioned Steps on YARN Containers• HDFS File Input Splits and Colocation

– StaticBlockSplitter, StaticLengthSplitter, SlopBlockSplitter

• Main Job Executed on Appmaster• Re-start Failed Job• Run Only Failed Steps• Job Repository – In-memory / Database

18

Spring Batch Input Splits - Appmaster

19

@Bean @StepScopeprotected Partitioner partitioner( @Value(BatchSystemConstants.JP_SPEL_KEY_INPUTPATTERNS) String inputPatterns) throws IOException { SplitterPartitioner partitioner = new SplitterPartitioner(); partitioner.setSplitter(splitter()); partitioner.setInputPatterns(inputPatterns); return partitioner;}@Beanprotected Splitter splitter() { return new StaticLengthSplitter(1000);}

Spring Batch Input Splits - Container

20

@Bean @StepScopeprotected DataStoreItemReader<String> itemReader( @Value(SEC_SPEL_KEY_FILENAME) String fileName, @Value(SEC_SPEL_KEY_SPLITSTART) Long start, @Value(SEC_SPEL_KEY_SPLITLENGTH) Long length) { Split split = new GenericSplit(start, length, null); DataStoreItemReader<String> reader = new DataStoreItemReader<String>(); reader.setDataStoreReader(new TextFileReader(configuration, new Path(fileName), null, split, null)); reader.setLineDataMapper(new PassThroughLineDataMapper()); return reader;}

DemoSpring Batch

21

Demo – Spring Batch

22

master1 - remoteStep1master2 - remoteStep2

remoteStep1:partition0

Appmaster

Containers

remoteStep1:partition1

remoteStep2:partition0

remoteStep2:partition1

/tmp/remoteStep1partition0

HDFS

/tmp/remoteStep1partition1

/tmp/remoteStep2partition0

/tmp/remoteStep2partition1

Application Client

• Command-line vs. Shell• Shell Based on Boot Cli• No Universal Client• Easy to Create Your Own Client• Shell Makes Command Execution Faster

23

Container Grouping

• Provides Multiple Container Types• Concepts of Grouping and Clustering• Group Projected from a Container Pool

• i.e. any, hosts, racks

• Group has a Target Projection State

• Create On-Demand• Change Group Size• Re-start Failed Containers

24

Web Application UI

• Spring Boot• Spring MVC• Embedded Servlet Container• Template Engines

25

DemoContainer Groups - RabbitMQ

26

Demo – Container Groups

27

Client Appmaster

ContainerProducer

ContainerConsumer

Thank You, Questions?

28