painless build and deploy for yarn applications with spring
TRANSCRIPT
© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission.
Spring YARN
By Janne Valkealahti
Agenda
• Hadoop YARN Intro• Spring YARN Intro• Concepts
2
• Demos• Simple Application• Multi Project• Testing• Spring Batch• Container Groups
http://github.com/SpringOne2GX-2014/JanneValkealahti-SpringYarn
Hadoop YARN
• Hadoop v1 vs. v2• Is a Resource Scheduler• Is not a Task Scheduler• YARN != Hadoop v2• MapReduce v2 is a YARN Application• Big Investment – Re-use Outside of MapReduce
3
Spring YARN
• Is a Framework• Run Spring Contexts on YARN• Application Configuration• No Boilerplate for Something Simple• Extend to Create more Complex Applications• Not Meant for Existing Apps
5
Spring YARN Concepts - Client
• Access YARN Cluster• Submit / Control Running Applications• Launch Context for Appmaster
• Config
• Application Files (Localization)
• Environment
7
Spring YARN Concepts - Appmaster
• Control the Running Application Instance• Appmaster is a main() of the Application• Lifecycle• Controls and Launches YARN Containers• Launch Context for Container
• Config
• Container Files (Localization)
• Environment
8
Boot Application Model
• Spring Boot• End to End
• Project, Build, Launch, Control
• Heavily Based on Boot AutoConfiguration• @OnYarnAppmasterCondition
• @OnYarnContainerCondition
• @OnYarnClientCondition
• JavaConfig• Yaml Config Files
12
Boot Application Model
• Container• @YarnComponent, @OnContainerStart
• @YarnEnvironments, @YarnParameters
13
@YarnComponentpublic class HelloPojo {
@OnContainerStart public void myMethod() throws Exception { // do your stuff here }
}
Testing with YARN
• Testing is Difficult• Spring YARN to Rescue• Spring Test / Spring YARN Test• @MiniYarnCluster / @MiniYarnClusterTest• AbstractBootYarnClusterTests• Yarn Configuration from a Mini Cluster
15
Test - JUnit
16
@MiniYarnClusterTestpublic class AppIT extends AbstractBootYarnClusterTests {
@Test public void testApp() throws Exception { ApplicationInfo info = submitApplicationAndWait(ClientApplication.class, new String[0]); assertThat(info.getYarnApplicationState(), is(YarnApplicationState.FINISHED)); }
}
Spring Batch Partitioned Steps
• Run Partitioned Steps on YARN Containers• HDFS File Input Splits and Colocation
– StaticBlockSplitter, StaticLengthSplitter, SlopBlockSplitter
• Main Job Executed on Appmaster• Re-start Failed Job• Run Only Failed Steps• Job Repository – In-memory / Database
18
Spring Batch Input Splits - Appmaster
19
@Bean @StepScopeprotected Partitioner partitioner( @Value(BatchSystemConstants.JP_SPEL_KEY_INPUTPATTERNS) String inputPatterns) throws IOException { SplitterPartitioner partitioner = new SplitterPartitioner(); partitioner.setSplitter(splitter()); partitioner.setInputPatterns(inputPatterns); return partitioner;}@Beanprotected Splitter splitter() { return new StaticLengthSplitter(1000);}
Spring Batch Input Splits - Container
20
@Bean @StepScopeprotected DataStoreItemReader<String> itemReader( @Value(SEC_SPEL_KEY_FILENAME) String fileName, @Value(SEC_SPEL_KEY_SPLITSTART) Long start, @Value(SEC_SPEL_KEY_SPLITLENGTH) Long length) { Split split = new GenericSplit(start, length, null); DataStoreItemReader<String> reader = new DataStoreItemReader<String>(); reader.setDataStoreReader(new TextFileReader(configuration, new Path(fileName), null, split, null)); reader.setLineDataMapper(new PassThroughLineDataMapper()); return reader;}
Demo – Spring Batch
22
master1 - remoteStep1master2 - remoteStep2
remoteStep1:partition0
Appmaster
Containers
remoteStep1:partition1
remoteStep2:partition0
remoteStep2:partition1
/tmp/remoteStep1partition0
HDFS
/tmp/remoteStep1partition1
/tmp/remoteStep2partition0
/tmp/remoteStep2partition1
Application Client
• Command-line vs. Shell• Shell Based on Boot Cli• No Universal Client• Easy to Create Your Own Client• Shell Makes Command Execution Faster
23
Container Grouping
• Provides Multiple Container Types• Concepts of Grouping and Clustering• Group Projected from a Container Pool
• i.e. any, hosts, racks
• Group has a Target Projection State
• Create On-Demand• Change Group Size• Re-start Failed Containers
24