apache hadoop 3 current status ajisaka - schd.wsschd.ws/hosted_files/apachebigdata2016/0d/apache...

of 40/40
Copyright © 2016 NTT DATA Corporation 5/9/2016 NTT DATA Corporation Akira Ajisaka Apache Hadoop 3, Current Status Apache: Big Data North America 2016

Post on 02-Feb-2018

219 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Copyright 2016 NTT DATA Corporation

    5/9/2016NTT DATA CorporationAkira Ajisaka

    Apache Hadoop 3, Current Status

    Apache: Big Data North America 2016

  • 2 Copyright 2016 NTT DATA Corporation

    Self introduction

    n Akira Ajisaka (NTT DATA)l Apache Hadoop Committer & PMC member

    130+ commits in 2015 Working on usability and supportability

    l "Open-Source Professional Services" team Has deployed and supported 10k+ nodes of

    Hadoop clusters overall for 7 years Contributing to Apache Hadoop 6th in the world

    with NTT [1] ASF Committers: Spark, Yetus, HTrace(incubating)

    [1] The Activities of Apache Hadoop Community 2014 http://ajisakaa.blogspot.com/2015/02/the-activities-of-apache-hadoop.html

  • 3 Copyright 2016 NTT DATA Corporation

    Agenda

    n What's Apache Hadoop 3?n Differences between Hadoop 3 and 2

    l New Featuresl Incompatible Changes

    n Current Statusn Summary

  • Copyright 2016 NTT DATA Corporation 4

    What's Apache Hadoop 3?

  • 5 Copyright 2016 NTT DATA Corporation

    Disclaimer

    nApache Hadoop 3 is now undefinedl Not releasedl Not decided what feature is in or not

    nThis presentation is not always correct

    nI'd like to introduce as far as I know

  • 6 Copyright 2016 NTT DATA Corporation

    What's Apache Hadoop 3?

    20142010 2011 201320122009 2015

    2.2.0

    2.3.0

    2.4.02.0.0-alpha

    2.1.0-beta

    branch-1 (branch-0.20)

    1.0.0 1.1.0 1.2.1(stable)0.20.1 0.20.205

    0.22.0

    0.21.0New append

    Security

    0.23.0

    0.23.11(final)NameNode Federation, YARN

    NameNode HA

    HDFS Snapshots NFSv3 support Windows

    Heterogeneous storage HDFS in-memory caching

    HDFS ACLs HDFS Rolling Upgrades Application History Server RM Automatic Failover

    2.5.0

    2.6.0YARN Rolling Upgrades Transparent Encryption Archival Storage

    2.7.0

    Drop JDK6 support Truncate API

    2016

    branch-0.23

    branch-2

    trunk

    Hadoop 3 and 2 were diverged in 2011 (5 years ago!)

    Hadoop1 (EOL)

    Hadoop2

    Hadoop3

  • 7 Copyright 2016 NTT DATA Corporation

    Discussions for releasing Apache Hadoop 3

    n2014/6l Discussed releasing Hadoop 3 with upgrading JDK

    versionl Decided to make 2.6 the last release that supports

    JDK6n 2015/3

    l Oracle JDK7 is EoL, so we need to upgradel After all we put off the event for another 12 months

    n 2016/2l It's time to revisit Hadoop 3 release plansl Developers are agreed with releasing Hadoop 3

  • 8 Copyright 2016 NTT DATA Corporation

    When will it be released?

    nDeveloper mailing list:l Releasing alphas through the summerl Freeze and stabilize for GA in Nov/Dec.

    nI suppose Hadoop 3 GA will be released in the end of 2016

    nThere are many tasks to dol I'll introduce them in this slide later

  • Copyright 2016 NTT DATA Corporation 9

    Difference between Hadoop 3 and 2

  • 10 Copyright 2016 NTT DATA Corporation

    YARN is unchanged

    nYARN: the biggest difference between Hadoop 1 & 2nHadoop 3 still uses YARN

    HDFS

    MapReduce

    HDFS

    Map Reduce Spark Tez

    Hadoop 1 Hadoop 2 & Hadoop 3

    YARN

    Hive Pig Hive

    Pig Storm

  • 11 Copyright 2016 NTT DATA Corporation

    New features

    n440 fixed issues only in Hadoop 3 (as of 5/8/2016)lhttps://s.apache.org/GpUq

    nHDFS erasure codingnShell script rewritenTask level native optimizationnDerive heap size or mapreduce.*.memory.mb

    automaticallynSupport more than 2 NameNodesnand more

  • 12 Copyright 2016 NTT DATA Corporation

    Break compatibility

    nMajor version up is to clean up the codel Deprecated APIs can be removed only in changing

    major version @Public and @Stable Java API REST API Metrics/JMX CLI Environment variables

    l Wire-compatibility can be broken 2.X client cannot talk to 3.X server and vice versa

    l Compatibility Guide:https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html

  • Copyright 2016 NTT DATA Corporation 13

    New Features

  • 14 Copyright 2016 NTT DATA Corporation

    Erasure Coding (HDFS-7285)

    n Probleml Reduce costs of storagel Blocks are replicated to 3 DNs

    3x storage overhead is costly

    n Solutionl Use Erasure Code

    3-replication (6,3)-Reed-SolomonTolerates 2 failures 3 failures

    Disk Usage 3x 1.5x

  • 15 Copyright 2016 NTT DATA Corporation

    Erasure Coding: Write files using (6,3)-Reed-Solomon

    n Write data to 9 DNs in parallel

    DN1

    DN6

    DN7

    DN9

    Incoming Data

    ECClient

    3 Parity Blocks

    6 Data Blocks

  • 16 Copyright 2016 NTT DATA Corporation

    Erasure Coding: Read files

    n Read data from 6 DNs in parallel

    DN1

    DN6

    DN7

    DN9

    ECClient

  • 17 Copyright 2016 NTT DATA Corporation

    Erasure Coding: Read files when DN fails

    n Read data from arbitrary 6 DNs in parallel

    DN1

    DN6

    DN7

    DN9

    ECClient

  • 18 Copyright 2016 NTT DATA Corporation

    Erasure Coding: Current Status

    nPhase 1: striping layoutl C = 64KB (default)l Work for small filesl No data localityl Available on trunk

    nPhase 2: contiguous layoutl C = 128MB (= HDFS Block size)l Not work for small filesl Data localityl Now in progress (HDFS-8030)

    Incoming Data

    DataNode 1DataNode 2DataNode 3DataNode 4DataNode 5

    Cell size (C)

  • 19 Copyright 2016 NTT DATA Corporation

    Shell Script Rewrite (HADOOP-9902)

    nHadoop and Shell Scriptl Launching daemonsl Hadoop CLI

    nDifficult to understandl What is the correct env var to set a option

    java classpath? java.library.path? GC options?

    l How to add the option to the env varl We have to read almost all the shell scripts!

  • 20 Copyright 2016 NTT DATA Corporation

    After rewriting the scripts ...

    nEasy to understandnBecause shell API doc is available

    l Shelldoc maker generates docs from the scriptsl Similar to JavaDoc

    Public API

    My documentation build (trunk): http://aajisaka.github.io/hadoop-project/hadoop-project-dist/hadoop-common/UnixShellAPI.html

  • 21 Copyright 2016 NTT DATA Corporation

    .hadoop-env and .hadooprc (HADOOP-11353, HADOOP-13045)

    nVery similar to .bashrcl Read the API docl Create your own ~/.hadoopXX

    .hadoop-env : hadoop-env.sh for each user .hadooprc : called after shell env vars are configured

    l And that's all :)

    nex.) Set additional classpath (.hadooprc)hadoop_add_classpath /path/to/my/jar

  • 22 Copyright 2016 NTT DATA Corporation

    --debug option is available

    nUseful for troubleshooting

    $ hadoop --debug versionDEBUG: hadoop_parse_args: procesiong versionDEBUG: hadoop_parse: asking caller to skip 1DEBUG: HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop(snip)DEBUG: Applying the user's .hadooprcDEBUG: Initial CLASSPATH=/path/to/my/jarDEBUG: Initialize CLASSPATH(snip)DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/*

    CLASSPATH was overwritten!! (before HADOOP-13045)

  • 23 Copyright 2016 NTT DATA Corporation

    Many new features, bug fixes, improvements

    n 'hadoop distch' to change the ownership and permissions on many files via MapReduce job

    n 'hadoop jnipath' to print java.library.pathn 'hadoop --daemon' instead of hadoop-daemon.sh

    l ex.) hdfs --daemon status namenodel The return code for status is LSB-compatiblel hadoop-daemon(s).sh are now deprecated

    n.out files are now appended (not overwritten)l Allows external log rotation

    nand many morel see https://issues.apache.org/jira/browse/HADOOP-9902

  • 24 Copyright 2016 NTT DATA Corporation

    Task level native optimization (MAPREDUE-2841)

    nAdd a native implementation of the map output collectorl Sort, Spill and IFile serialization

    nPrequisitesl Built with -Pnative optionl Custom writable types and comparators are not

    supportednSetting

    Tips: compact xml form will be supported in Hadoop 3

    (HADOOP-6964)

  • 25 Copyright 2016 NTT DATA Corporation

    Benchmark

    nRelease Note in the issue:l "For shuffle-intensive jobs this may provide

    speed-ups of 30% or more."

    nBenchmarked with 3 slaves (m3.xlarge)l CentOS 7.2l 3.0.0-SNAPSHOT (revision 5865fe2b)

    nA very shuffle-intensive wordcount jobl Input: 2.6GB (compressed)l Shuffle: 14GBl Output: 10GB

  • 26 Copyright 2016 NTT DATA Corporation

    Benchmark result

    nResult: 506sec -> 383sec (25% speedup)l Average Map Time: 182sec -> 89sec

    nMap task log (native optimization enabled)

    nTips: Backported to CDH 5.2.0 or later

    INFO [main] org.apache.hadoop.mapred.nativetask.HadoopPlatform: Hadoop platform initedINFO [main] org.apache.hadoop.mapred.nativetask.NativeRuntime: Nativetask JNI library loaded.INFO [main] org.apache.hadoop.mapred.nativetask.handlers.NativeCollectorOnlyHandler: NativeTask Combiner is enabled, class = org.apache.hadoop.examples.WordCount$IntSumReducerINFO [main] org.apache.hadoop.mapred.nativetask.NativeBatchProcessor: NativeHandler: direct buffer size: 1048576INFO [main] org.apache.hadoop.mapred.nativetask.handlers.NativeCollectorOnlyHandler: [NativeCollectorOnlyHandler] combiner is not nullINFO [main] org.apache.hadoop.mapred.nativetask.NativeBatchProcessor: NativeHandler: direct buffer size: 1048576INFO [main] org.apache.hadoop.mapred.nativetask.util.OutputUtil: nativetask.output.manager = org.apache.hadoop.mapred.nativetask.util.NativeTaskOutputFilesINFO [main] org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator: Native output collector can be successfully enabled!INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator

  • 27 Copyright 2016 NTT DATA Corporation

    Derive heap size or mapreduce.*.memory.mb automatically (MAPREDUCE-5785)

    nIn Hadoop 2, two similar properties must be set :(l mapreduce.{map,reduce}.memory.mb

    The amount of memory to request from the scheduler for each task (ex. 2048)

    l mapreduce.{map,reduce}.java.opts Java options for YARN containers (ex. -Xmx2G)

    nIn Hadoop 3, either is enoughl .java.opts is derived from .memory.mb and vice

    versa .java.opts = .memory.mb * mapreduce.job.heap.memory-mb.ratio .memory.mb = .java.opts / mapreduce.job.heap.memory-mb.ratio

  • 28 Copyright 2016 NTT DATA Corporation

    Support more than two NameNodes (HDFS-6440)

    nHadoop 2 now supports only 2 NameNodesl 1 active and 1 standby

    nHadoop 3 supports 2 or more standby NameNodesl provides additional fault-tolerancel avoids multiple standby NNs to checkpoint at the

    same timel # of standby should be small due to block report

    nFYI: ResourceManager already supports multi-standby in branch-2

  • 29 Copyright 2016 NTT DATA Corporation

    Other new features

    n metrics2 sink plugin for Apache Kafka (HADOOP-10949)n .jhist default format is changed from json to binary

    (MAPREDUCE-6613)n Use FileOutputCommitter v2 by default

    (MAPREDUCE-6336)n Allow/disallow snapshots via WebHDFS (HDFS-9057)n Check and make checkpoint before stopping NameNode

    (HDFS-6353)n YARN TimelineServer v2 (YARN-2928)n YARN WebUI v2 (YARN-3368)n Dynamic subcommands (HADOOP-12930)n and many other changes

  • Copyright 2016 NTT DATA Corporation 30

    Incompatible Changes

  • 31 Copyright 2016 NTT DATA Corporation

    Incompatible changes

    nMany deprecated APIs will be removedl hftp/hsftp/s3 -> webhdfs/s3{n,a}l Metrics v1l org.apache.hadoop.Recordsl and more

    nImproved CLI outputl 'mapred job -list' shows the job name as welll 'hadoop fs -du' shows the raw disk usage, and

    aligned more unix-likel and more

    nSearch 'Incompatible change' flagl https://s.apache.org/sMO4

  • 32 Copyright 2016 NTT DATA Corporation

    STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/re2j-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:!/usr/local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace-core4-4.0.1-incubating.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local! /hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/json-smart-1.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.51.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jcip-annotations-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-auth-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/common/hadoop-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hadoop-hdfs-client-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-all-4.1.0.Beta5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hpack-0.11.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okio-1.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okhttp-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javassist-3.18.1-GA.jar:/usr/local/hadoop/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/curator-test-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/fst-2.24.jar:/usr/local/hadoop/share/hadoop/yarn/lib/objenesis-2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-math-2.2.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-api-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-registry-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-3.0.0-SNAPSHOT.jar:/usr/java/latest/lib/tools.jar

    Bump up the versions of the libraries

    n Drop JDK7 support (HADOOP-11858)l Hello Lambda!

    n Dependency Helll Tomcatl Jettyl Jerseyl Guaval Log4Jl Jacksonl and many many many more ...

  • 33 Copyright 2016 NTT DATA Corporation

    Classpath isolation (HADOOP-11656)

    nRelaxing the "dependency hell"l Separate client and server jarsl Client jar does not pull any third party

    dependencies

    nIf the isolation is done ...l We can safely upgrade the libraries in server codel In branch-2, the upgrade is incompatible :(

  • Copyright 2016 NTT DATA Corporation 34

    Current Status

  • 35 Copyright 2016 NTT DATA Corporation

    Current Status

    nMost of the new features are already availablel Erasure Codingl Shell Scripts Rewritel Task level native optimization

    nNeed more contributionl Bumping the library versions l Classpath isolationl Remove deprecated XXX

    nDiscussionl Release from trunk or cut new branch-3l How should we make the releases alpha, beta, or

    GA?

  • Copyright 2016 NTT DATA Corporation 36

    Summary

  • 37 Copyright 2016 NTT DATA Corporation

    Summary

    nApache Hadoop 3 hasl Many new features and code cleanupsl Many remaining tasks

    nNeed your help to release Hadoop 3 earlier!l Not only creating patches but also testing are

    welcome!

    n(Probably) Hadoop 3 GA will be released in 2016

  • 38 Copyright 2016 NTT DATA Corporation

    Comments?

    nSuppose there are many Apache Hadoop developers/users in this room :)

    nTell mel if there are any required tasks not in this talkl if I am misunderstanding somethingl everything you want to tell

    nDiscussionl Release from trunk or cut branch-3l When we should support JDK9?l What is alpha/beta/GA?l ...

    n I'd like to feedback to the community

  • Copyright 2011 NTT DATA Corporation

    Copyright 2016 NTT DATA Corporation

  • 40 Copyright 2016 NTT DATA Corporation

    Appendix

    nNative erasure coding support inside HDFS (Strata + Hadoop World New York 2015)lhttp://conferences.oreilly.com/strata/big-data-

    conference-ny-2015/public/schedule/detail/42957