Transcript
Page 1: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

ABOUTTHISTRAINING:

TheworldofHadoopand“BigData"canbeintimidating-hundredsofdifferenttechnologieswithcrypticnamesformtheHadoopecosystem.

Thiscomprehensivetraininghasbeendesignedbyindustryexpertsconsideringcurrentindustryjobrequirementstoprovidein-depthlearningonbigdataandHadoopModules.

ThisindustryorientedprogramisacombinationofthetrainingcoursesinHadoopdeveloper,Hadoopadministrator,Hadooptesting,andanalytics.

ThisHadooptrainingwillalsoprepareyouforthe“BigDataCertificationofCloudera-CCPandCCA”.

Withthiscourse,you'llnotonlyunderstandwhatthosesystemsareandhowtheyfittogether-butyou'llgohands-onandlearnhowtousethemtosolverealbusinessproblems!

-Soit'snotjusttheory…

It'sfilledwithhands-onactivitiesandexercisesbackedwithplentydiscussionsofusecasesofdifferentclientsacrossdomains.Webelievethiswillhelpyoutobuildyourconfidenceonthetechnologyanditsusage.

You'llfindarangeofactivitiesinthiscourseforpeopleateverylevel.Ifyou'reaprojectmanagerwhojustwantstolearnthebuzzwords,therearewebUI'sformanyoftheactivitiesinthecoursethatrequirenoprogrammingknowledge.Ifyou'recomfortablewithcommandlines,

Page 2: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

we'llshowyouhowtoworkwiththemtoo.Andifyou'reaprogrammer,we'llchallengeyouwithwritingrealscriptsonaHadoopsystemusingjava,Scala,PigLatin,andPython.

YOURTAKEAWAYFROMTRAINING:

You'llwalkawayfromthiscoursewithareal,deepunderstandingofHadoopanditsassociateddistributedsystems,andyoucanapplyHadooptoreal-worldproblems.

Plusavaluablecompletioncertificateiswaitingforyouattheend!

WHATYOUWILLLEARNINTHISBIGDATAHADOOPONLINETRAININGCOURSE?

1. DetailedunderstandingofBigDataanalytics2. MasterfundamentalsofHadoop2.8andYARNfordesigningdistributedsystemsthatmanages

"bigdata"usingHadoopandrelatedtechnologiesforstoringandanalyzingdataatscale.3. Understand the architecture of HDFS and MapReduce for parallel storage and parallel

processing.4. Understand the configuration choices you shouldmake for stability, reliability andoptimized

taskschedulingonyourdistributedsystem.5. AnalyzerelationaldatausingHiveandMySQL(ConnectingHadooptoOtherDBs)6. Analyzenon-relationaldatausingHBase7. UsePigandSparktocreatescriptstoprocessdataonaHadoopclusterinmorecomplexways.8. UnderstandhowHadoopclustersaremanagedbyYARN,ZookeeperandHue.9. KnowhowtoscheduleyourHadoopjobsusingOozie.10. Collectdatafromavarietyofsources toyourHadoopclusterusingSqoopandFlume11. MasterHadoopadministrationactivitieslikeclustermanaging,monitoring,administrationand

troubleshooting12. LearntestingHadoopapplicationsusingMRUnitandotherautomationtools.13. KnowbasicsofSpark,SparkRDD,Graphx,MLlibandwritingSparkapplications14. Practicereal-lifeprojectsusingHadoopandApacheSpark15. Discussiononindustryusecasesofdifferentclientsacrossdomains16. BeequippedtoclearBigDataHadoopCertification.(CCPandCCA)

Page 3: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

RECOMMENDEDSKILLSPRIORTOTAKINGTHISCOURSE

There is no pre-requisite to take this Big data training and tomaster Hadoop. But basics ofUNIX,SQLandjava/Pythonwouldbegood.AtGraduIT,weprovidecomplimentarySelf-PacedunixandJavacoursewithourBigDataHadooptrainingtobrush-uptherequiredskillssothatyouaregoodonyourHadooplearningpath.

WHOSHOULDTAKETHISBIGDATAHADOOPTRAININGCOURSE?

1. ProgrammingDevelopersandSystemAdministrators2. Experiencedworkingprofessionals,Projectmanagers3. Big Data Hadoop Developers eager to learn other verticals like Testing, Analytics,

Administration4. BusinessIntelligence,DatawarehousingandAnalyticsProfessionals5. Graduates,undergraduateseagertolearnthelatestBigDatatechnologycantakethisBigData

HadoopCertificationonlinetraining

BITONHADOOP:

HadoopenablestoBUILDANINSIGHT-DRIVENBUSINESS.

Tobespecific,Hadoopisanopen-sourcesoftwareframeworkforstoringdataandrunningapplicationsonclustersofcommodityhardware.Itprovidesmassivestorageforanykindofdata,enormousprocessingpowerandtheabilitytohandlevirtuallylimitlessconcurrenttasksorjobs.

AlmosteverylargecompanyyoumightwanttoworkatusesHadoopinsomeway,includingAmazon,Ebay,Facebook,Google,LinkedIn,IBM,Spotify,Twitter,andYahoo!Andit'snotjusttechnologycompaniesthatneedHadoop;eventheNewYorkTimesusesHadoopforprocessingimages.

Page 4: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

CURRICULUM:

Module1:UnderstandingBigDataandHadoop

• IntroductiontoBigDataAndHadoop• DiscussiononBigDataanditsSourcesandChallengesrelatedtoit• UnderstandingattributesofBigDataanddifferentdatavarieties• DiscussingUsesCaseson“OpportunityforBusiness”inBigData• DiscussionondifferentsolutionsforproblemsrelatedtoBigData• ComparisonofHadoopvstraditionalsystemsSolutions• UnderstandingCompleteSolutionarchitecturefromDataAcquisitiontoDataAnalysis

forBusiness• DiscussionontechnologiesgettingusedforEndtoEndsolutionforBigDataAnalysis

drivenbusiness• SessiononPYTHON,UNIXandJAVA(Self-pacedlearningvideos)

Module2:UnderstandingHadoopfromThousandfeetView

• BitonhistoryofHadoopanditsevolution• Understandingparallelprocessingandparallelstoragearchitecture• RelatingMAPREDUCEandHDFSArchitecturetoabove• OverviewofalltechnologystacksrelatedtoHadoopEcosystem.• DiscussiononDifferentDistributionsofHadoop,DifferentvendorsofHadoop• Installationandset-upofHadoopCluster(Clouderapreferably)

Module3:UnderstandingParallelStoragesolutionwithHDFS

• UnderstandingHDFSArchitectureindetail• UnderstandingHadoopMaster-SlaveArchitecture• UnderstandingNameNode,DataNode,SecondaryNameNode• DiscussiononBlocksandDataReplication• LearningcommonHDFScommandsandpracticingthem• DiscussiononTypicalProductionClusteranditsconfigurationsforstability,reliability

andOptimization• UnderstandingAnatomyoffileRead,WriteoperationsinHDFS• DiscussiononHadoop2.xClusterArchitecture-FederationandHighAvailability

Page 5: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

Module4:UnderstandingParallelProcesssolutionwithMapReduce

• UnderstandingHadoop2.xMapReduceArchitecture(YARNArchitecture)• UnderstandingHadoop2.xMapReduceComponents• DiscussiononYARNMRApplicationExecutionFlow• DiscussiononAnatomyofMapReduceProgram–WorkFlow• DiscussiononbasicMapReduceAPIConcepts• WritingMapReduceDriver,Mappers,andReducersusingJAVA• DemoonMapReduceprogramexecution• UnderstandingInputSplitsanditsrelationshipwithHDFSBlocks• DiscussonAdvancedConcepts:

o DistributedCacheo CombinerandPartitionero Counterso MapsideandReducesideJoinso UseofCompressiontechniques(Snappy,LZOandZip)o AdvancedDatatypesinMapReduce(WritableandWritableComparable)

• Hands-onexercisesonMapReduceProgramExecution• DemoonTelecomDataset

Module 5: Learn to analyze relational data using Hive

• UnderstandingofHiveFramework&Components• DiscussiononrelationshipbetweenHiveandMapReduce(WhentochooseWhat)• Understandingandhandson

o HiveDataTypeso HiveDDL–Create/Show/DropDataBaseandTableso HiveDML–LoadFiles&InsertData o HiveSQL-Select,Filter,Join,GroupBy

• UnderstandingofInternalandExternalTables• UnderstandingofProgrammingstructureinUDF,PartitionsandBuckets• DiscussiononLimitationsofHive.• DiscussiononHIVEonSPARK.

Module6:LearnDataFlowETLScriptingLanguage“Pig”

• Introduction to Apache Pig• DiscussiononrelationshipbetweenHive,MapReduceandPig• Grunt• ShellandUtilitycomponents • Different data types in Pig

Page 6: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

• ProgrammingStructureinPig • Modes Of Execution in Pig • Experiencing Pig Script by writing Evaluation, Filter, Load and Store functions • UDFs in Pig • Understand Integration of HBASE with Pig (After Completion of HBASE Module)

Module7:Learn to analyze non-relational data using HBase

• IntroductiontoNoSQLDatabases• UnderstandingHBasenon-relational distributedDatabaseanditsarchitecture• When/WhytouseHBase• UnderstandingHBaseClientAPI• KnowhowtoloadDataintoHBase• KnowhowtoqueryDatafromHBase• DiscussiononrelationshipbetweenHive,MapReduce,PigandHBase• Overviewofothernon-relationallikeCassandra,andMongoDBanditsadvantagesover

RDBMSandCareerpathinNOSQLDBs.

Module8:LearnhowtoCollectdatafromavarietyofsourcestoyourHadoopclusterusingSqoop,Flume

• UnderstandWhyandwhatisSQOOP• UnderstandSQOOPArchitecture• LearnhowtoImportingDataUsingSQOOPtoHDFS/HIVE/HBaseFromRDBMS• UnderstandWhyandwhatisFlume• UnderstandFlumeArchitecture• Learnhowtoefficientlycollect,aggregate,andmovelargeamountsofstreamingdataintothe

HadoopDistributedFileSystem(HDFS)

Module9:LearnhowtoscheduleHadoopjobsusingOozie

• IntroductiontoWorkFlowManagementandOozie• LearnOozieComponentsAndWorkflow• OzzieCommands• ExperienceschedulingjobsusingOozie

Module10:DiscussiononEmergingTechnologiesinBigDataLandscape

• Discussionon:o EmergingTrendsInBigDataLandscapeo IntroductiontoSPARK–InMemoryParallelProcessingFrameworkforReal

Time/NearRealTime(NRT)operations

Page 7: ABOUT THIS TRAINING - s3-ap-southeast · PDF filewith writing real scripts on a Hadoop system using java, Scala ... might want to work at uses Hadoop in some way, including Amazon,

• BestpracticesforHadoopDeveloper• DiscussiononInterviewpreparation


Top Related