install sqoop on amazon emr (elastic map reduce)
DESCRIPTION
Slides used in the Video "Sqoop on EMR" - https://www.youtube.com/watch?v=3YJwDJOyDE0TRANSCRIPT
Installing Sqoop on AWS Elas1c Map Reduce
Rohit Ghatol Director of Engineering @ Synerzip
h3p://www.linkedin.com/in/rohitghatol @rohitghatol h3p://rohitghatol.com
BY
So<ware Stack
Amazon EMR
Apache Sqoop
Step 1 – Set S3 Buckets
S3
S3
S3
synerzip-‐sqoop-‐scripts • install-‐sqoop.sh • sqoop-‐import-‐all.sh • mysql-‐connector-‐java-‐5.1.33.tar.gz • sqoop-‐1.4.4.bin__hadoop-‐2.0.4-‐alpha.tar.gz
synerzip-‐emr-‐logs • j-‐2SL51VFFUEVZT/
• daemons • node • steps
synerzip-‐imported-‐data • User_Profile-‐12-‐12-‐12_10:10:10
• part-‐m-‐00000 • part-‐m-‐00001 • part-‐m-‐00002
S3 Bucket with Sqoop Scripts
S3 Bucket with EMR Logs
S3 Bucket with Sqoop Imported Data
S3 Buckets
Install-‐Sqoop.sh #!/bin/bash cd /home/hadoop hadoop fs -‐copyToLocal s3://synerzip-‐sqoop-‐scripts/sqoop-‐1.4.4.bin__hadoop-‐2.0.4-‐alpha.tar.gz sqoop-‐1.4.4.bin__hadoop-‐2.0.4-‐alpha.tar.gz tar -‐xzf sqoop-‐1.4.4.bin__hadoop-‐2.0.4-‐alpha.tar.gz hadoop fs -‐copyToLocal s3://synerzip-‐sqoop-‐scripts/mysql-‐connector-‐java-‐5.1.33.tar.gz mysql-‐connector-‐java-‐5.1.33.tar.gz tar -‐xzf mysql-‐connector-‐java-‐5.1.33.tar.gz cp mysql-‐connector-‐java-‐5.1.33/mysql-‐connector-‐java-‐5.1.33-‐bin.jar sqoop-‐1.4.4.bin__hadoop-‐2.0.4-‐alpha/lib/
Sqoop-‐Import-‐all.sh !/bin/bash cd /home/hadoop/sqoop-‐1.4.4.bin__hadoop-‐2.0.4-‐alpha/bin ./sqoop import -‐-‐connect jdbc:mysql://db.c5zzejm1gdnx.us-‐west-‐1.rds.amazonaws.com/test -‐-‐username root -‐-‐password password -‐-‐table User_Profile -‐-‐target-‐dir s3://synerzip-‐imported-‐data/User_Profile-‐`date +"%m-‐%d-‐%y_%T"`
Step 2 – MySQL Database
User_Profile Table
Step 3 – Start EMR Cluster
s3://us-‐west-‐1.elasacmapreduce/libs/script-‐runner/script-‐runner.jar S3://synerzip-‐sqoop-‐scripts/install-‐sqoop.sh
s3://us-‐west-‐1.elasacmapreduce/libs/script-‐runner/script-‐runner.jar S3://synerzip-‐sqoop-‐scripts/import-‐sqoop-‐all.sh
Install-‐Sqoop Step
Import Sqoop Step
Install Sqoop Step
Jar locaaon -‐ s3://us-‐west-‐1.elasacmapreduce/libs/script-‐runner/script-‐runner.jar
Import Sqoop
Jar locaaon -‐ s3://us-‐west-‐1.elasacmapreduce/libs/script-‐runner/script-‐runner.jar
EMR Steps
Step 4 – See Imported Data
part-‐m-‐00000