lecture 2 part 2

12
An introduction to the Hadoop Distributed File System (HDFS), Map Reduce, HBase, and HCatalog Salem Othman Kent State University Summer 2015 [email protected]

Upload: jazan-university

Post on 17-Aug-2015

191 views

Category:

Education


0 download

TRANSCRIPT

  1. 1. What is Hadoop?, What Hadoop is not?, and Hadoop Assumptions. What is Rack, Cluster, Nodes and Commodity Hardware? HDFS - Hadoop Distributed File System Using HDFS commands MapReduce Higher-level languages over Hadoop: Pig and Hive HBase Overview HCatalog
  2. 2. What is Hadoop and its components? What is the commodity server/Hardware? Why HDFS ? What is the responsibility of NameNode in HDFS? What is Fault Tolerance? What is the default replication factor in HDFS? What is the heartbeat in HDFS? What are JobTracker and TaskTracker? Why MapReduce programming model? Where do we have Data Locality in MapReduce? Why we need to use Pig and Hive? What is the difference between Hbase and HCatalog
  3. 3. Download Oracle VM VirtualBox Install Oracle VM VirtualBox Run Oracle VM VirtualBox 1 23
  4. 4. Download Hortonworks Install Hortonworks Import Hortonworks inside Oracle VM VirtualBox Run Hortonworks 1 2 3 4 5 6 Download Hortonworks Install Hortonworks Import Hortonworks inside Oracle VM VirtualBox Run Hortonworks
  5. 5. 1 2 3
  6. 6. 1 2 3 4 5 6 7