![Page 2: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/2.jpg)
www.kerneltraining.com
MapReduce
Pre-requisites for learning MapReduce ??
1. Hadoop Framework2. Distributed storage system such as HDFS3. Parallel programming concepts
![Page 3: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/3.jpg)
www.kerneltraining.com
MapReduce
Overview of mapreduce workflow
Map Input List
Map Output List
Mapper
Reduce Input List
Reduce Output List
Reducer
Mapping Phase
Reducing Pahase
Map Input List
![Page 4: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/4.jpg)
www.kerneltraining.com
MapReduce
<1, Delhi Mumbai Delhi>
<Delhi, 1><Mumbai, 1><Delhi, 1>
<2, Bangalore Delhi Chennai>
<3, Mumbai Delhi Chennai>
<Bangalore, 1><Delhi, 1><Chennai, 1>
<Mumbai, 1><Delhi, 1><Chennai, 1>
<Delhi, 1><Delhi, 1><Delhi, 1><Delhi, 1><Bangalore, 1>
<Mumbai, 1><Mumbai, 1>
<Chennai, 1><Chennai, 1>
<Delhi, (1,1,1,1)><Bangalore, 1>
<Mumbai, (1,1)><Chennai, (1,1)>
Delhi Mumbai Delhi
Bangalore Delhi Chennai
Mumbai Delhi Chennai
Map Phase
Shuffle/Sort
Reduce Phase
Map Output
Overview of MapReduce Framework
Input File
![Page 5: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/5.jpg)
www.kerneltraining.com
MapReduceResponsibilities to tackle various phases
Input Map Shuffling ReduceMap Output
Create ‘Input Splits’
Create individual Records -- Framework
User Defined Logic -- User
User Defined Logic -- User
Framework
<1, Delhi Mumbai Delhi>
<Delhi, 1><Mumbai, 1><Delhi, 1>
<2, Bangalore Delhi Chennai>
<3, Mumbai Delhi Chennai>
<Bangalore, 1><Delhi, 1><Chennai, 1>
<Mumbai, 1><Delhi, 1><Chennai, 1>
<Bangalore, 1>
<Mumbai, 1><Mumbai, 1>
<Chennai, 1><Chennai, 1>
<Delhi, (1,1,1,1)><Bangalore, 1>
<Mumbai, (1,1)><Chennai, (1,1)>
<Delhi, 4><Bangalore, 1>
Delhi Mumbai Delhi
Bangalore Delhi Chennai
Mumbai Delhi Chennai <Mumbai,
2><Chennai, 2>
<Delhi, 1><Delhi, 1><Delhi, 1><Delhi, 1>
![Page 6: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/6.jpg)
www.kerneltraining.com
MapReduce
Reduce Process
Mapper Process
Block A Block B Block C
Driver
Mapper
Record Reader
Input Split 1
Input Split 2
Input Split 3
Input Split 4
InputFormat
Mapper Process
Mapper
Record Reader
Reads
Passes <K,V> pairs
Reads
Calculates
Defines Passes
<K,V> pairs
<K, V> pairs
<K, V> pairs
Components of MapReduce
![Page 7: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/7.jpg)
www.kerneltraining.com
MapReduceComponents of MapReduce
Reduce Process
Mapper Process
Block A Block B Block C
Driver
Mapper
Record Reader
Input Split 1 Input Split 2 Input Split 3 Input Split 4
Mapper Process
Mapper
Record Reader
Reads
Passes <K,V> pairs
Reads
Calculates
Defines Passes
<K,V> pairs
<K, V> pairs
<K, V> pairs
Reduce Process
Reduce Process
![Page 8: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/8.jpg)
www.kerneltraining.com
MapReduceComponents of MapReduce
Reduce Process
Mapper Process
Block A Block B Block C
Driver
Mapper
Record Reader
Input Split 1 Input Split 2 Input Split 3 Input Split 4
InputFormat
Mapper Process
Mapper
Record Reader
Reads
Passes <K,V> pairs
Reads
Calculates
Defines Passes
<K,V> pairs
<K, V> pairs
<K, V> pairs
Reduce Process
Reduce Process
Reduce ProcessReducer
Reduce ProcessReducer
Passes <K,V> pairs
Passes <K,V> pairs
Shuffle
![Page 9: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/9.jpg)
www.kerneltraining.com
MapReduceComponents of MapReduce
Reduce Process
Mapper Process
Block A Block B Block C
Driver
Mapper
Record Reader
Input Split 1 Input Split 2 Input Split 3 Input Split 4
InputFormat
Mapper Process
Mapper
Record Reader
Reads
Passes <K,V> pairs
Reads
Calculates
Defines Passes <K,V>
pairs
<K, V> pairs
<K, V> pairs
Reduce Process
Reduce Process
Reduce ProcessReducer
Reduce ProcessReducer
Passes <K,V> pairs
Passes <K,V> pairs
Shuffle
Writer
Output Data
Writer
Output DataWrites
Writes
OutputFormat
Defines
Defines
Defines
Defines
Defines
![Page 10: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/10.jpg)
www.kerneltraining.com
MapReduceDeciding factors to decide MapReduce.
Questions we must ask before deciding MapReduce :-
• Are input files input files independent of each other to process?
• Can the problem be broken into smaller tasks such that each task can be processed independently?
• Can the partial results of executing processing on small tasks be aggregated or consolidated?
![Page 11: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/11.jpg)
www.kerneltraining.com
MapReduceDesign Patterns
Template for solving a common and general data manipulation problem with MapReduce.
• Summarization Patterns
• Filtering Patterns
• Join Patterns
• Job Chaining Patterns
![Page 12: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/12.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
•To find out subscribers and their corresponding downloaded bytes from sample logs of airmobile provided. Each line has information about subscriber (substring 15,26) the bytes downloaded (substring 87,97)
•Sample log files are present in above format. Data is present in line delimited format. From each line Customer ID and Downloaded Bytes have to be extracted for analysis.
![Page 13: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/13.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
(K1, V1) -- Input to user defined map function
•(0 , subId=00001111911128052639towerid=11232w34532543456345623453456984756894756bytes=122112212212212219.6726312167218586E17)
•(121 , subId=00001111911128052615towerid=11232w34532543456345623453456984756894756bytes=122112212212212216.9431647633139046E17
•(242 , subId=00001111911128052615towerid=11232w34532543456345623453456984756894756bytes=122112212212212214.7836041833447418E17)
![Page 14: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/14.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
list(K2, V2) -- Output from use defined map function
•(28052627, 8.4621702216543) •(28052639, 9.672631216721a858) •(28052627, 8.64072609693471)
![Page 15: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/15.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
(K2, list(V2)) -- Input to use defined reduce function
•(“28052627”, (8.4621702216543, 8.64072609693471) •(“28052639”, (9.672631216721858))
![Page 16: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/16.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
Mapper Class
![Page 17: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/17.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
Reducer Class
![Page 18: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/18.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
Driver Class
![Page 19: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/19.jpg)
www.kerneltraining.com
MapReduceCase Study – Summarization Pattern
MapReduce Output
![Page 21: Mapreduce Introduction | Overview | Online Training | Basics](https://reader036.vdocument.in/reader036/viewer/2022062820/58a01d791a28ab9c418b6e45/html5/thumbnails/21.jpg)
THANK YOUfor attending Demo of Hadoop Map Reduce
www.kerneltraining.com
Email: [email protected] us: +91 8099776681