big data analysis for page ranking using map reduce concept

18
B B ig ig D D ata Analysis for Page ata Analysis for Page Ranking using Map/Reduce Ranking using Map/Reduce R.Renuka, R.Vidhya Priya, III B.Sc., IT, S.F.R.College for Women,

Upload: vidhya-kumar

Post on 18-Jul-2015

191 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Big Data Analysis for page ranking using map reduce concept

BBig ig DData Analysis for Page ata Analysis for Page

Ranking using Map/ReduceRanking using Map/Reduce

R.Renuka, R.Vidhya Priya, III B.Sc., IT, S.F.R.College for Women, Sivakasi.

Page 2: Big Data Analysis for page ranking using map reduce concept

OverviewIntroductionWhat is Big Data!Why Big Data?4 V’s Of Big DataBig Data Analytics TechnologiesMap/Reduce Applications Case StudyConclusion

Page 3: Big Data Analysis for page ranking using map reduce concept

IntroductionData have outgrown the storage and processing capabilities of a single host.

Two fundamental challenges: – how to store and – how to work with voluminous data sizes, and, – how to understand data and turn it into a competitive

advantage.

Page 4: Big Data Analysis for page ranking using map reduce concept

What is Big Data! ‘Big-data’ is similar to ‘Small-data’ , but bigger

But having data bigger requires different approaches: techniques, tools & architectures

To solve: New problems and old problems in a better way.

Page 5: Big Data Analysis for page ranking using map reduce concept

The Blind men and the Elephant

Page 6: Big Data Analysis for page ranking using map reduce concept

Why Big Data?Key enablers for the growth of “Big Data” are:

Increase of Processing Power

Increase of Storage Capacities

Availability of Data

Page 7: Big Data Analysis for page ranking using map reduce concept

4 V’s of Big Data

Page 8: Big Data Analysis for page ranking using map reduce concept

Big Data Analytics TechnologiesHadoop

PLATFORA

WibiData

PIG

Hive

MapReduce

NoSQL databases

Column-oriented databases

Page 9: Big Data Analysis for page ranking using map reduce concept

HadoopHadoop is a distributed file system and data processing engine

Hadoop has two components:– The Hadoop distributed file system (HDFS)– The MapReduce programing.

Page 10: Big Data Analysis for page ranking using map reduce concept

Map / ReduceA High level abstracted framework for distributed processing of large datasets

Fault Tolerant , Parallelization

Computation consists of two phasesMapReduce

A Master-Slave architecture

Computations occurs in multiple slave nodes

And it tries to provide data locality as much as possible.

Page 11: Big Data Analysis for page ranking using map reduce concept

MR modelMap– Process a key/value pair to generate intermediate key/value

pairsReduce– Merge all intermediate values associated with the same key

Users implement interface of two primary methods:1. Map: (key1, val1) → (key2, val2)2. Reduce: (key2, [val2]) → [val3]

Page 12: Big Data Analysis for page ranking using map reduce concept

Applications

Page 13: Big Data Analysis for page ranking using map reduce concept

Homeland Security

Finance Smarter Healthcare Multi-channel sales

Telecom

Manufacturing

Traffic Control

Trading Analytics Fraud and Risk

Log Analysis

Search Quality

Retails

Page 14: Big Data Analysis for page ranking using map reduce concept

Case Study

Page 15: Big Data Analysis for page ranking using map reduce concept
Page 16: Big Data Analysis for page ranking using map reduce concept

Conclusion

Real-time big data isn’ t just a process for storing

petabytes or exabytes of data in a data warehouse, It’s

about the ability to make better decisions and take

meaningful actions at the right time.

Page 17: Big Data Analysis for page ranking using map reduce concept

Queries ??

Page 18: Big Data Analysis for page ranking using map reduce concept