big data with hadoop and cloud computing

http://clean-clouds.com

Big Data with Hadoop and Cloud Computing

Researcher’s Blog - http://clean-clouds.com

“Big Data Processing” relevant for Enterprises

• Big Data used to be discarded or un-analyzed & archived.

– Loss of information, insight, and prospects to extract new value.

• How Big Data is beneficial?

– Energy companies - Geophysical analysis.

– Science and medicine - Empiricism is growing than experimentation

– Disney – Customer behavior patterns across its stores, and theme parks

• Pursuit of a “Competitive Advantage” is the driving factor for Enterprises

– Data mining (Log processing, click stream analysis, similarity algorithms, etc.), Financial

simulation (Monte Carlo simulation), File processing (resize jpegs), Web indexing

Cloud Computing ~ brings economy to Big Data Processing

• Big Data Processing can be implemented by HPC & Cloud.

1) HPC implementation is very costly w.r.t. CAPEX & OPEX.

2) Cloud Computing is efficient because of its paper use nature.

• MapReduce programming model is used for processing big data sets.

• Pig, Hive, Hadoop, … are used for Big data Processing

– Pig - SQL-like operations that apply to datasets.,

– Hive - Perform SQL-like data analysis on data

– Hadoop - processes vast amounts of data; (Focal point)

• Use EC2 instances to analyze “Big Data” in Amazon IaaS.

• Amazon MapReduce reduces complex setup & Magt.

Cost Comparison of Alternatives

Elastic: 1000 Standard ExtraLarge instances 15GB RAM,1690GB storageElastic MapReduce $377395

$1,746,769Elastic, Easy to Use, ReliableAuto turnoff resources.

Amazon MapReduce

As per Amazon EC2 cost comparison calculator

Use case: Analyze Next Generation Sequencing data to understand genetics of cancer.

100 Steady & 200 Peak load Servers68.4GB memory 1690 GB storage

•CAPEX & OPEX•Time-consuming set-up•Magt. of Hadoop clusters

HPC

400 reserved,600 on demandStandard Extra Largeinstances 15GB RAM,1690GB storage

•Time-consuming set-up•Magt. of Hadoop clusters

Amazon IaaS

Future Direction• Current Experiments & Identified areas

– Social network analysis– Managing Data center– Collective Intelligence - Algorithms and Visualization

techniques – Predictive analytics

• Accelerators Exploration – Apache Whirr - Cloud-neutral way to run services – Apache Mahout - Scalable machine learning library– Cascading - Distributed computing framework– HAMA - define and execute fault tolerant data processing

workflows• Exploration of LAMP-like stack for Big Data aggregation, processing and analytics

Download with Linkedin Username/Password

http://clean-clouds.tradepub.com/free/w_hp297/prgm.cgi

http://clean-clouds.tradepub.com/free/w_su54/prgm.cgi

http://clean-clouds.tradepub.com/free/w_laun01/prgm.cgi

http://clean-clouds.tradepub.com/free/w_bund25/prgm.cgi

http://clean-clouds.tradepub.com/free/w_su62/prgm.cgi

http://clean-clouds.tradepub.com/free/w_aaaa2182/prgm.cgi

http://clean-clouds.tradepub.com/free/w_sitb16/prgm.cgi

http://clean-clouds.tradepub.com/free/w_redh04/prgm.cgi

http://clean-clouds.tradepub.com/free/w_proo07/prgm.cgi

http://clean-clouds.tradepub.com/free/w_akam04/prgm.cgi

http://clean-clouds.tradepub.com/free/w_acqu07/prgm.cgi

http://clean-clouds.tradepub.com/free/w_sym108/prgm.cgi

Thank You

big data with hadoop and cloud computing

Technology

cloud computing mapreduce (2) keke chen. outline hadoop...

introduction to cloud...

cloud computing using mapreduce, hadoop, spark computing...

ioannis magnisalis - international hellenic...

the power of hadoop in cloud computing

cloud computing technology in scientific research...hadoop...

cse6331: cloud computing · frameworks emerging in the...

introduction to cloud computing ggg understanding cloud...

research in an open cloud exchange - systor · pdf...

a hybrid algorithm using genetic algorithm – hadoop...

雲端計算 cloud computing lab–hadoop. agenda hadoop...

k.c. college of engineering and management studies and ......

cloud computing and the dna data race -...

future @ cloud: cloud computing meets smart ecosystems...

hadoop, big data e cloud computing

cloud computing install hadoop

cloud computing, hadoop and mapreduce

cloud computing using mapreduce, hadoop,...

presentation on cloud computing security issues using hadoop...

cloud computing. what is cloud computing ? history...