building hadoop with chef

11
Build & Managing Hadoop Build & Managing Hadoop with Chef with Chef John Martin Sr Director, Production Engineering

Upload: john-martin

Post on 15-Dec-2014

2.810 views

Category:

Technology


3 download

DESCRIPTION

Slides from my presentation at #ChefConf 2013 Big Data meets Configuration Management. Edmunds.com's first foray into Hadoop is a tale of challenges, discovery, and ultimately triumph. This is the story of how Edmunds.com leveraged Chef - and its community - to build a fully automated Hadoop cluster in the face of looming project deadlines.

TRANSCRIPT

Page 1: Building Hadoop with Chef

Build & Managing HadoopBuild & Managing Hadoopwith Chefwith Chef

John MartinSr Director, Production Engineering

Page 2: Building Hadoop with Chef

IntroductionIntroduction

• Me, Me, Me

• 10+ years in .com & JEE space

• Project Crew

• Paul MacDougall

• Greg Rokita

• KC Braunschweig (former)

• Ryan Holmes (former)

• Edmunds.com

• Founded in 1966

• Gopher site in 1994

• HTTP site in 1995

Page 3: Building Hadoop with Chef

Edmunds.com EnvironmentEdmunds.com Environment

• Nearing 3000 hosts

• Heavily virtualized(Xen, CloudStack, AWS)

• Tomcat with some WebLogic

• Coherence Solr Mongo

• Publishing built on ActiveMQ

• Newly launched DWH built around Hadoop + Netezza

Page 4: Building Hadoop with Chef

• Explosive infrastructure growth

• Quick to bootstrap

• Easy integration with our tooling

• knife

• The Chef Community

Why Chef?Why Chef?

Page 5: Building Hadoop with Chef

• Open framework for data-intensive distributed applications

• Reigning King of “Big Data”

• Many services

• HDFS

• MapReduce

• HBase

• ZooKeeper

• Designed to run on commodity hardware

What’s Hadoop?What’s Hadoop?

Page 6: Building Hadoop with Chef

• Multiple Clusters

• Roughly 200Tb in total

• 40+ nodes in production

• Maintained by Ops + Dev

• Dell R410

• Six-core 2.40Ghz

• 24Gb RAM

• 4x 1Tb 7200RPMs

Edmunds Hadoop EnvironmentEdmunds Hadoop Environment

Page 7: Building Hadoop with Chef

• First cluster was a Frankenstein

• Part BMC

• Part manual effort

• Part Puppet

• Staff changes & knowledge loss

• Time for a clean slate!

How We Got HereHow We Got Here

Page 8: Building Hadoop with Chef

• True Dev + Ops effort

• Production built in 3 weeks

• Built with community cookbooks

• All services now administered with knife

• New nodes now cluster-ready within minutes

Building Hadoop with ChefBuilding Hadoop with Chef

Page 9: Building Hadoop with Chef

• First highly-visible Chef success story at Edmunds

• Cemented Chef as our CM solution

• Engaged us with the community

• Completely automated Hadoop infrastructure

• New suite of administrative scripts

• knife-[start|stop]-all.sh $cluster

• knife-[start|stop]-hbase.sh $cluster

• knife-[start|stop]-mapred.sh $cluster

• knife-[start|stop]-oozie.sh $cluster

What We GainedWhat We Gained

Page 10: Building Hadoop with Chef

• New cluster currently being built!

• Integration with Cloudera Manager

• Cluster replication

• Continue evangelism of Chef’s awesomeness

• Extend more of the toolchain around Chef

• See you around at the LA Chef UG!

Where Next?Where Next?

Page 11: Building Hadoop with Chef

Thank you!Thank you!

• email: [email protected]

• twitter: @tekbuddha