my talk at lvee 2016

36
Using Hadoop stack to build a cloud VAT declaraons revising service Alex Chistyakov Git in Sky Grodno, LVEE 2016

Upload: alex-chistyakov

Post on 09-Jan-2017

53 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: My talk at LVEE 2016

Using Hadoop stack to build a cloud VATdeclarations revising service

Alex ChistyakovGit in Sky

Grodno, LVEE 2016

Page 2: My talk at LVEE 2016

Who I am

● Hello, my name is Alex

● Principal Engineer @ Git in Sky

● Hadoop operations engineer

● Former Java developer (not only Java and not so

“former” in fact)

Page 3: My talk at LVEE 2016

Who are you?

● Linux and OSS enthusiasts?

● Software developers?

● DevOps engineers?

● Big data guys?

Page 4: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

Page 5: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

Page 6: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

● 2) Configure the bloody cluster!

Page 7: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

● 2) Configure the bloody cluster!

● 3) ???

Page 8: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

● 2) Configure the bloody cluster!

● 3) ???

● 4) PROFIT!!!

Page 9: My talk at LVEE 2016

Big Data is hard!

● A customer wants a number of environments fordifferent purposes (dev, testing, staging &production)

● DevOps culture requires repeatability!

● (Observe a beautiful snowflake to the right)

● Business wants to reduce costs

Page 10: My talk at LVEE 2016

So, we need a detailed plan

● 1) Buy an enterprise subscription from Oracle

Page 11: My talk at LVEE 2016

So, we need a detailed plan

● 1) Buy an enterprise subscription from Oracle

● ^ FAIL!

Page 12: My talk at LVEE 2016

So, we need a detailed plan

● 1) Read the manual on the product site

Page 13: My talk at LVEE 2016

So, we need a detailed plan

● 1) Read the manual on the product site

● 2) Configure everything manually

Page 14: My talk at LVEE 2016

So, we need a detailed plan

● 1) Read the manual on the product site

● 2) Configure everything manually

● ^ FAIL!

Page 15: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

Page 16: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

● 2) Configure everything from a web interface

Page 17: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

● 2) Configure everything from a web interface

● 3) Don’t forget to buy an enterprise subscription

Page 18: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

● 2) Configure everything from a web interface

● 3) Don’t forget to buy an enterprise subscription

● 4) ^ MULTIPLE FAILS!!!

Page 19: My talk at LVEE 2016

A word on proprietary software

● Proprietary software is full of nasty bugs, period

Page 20: My talk at LVEE 2016

A word on open source software

● Open source software is awesome

Page 21: My talk at LVEE 2016

Software market in 2016

● It’s not “proprietary vs open source”

Page 22: My talk at LVEE 2016

Software market in 2016

● It’s not “proprietary vs open source”

● It’s “open source vs open source”

Page 23: My talk at LVEE 2016

Open source vs open source

● Cloudera CDH vs vanilla Apache

Page 24: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

Page 25: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

Page 26: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

● 3) Automate all the things

Page 27: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

● 3) Automate all the things

● 4) ???

Page 28: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

● 3) Automate all the things

● 4) ???

● 5) PROFIT!!!

Page 29: My talk at LVEE 2016

100 reasons not to use Cloudera CDH

● Cloudera CDH obscures configuration

● Cloudera CDH generates textual configs from the DB

● Cloudera CDH is web-interface centric

● Cloudera CDH is a monolith with a vendor lock-in

Page 30: My talk at LVEE 2016

Our own little open source product

● Based on Ansible (Ansible is like Chef but awesome)

● https://github.com/gitinsky/ansible-hadoop-stack-howto

● https://github.com/gitinsky/ansible-role-*

Page 31: My talk at LVEE 2016

Problems

● Lack of documentation

Page 32: My talk at LVEE 2016

Problems

● Lack of documentation

● Lack of manpower

Page 33: My talk at LVEE 2016

Problems

● Lack of documentation

● Lack of manpower

● Nobody uses our product (except us)

Page 34: My talk at LVEE 2016

What about the VAT service thing?

● Forget it, it’s not that relevant

Page 35: My talk at LVEE 2016

Conclusions

● Open source software is awesome

● But Cloudera CDH is not

● We can make open source software better

Page 36: My talk at LVEE 2016

So long, and thanks for all the fish!

● Ask your questions please

● Alex Chistyakov, Principal Engineer @ Git in Sky

● http://gitinsky.com

[email protected]

● http://meetup.com/DevOps-40