osdc 2015: martin gerhard loschwitz - kristian köhntopp | 45 minutes of openstack hate

75
45 Minutes of Openstack Hate: A reality check Kristian Köhntopp, Cloud Architect Old Fart Martin Loschwitz, Cloud Architect Future Integration Representative SysEleven https://www.flickr.com/photos/pinksherbet/188842453 Pink Sherbet Photography (CC BY 2.0)

Upload: netways

Post on 15-Jul-2015

140 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

45 Minutes of Openstack Hate: A reality checkKristian Köhntopp, Cloud Architect Old Fart Martin Loschwitz, Cloud Architect Future Integration Representative

SysEleven

https://www.flickr.com/photos/pinksherbet/188842453 Pink Sherbet Photography (CC BY 2.0)

Page 2: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

2

Page 3: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

3

Page 4: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

https://uksysadmin.files.wordpress.com/2011/03/openstackwallpaper1.png

Page 5: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Widely supported by many marketing departmentshttps://www.flickr.com/photos/ahockley/8662640096 Aaron Hockley (cc-by-sa 2.0)

Page 6: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

6

http://www.zdnet.com/article/customers-reporting-interest-in-cloud-containers-linux-and-openstack-for-2015/ http://www.businesscloudnews.com/2015/04/09/red-hat-dell-redouble-openstack-private-cloud-efforts/

http://blogs.wsj.com/cio/2015/02/25/the-morning-download-h-p-deal-with-deutsche-bank-is-step-forward-for-openstack/

Page 7: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

What Openstack Vendors promise…https://www.flickr.com/photos/debarshiray/8237431709/sizes/l debarshiray (CC-BY-SA)

Page 8: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

How Openstack Admins imagine their workplace…(C) Kristian Köhntopp, 2006

Page 9: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

What is being delivered…(C) Hendrik Scholz, 2006

Page 10: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

What the admin job actually looks like…(C) Kristian Köhntopp, 2015

Page 11: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

What is it that we want to do?

11

Page 12: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

„Any VM, anywhere.“

12

Page 13: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

„Infrastructure as Code.“

13

Page 14: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Why would I need that?

• 24 Cores (48 Cores HT), 256G RAM, 2* 10GBit und 12* 3TB HDD including a solid BBU

• 10k EUR

• Applications that actually fill that box are rare. So we are cutting it up to sell off the parts.

14

Page 15: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

"Any VM, anywhere"15

CPU, RAM

StorageNetwork

OverlayUnderlay

Page 16: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage"Where the Internet lives", http://www.google.com/about/datacenters/gallery/#/tech/12

Page 17: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage requirements

• VM with Ephemeral Storage • Not a problem, right? Because storage can be an un-

RAID-ed local disk. • But then you got no migration.

• Mainentance sucks w/o Migration. For you and for your customers.

17

Page 18: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Lesson #1:Nope, local storage is not a legal

default.

18

Page 19: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage

• VM with Volume: All writes are always remote. • And redundant.

• Relevant metrics: • Bandwidth, IOPS and Latency • MB/sec, multithreaded-fsync()/sec und sequential-

fsync()/sec • Where do the limits come from?

19

Page 20: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage vs. requirements

• Scales easily: • Bandwidth, MT-IOPS

• Hard to scale: • Sequential-IOPS (b/c latency, target: 10k) • Default-Benchmark: "MySQL Slave on a volume" • "16 KB single-thread random-write on a datafile."

20

Page 21: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Lesson #2:A Single-Threaded Random-Write

Benchmark is a fine first evaluation.

21

Page 22: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage vs. actual Openstack delivery

• OpenStack Default is iSCSI und tgtd • »The problem is left as an exercise to the user.«

• Storage Vendors are totally in love with that approach. • https://wiki.openstack.org/wiki/CinderSupportMatrix

22

Page 23: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage: Ceph - the good

• Excellent bandwidth • Good Multithreaded-IOPS • Very robust, if you got spare memory and time for MTTR

23

Page 24: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage: Ceph - the bad

• CRUSH • Layout follows topology • Change topology, move data needlessly • You do need a background network for storage to

facilitate Rebalancing/Recovery/Cluster-Reorg

24

Page 25: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage: Ceph - the ugly

• Sequential IOPS fatally slow (200 IOPS). • MySQL slave on our Ceph-Volumes @ 200 Commit/s • Windows 8.1 boots from a Ceph-Volume in 15 Minutes

25

Page 26: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Lesson #3:Pure Play OpenStack will not work for production.

Corollary: OpenStack is not a functioning

Open Source Project.

26

Page 27: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Storage requirements vs. Multitenancy

• IOPS-Requirements need SSD to implement • IOPS Quotas require SSD to implement • SSD complicated - Preice vs. guranteed Performance • Caches complicated

27

"Magento Indexer"

Single-Threaded Database Updates

You can't put quota on what isn't there.

Small IOPS = large incremental steps, large variance.

Target price 1 EUR/GB atm.

Low performance variation more

important than huge peak perf.

Working Set > Cache

means you are working

the raw iron.

Page 28: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Lesson #4:Caching is as much part of the problem

as it is part of the solution.

28

Page 29: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

NetworkMercury Redstone Connector MR-1 (1960) https://www.flickr.com/photos/jurvetson/5691350527 Steve Jurvetson (CC-BY)

Page 30: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Network requirements

• Hosting setup: • Topology: Wait-Free, Oversubscription-Free • Redundancy: SPOF-Free • Multi-Tenancy: Isolation and Quotas

• Also: • Capable of carrying the storage traffic

30

Page 31: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

–Joe Random Alphauser

„We just rolled out a few dozen VMs with Hadoop and played Terasort.“

31

http://bradhedlund.com/2012/01/25/construct-a-leaf-spine-design-with-40g-or-10g-an-observation-in-scaling-the-fabric/

Page 32: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Notwork: Openstack default offering

• Open vSwitch, GRE-Ball, Broadcast-Problem, Chokepoint, SPOF

• Current releases are only marginally less braindead.

32

SPOF

Page 33: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Lesson #5:Ok, networking doesn't work, too.

33

Page 34: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Let's look around for a non-broken solution34

Page 35: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

OpenContrail: the good, …

• Open Source Project by Juniper • Uses existing hardware routing infrastructure

• scales, no SPOFs, no Chokepoints • based on MPLS, BGP, and other well understood

protocols • actually delivers stuff (as opposed to e.g. OpenDaylite)

35

Page 36: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

OpenContrail: the bad, …

• Juniper bought Contrail, doesn't understand how to market it

• little and outdated documentation, bad release management, very bad packaging

• funny support organisation

36

Page 37: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

OpenContrail: the bad, …

• During the OpenContrail build process, the scons-based build-process will

• download the full source of tools such as Bind and Curl • patch them wildly • put the resulting binaries into a .deb package

• which will of course conflict with half a dozen Ubuntu packages

37

Page 38: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

… and the ugly.https://www.flickr.com/photos/59145750@N03/5559004171 Mara Tr. (CC-BY 2.0)

Page 39: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Technology-Jenga

• OpenContrail’s "Stack" • vrouter (.ko), if-map, Sandesh (XML over Thrift!) • C++, Python, node.js, irond (Java) • redis, Cassandra, Zookeeper • xmpp, BGP, MPLS • Up Next in OpenContrail 2.2: Kafka • Put that into a job profile, try to find a candidate. Is he named

Pedro?

39

Page 40: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Lesson #6:Problem not limited to Contrail scope.

40

Page 41: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Let's look what else is in the box…https://www.flickr.com/photos/markjsebastian/4217877353/ Mark Sebastian (CC BY-SA 2.0)

Page 42: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

General requirements as of 2015

• Distributed Anything: • NTP, centralized Logging, centralized Monitoring • functional validation before components before join, clean

cluster startup • cluster comms are Kyle-Kingsbury-proof • CA, encrypted data in flight, optionally encrypted data at rest • User-Story regarding Live-Upgrades, with Canaries

42

Page 43: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

OpenStack delivers this…https://www.flickr.com/photos/z287marc/3189567558/ z287marc (CC BY 2.0)

Page 44: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

A wild hack session appears…https://www.flickr.com/photos/nauright/11341288174/ Romana Klee (CC BY-SA 2.0)

Page 45: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Not an isolated problem…

• Start a HEAT stack with a few networks and 20 VMs • … and the cluster switches off. • Nova does not communicate with Cinder, but has a

hard timeout. • "Are we stupid or is OpenStack stupid? Let's test a few

public clouds…"

45

Page 46: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Result…

• Phone ringing… • "Whatever you are doing,

could you please do something else?"

46

"Rotes Telefon", http://de.wikipedia.org/wiki/Datei:Jimmy_Carter_Library_and_Museum_99.JPG User:Piotrus (CC-BY 3.0)

Page 47: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Not an isolated problem…47

Implementation tested

on Devstack in VMware Fusion

on a MacBook Air in St. Oberholz?

Page 48: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Not an isolated problem…48

https://thespeechatimeforchoosing.files.wordpress.com/2012/12/facepalm.jpg

Keystone

Page 49: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Not an isolated problem…

• "Could you have a look, my Instances won't instantiate." • A longer debug session later it turns out that libvirts on

cloud17 is dysfunctional. • Compute-Nodes in OpenStack join the cluster simply by

being visible in AMQP (register as a consumer), no functional validation.

49

Page 50: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Not an isolated problem…50

Ceilometer

Page 51: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Not an isolated problem…

• During the Juno release cycle, Horizon got patched to be „less confusing for users“ (any GNOME devs around?)

• Result: Assigning floating IPs using Horizon became impossible when using OpenContrail

• Lesson learned: OpenStack wants to be versatile, but in fact, most development only ever happens for the „streamline“ setups

51

Page 52: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

What is the real problem?https://www.flickr.com/photos/jdhancock/8395113234 JD Hancock (CC BY 2.0)

Page 53: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

„Any VM, anywhere.“

53

Page 54: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

„As a hoster/enterprise/department, I want a virtualization platform

which does…“

54

Page 55: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Problemspec vs. Produktspec

• Requirements regarding • Tenant Isolation, • Billing Model, • Operations Model, • Development Model, • Scalability

55

Page 56: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Prototypes vs. Product"Prototype in the round file", https://www.flickr.com/photos/generated/3313311558 Jared Tarbell (CC-BY 2.0)

Page 57: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Stabilize the core with actual engineering vs.

Moar components

57

Page 58: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

“The Big Tent”https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)

Page 59: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

"The problem with the Big Tent is that it is full of clowns."https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)

Page 60: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

–Maik Zumstrull

„Instead of having a functional solution I can now choose between 13

differently deficient components.“

60

Page 61: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Democracy as a software architecture model

• Product definition == target specification • Who is our customer and what do they need? • What are the mandatory/desireable/optional

properties of the product to become part of the solution space instead of the problem space?

61

Page 62: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Democracy as a software architecture model

• Quality Assurance • "But OpenStack does have Blueprints, CI and code

review" • Without a target specification you have no idea about

requirements on scale, workload types, security, …

62

Page 63: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Who actually votes on your commit…

Page 64: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Democracy as a software architecture model

• Limited, learnable technology stack • One recommended and tested solution with a

documented best practice for deployment instead of a plugin architecture.

• Reuseable solutions for comparable problems in neighboring subprojects (“Architecture Refactoring”).

64

Page 65: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Monasca

Page 66: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

How to win at Hipsterbingo…

• »Monasca is an open-source multi-tenant, highly scalable, performant, fault-tolerant monitoring-as-a-service solution that integrates with OpenStack. It uses a REST API for high-speed metrics processing and querying and has a streaming alarm engine and notification engine.«

66

Bingo!

Page 67: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

How to win at Hipsterbingo…

• »Uses a number of underlying technologies:Apache Kafka, Apache Storm, Zookeeper, MySQL, Vagrant, Dropwizard, InfluxDB, Vertica.«

67

Page 68: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Democracy as a software architecture model

• Useful vs. Hyped • “Docker is like OpenStack,

but from a Dev instead of an Ops perspective. Wouldn't it be great to unify the projects?”

68

https://twitter.com/martinisoft/status/527191803603468288

Page 69: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

And finally…69

Page 70: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

TL;DR please?

70

Page 71: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

TL;DR

• OpenStack right now: not a product, but a box of nuts and bolts. Significant assembly required.

• Not all parts are actually bad, but the quality varies. A lot.

• No other project has the momentum. • Most other projects do not have multi-tenancy, multi-node. • Many haven't even grasped the actual problem. Including Docker.

71

Page 72: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

TL;DR

• Deep in the Gartner hype cycle phase 2: • Users and hence real-world

requirements missing. • No unified view on acceptable

solutions. • Vendors think they can "win" and

"own" the market or products, pushing their embrace and extend agendas.

72

Page 73: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

TL;DR

• Apache-like open development processes are a proven model for taming open source and commoditisation.

• Compare Hadoop • Based on methods pioneered by Java

73

Page 74: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

74

?

Page 75: OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

3. Openstack DACH Tag 11. Juni 2015

Berlin http://openstack-dach.org

75