openstack at scale inside netapp

13
OpenStack at Scale inside NetApp Manasi Prabhavalkar NetApp Inc. August 24, 2016

Upload: tesora

Post on 07-Apr-2017

131 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: OpenStack at Scale Inside NetApp

OpenStack at Scale inside NetAppManasi PrabhavalkarNetApp Inc.August 24, 2016

Page 2: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.2

Manasi Prabhavalkar Systems Architect for OpenStack in the

Engineering Shared Infrastructure Services group AKA Customer Zero

Masters in Computer Science @ NC State University

Bleeding edge of technology to serve as a platform for innovation inside NetApp

About Me

@manasip11

Page 3: OpenStack at Scale Inside NetApp

3

AUTOMATION WITH PUPPET

GLOBALIZING OPENSTACK

FUTURESTEPS

OPENSTACK INTRODUCTION

BEFORE OPENSTACK

AUTOMATING NDO UPGRADES

GLOBAL NDO UPGRADES

Pre-2014 Aug 2014 Sept 2014 Aug 2015 Dec 2015 Jan 2016

Timeline

© 2016 NetApp, Inc. All rights reserved.

Page 4: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.4

Global Engineering CloudKey stats Internal Private Cloud: GEC

One stop portal Multi-hypervisor 75,000 Total VM Capacity 15% KVM and growing

FlexPod Datacenter OpenStack RDO Mitaka NetApp FAS and/or E-Series Storage Cisco Nexus Networking Cisco UCS Compute

Automation Deployed Puppet Open Source Jenkins Git

Massively scalable shared virtual data center infrastructure

Page 5: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.5

Region Architecture

Glance

Nova

Neutron

Cinder

Ceilometer

Heat

CONTROLLERRabbitMQ

COMPUTES LOT1

S LOT5

S LOT3

S LOT7

SL OT2

SL OT6

SL OT4

SL OT8

!

UC S 5108

OK FAIL OK F AIL OK F AIL OK FAIL

! Rese tCo ns ole

UCS B 200 M3

! Rese tCo ns ole

UCS B 200 M3

! Re se tCo n s ole

UCS B 200 M3

! Rese tCo ns ole

UCS B 200 M3

! Rese tCon s ole

UCS B 200 M3

! Rese tCon s ole

UCS B 200 M3

! Rese tCon s ole

UCS B 200 M3

! Rese tCon so le

UC S B 200 M3

S LOT1

S LOT5

S LOT3

S LOT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UC S 5108

OK FAIL OK F AIL OK F AIL OK FAIL

! Re se tCon s ole

UCS B200 M3

! Re se tCon s ole

UCS B200 M3

! Re se tCon s ole

UCS B 200 M3

! Re se tCon s ole

UCS B200 M3

! Re setCo n s ole

U CS B 200 M3

! Re setCo n s ole

U CS B 200 M3

! Re setCo n s ole

U CS B 200 M3

! Re setCo n s ole

U CS B 200 M3

DB

MONGODB

REGION-01

REGION ZERO

Keystone Horizon

LB LB LB LB

GDB GDB GDB

LB LB

Keystone Keystone Horizon Horizon

Glance

Nova

Neutron

Cinder

Ceilometer

Heat

CONTROLLERRabbitMQ

COMPUTESL OT

1

SL OT5

SL OT3

SL OT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UC S 5108

OK F AIL OK FAIL OK F AIL OK F AIL

! Rese tCo ns ole

U CS B200 M3

! Rese tCo ns ole

U CS B200 M3

! Re setCo n sole

UC S B 200 M3

! Rese tCo ns ole

U CS B200 M3

! Re se tCo nso le

UCS B 200 M3

! Re se tCo nso le

UCS B 200 M3

! Re se tCo nso le

UCS B 200 M3

! Re se tCon so le

UC S B200 M3

SL OT1

SL OT5

SL OT3

SL OT7

SL OT2

S LOT6

SL OT4

SL OT8

!

UC S 5108

OK F AIL OK FAIL OK F AIL OK F AIL

! Re se tCon sole

UCS B200 M3

! Re se tCon sole

UCS B200 M3

! Re se tCo nso le

UC S B200 M3

! Re se tCon sole

UCS B200 M3

! Re setCon so le

UCS B200 M3

! Re setCon so le

UCS B200 M3

! Re setCon so le

UCS B200 M3

! Re setCo n so le

UC S B200 M3

DB

MONGODB

REGION-03

Glance

Nova

Neutron

Cinder

Ceilometer

Heat

CONTROLLERRabbitMQ

COMPUTESL OT

1

SL OT5

SL OT3

SL OT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UCS 5108

OK FAIL OK F AIL OK F AIL OK F AIL

! Re setCo n sole

UC S B200 M3

! Re setCo n sole

UC S B200 M3

! Rese tCons o le

UCS B 200 M3

! Re setCo n sole

UC S B200 M3

! Re se tCon s ole

UCS B 200 M3

! Re se tCon s ole

UCS B 200 M3

! Re se tCon s ole

UCS B 200 M3

! Rese tCon sole

UC S B200 M3

S LOT1

S LOT5

S LOT3

S LOT7

SL OT2

SL OT6

SL OT4

SL OT8

!

UC S 5108

OK FAIL OK F AIL OK F AIL OK FAIL

! Rese tCo ns ole

UCS B200 M3

! Rese tCo ns ole

UCS B200 M3

! Re se tCo n s ole

UCS B200 M3

! Rese tCo ns ole

UCS B200 M3

! Rese tCon s ole

UCS B200 M3

! Rese tCon s ole

UCS B200 M3

! Rese tCon s ole

UCS B200 M3

! Rese tCon so le

UC S B200 M3

DB

MONGODB

REGION-02

Page 6: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.6

01

05

10

15

20

25

30

35

40

02

03

04

06

07

08

09

11

12

13

14

16

17

18

19

21

22

23

24

26

27

28

29

31

32

33

34

36

37

38

39

41

42

01

05

10

15

20

25

30

35

40

02

03

04

06

07

08

09

11

12

13

14

16

17

18

19

21

22

23

24

26

27

28

29

31

32

33

34

36

37

38

39

41

42

ST S

BCN

ACT

Ci sco Nexus 9396 PX

1

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

N9K-M12PQ STS 1 2ACT

3 4ACT

5 6ACT

7 8ACT

9 10ACT

11 12ACT

ST S

BCN

ACT

Ci sco Nexus 9396 PX

1

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

N9K-M12PQ STS 1 2ACT

3 4ACT

5 6ACT

7 8ACT

9 10ACT

11 12ACT

A B

FAS8040FAS8040

4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19

DS 2246

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200GB

1200GB

4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19

DS 2246

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200GB

1200G

B

CISCO UCS 6248UP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

STAT

ID

SLOT1

SLOT5

SLOT3

SLOT7

SLOT2

SLO T6

SLOT4

SLOT8

!

UCS 5108

OK FAIL OK FAI L OK FAIL OK FAIL

SLOT1

SLOT5

SLOT3

SLOT7

SLOT2

SLO T6

SLOT4

SLOT8

!

UCS 5108

OK FAIL OK FAI L OK FAIL OK FAIL

CISCO UCS 6248UP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

STAT

ID

4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19

DS 2246

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200G

B

1200GB

1200GB

1200GB

1200GB

4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19

DS 2246

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200G

B

1200GB

1200G

B

1200GB

1200G

B

! Res etConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! Res etConsole

UCS B200 M4

! Re setConsole

UCS B200 M4

! ResetConsole

UCS B200 M4

PUPPETROLES

FLEXCLONE AND ASSIGN BOOT LUN

ASSIGN SERVICE PROFILE

CREATE FLEXVOLS FOR CINDER AND NOVA STORAGE

FEED THE NODE TO PUPPET

VLAN ####

CREATE VLANS FOR INSTANCES

Automation With Puppet

WEBLOADBALANCERS KEYSTONEGALERADBCONTROLLER COMPUTEDATABASEMONGODB

OpenStack Juno in Production in less than 90 minutes with 45 nodes

Page 7: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.7

1. Shared Services Keystone & Horizon upgraded

serially

2. Controller Services upgraded serially across

regions

3. Compute Live migrate Instances to other

Compute nodes Upgrade empty Compute node

serially within region and parallel across regions

Zero service interruption

Automating Non-Disruptive UpgradesSeamless user experience

COMPUTESL OT1

SL OT5

SL OT3

SL OT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UC S 5108

OK F AIL OK FAIL OK FAIL OK F AIL

! Rese tCo ns ole

U CS B200 M3

! Rese tCo ns ole

U CS B200 M3

! Rese tCon sole

UC S B 200 M3

! Rese tCo ns ole

U CS B200 M3

! Re se tCons o le

UCS B 200 M3

! Re se tCons o le

UCS B 200 M3

! Re se tCons o le

UCS B 200 M3

! Re se tCon so le

UC S B 200 M3

SL OT1

SL OT5

SL OT3

SL OT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UCS 5108

OK F AIL OK FAIL OK F AIL OK F AIL

! Re se tCon so le

UC S B200 M3

! Re se tCon so le

UC S B200 M3

! Re se tCon so le

UC S B200 M3

! Re se tCon so le

UC S B200 M3

! Re se tCon s ole

UCS B200 M3

! Re se tCon s ole

UCS B200 M3

! Re se tCon s ole

UCS B200 M3

! Rese tCon so le

UC S B200 M3

REGION-01

COMPUTESL OT

1

SL OT5

SL OT3

SL OT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UCS 5108

OK F AIL OK FAIL OK FAIL OK F AIL

! Rese tCon s ole

UCS B200 M3

! Rese tCon s ole

UCS B200 M3

! Re setCo n so le

UCS B 200 M3

! Rese tCon s ole

UCS B200 M3

! Re se tCo nso le

UC S B 200 M3

! Re se tCo nso le

UC S B 200 M3

! Re se tCo nso le

UC S B 200 M3

! Re se tCo n sole

U CS B 200 M3

SL OT1

SL OT5

SL OT3

SL OT7

SL OT2

SLOT6

SL OT4

SL OT8

!

UC S 5108

OK F AIL OK FAIL OK FAIL OK F AIL

! Re setCo n sole

UC S B200 M3

! Re setCo n sole

UC S B200 M3

! Re setCo nso le

UC S B200 M3

! Re setCo n sole

UC S B200 M3

! Re se tCo n so le

UC S B 200 M3

! Re se tCo n so le

UC S B 200 M3

! Re se tCo n so le

UC S B 200 M3

! Rese tCo n sole

U CS B 200 M3

REGION-N

Glance

Nova

Neutron

Cinder

Ceilometer

Heat

CONTROLLERRabbitMQ

Glance

Nova

Neutron

Cinder

Ceilometer

Heat

CONTROLLERRabbitMQ

Keystone Keystone Keystone Horizon HorizonHorizon

REGION ZERO

Page 8: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.8

Global NDO Upgrades

2000 VM Capacity30 Total Nodes 2 Hours to Upgrade

SITE 3: RTP, NC

600 VM Capacity22 Total Nodes1.5 Hours to Upgrade

SITE 2: CALIFORNIA

6000 VM Capacity86 Total Nodes4 Hours to Upgrade

SITE 4: RTP, NC

100 VM Capacity14 Total Nodes1 Hour to Upgrade

SITE 1: BANGLORE

Page 9: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.9

Lessons Learned OpenStack is maturing but documentation is key

Set Expectations: OpenStack is different from what we’ve supported in the past

NetApp Storage played a positive role in deployment and upgrades Non-disruptive Easy to scale Fast instance creation using NetApp Cinder Driver –

50% faster than generic NFS

OpenStack Lessons

Advice for you FlexPod provides a highly available, independently

scalable and resilient platform

Monitoring for greater visibility in your OpenStack environment

Define an upgrade strategy that suits your architecture Try to leverage automation tools and CI/CD platforms

Globally dispersed team Refine and test automation in your local geography and

then roll out globally Educate, enable, and mentor your peers to upgrade

based on their schedule

Global Engineering Cloud

Page 10: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.10

Where we’re going next

Integration of other OpenStack projects1. Ironic (Baremetal as a Service)2. Trove (Database as a Service)3. Manila (File Share as a Service)4. Magnum (Container as a Service)

Global Engineering Cloud

Page 11: OpenStack at Scale Inside NetApp

11

Key Takeaways

© 2016 NetApp, Inc. All rights reserved.

Have a good foundation that you can count on Converged Infrastructure (FlexPod) provides a scalable, highly efficient platform

Set expectations, PLAN ahead, and DOCUMENT well! Automation and non-disruptive upgrades were KEY ingredients for success

Our Global Engineering Cloud is backed by an OpenStack ecosystem that is highly available, upgradeable between releases, and provided at scale across geographical regions

Page 12: OpenStack at Scale Inside NetApp

© 2016 NetApp, Inc. All rights reserved.12

Other collateral

NEW Technical Report (FlexPod OSP8):http://nt-ap.com/1XN5Tgc

RHEL-OSP6 on FlexPod Deployment: http://bit.ly/1Q7b3Qb

RHEL-OSP6 on FlexPod Design: http://bit.ly/1LFCHEz

Reference architectures

Page 13: OpenStack at Scale Inside NetApp

Thank You