sardina€¦ · title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 created date:...

56
Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved. SARDINA The road to OpenStack at (one of?) UK’s largest academic private cloud: University of Edinburgh Jan Winter [email protected] +44 131 650 5009 Kenneth Tan [email protected] +44 798 941 7838

Upload: others

Post on 27-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

S A R D I N A

The road to OpenStack at (one of?) UK’s largest academic private cloud: University of Edinburgh

Jan Winter [email protected] +44 131 650 5009

Kenneth Tan [email protected] +44 798 941 7838

Page 2: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

1. Who?

!2

Page 3: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

University of Edinburgh

one of the leading universities in the world, amongst top 6 in the UK

ranked #6 or #7 in Europe, depending on whom you ask

founded a little while ago … in 1582 (older than some countries!)

population: ca. 45k (undergraduate + postgraduate students, admin staff, academic staff)

Page 4: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

University of Edinburgh

Research Services, IT Infrastructure Division: part of University’s central Information Services

Responsible for operating university-wide IT infrastructure / services for research

Research Services: 10 people (yes, only! and I am 10%!)

Page 5: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

2. Strategy and objective

!5

Page 6: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Strategy

move higher cost services delivery channels to lower cost delivery channels

automation + self-service: lower cost and increase speed of response and lesser errors

enable higher value HR, deliver higher value services

Page 7: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Cloud objective

provide flexible infrastructure of compute, storage, network, services for research

… enable higher value HR, deliver higher value services

Page 8: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

3. Where is OpenStack in this picture?

!8

Page 9: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Role of OpenStack

provide users with flexible, self-serviced provisioned of computing infrastructure

compute, storage, networking, applications portfolio

OpenStack should not result in net increase of workload for Research Services operations team

Page 10: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Role of OpenStack

serve disparate user groups, varying and competing needs, long-lived and short-lived workloads

hpc/technical computing workload, eg: CERN, particle physics

data science workload: hadoop, spark

cannot make assumption about users’ level of technical expertise

Page 11: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

more importantly …

need to meet organization mission

don’t throw money at the challenge

money does not grow on trees

Page 12: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

4. The partner (not in crime!)

!12

Page 13: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Sardina Systems

European cloud platform software vendor with presence in Slovakia, Romania, Russia, UK

technical team’s expertise in large scale, business and operations critical facilities in finance, defense, government, meteorology and automotive industries

OpenStack Foundation Corporate Sponsor

Page 14: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

5. The solution

!14

Page 15: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Route to the solution

problem definition and solution design

system architecture

hardware specification

project management: what’s happening when, by who

Page 16: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Route to the solution

plan for production from day 0

build for production from day 0

it’s not a toy: high availability is not optional

monkey-capable no-magic scaling up process

Page 17: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Route to the solution

not a guinea pig!

Page 18: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

… to consider full lifecycle

deploy

operate

upgrade

Page 19: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

6. The solution design

!19

Page 20: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

The solution design

management: 16 nodes

storage: 9 ceph nodes

compute: ca. 200+ nodes (design size; varies from 50+, can dynamically add using other servers in the farm)

network: 5 VLANs, 10 GbE

integrates with broader University IT services

Page 21: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

7. Deploy phase

!21

Page 22: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

From design to deployment

the ingredients: what’s the configuration needed

ingredients ready: make sure hardware are available (servers, storage, networking)

system configuration: single source of truth at all times

Page 23: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Automate, Automate, Automate

system configuration = input for automation

deployment of entire system: fully automated

reduced project risk = increased operational confidence

Page 24: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Automate, Automate, Automate

base OS deployment with Foreman

post OS deployment configuration with SaltStack

full OpenStack + Ceph deployment with FishOS Deployer (SaltStack variant)

principle: standard enterprise tools, nothing fancy!

Page 25: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Deploy: 1, 2, 3 … showtime

1. input: configuration and inventory

2. run deployer

3. 2 cups of coffee

… done

Page 26: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

8. Concept to reality

!26

Page 27: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Deployer

pre-tested and proven off-site

ci/cd process

no knowledge of OpenStack necessary

Page 28: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Implementation: proof of solution

detailed planning: what is to be done when, what will be available when, who will do what by when?

time from zero-to-operation: 8 weeks (and 6 weeks waiting for storage hardware to arrive)

Page 29: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

9. Upgrade phase

!29

Page 30: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Run at all times

n to n+1 upgrade every 6 months (releases eol in 12 months)

2n upgrades in n years: more risky option

it’s not a toy!

system has to continue operation at all times

Page 31: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Zero-downtime?

get solution architecture right (from deploy phase) … else, find out problem 6 months later

monkey-capable fully automated upgrade process

Page 32: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

10. The operate phase

!32

Page 33: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Challenges

uptime + availability

resource management

infrastructure-as-code operation

idempotent: flexible to change at any time

Page 34: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Reliability, Uptime, Availability

high availability: not an luxury

service consumers only interested in availability + reliability

Page 35: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Downtime

any time any service is not available to service consumer, whatever the cause may be

Page 36: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Downtime during operate phase

intrinsic — vs — extrinsic

downtime risks

Page 37: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Downtime during operate phase

intrinsic — vs — extrinsic

downtime risks

Page 38: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Extrinsic downtime risk

dealt with via highly available solution architecture

Page 39: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

What is OpenStack?

a series of intercommunicating services

HTTP, MQ, DB

Page 40: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

The types of services

1. with data + configuration

2. configuration only

Page 41: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Safeguards: data

replication, replication, replication

2-node model or quorum/odd-node model

Page 42: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Safeguards: data

traditional high availability operations model

Page 43: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

http services

http: it’s a web server! let’s treat it as such!

high availability for web servers

Page 44: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

What else? … mq

mq: put stuff in queue, take stuff from queue

replicate reader/writer

Page 45: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

in short, for extrinsic downtime risks …

keep extra copy available at all times

Page 46: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Smarts …

it’s in the solution architecture

get it right, else it can come back to bite you!

Page 47: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Other requirements for production

usage accounting and billing

log + metrics management

monitoring

capacity planning

Page 48: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

11. Problems encountered

!48

Page 49: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Problems encountered

downtime due to running out of disk space, affecting ceilometer, mq

networking hardware problem (being investigated), resulting in packet loss

cinder problem encountered during Ocata upgrade

Page 50: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

12. Nearly 2 years later …

!50

Page 51: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Worked well, growing

system expansions (storage and compute)

easily add compute nodes as necessary

growing by another ca. 2000 cores in coming weeks

system can be supported by just 1 person

Page 52: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

13. Key lessons learnt?!

!52

Page 53: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Key lessons learnt

1. didn’t spin own Linux distribution, so don’t spin own OpenStack either

Page 54: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Key lessons learnt

2. not planning, planning, planning … but rather … experience + planning, experience + planning, and more experience + planning

Page 55: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

Key lessons learnt

3. make sure your vendor is able to support your operation when needed

Page 56: SARDINA€¦ · Title: sdn-presentation-openstack-summit-uoe-eleanor-cloud-20180517 Created Date: 5/31/2018 7:09:59 PM

Sardina Systems Proprietary. Copyright (C) 2014 -- 2018. Sardina Systems. All rights reserved.

S A R D I N A

Location: Ukraine

The road to OpenStack at (one of?) UK’s largest academic private cloud: University of Edinburgh

Jan Winter [email protected] +44 131 650 5009

Kenneth Tan [email protected] +44 798 941 7838

edinburgh.ac.uk | sardinasystems.com