unveiling cern cloud architecture - october, 2015

39

Upload: belmiro-moreira

Post on 07-Jan-2017

421 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Unveiling CERN Cloud Architecture - October, 2015
Page 2: Unveiling CERN Cloud Architecture - October, 2015

Unveiling CERN Cloud Architecture

Openstack Design Summit – Tokyo, 2015

Belmiro Moreira [email protected] @belmiromoreira

Page 3: Unveiling CERN Cloud Architecture - October, 2015

What is CERN? •  European Organization for Nuclear

Research (Conseil Européen pour la Recherche Nucléaire)

•  Founded in 1954 •  21 state members, other countries

contribute to experiments •  Situated between Geneva and the

Jura Mountains, straddling the Swiss-French border

•  CERN mission is to do fundamental research

3

Page 4: Unveiling CERN Cloud Architecture - October, 2015

LHC - Large Hadron Collider

4

Page 5: Unveiling CERN Cloud Architecture - October, 2015

LHC and Experiments

5

CMS detector

https://www.google.com/maps/streetview/#cern

Page 6: Unveiling CERN Cloud Architecture - October, 2015

LHC and Experiments

6

Proton-lead collisions at ALICE detector

Page 7: Unveiling CERN Cloud Architecture - October, 2015

CERN Data Centres

7

Page 8: Unveiling CERN Cloud Architecture - October, 2015

OpenStack at CERN by numbers

8

~ 5000 Compute Nodes (~130k cores) •  ~ 4800 KVM •  ~ 200 Hyper-V

~ 2400 Images ( ~ 30 TB in use)

~ 1800 Volumes ( ~ 800 TB allocated) ~ 2000 Users ~ 2300 Projects

~ 16000 VMs running

Number of VMs created (green) and VMs deleted (red) every 30 minutes

Page 9: Unveiling CERN Cloud Architecture - October, 2015

OpenStack timeline at CERN

9

ESSEX 5 Apr 2012

FOLSOM 27 Sep 2012

GRIZZLY 4 Apr 2013

HAVANA 17 Oct 2013

ICEHOUSE 17 Apr 2014

JUNO 16 Oct 2014

Havana February 2014

Icehouse October 2014

KILO 30 Apr 2015

“Hamster” Oct 2013

“Guppy” Jun 2012

“Ibex” Mar 2013

Grizzly Jul 2013

Juno April 2015

LIBERTY

Kilo October 2015

CERN production infrastructure

Page 10: Unveiling CERN Cloud Architecture - October, 2015

•  Evolution of the number of VMs created since July 2013

OpenStack timeline at CERN

10

Number of VMs running Number of VMs created (cumulative)

Page 11: Unveiling CERN Cloud Architecture - October, 2015

Infrastructure Overview •  One region, two data centres, 26 Cells •  HA architecture only on Top Cell •  Children Cells control plane are usually VMs running in the shared infrastructure •  Using nova-network with custom CERN driver •  2 Hypervisor types (KVM, HyperV) •  Scientific Linux CERN 6; CERN Centos 7; Windows Server 2012 R2 •  2 Ceph instances •  Keystone integrated with CERN account/lifecycle system •  Nova; Keystone; Glance; Cinder; Heat; Horizon, Ceilometer; Rally •  Deployment using OpenStack puppet modules and RDO

11

Page 12: Unveiling CERN Cloud Architecture - October, 2015

Architecture Overview

12

Nova Compute Cell

Nova Top Cell

Nova Compute Cell

Nova Compute Cell

Load Balancer Ceph

Glance

Cinder

Heat

Ceilometer

Horizon

Keystone

DB infrastructure

(...)

Geneva Data Centre Budapest Data Centre

Ceph

DB infrastructure

Nova Compute Cell

Nova Compute Cell

Nova Compute Cell

(...)

Page 13: Unveiling CERN Cloud Architecture - October, 2015

Why Cells? •  Single endpoint to users •  Scale transparently between Data Centres

•  Availability and Resilience •  Isolate different use-cases

13

Page 14: Unveiling CERN Cloud Architecture - October, 2015

CellsV1 Limitations •  Functionality Limitations:

•  Security Groups •  Manage aggregates on Top Cell •  Availability Zone support

•  Cell scheduler limited functionality •  Ceilometer integration

14

Page 15: Unveiling CERN Cloud Architecture - October, 2015

Nova Deployment at CERN

15

nova-cells

rabbitmq Top cell controller API node

nova-api

rabbitmq

nova-cells

nova-api

nova-scheduler

nova-conductor

nova-network

Child cell controller

Compute node

nova-compute

rabbitmq

nova-cells

nova-api

nova-scheduler

nova-conductor

nova-network

Child cell controller

Compute node

nova-compute

DB

(...)

Load Balancer

DB DB

Page 16: Unveiling CERN Cloud Architecture - October, 2015

Nova - Cells Control Plane Top Cell Controller: •  Controller nodes running only on

physical nodes •  Clustered RabbitMQ with mirrored

queues

•  “nova-api” nodes are VMs •  deployed in the “common” (user

shared) infrastructure

16

Children Cells Controllers: •  Only ONE controller node per cell

•  NO HA at Children Cell level •  Most are VMs running in other

Cells •  Children Cell controller fails?

•  Replaced by another VM •  User VMs are still available

•  ~200 compute nodes per cell

Page 17: Unveiling CERN Cloud Architecture - October, 2015

Nova - Cells Scheduling •  Different cells have different use cases

•  Hardware, Location, Network configuration, Hypervisor type, ...

•  Cells capabilities •  “datacentre”, “hypervisor”, “avzs”

•  example: capabilities=hypervisor=kvm,avzs=avz-a,datacentre=geneva •  scheduler filters to use these capabilities

•  CERN Cell Filters available at: https://github.com/cernops/nova/tree/cern-2014.2.2-1/nova/cells/filters

17

Page 18: Unveiling CERN Cloud Architecture - October, 2015

Nova - Cells Scheduling - Project Mapping How we map projects to cells? https://github.com/cernops/nova/blob/cern-2014.2.2-2/nova/cells/filters/target_cell_project.py

•  Default cells; Dedicated cells •  Target cell will be selected considering the following configuration:

“nova.conf” cells_default=cellA,cellB,cellC,cellD cells_projects=cellE:<project_uuid1>;<project_uuid2>,cellF:<project_uuid3>

•  “disabling” a cell is removing it from the list... http://openstack-in-production.blogspot.fr/2015/10/scheduling-and-disabling-cells.html

18

Page 19: Unveiling CERN Cloud Architecture - October, 2015

Nova - Cells Scheduling - AVZs •  CellsV1 implementation is not aware of aggregates •  How to have AVZs with cells?

•  Create the aggregate/availability zone in the Top Cell •  Create “fake” nova-compute services to add nodes into the

AVZs aggregates •  Cell scheduler uses “capabilities” to identify AVZs •  NO aggregates in the children cells

19

Page 20: Unveiling CERN Cloud Architecture - October, 2015

Nova - Legacy Child Cell configuration at CERN •  Our first cell (2013)

•  Cell with >1000 compute nodes •  Any problem in Cell control plane had huge impact •  All availability zones behind this Cell using aggregates

•  Aggregates dedicated to specific projects •  Multiple hardware types

•  KVM and Hyper-V

20

Page 21: Unveiling CERN Cloud Architecture - October, 2015

Nova - Cell Division (from 1 to 9) How to divide an existing Cell? •  Setup new Child Cells controllers •  Copy the existing DB to all new Cells and delete all instance records that

will not belong to the new Cell •  Move compute nodes to new Cells

•  Change instances “cells path” in Top Cell DB

21

Page 22: Unveiling CERN Cloud Architecture - October, 2015

Nova - Live Migration •  Block live migration

•  Compute nodes don’t have shared storage

•  Not used for daily operations... •  Resources availability and network clusters constraints •  Only considered for pets

•  Planned for the SLC6 to CC7 migration •  Planned for hardware end of life

•  How to orchestrate large live-migration campaign?

22

Page 23: Unveiling CERN Cloud Architecture - October, 2015

Nova - Live Migration •  Block live migration with volumes attached is problematic...

•  Attached Cinder volumes are block migrated along with instance •  They are copied, over the network, from themselves to themselves •  Can cause data corruption

•  https://bugs.launchpad.net/nova/+bug/1376615 •  https://bugzilla.redhat.com/show_bug.cgi?id=1203032 •  https://review.openstack.org/#/c/176768/

23

Page 24: Unveiling CERN Cloud Architecture - October, 2015

Nova - Kilo with SLC6 •  Kilo dropped support to Python 2.6

•  We still have ~800 compute nodes running on SLC6

•  We needed to build Nova RPM for SLC6 •  Original recipe from GoDaddy!

•  Create a venv using python 2.7 from SCL •  Build the venv with Anvil

•  Package the venv in a RPM

24

Page 25: Unveiling CERN Cloud Architecture - October, 2015

Nova - Network CERN network configuration: •  Network is divided into several "network clusters" (L3 networks), that

have several ”IP services" (L2 subnets) •  Each compute node is associated to a "network cluster”

•  VMs running in a compute node can only have an IP from the "network cluster" associated to the compute node

•  https://etherpad.openstack.org/p/Network_Segmentation_Usecases

25

Page 26: Unveiling CERN Cloud Architecture - October, 2015

Nova - Network •  Developed CERN Network driver

•  Create a new VM 1.  Selects the network cluster considering the compute node selected to boot the instance 2.  Selects an address from the network cluster 3.  Updates CERN network database 4.  Waits for the central DNS refresh

•  “fixed_ips” table contains IPv4, IPv6, MAC and network cluster •  New table does the mapping “host” -> network cluster •  Network constraints in some nova operations

•  Resize, Live-Migration •  https://github.com/cernops/nova/blob/cern-2014.2.2-2/nova/network/manager.py

26

Page 27: Unveiling CERN Cloud Architecture - October, 2015

Neutron is coming... •  NOT in production. Testing/developing instance •  What we use/don't use from Neutron

•  No SDN or tunneling •  Only provider networks, no private/tenant •  Flat networking.  VMs bridged directly to the real network •  No DHCP or DNS from neutron. We have already our infrastructure •  We don't use floating IPs •  Neutron API not exposed to users

•  Implemented API extensions and Mechanism Driver for our use case •  https://github.com/cernops/neutron/commit/63f4e19c7423dcdc2b5a7573d0898ec9e799663b

•  How to migrate from nova-network to Neutron?

27

Page 28: Unveiling CERN Cloud Architecture - October, 2015

Keystone Deployment at CERN

28

Load Balancer

DB Service

Catalogue DB

Keystone

Service Catalogue

(Exposed to Users) (Dedicated to Ceilometer)

Keystone

Active Directory

Page 29: Unveiling CERN Cloud Architecture - October, 2015

Keystone •  Keystone nodes are VMs •  Integrated with CERN’s Active Directory infrastructure •  Project life cycle

•  ~200 arrivals/departures per month •  CERN user subscribes the "cloud service”

•  Created "Personal Project" with limited quota •  “Shared Projects” created by request

•  "Personal project" disabled when user leaves the Organization •  After 3 months stop resources and after 6 months delete resources (VMs,

Volumes, Images, …)

29

Page 30: Unveiling CERN Cloud Architecture - October, 2015

Glance Deployment at CERN

30

Load Balancer

DB

Glance-api

Glance-registry

Glance node

(Exposed to Users)

Glance-api

Glance-registry

Glance node

(Only used for Ceilometer calls)

Ceph Geneva

Page 31: Unveiling CERN Cloud Architecture - October, 2015

Glance •  Uses Ceph backend in Geneva •  Glance nodes are VMs •  NO Glance image cache •  Glance API and Glance Registry running in the same node

•  Glance API only talks with local Glance Registry •  Two sets of nodes (API exposed to users and Ceilometer)

•  When Glance Quotas per Project? •  Problematic in private clouds where users are not “charged” for storage

31

Page 32: Unveiling CERN Cloud Architecture - October, 2015

Cinder Deployment at CERN

32

Load Balancer

DB

Cinder-api

Cinder-volume

Cinder node

Cinder-scheduler

rabbitmq

Ceph Geneva

Ceph Budapest

NetApp

Page 33: Unveiling CERN Cloud Architecture - October, 2015

Cinder •  Ceph and NetApp backends •  Extended list of available volume types (QoS, Backend, Location) •  Cinder nodes are VMs •  Active/Active?

•  When a volume is created a “cinder-volume” node is associated •  Responsible for volume operations

•  Not easy to replace cinder controller nodes •  DB entries need to be changed manually

•  More about CERN storage infrastructure for OpenStack: •  https://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/ceph-at-cern-a-year-in-the-

life-of-a-petabyte-scale-block-storage-service

33

Page 34: Unveiling CERN Cloud Architecture - October, 2015

Ceilometer Deployment at CERN

34

nova-compute

ceilometer-compute

Hbase

Ceilometer Notification

Agent Ceilometer

Pulling Collector

Ceilometer Notification Collector

Ceilometer UDP

Collector

Mysql MongoDB

Ceilometer API

Cell rabbitmq

notifi

catio

ns

Ceilometer rabbitmq

Ceilometer Evaluator & Notifier

samp

le RP

C

samp

le UD

P

Ceilometer API

HEAT

ceilometer-central-agent

Compute node

Page 35: Unveiling CERN Cloud Architecture - October, 2015

Ceilometer

35

•  “ceilometer-compute-agent” queries “nova-api” for the instances hosted in the compute node •  This can be very demanding for

“nova-api” •  When using the default

“instance_name_template” the “instance_name” in Top Cell is different from the Child Cell

•  Need to have “nova-api” per Cell Number of Nova API calls done by ceilometer-compute-agent per hour

Page 36: Unveiling CERN Cloud Architecture - October, 2015

•  Using a dedicated RabbitMQ cluster for Ceilometer •  Initially we used Children Cells

Not a good idea! •  Any failure/slow down in the

backend storage system can create a big queue...

Ceilometer

36

Size of “metering.sample” queue

Page 37: Unveiling CERN Cloud Architecture - October, 2015

Rally

37

•  Probe/Benchmarking the Infrastructure every hour

Page 38: Unveiling CERN Cloud Architecture - October, 2015

Challenges •  Capacity increase to 200k cores by Summer 2016 •  Live Migrate thousands of VMs

•  Upgrade ~800 compute nodes from SLC6 to CC7 •  Retire old servers

•  Move to Neutron •  Identity Federation with different scientific sites •  Magnum and containers possibilities

38

Page 39: Unveiling CERN Cloud Architecture - October, 2015

[email protected] @belmiromoreira

http://openstack-in-production.blogspot.com