20170926 cern cloud v4

41
Tim Bell CERN @noggin143 OpenStack UK Days 26 th September 2017 Understanding the Universe through Clouds Universe and Clouds - 26th September 2017 [email protected] 1

Upload: tim-bell

Post on 29-Jan-2018

80 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: 20170926 cern cloud v4

Tim Bell

CERN

@noggin143

OpenStack UK Days

26th September 2017

Understanding the Universe

through CloudsUniverse and Clouds - 26th September 2017 [email protected] 1

Page 2: 20170926 cern cloud v4

2

CERN: founded in 1954: 12 European States

“Science for Peace”

Today: 22 Member States

Member States: Austria, Belgium, Bulgaria, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Israel, Italy, Netherlands, Norway, Poland, Portugal, Romania, Slovak Republic, Spain, Sweden, Switzerland and United Kingdom

Associate Member States: Pakistan, India, Ukraine, Turkey

States in accession to Membership: Cyprus, Serbia

Applications for Membership or Associate Membership:Brazil, Croatia, Lithuania, Russia, Slovenia

Observers to Council: India, Japan, Russia, United States of America; European Union, JINR and UNESCO

~ 2300 staff

~ 1400 other paid personnel

~ 12500 scientific users

Budget (2017) ~1000 MCHF

Universe and Clouds - 26th September 2017 [email protected] 2

Page 3: 20170926 cern cloud v4

The Large Hadron Collider (LHC)

Universe and Clouds - 26th September 2017 [email protected] 3

~700 MB/s

~10 GB/s

>1 GB/s

>1 GB/s

Page 4: 20170926 cern cloud v4

Universe and Clouds - 26th September 2017 [email protected] 4

Page 5: 20170926 cern cloud v4

[email protected] 5Universe and Clouds - 26th September 2017

Page 6: 20170926 cern cloud v4

6

Tier-1: permanent

storage, re-processing,

analysis

Tier-0

(CERN and Hungary):

data recording,

reconstruction and

distribution

Tier-2: Simulation,

end-user analysis

> 2 million jobs/day

~750k CPU cores

600 PB of storage

~170 sites,

42 countries

10-100 Gb links

WLCG:

An International collaboration to distribute and analyse LHC data

Integrates computer centres worldwide that provide computing and storage

resource into a single infrastructure accessible by all LHC physicists

The Worldwide LHC Computing Grid

Universe and Clouds - 26th September 2017 [email protected]

Page 7: 20170926 cern cloud v4

Universe and Clouds - 26th September 2017 [email protected] 7

Asia North America

South America

Europe

LHCOne: Overlay network

Allows national network providers to

manage HEP traffic on general

purpose network

0

10

20

30

40

50

60

70

JAN FEB MAR APR MAY JUN JUL AUG SEPT OCT NOV DEC JAN FEB MAR APR MAY

Page 8: 20170926 cern cloud v4

A big data problem

[email protected] 8

2016: 49.4 PB LHC data/

58 PB all experiments/

73 PB total

ALICE: 7.6 PB

ATLAS: 17.4 PB

CMS: 16.0 PB

LHCb: 8.5 PB

11 PB in July

180 PB on tape

800 M files

Universe and Clouds - 26th September 2017

Page 9: 20170926 cern cloud v4

Public Procurement CycleStep Time (Days) Elapsed (Days)

User expresses requirement 0

Market Survey prepared 15 15

Market Survey for possible

vendors

30 45

Specifications prepared 15 60

Vendor responses 30 90

Test systems evaluated 30 120

Offers adjudicated 10 130

Finance committee 30 160

Hardware delivered 90 250

Burn in and acceptance 30 days typical with 380 worst

case

280

Total 280+ Days

Universe and Clouds - 26th September 2017 [email protected] 9

Page 10: 20170926 cern cloud v4

OpenStack London July 2011 Vinopolis

Universe and Clouds - 26th September 2017 [email protected] 10

Page 11: 20170926 cern cloud v4

CERN Tool Chain

Universe and Clouds - 26th September 2017 [email protected] 11

Page 12: 20170926 cern cloud v4

CERN OpenStack Service Timeline

(*) Pilot (?) Trial Retired

ESSEX

Nova (*)

Glance (*)

Horizon (*)

Keystone (*)

FOLSOM

Nova (*)

Glance (*)

Horizon (*)

Keystone (*)

Quantum

Cinder

GRIZZLY

Nova

Glance

Horizon

Keystone

Quantum

Cinder

Ceilometer (*)

HAVANA

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer (*)

Heat

ICEHOUSE

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer

Heat

Ironic

Trove

JUNO

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer

Heat (*)

Rally (*)

5 April 2012 27 September 2012 4 April 2013 17 October 2013 17 April 2014 16 October 2014

July 2013

CERN OpenStack

Production

February 2014

CERN OpenStack

Havana

October 2014

CERN OpenStack

Icehouse

March2015

CERN OpenStack

Juno

LIBERTY

Nova

Glance

Horizon

Keystone

Neutron (*)

Cinder

Ceilometer

Heat

Rally

EC2API

Magnum (*)

Barbican (*)

September 2015

CERN OpenStack

Kilo

KILO

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer

Heat

Rally

Manila

September 2016

CERN OpenStack

Liberty

MITAKA

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer

Heat

Rally

EC2API

Magnum

Barbican

Ironic (?)

Mistral (?)

Manila (?)

March 2017

CERN OpenStack

Mitaka

NEWTON

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer

Heat

Rally

EC2API

Magnum

Barbican

Ironic (?)

Mistral (?)

Manila (?)

7 April 201630 April 2015 15 October 2015

OCATA

Nova

Glance

Horizon

Keystone

Neutron

Cinder

Ceilometer

Heat

Rally

EC2API

Magnum

Barbican

Ironic (?)

Mistral (?)

Manila (*)

22 Feb 2017

June 2017

CERN OpenStack

Newton

6 Oct 2016

PIKE

28 Aug 2017

Universe and Clouds - 26th September 2017 [email protected] 12

Page 13: 20170926 cern cloud v4

Universe and Clouds - 26th September 2017 [email protected] 13

Currently >8000 hypervisors, 281K cores running 33,000 VMs

Page 14: 20170926 cern cloud v4

From ~200TB total to ~450 TB of RBD + 50 TB RGW• From ~200TB total to ~450 TB of RBD + 50 TB RGW

OpenStack Glance + Cinder

Example: ~25 puppet masters reading

node configurations at up to 40kHz

iop

s

14Universe and Clouds - 26th September 2017 [email protected]

• Scale tests with Ceph Luminous up to 65PB in a block storage pool

http://ceph.com/community/new-luminous-scalability/

Page 15: 20170926 cern cloud v4

Software Deployment

[email protected]

Deployment based on CentOS and RDO- Upstream, only patched where necessary

(e.g. nova/neutron for CERN networks)

- Some few customizations

- Works well for us

Puppet for config’ management- Introduced with the adoption of AI paradigm

We submit upstream whenever possible- openstack, openstack-puppet, RDO, …

Updates done service-by-service over several months- Running services on dedicated (virtual) servers helps

(Exception: ceilometer and nova on compute nodes)

- Aim to be around 6-9 months behind trunk

Upgrade testing done with packstack and devstack- Depends on service: from simple DB upgrades to full shadow installations

Universe and Clouds - 26th September 2017

Page 16: 20170926 cern cloud v4

Community Experience Open source collaboration sets model for in-

house teams

External recognition by the community is highly rewarding for contributors

Reviews and being reviewed is a constant learning experience

Productive for job market for staff

Working groups, like the Scientific and Large Deployment teams, discuss wide range of topics

Effective knowledge transfer mechanisms consistent with the CERN mission

Dojos at CERN bring good attendance Ceph, CentOS, Elastic, OpenStack CH, …

Universe and Clouds - 26th September 2017 [email protected] 16

Page 17: 20170926 cern cloud v4

Top level cell

Runs API service

Top cell scheduler

~50 child cells run

Compute nodes

Scheduler

Conductor

Decided to not use HA

Version 2 coming

Default for all

Scaling Nova

17Universe and Clouds - 26th September 2017 [email protected]

Page 18: 20170926 cern cloud v4

Rally

18Universe and Clouds - 26th September 2017 [email protected]

Page 19: 20170926 cern cloud v4

What’s new? Magnum

Container Engine as a Service

Kubernetes, Docker, Mesos…

$ magnum cluster-create --name myswarmcluster --cluster-template swarm --node-count 100

$ magnum cluster-list

+------+----------------+------------+--------------+-----------------+

| uuid | name | node_count | master_count | status |

+------+----------------+------------+--------------+-----------------+

| .... | myswarmcluster | 100 | 1 | CREATE_COMPLETE |

+------+----------------+------------+--------------+-----------------+

$ $(magnum cluster-config myswarmcluster --dir magnum/myswarmcluster)

$ docker info / ps / ...

$ docker run --volume-driver cvmfs -v atlas.cern.ch:/cvmfs/atlas -it centos /bin/bash

[root@32f4cf39128d /]#

Universe and Clouds - 26th September 2017 [email protected] 19

Page 20: 20170926 cern cloud v4

Scaling Magnum to 7M req/sRally drove the tests

1000 node clusters (4000 cores)

Cluster Size (Nodes) Concurrency Deployment Time

(min)

2 50 2.5

16 10 4

32 10 4

128 5 5.5

512 1 14

1000 1 23

Universe and Clouds - 26th September 2017 [email protected] 20

Page 21: 20170926 cern cloud v4

What’s new? Mistral Workflow-as-a-Service used for multi-step actions,

triggered by users or events

Horizon dashboard for visualising results

Examples Multi-step project creation

Scheduled snapshot of VMs

Expire personal resources after 6 months

Code at https://gitlab.cern.ch/cloud-infrastructure/mistral-workflows

Some more complex cases coming in the pipeline

Universe and Clouds - 26th September 2017 [email protected] 21

Page 22: 20170926 cern cloud v4

Universe and Clouds - 26th September 2017 [email protected] 22

Page 23: 20170926 cern cloud v4

Automate provisioning

23

Automate routine procedures- Common place for workflows

- Clean web interface

- Scheduled jobs, cron-style

- Traceability and auditing

- Fine-grained access control

- …

Procedures for- OpenStack project creation

- OpenStack quota changes

- Notifications of VM owners

- Usage and health reports

- …

Disable compute

node

• Disable nova-service

• Switch Alarms OFF

• Update Service-Now ticket

Notifications

• Send e-mail to VM owners

Other tasks

Post new

message broker

Add remote AT

job

Save intervention

details

Send calendar

invitation

Universe and Clouds - 26th September 2017 [email protected]

Page 24: 20170926 cern cloud v4

Manila: Overview

24

• File Share Project in OpenStack- Provisioning of shared file systems to VMs

- ‘Cinder for file shares’

• APIs for tenants to request shares- Fulfilled by backend drivers

- Acessed from instances

• Support for variety of NAS protocols- NFS, CIFS, MapR-FS, GlusterFS, CephFS, …

• Supports the notion of share types- Map features to backends

Manila

Backend

1. Request share

2. Create share

4. Access share

User

instances

3. Provide handle

Universe and Clouds - 26th September 2017 [email protected]

Page 25: 20170926 cern cloud v4

25

LHC Incident in April 2016

Universe and Clouds - 26th September 2017 [email protected]

Page 26: 20170926 cern cloud v4

Manila testing: #fouinehammer

26

m-share

DriverRabbitMQ

m-sched

m-api DBm-api m-api

1 … 500 nodes

1 ... 10k PODs

Universe and Clouds - 26th September 2017 [email protected]

Page 27: 20170926 cern cloud v4

Commercial Clouds

Universe and Clouds - 26th September [email protected] 27

Page 28: 20170926 cern cloud v4

Development areas going forward Spot Market

Cells V2

Neutron scaling – no Cells equivalent yet

Magnum rolling upgrades

Collaborations with Industry

Universe and Clouds - 26th September 2017 [email protected] 28

Page 29: 20170926 cern cloud v4

Operations areas going forward Further automate migrations

Around 5,000 VMs / year

First campaign in 2016 needed some additional scripting such as pausing very active VMs

Newton live migration includes most use cases

Software Defined Networking Nova network to Neutron migration to be completed

In addition to flat network in use currently

Introduce higher level functions such as LBaaS

Universe and Clouds - 26th September 2017 [email protected] 29

Page 30: 20170926 cern cloud v4

Future Challenges

0

100

200

300

400

500

600

700

800

900

1000

Raw Derived

Dataestimatesfor1styearofHL-LHC(PB)

ALICE ATLAS CMS LHCb

0

50000

100000

150000

200000

250000

CPU(HS06)

CPUNeedsfor1stYearofHL-LHC(kHS06)

ALICE ATLAS CMS LHCb

BFirst run LS1 Second run LS2 Third run LS3 HL-LHC

FCC?

2013 2014 2015 2016 2017 201820112010 2012 2019 2023 2024 2030?20212020 2022 …2025

CPU:• x60 from 2016

Data:• Raw 2016: 50 PB 2027: 600 PB

• Derived (1 copy): 2016: 80 PB 2027: 900 PB

Universe and Clouds - 26th September 2017 [email protected] 30

Raw data volume for LHC increases exponentially and with it processing and analysis

load

Technology at ~20%/year will bring x6-10 in 10-11 years

Estimates of resource needs at HL-LHC x10 above what is realistic to expect from

technology with reasonably constant cost

Page 31: 20170926 cern cloud v4

Summary OpenStack has provided a strong base for

scaling resources over the past 4 years without significant increase in CERN staff

Additional functionality on top of pure Infrastructure-as-a-Service is now coming to production

Community and industry collaboration has been productive and inspirational for the CERN team

Some big computing challenges up ahead…

Universe and Clouds - 26th September 2017 [email protected] 31

Page 32: 20170926 cern cloud v4

Further Information

Technical details on the CERN cloud at

http://openstack-in-production.blogspot.fr

Custom CERN code is at https://github.com/cernops

Scientific Working Group at

https://wiki.openstack.org/wiki/Scientific_working_group

Helix Nebula details at http://www.helix-nebula.eu/

http://cern.ch/IT ©CERN CC-BY-SA 4.0Universe and Clouds - 19th June 2017 [email protected] 32

Page 33: 20170926 cern cloud v4

Backup

Page 34: 20170926 cern cloud v4

Universe and Clouds - 26th September 2017 [email protected] 34

WLCG MoU Signatures2017:

- 63 MoU’s

- 167 sites; 42 countries

Page 35: 20170926 cern cloud v4

Partners

Contributors

Associates

Research

Universe and Clouds - 26th September 2017 [email protected] 35

Page 36: 20170926 cern cloud v4

Universe and Clouds - 26th September 2017 [email protected] 36

Page 37: 20170926 cern cloud v4

How do we monitor?

37

Processing

kafka

Data

Centres

Data

SourcesData

Access

Storage/Search

WLCG

Transport

Universe and Clouds - 26th September 2017 [email protected]

Page 38: 20170926 cern cloud v4

Tuning

38

Many hypervisors are configured for compute optimisation CPU Passthrough so VM sees identical CPU

Extended Page Tables so memory page mapping is done in hardware

Core pinning so scheduler keeps the cores on the underlying physical cores

Huge pages to improve memory page cache utilisation

Flavors are set to be NUMA aware

Improvements of up to 20% in performance

Impact is that the VMs cannot be live migrated so service machines are not configured this way

Universe and Clouds - 26th September 2017 [email protected]

Page 39: 20170926 cern cloud v4

Provisioning services Moving towards

Elastic Hybrid IaaS

model:• In house resources at full

occupation

• Elastic use of commercial

& public clouds• Assume “spot-market”

style pricing

OpenStack Resource Provisioning

(>1 physical data centre)

HTCondor

Public Cloud

VMsContainersBare Metal and HPC

(LSF)

Volunteer

Computing

IT & Experiment

ServicesEnd Users CI/CD

APIs

CLIs

GUIs

Experiment Pilot Factories

Universe and Clouds - 26th September 2017 [email protected] 39

Page 40: 20170926 cern cloud v4

Simulating Elasticity Deliveries are around 1-2 times per year

Resources are for Batch compute … immediately needed … compute optimised

Services … needed as projects request quota ... Support live migration with generic CPU definition

Elasticity is simulated by Creating opportunistic batch projects running on resources available

for services in the future

Draining opportunistic batch as needed

End result is High utilisation of ‘spare’ resources

Simulation of an elastic cloud

Universe and Clouds - 26th September 2017 [email protected] 40

Page 41: 20170926 cern cloud v4

Pick the interesting events 40 million per second

Fast, simple information

Hardware trigger in a few micro seconds

100 thousand per second Fast algorithms in local

computer farm

Software trigger in <1 second

Few 100 per second Recorded for study

41

Muon

tracks

Energy

deposits

Universe and Clouds - 26th September 2017 [email protected]