20170926 cern cloud v4
Post on 29-Jan-2018
81 Views
Preview:
TRANSCRIPT
Tim Bell
CERN
@noggin143
OpenStack UK Days
26th September 2017
Understanding the Universe
through CloudsUniverse and Clouds - 26th September 2017 Tim.Bell@cern.ch 1
2
CERN: founded in 1954: 12 European States
“Science for Peace”
Today: 22 Member States
Member States: Austria, Belgium, Bulgaria, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Israel, Italy, Netherlands, Norway, Poland, Portugal, Romania, Slovak Republic, Spain, Sweden, Switzerland and United Kingdom
Associate Member States: Pakistan, India, Ukraine, Turkey
States in accession to Membership: Cyprus, Serbia
Applications for Membership or Associate Membership:Brazil, Croatia, Lithuania, Russia, Slovenia
Observers to Council: India, Japan, Russia, United States of America; European Union, JINR and UNESCO
~ 2300 staff
~ 1400 other paid personnel
~ 12500 scientific users
Budget (2017) ~1000 MCHF
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 2
The Large Hadron Collider (LHC)
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 3
~700 MB/s
~10 GB/s
>1 GB/s
>1 GB/s
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 4
Tim.Bell@cern.ch 5Universe and Clouds - 26th September 2017
6
Tier-1: permanent
storage, re-processing,
analysis
Tier-0
(CERN and Hungary):
data recording,
reconstruction and
distribution
Tier-2: Simulation,
end-user analysis
> 2 million jobs/day
~750k CPU cores
600 PB of storage
~170 sites,
42 countries
10-100 Gb links
WLCG:
An International collaboration to distribute and analyse LHC data
Integrates computer centres worldwide that provide computing and storage
resource into a single infrastructure accessible by all LHC physicists
The Worldwide LHC Computing Grid
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 7
Asia North America
South America
Europe
LHCOne: Overlay network
Allows national network providers to
manage HEP traffic on general
purpose network
0
10
20
30
40
50
60
70
JAN FEB MAR APR MAY JUN JUL AUG SEPT OCT NOV DEC JAN FEB MAR APR MAY
A big data problem
Tim.Bell@cern.ch 8
2016: 49.4 PB LHC data/
58 PB all experiments/
73 PB total
ALICE: 7.6 PB
ATLAS: 17.4 PB
CMS: 16.0 PB
LHCb: 8.5 PB
11 PB in July
180 PB on tape
800 M files
Universe and Clouds - 26th September 2017
Public Procurement CycleStep Time (Days) Elapsed (Days)
User expresses requirement 0
Market Survey prepared 15 15
Market Survey for possible
vendors
30 45
Specifications prepared 15 60
Vendor responses 30 90
Test systems evaluated 30 120
Offers adjudicated 10 130
Finance committee 30 160
Hardware delivered 90 250
Burn in and acceptance 30 days typical with 380 worst
case
280
Total 280+ Days
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 9
OpenStack London July 2011 Vinopolis
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 10
CERN Tool Chain
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 11
CERN OpenStack Service Timeline
(*) Pilot (?) Trial Retired
ESSEX
Nova (*)
Glance (*)
Horizon (*)
Keystone (*)
FOLSOM
Nova (*)
Glance (*)
Horizon (*)
Keystone (*)
Quantum
Cinder
GRIZZLY
Nova
Glance
Horizon
Keystone
Quantum
Cinder
Ceilometer (*)
HAVANA
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer (*)
Heat
ICEHOUSE
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
Ironic
Trove
JUNO
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat (*)
Rally (*)
5 April 2012 27 September 2012 4 April 2013 17 October 2013 17 April 2014 16 October 2014
July 2013
CERN OpenStack
Production
February 2014
CERN OpenStack
Havana
October 2014
CERN OpenStack
Icehouse
March2015
CERN OpenStack
Juno
LIBERTY
Nova
Glance
Horizon
Keystone
Neutron (*)
Cinder
Ceilometer
Heat
Rally
EC2API
Magnum (*)
Barbican (*)
September 2015
CERN OpenStack
Kilo
KILO
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
Rally
Manila
September 2016
CERN OpenStack
Liberty
MITAKA
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
Rally
EC2API
Magnum
Barbican
Ironic (?)
Mistral (?)
Manila (?)
March 2017
CERN OpenStack
Mitaka
NEWTON
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
Rally
EC2API
Magnum
Barbican
Ironic (?)
Mistral (?)
Manila (?)
7 April 201630 April 2015 15 October 2015
OCATA
Nova
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
Rally
EC2API
Magnum
Barbican
Ironic (?)
Mistral (?)
Manila (*)
22 Feb 2017
June 2017
CERN OpenStack
Newton
6 Oct 2016
PIKE
28 Aug 2017
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 12
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 13
Currently >8000 hypervisors, 281K cores running 33,000 VMs
From ~200TB total to ~450 TB of RBD + 50 TB RGW• From ~200TB total to ~450 TB of RBD + 50 TB RGW
OpenStack Glance + Cinder
Example: ~25 puppet masters reading
node configurations at up to 40kHz
iop
s
14Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
• Scale tests with Ceph Luminous up to 65PB in a block storage pool
http://ceph.com/community/new-luminous-scalability/
Software Deployment
15Tim.Bell@cern.ch
Deployment based on CentOS and RDO- Upstream, only patched where necessary
(e.g. nova/neutron for CERN networks)
- Some few customizations
- Works well for us
Puppet for config’ management- Introduced with the adoption of AI paradigm
We submit upstream whenever possible- openstack, openstack-puppet, RDO, …
Updates done service-by-service over several months- Running services on dedicated (virtual) servers helps
(Exception: ceilometer and nova on compute nodes)
- Aim to be around 6-9 months behind trunk
Upgrade testing done with packstack and devstack- Depends on service: from simple DB upgrades to full shadow installations
Universe and Clouds - 26th September 2017
Community Experience Open source collaboration sets model for in-
house teams
External recognition by the community is highly rewarding for contributors
Reviews and being reviewed is a constant learning experience
Productive for job market for staff
Working groups, like the Scientific and Large Deployment teams, discuss wide range of topics
Effective knowledge transfer mechanisms consistent with the CERN mission
Dojos at CERN bring good attendance Ceph, CentOS, Elastic, OpenStack CH, …
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 16
Top level cell
Runs API service
Top cell scheduler
~50 child cells run
Compute nodes
Scheduler
Conductor
Decided to not use HA
Version 2 coming
Default for all
Scaling Nova
17Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Rally
18Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
What’s new? Magnum
Container Engine as a Service
Kubernetes, Docker, Mesos…
$ magnum cluster-create --name myswarmcluster --cluster-template swarm --node-count 100
$ magnum cluster-list
+------+----------------+------------+--------------+-----------------+
| uuid | name | node_count | master_count | status |
+------+----------------+------------+--------------+-----------------+
| .... | myswarmcluster | 100 | 1 | CREATE_COMPLETE |
+------+----------------+------------+--------------+-----------------+
$ $(magnum cluster-config myswarmcluster --dir magnum/myswarmcluster)
$ docker info / ps / ...
$ docker run --volume-driver cvmfs -v atlas.cern.ch:/cvmfs/atlas -it centos /bin/bash
[root@32f4cf39128d /]#
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 19
Scaling Magnum to 7M req/sRally drove the tests
1000 node clusters (4000 cores)
Cluster Size (Nodes) Concurrency Deployment Time
(min)
2 50 2.5
16 10 4
32 10 4
128 5 5.5
512 1 14
1000 1 23
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 20
What’s new? Mistral Workflow-as-a-Service used for multi-step actions,
triggered by users or events
Horizon dashboard for visualising results
Examples Multi-step project creation
Scheduled snapshot of VMs
Expire personal resources after 6 months
Code at https://gitlab.cern.ch/cloud-infrastructure/mistral-workflows
Some more complex cases coming in the pipeline
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 21
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 22
Automate provisioning
23
Automate routine procedures- Common place for workflows
- Clean web interface
- Scheduled jobs, cron-style
- Traceability and auditing
- Fine-grained access control
- …
Procedures for- OpenStack project creation
- OpenStack quota changes
- Notifications of VM owners
- Usage and health reports
- …
Disable compute
node
• Disable nova-service
• Switch Alarms OFF
• Update Service-Now ticket
Notifications
• Send e-mail to VM owners
Other tasks
Post new
message broker
Add remote AT
job
Save intervention
details
Send calendar
invitation
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Manila: Overview
24
• File Share Project in OpenStack- Provisioning of shared file systems to VMs
- ‘Cinder for file shares’
• APIs for tenants to request shares- Fulfilled by backend drivers
- Acessed from instances
• Support for variety of NAS protocols- NFS, CIFS, MapR-FS, GlusterFS, CephFS, …
• Supports the notion of share types- Map features to backends
Manila
Backend
1. Request share
2. Create share
4. Access share
User
instances
3. Provide handle
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
25
LHC Incident in April 2016
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Manila testing: #fouinehammer
26
m-share
DriverRabbitMQ
m-sched
m-api DBm-api m-api
1 … 500 nodes
1 ... 10k PODs
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Commercial Clouds
Universe and Clouds - 26th September 2017Tim.Bell@cern.ch 27
Development areas going forward Spot Market
Cells V2
Neutron scaling – no Cells equivalent yet
Magnum rolling upgrades
Collaborations with Industry
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 28
Operations areas going forward Further automate migrations
Around 5,000 VMs / year
First campaign in 2016 needed some additional scripting such as pausing very active VMs
Newton live migration includes most use cases
Software Defined Networking Nova network to Neutron migration to be completed
In addition to flat network in use currently
Introduce higher level functions such as LBaaS
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 29
Future Challenges
0
100
200
300
400
500
600
700
800
900
1000
Raw Derived
Dataestimatesfor1styearofHL-LHC(PB)
ALICE ATLAS CMS LHCb
0
50000
100000
150000
200000
250000
CPU(HS06)
CPUNeedsfor1stYearofHL-LHC(kHS06)
ALICE ATLAS CMS LHCb
BFirst run LS1 Second run LS2 Third run LS3 HL-LHC
…
FCC?
2013 2014 2015 2016 2017 201820112010 2012 2019 2023 2024 2030?20212020 2022 …2025
CPU:• x60 from 2016
Data:• Raw 2016: 50 PB 2027: 600 PB
• Derived (1 copy): 2016: 80 PB 2027: 900 PB
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 30
Raw data volume for LHC increases exponentially and with it processing and analysis
load
Technology at ~20%/year will bring x6-10 in 10-11 years
Estimates of resource needs at HL-LHC x10 above what is realistic to expect from
technology with reasonably constant cost
Summary OpenStack has provided a strong base for
scaling resources over the past 4 years without significant increase in CERN staff
Additional functionality on top of pure Infrastructure-as-a-Service is now coming to production
Community and industry collaboration has been productive and inspirational for the CERN team
Some big computing challenges up ahead…
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 31
Further Information
Technical details on the CERN cloud at
http://openstack-in-production.blogspot.fr
Custom CERN code is at https://github.com/cernops
Scientific Working Group at
https://wiki.openstack.org/wiki/Scientific_working_group
Helix Nebula details at http://www.helix-nebula.eu/
http://cern.ch/IT ©CERN CC-BY-SA 4.0Universe and Clouds - 19th June 2017 Tim.Bell@cern.ch 32
Backup
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 34
WLCG MoU Signatures2017:
- 63 MoU’s
- 167 sites; 42 countries
Partners
Contributors
Associates
Research
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 35
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 36
How do we monitor?
37
Processing
kafka
Data
Centres
Data
SourcesData
Access
Storage/Search
WLCG
Transport
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Tuning
38
Many hypervisors are configured for compute optimisation CPU Passthrough so VM sees identical CPU
Extended Page Tables so memory page mapping is done in hardware
Core pinning so scheduler keeps the cores on the underlying physical cores
Huge pages to improve memory page cache utilisation
Flavors are set to be NUMA aware
Improvements of up to 20% in performance
Impact is that the VMs cannot be live migrated so service machines are not configured this way
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
Provisioning services Moving towards
Elastic Hybrid IaaS
model:• In house resources at full
occupation
• Elastic use of commercial
& public clouds• Assume “spot-market”
style pricing
OpenStack Resource Provisioning
(>1 physical data centre)
HTCondor
Public Cloud
VMsContainersBare Metal and HPC
(LSF)
Volunteer
Computing
IT & Experiment
ServicesEnd Users CI/CD
APIs
CLIs
GUIs
Experiment Pilot Factories
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 39
Simulating Elasticity Deliveries are around 1-2 times per year
Resources are for Batch compute … immediately needed … compute optimised
Services … needed as projects request quota ... Support live migration with generic CPU definition
Elasticity is simulated by Creating opportunistic batch projects running on resources available
for services in the future
Draining opportunistic batch as needed
End result is High utilisation of ‘spare’ resources
Simulation of an elastic cloud
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch 40
Pick the interesting events 40 million per second
Fast, simple information
Hardware trigger in a few micro seconds
100 thousand per second Fast algorithms in local
computer farm
Software trigger in <1 second
Few 100 per second Recorded for study
41
Muon
tracks
Energy
deposits
Universe and Clouds - 26th September 2017 Tim.Bell@cern.ch
top related