opennebulaconf 2016 - budgeting: the ugly duckling of cloud computing? by matteo lanati, leibniz...

21
Budgeting: the Ugly Duckling of Cloud Computing? Dr. Matteo Lanati ([email protected]) 25 th October 2016

Upload: opennebula-project

Post on 16-Apr-2017

114 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

Budgeting: the Ugly Duckling of Cloud Computing?Dr. Matteo Lanati ([email protected])

25th October 2016

Page 2: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

2LRZ, Distributed Resources Group, Matteo Lanati

● Introduction● Update on the last year‘s activity● Budgeting for OpenNebula

● Why it is needed

● Rationale and ideas

● Current status

● Next steps

Outline

Page 3: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

3LRZ, Distributed Resources Group, Matteo Lanati

● Scope:

– Munich

– Bavaria

– Germany

– Europe

– Worldwide

● Provision of traditional IT services

● High performance systems

Leibniz Supercomupting Centre of the BavarianAcademy of Sciences and Humanities

Page 4: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

4LRZ, Distributed Resources Group, Matteo Lanati

Phase 1 (2012)● Westmere/Sandy Bridge● > 155.000 cores● > 3.0 Pflops/s total peak

performance

SuperMUC

Phase 2 (2015)● Haswell ● > 86.000 cores● > 3.5 Pflops/s total peak

performance

https://www.lrz.de/services/compute/supermuc/

Page 5: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

5LRZ, Distributed Resources Group, Matteo Lanati

SSH commands

Monitoring probes

...

Worker node 88

Datastore System store 1 System store 10

Worker node 1

LRZ Compute Cloud: OpenNebula setup

88 physical nodes736 cores / 7.5 TB RAM

...

VMWare high availability8 cores / 32 GB RAM

NetApp NAS300 TB

Page 6: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

6LRZ, Distributed Resources Group, Matteo Lanati

Our user base

Update on last year‘s activity

March 2015 – October 2015200 accounts

October 2015 – October 2016250 new accounts

LRZ 28%

Other 15% Math / CS 28%

Mech. Eng. 12%

Other 40%

LRZ 19%

Math / CS 15%

Bio 10%

Page 7: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

7LRZ, Distributed Resources Group, Matteo Lanati

Resource usage: computation

Update on last year‘s activity

October 2015 – October 2016Computation: ~3 Mi CPU-hours

Storage: ~30 TB

March 2015 – October 2015Computation: ~1 Mi CPU-hours

Storage: ~10 TB

Math / CS1.2 M (41%)

Geo496 K (17%)

Mech. Eng.459 K (15%)

Other443 K (14%)

LRZ191 K (6%)

Geo396 K (35%)

Math / CS128 K (12%)

Other219 K (20%)

LRZ88 K (8%)

Mech. Eng.64 K (6%)

Physics108 K (10%)

Page 8: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

8LRZ, Distributed Resources Group, Matteo Lanati

Goal: efficient use of resources (i.e., few idle VMs)

Manage the lifetime of a group of VMs according to:

● Number of cores (Nc)● RAM (Mem)● Datastore space (Ds)● IPs● time

What budgeting means

Cost function

(A * Nc + B * Mem + C * Ds + D * IPs) * <running time>

Page 9: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

9LRZ, Distributed Resources Group, Matteo Lanati

A concrete proposal for the cost factors

What budgeting means

0.01 * Nc * <hours> + 0.001 * Mem * <hours> + + 0.01 * Ds * <months> + 0.50 * IPpublic * <months> +

+ 0.10 * IPprivate * <months>

Item Time period Cost

Core Hour 0.01 €

GB of RAM Hour 0.001 €

GB in image store Month 0.01 €

Public IP Month 0.50 €

Private (campus) IP Month 0.10 €

Page 10: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

10LRZ, Distributed Resources Group, Matteo Lanati

Use cases

● Computational bursts– 200 to 400 cores for few weeks to few months

● Multitenancy inside a group / project– Support students training activities

– Important feature: avoid budget overflow

● Resource management and planning– To help us deciding how /in which direction to grow

Why budgeting

Page 11: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

11LRZ, Distributed Resources Group, Matteo Lanati

Hardware Classes

● Regular– Payed by LRZ

● Reserved– Brought in by the user

– Exclusive access

Budgeting: the big plan

User Classes

● Normal (uninterruptible)– No guarantees on start time

● Privileged (golden)– Immediate start

Page 12: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

12LRZ, Distributed Resources Group, Matteo Lanati

Hardware Classes

● Regular– Payed by LRZ

● Reserved– Brought in by the user

– Exclusive access

Budgeting: the big plan

User Classes

● Normal (uninterruptible)– No guarantees on start time

● Privileged (golden)– Immediate start

Page 13: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

13LRZ, Distributed Resources Group, Matteo Lanati

Hardware Classes

● Regular– Payed by LRZ

● Reserved– Brought in by the user

– Exclusive access

Budgeting: the big plan

Usage optimisation

User Classes

● Normal (uninterruptible)– No guarantees on start time

● Privileged (golden)– Immediate start

● Opportunistic● Interruptible

Page 14: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

14LRZ, Distributed Resources Group, Matteo Lanati

Hardware Classes

● Regular

● Reserved– Permission/ownership

– Scheduling requirements

– Scheduler

Budgeting: possible implementation

User Classes

● Selected in the template● Possible customisation of the

GUI

Page 15: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

15LRZ, Distributed Resources Group, Matteo Lanati

● Prepaid model– Avoid budget overflow

– Mitigation in case the budget is exceeded => undeploy VMs

● External implementation– Split the budget management from sysadmin view

– Easier to use the cost function to run a prediction model

Budgeting: important features

Page 16: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

16LRZ, Distributed Resources Group, Matteo Lanati

Budgeting: the implementation so far

Page 17: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

17LRZ, Distributed Resources Group, Matteo Lanati

Budgeting: the implementation so far

Page 18: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

18LRZ, Distributed Resources Group, Matteo Lanati

Budgeting: the implementation so far

Page 19: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

19LRZ, Distributed Resources Group, Matteo Lanati

Budgeting: the implementation so far

VM submission VM runningHook script

Cron jobs

VM undeployed

ONE DB

Budget thresholds

Budget Consumption<# cores> * <running time>

Page 20: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

20LRZ, Distributed Resources Group, Matteo Lanati

● Update to ONE 5.0.x● Upgrade the hardware● Focus on the security of VMs – LRZ Security Scanner (LSS)

– Detect weak passwords

– Identify vulnerabilities

Next Steps

Page 21: OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Matteo Lanati, Leibniz Supercomputing Centre

21LRZ, Distributed Resources Group, Matteo Lanati

Thank you for your attention