dude where's my volume, open stack summit vancouver 2015
TRANSCRIPT
Today’s Presenters Neil Levine Sean Cohen
Gorka Eguileor
Director of Product Management, Ceph
Red Hat
Principal Product Manager, OpenStack
Red Hat
Software Engineer Cinder, Manila
Red Hat
2
Agenda
▪ OpenStack Disaster Recovery & Mul8-‐Site ▪ Ceph & Mul8-‐Site ▪ 4 use-‐cases
– Topologies – Configura8on – Future Op8ons
▪ Liberty Blueprints
3 OpenStack Summit May 2015 -‐ Vancouver
OpenStack Disaster Recovery
DR Configura8ons: – Ac8ve -‐ Cold standby – Ac8ve -‐ Hot standby – Ac8ve -‐ Ac8ve
5 OpenStack Summit May 2015 -‐ Vancouver
▪ Different disaster recovery topologies and configura8ons come with different RPO/RTO levels:
Site Topologies: – Stretched Cluster – One OpenStack Cluster – Two OpenStack Clusters
OpenStack Disaster Recovery
▪ What does disaster recovery for OpenStack involve?
– Capturing the metadata relevant for the protected workloads/resources via components APIs.
– Ensuring that the required VM images are present at the target/des8na8on cloud (limited to single cluster)
– Replica8on of the workload data using storage replica8on, applica8on level replica8on, or backup/restore.
OpenStack Summit May 2015 -‐ Vancouver
OpenStack Cinder
▪ Metadata database and volumes ▪ Topology
– HA pairs but within a single-‐site – No inherent mul8-‐site/DR architecture
▪ APIs – Volume Migra8on API – Volume Backup API – Volume Replica8on API
8 OpenStack Summit May 2015 -‐ Vancouver
OpenStack Glance
▪ Metadata database and images ▪ Topology
– HA pairs but within a single-‐site – No inherent mul8-‐site/DR architecture
▪ APIs – glance-‐api
9 OpenStack Summit May 2015 -‐ Vancouver
OpenStack Nova
▪ Metadata database and volume ▪ Topology
– HA pairs but within a single-‐site – No inherent mul8-‐site/DR architecture
▪ Ca_le: – shouldn’t be backing up ephemeral volumes…. – Put snapshots in Glance if you need them.
10 OpenStack Summit May 2015 -‐ Vancouver
Ceph RBD Overview
▪ Storage for Glance, Cinder and Nova ▪ RBD Exports
– Incremental by default
▪ RBD Mirroring – Aka Volume Replica8on – Scheduled for 2016
12 OpenStack Summit May 2015 -‐ Vancouver
Ceph RGW Overview
▪ Swih-‐API (and S3) compa8ble object store ▪ Common storage plajorm with RBD ▪ Mul8-‐Site v1: Ac8ve/Passive (today) ▪ Mul8-‐Site v2: Ac8ve/Ac8ve (2016)
13 OpenStack Summit May 2015 -‐ Vancouver
Can I run a single [OpenStack/Ceph] cluster?
▪ Not Recommended ▪ OpenStack not designed for high-‐latency links
– Possible for campus environments ▪ Ceph not designed for high-‐latency links
– Possible for campus environments – Pay a_en8on to monitor placement and read-‐affinity selngs
16 OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #1: User Ctrl-‐Z
▪ Only duplicate the backup storage cluster: − 1 OpenStack cluster (i.e. one logical Cinder service) − 2 Ceph clusters in different physical loca8ons
▪ Undo accidental volume dele8on ▪ Uses Cinder Backup service:
− Easy configura8on − Fine granularity
▪ Backups controlled by end-‐user or cloud admin 17 OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #1: Single “Stretched” Topology
18
Cinder Cinder-‐Backup
Ceph RBD Ceph RBD
Site A Site B
OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #1: Cinder Backup
▪ Tightly coupled to Cinder Volume ▪ Mul8ple available backends: RBD, RGW/Swih, NFS…
− Incremental backups by default with RBD ▪ Backup metadata is required to restore volumes ▪ Usage: Horizon, CLI, cinder-‐client API ▪ Some limita8ons:
− Single backend − Individual and manual process − Only available volumes
19 OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #1: Cinder Backup’s future
▪ Next cycle: − Decoupling from Cinder Volume − Snapshot backups − Scheduling
▪ In the mean8me: Script − Automa8c mul8ple backups − Control backups visibility from users − Backup in-‐use volumes − Limit backups per volume to keep
20 OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #2: The Admin Warehouse
23 OpenStack Summit May 2015 -‐ Vancouver
▪ One OpenStack cluster o No OpenStack services in Site B
▪ Two Ceph clusters ▪ Less to deploy in Site B, longer recovery 8me ▪ Backups controlled by admin, not user ▪ Restore everything in event of total data loss ▪ Equivalent to a tape backup
Use-‐Case #2: Topology
24
Cinder
Ceph RBD Ceph RBD
Site A Site B
Glance MySQL dump
rbd export
MySQL dump cinder.sql
glance.sql
OpenStack Summit May 2015 -‐ Vancouver
Use-Case #2: Configuration
▪ mysqldump --databases cinder glance ▪ Automated RBD Export script:
o https://www.rapide.nl/blog/ceph_-_rbd_replication ▪ Limitations:
o No snapshot clean up o Ensure backups complete in a day
▪ Restore o Reverse the streams
OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #3: The Failover Site
▪ Two OpenStack clusters, two Ceph clusters ▪ Backups controlled by admin ▪ Ac8ve/Passive ▪ Use low-‐level tools to handle backups
o MySQL Replica8on o RBD Exports
27 OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #3: Topology
28
Cinder
Ceph RBD Ceph RBD
Site A Site B
Glance
Cinder MySQL replica8on
rbd exports
use same fsid on backup cluster
Glance MySQL replica8on
OpenStack Summit May 2015 -‐ Vancouver
Use-Case #3: Configuration
▪ Replication but not include in HA pair ▪ Unlike Active-Active configurations - the
consistency between the data & the databases is not guaranteed.
OpenStack Summit May 2015 -‐ Vancouver
Use-‐Case #4: Topology (Future)
31
Cinder
Ceph RBD Ceph RBD
Site A Site B
Glance
Cinder
rbd mirroring
use same fsid on backup cluster
Glance
OpenStack Summit May 2015 -‐ Vancouver
cinder replica8on
glance replica8on
Use-Case #4: Future Options ▪ Glance-Replicator
o Run Glance in 2nd site and push image copies
OpenStack Summit May 2015 -‐ Vancouver
What’s coming up in Liberty
Cinder -‐ Volume Replica`on V2 ▪ Replica8on between Cinders
o Currently we have basic replica8on in a single Cinder deployment.
▪ Consistency data replica8on o Align CG design and volume-‐replica8on spec, one CG could support
different volume-‐types, where the volume-‐type to decide which
volume-‐replica8on is going to be created and added to CG.
OpenStack Summit May 2015 -‐ Vancouver
Summary
▪ Today: o Simple:
▪ Use-Case #1 - Ctrl-Z o Medium:
▪ Use-Case #2 - Admin Warehouse o Advanced:
▪ Use-Case #3 - Active/Passive Infrastructure ▪ Future:
o Use-Case #4 - Active/Passive OpenStack