dude where's my volume, open stack summit vancouver 2015

36
Dude, where’s my volume? Vancouver, May 2015

Upload: sean-cohen

Post on 25-Jul-2015

1.156 views

Category:

Technology


2 download

TRANSCRIPT

Dude, where’s my volume?

Vancouver, May 2015

Today’s Presenters Neil Levine Sean Cohen

Gorka Eguileor

Director of Product Management, Ceph

Red Hat

Principal Product Manager, OpenStack

Red Hat

Software Engineer Cinder, Manila

Red Hat

2

Agenda  

▪  OpenStack  Disaster  Recovery  &  Mul8-­‐Site  ▪  Ceph  &  Mul8-­‐Site  ▪  4  use-­‐cases  

–  Topologies  –  Configura8on  –  Future  Op8ons  

▪ Liberty  Blueprints  

3   OpenStack  Summit  May  2015  -­‐  Vancouver    

Disaster recovery

OpenStack  Disaster  Recovery  

DR  Configura8ons:  –  Ac8ve  -­‐  Cold  standby      –  Ac8ve  -­‐  Hot  standby  –  Ac8ve  -­‐  Ac8ve  

 

5   OpenStack  Summit  May  2015  -­‐  Vancouver    

▪  Different  disaster  recovery  topologies  and  configura8ons  come  with  different  RPO/RTO  levels:  

Site  Topologies:  –  Stretched  Cluster      –  One  OpenStack  Cluster  –  Two  OpenStack  Clusters  

 

OpenStack  Disaster  Recovery  

▪  What  does  disaster  recovery  for  OpenStack  involve?    

–  Capturing  the  metadata  relevant  for  the  protected  workloads/resources  via  components  APIs.  

 

–  Ensuring  that  the  required  VM  images  are  present  at  the  target/des8na8on  cloud  (limited  to  single  cluster)  

 

–  Replica8on  of  the  workload  data  using  storage  replica8on,  applica8on  level  replica8on,  or  backup/restore.  

 OpenStack  Summit  May  2015  -­‐  Vancouver    

OPENSTACK  COMPONENTS  

7   OpenStack  Summit  May  2015  -­‐  Vancouver    

OpenStack  Cinder  

▪  Metadata  database  and  volumes  ▪  Topology  

–  HA  pairs  but  within  a  single-­‐site  –  No  inherent  mul8-­‐site/DR  architecture  

▪  APIs  –  Volume  Migra8on  API    –  Volume  Backup  API  –  Volume  Replica8on  API  

8   OpenStack  Summit  May  2015  -­‐  Vancouver    

OpenStack  Glance  

▪  Metadata  database  and  images  ▪  Topology  

–  HA  pairs  but  within  a  single-­‐site  –  No  inherent  mul8-­‐site/DR  architecture  

▪  APIs  –  glance-­‐api  

9   OpenStack  Summit  May  2015  -­‐  Vancouver    

OpenStack  Nova  

▪  Metadata  database  and  volume  ▪  Topology  

–  HA  pairs  but  within  a  single-­‐site  –  No  inherent  mul8-­‐site/DR  architecture  

▪  Ca_le:  –  shouldn’t  be  backing  up  ephemeral  volumes….  –  Put  snapshots  in  Glance  if  you  need  them.    

 10   OpenStack  Summit  May  2015  -­‐  Vancouver    

CEPH  COMPONENTS  

11   OpenStack  Summit  May  2015  -­‐  Vancouver    

Ceph  RBD  Overview  

▪  Storage  for  Glance,  Cinder  and  Nova  ▪  RBD  Exports  

–  Incremental  by  default  

▪  RBD  Mirroring  –  Aka  Volume  Replica8on  –  Scheduled  for  2016  

 

12   OpenStack  Summit  May  2015  -­‐  Vancouver    

Ceph  RGW  Overview  

▪  Swih-­‐API  (and  S3)  compa8ble  object  store  ▪  Common  storage  plajorm  with  RBD  ▪  Mul8-­‐Site  v1:  Ac8ve/Passive  (today)  ▪  Mul8-­‐Site  v2:  Ac8ve/Ac8ve  (2016)    

13   OpenStack  Summit  May  2015  -­‐  Vancouver    

TOPOLOGIES  

14   OpenStack  Summit  May  2015  -­‐  Vancouver    

Single  “Stretched”  Topology  Use-­‐Case  #1  

Can  I  run  a  single  [OpenStack/Ceph]  cluster?  

▪  Not  Recommended  ▪  OpenStack  not  designed  for  high-­‐latency  links  

–  Possible  for  campus  environments      ▪  Ceph  not  designed  for  high-­‐latency  links  

–  Possible  for  campus  environments  –  Pay  a_en8on  to  monitor  placement  and  read-­‐affinity  selngs  

 16   OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #1:  User  Ctrl-­‐Z  

▪  Only  duplicate  the  backup  storage  cluster:  −  1  OpenStack  cluster  (i.e.  one  logical  Cinder  service)  −  2  Ceph  clusters  in  different  physical  loca8ons  

▪ Undo  accidental  volume  dele8on  ▪ Uses  Cinder  Backup  service:    

−  Easy  configura8on  −  Fine  granularity  

▪  Backups  controlled  by  end-­‐user  or  cloud  admin   17   OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #1:  Single  “Stretched”  Topology  

18  

Cinder   Cinder-­‐Backup  

Ceph  RBD   Ceph  RBD  

Site  A   Site  B  

OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #1:  Cinder  Backup  

▪  Tightly  coupled  to  Cinder  Volume  ▪  Mul8ple  available  backends:  RBD,  RGW/Swih,  NFS…  

−  Incremental  backups  by  default  with  RBD  ▪  Backup  metadata  is  required  to  restore  volumes  ▪  Usage:  Horizon,  CLI,  cinder-­‐client  API  ▪  Some  limita8ons:  

−  Single  backend  −  Individual  and  manual  process  −  Only  available  volumes  

19   OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #1:  Cinder  Backup’s  future  

▪  Next  cycle:  −  Decoupling  from  Cinder  Volume  −  Snapshot  backups  −  Scheduling  

▪  In  the  mean8me:  Script  −  Automa8c  mul8ple  backups  −  Control  backups  visibility  from  users  −  Backup  in-­‐use  volumes  −  Limit  backups  per  volume  to  keep  

20   OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #1:  Script  Demo  

https://www.youtube.com/watch?feature=player_embedded&v=2HpMz27kiss

The  Admin  Warehouse  Use-­‐Case  #2  

Use-­‐Case  #2:  The  Admin  Warehouse  

23   OpenStack  Summit  May  2015  -­‐  Vancouver    

▪  One  OpenStack  cluster  o No  OpenStack  services  in  Site  B  

▪  Two  Ceph  clusters  ▪  Less  to  deploy  in  Site  B,  longer  recovery  8me  ▪  Backups  controlled  by  admin,  not  user  ▪  Restore  everything  in  event  of  total  data  loss    ▪  Equivalent  to  a  tape  backup  

Use-­‐Case  #2:  Topology  

24  

Cinder  

Ceph  RBD   Ceph  RBD  

Site  A   Site  B  

Glance  MySQL  dump  

rbd  export  

MySQL  dump  cinder.sql  

glance.sql  

OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-Case #2: Configuration

▪  mysqldump --databases cinder glance ▪  Automated RBD Export script:

o  https://www.rapide.nl/blog/ceph_-_rbd_replication ▪  Limitations:

o  No snapshot clean up o  Ensure backups complete in a day

▪  Restore o  Reverse the streams

OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #3  

The  Failover  Site  

Use-­‐Case  #3:  The  Failover  Site  

▪  Two  OpenStack  clusters,  two  Ceph  clusters  ▪  Backups  controlled  by  admin  ▪  Ac8ve/Passive  ▪  Use  low-­‐level  tools  to  handle  backups  

o MySQL  Replica8on  o RBD  Exports  

27   OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #3:  Topology  

28  

Cinder  

Ceph  RBD   Ceph  RBD  

Site  A   Site  B  

Glance  

Cinder  MySQL  replica8on  

rbd  exports  

use  same  fsid  on  backup  cluster  

Glance  MySQL  replica8on  

OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-Case #3: Configuration

▪  Replication but not include in HA pair ▪  Unlike Active-Active configurations - the

consistency between the data & the databases is not guaranteed.

OpenStack  Summit  May  2015  -­‐  Vancouver    

OpenStack  Live  Disaster  Recovery  Site  

30  

Use-­‐Case  #4  

OpenStack  Summit  May  2015  -­‐  Vancouver    

Use-­‐Case  #4:  Topology  (Future)  

31  

Cinder  

Ceph  RBD   Ceph  RBD  

Site  A   Site  B  

Glance  

Cinder  

rbd  mirroring    

use  same  fsid  on  backup  cluster  

Glance  

OpenStack  Summit  May  2015  -­‐  Vancouver    

cinder  replica8on    

glance  replica8on  

 

Use-Case #4: Future Options ▪  Glance-Replicator

o  Run Glance in 2nd site and push image copies

OpenStack  Summit  May  2015  -­‐  Vancouver    

What’s  coming  up  in  Liberty  

Cinder  -­‐  Volume  Replica`on  V2  ▪  Replica8on  between  Cinders  

o  Currently  we  have  basic  replica8on  in  a  single  Cinder  deployment.  

▪  Consistency  data  replica8on  o  Align  CG  design  and  volume-­‐replica8on  spec,  one  CG  could  support  

different  volume-­‐types,  where  the  volume-­‐type  to  decide  which  

volume-­‐replica8on  is  going  to  be  created  and  added  to  CG.  

OpenStack  Summit  May  2015  -­‐  Vancouver    

Summary

▪  Today: o  Simple:

▪  Use-Case #1 - Ctrl-Z o  Medium:

▪  Use-Case #2 - Admin Warehouse o  Advanced:

▪  Use-Case #3 - Active/Passive Infrastructure ▪  Future:

o  Use-Case #4 - Active/Passive OpenStack

Q&A

OpenStack  Summit  May  2015  -­‐  Vancouver    

Thank you!