cinder enhancements-for-replication-using-stateless-snapshots

• [email protected]

Cinder Enhancements for Replication (and more) using Stateless Snapshots

Using Stateless Snapshots with Taskflow

Snapshots • With Havana, all Cinder Volume Drivers support snapshots.

• But some vendors provide “stateless” volume snapshots: – Taking the snapshot does not interfere with use of the Volume.

– The Volume remains fully readable and writeable

• Stateless/Low-overhead snapshots are useful for many other activities

– Replication, Migration, Fail-over, Archiving, Deploying Master Images, …

• What is proposed:

– A set of optional enhancements for Cinder Volume Drivers.

– A pattern of usage for Taskflows to take advantage of stateless snapshots.

Backup, Migration, Replication and pre-failover Preparation

• Multiple methods, but a common pattern with the same issues:

• Need for NDMP/OST-style direct appliance-to-appliance transfers.

– Volumes are big, transferring them twice is not acceptable

– Transferring them “through” the volume-manager is not acceptable either

• Low-cost snapshots enable “stateless” methods

• Volume Drivers must report their capabilities:

– can-snap-stateless, storage-assist, etc.

Handful of Stones, Many Birds

• Proposal: Volumes Drivers to optionally implement:

– Snapshot Replication.

– Severing the tie to the Volume status.

– Reporting capabilities.

• Variety of ways that Taskflow could use those to:

– Backup, Migrate, Implement a variety of data protection strategies, Enhance automatic failover, Improve deployment of cloned images

– Implement sophisticated snap-retention policies

– https://review.openstack.org/#/c/53480/

– https://blueprints.launchpad.net/cinder/+spec/volume-backup-create-task-flow

https://review.openstack.org/#/c/53480/



https://blueprints.launchpad.net/cinder/+spec/volume-backup-create-task-flow











When Cinder manages non-local storage

• This deployment was cited as one of two in the deep dive presentation.

• But the first implementation of backup does not work acceptably for these deployments.

Specific Volume Manager

Cinder

Storage Backend

Storage Controller

iSCSI Target

Nova

VM Instance

iSCSI Initiator

/dev/vda

Hypervisor

Current Cinder Backup

1. Volume Driver fetches content

2. Volume Driver puts Backup Object as client for Object Storage

• Problem: doubles network traffic

– OK, compression reduces the second step.

– But even with 90% compression it would still be 1.1x just transferring the data.

• What we want is to do direct transfer (3 on the diagram), which would match other Cinder backend actions

Cinder

Storage Backend

Storage Controller

Block Target

Block Initiator

Backup Target

Swift

1

2

3


Shorten this - Ongoing Use of Volume with Concurrent Backup/Replication/Etc. • Existing Cinder pattern for

volume migration could be applied:

– Use of override flag can enable doing long operations on an attached volume

– Allows clients to continue to use the volume while the backup/replication (or whatever) is in progress.

Client Cinder Volume

Manager

Storage Controller

Backup Target

Replicate Replicate

Snapshot/ Replicate Snapshot/ [ Release Snapshot ]

Snap Write

Snap Write

Ack

Ack

I/O

I/O

Agenda

• Optional Cinder Enhancements

–Track Status Independent of the Volume Status

– Snapshot Replication

– Volume Driver Attributes

• Taskflow Usage

Volume Status alone blocks 24/7 Volumes

• The problem is that the Volume Status is set to Backing Up

– Or Migrating, or Replicating, etc.

• Other Cinder actions are blocked by this:

– At most one backup/migration/whatever can be in progress at a time.

– You cannot reassign a volume while it is being backed up.

• Proposed Solution: Use a different Status variable

– Allow Backends to modify the Task state, rather than Volume state.

• Backend must declare itself to be “stateless” for this method.

• Progress is reported via the Task state just as it would have via the Volume state.

Impact of allowing Alternate Status

• First, it is optional

– It allows implementations that can do long-term actions without restricting access to the Volume to do so.

– Stateful implementations are not required to change their code. • If taking a snapshot is expensive, you don’t want Cinder using this as a “shortcut”.

• This is safe. No reliance on end user knowing when to override.

• For “stateless” Volume Drivers:

– Cinder understands that launching long term methods (such as backup or replicate) has no impact on the Volume itself.

– The action is actually being performed on a low-cost snapshot.

States of a Taskflow Using a Cinder Volume (such as Backup)

from taskflow imports states

....

transitions = FlowObjectsTransitions()

transitions.add_transition(volume,

states.IN_PROGRESS,

“BACKING-UP”)

transitions.add_transition(volume, states.ERROR,

“FAILED”)

transitions.add_transition(volume, states.SUCCESS,

“SUCCESS”)

backup_flow = Flow(“backup_flow_api”, transitions)

In Progress (Backing-up)

Error (Failed)

Success (Success)

cinder.backup.manager

cinder.backup.flows.backup_volume_flow



Agenda


– Track Status Independent of the Volume Status

–Snapshot Replication –Volume Driver Attributes

• Taskflow Usage

Proposed Method: Replicate Snapshot • Why?

– Why not? – For many/most implementations

snapshots can be migrated. – Certain tasks are simpler with snapshots.

• Snapshots are not volatile.

• Method on an existing snapshot. – Specifies a different backend as the target.

• Must be under the same Volume Driver.

• Snapshot formats are inherently vendor specific.

– Optionally suppresses incremental transfers, requiring a full copy from scratch.


Cinder

Storage Backend

Storage Controller

Storage Backend Y

Storage Controller

Replicate Snapshot X to Backend Y

Snapshot X

Replicating Snapshots differs from Volume Migration

• Replicates snapshot rather than a volume.

• Original snapshot is not deleted.

• Volume Drivers may use incremental transfer techniques. – Such as ZFS incremental snapshots.

• Snapshots have vendor specific formats – So method to replicate them is inherently vendor specific.

– This allows for vendor specific optimization beyond incremental snapshots:

• Compression.

• Multi-path transfer.

Periodic Incremental Snapshots approaches Continuous Data Replication

• Replicate snapshot can provide Continuous Data Replication if – The Volume Driver supports incremental snapshots.

– The snapshots are performed quickly enough.

– Old snapshots are cleaned up automatically.

• Difference between “snapshots” and “remote mirroring” is more a matter of degree than a fundamental difference.

Benefits of Snapshot Replication

• Several tasks where Snapshot Replication helps – “Warm Standby” – pool of server synchronized at snapshot

frequency.

– Enhanced deployment of VM boot images from a common master.

– Disaster Recovery.

– “Backup” to other servers.

– Volume migration.

– Check-in/Check-out of Volumes from a central storage server as VM is deployed.

Replicated Snapshots are versatile

1. Restore a volume from a Snapshot where the snapshot was replicated.

– Fast restore of a volume, but not at the optimum location.

• Or:

1. Replicate the Snapshot to a preferred location

2. And clone it there.

Storage Backend holding Snapshot

Volume V Snapshot

V.s3

Preferred Location

Volume V Snapshot

V.s3

1

2

3

Other Issues

• Where does Storage Backend come from?

– At least two methods: • From a backend_id in the DB, as suggested in Avishay’s Volume

Mirroring proposal.

• By querying the Volume Driver for a list of backends that it controls.

• Volume Driver and/or Backend is responsible for tracking dependencies created by any incremental snapshot feature.

– The delta snapshot must be made a full snapshot before the referenced prior snapshot can be deleted on a given server.

Agenda


– Track Status Independent of the Volume Status

– Snapshot Replication

–Volume Driver Attributes

• Taskflow Usage

Why Volume Driver Attributes

• We do not want to mandate that all snapshots be stateless.

– It’s relatively easy for copy-on-write systems, but not everyone is copy-on-write.

• My philosophy for building consensus on open-source and standards:

– They should be flexible enough to allow my competition to be stupid.

– Especially since they think what I’m doing is stupid.

• Volume Driver Attributes let vendor-neutral code decide what will work well and what will not.

– Taking a snapshot does not optimize replication if it requires making a copy of the data before making a copy of the data.

Proposed Attributes: Volume Driver Capabilities

• Problem: how to optimize long (bulk data intensive) operations of Cinder volumes. – Vendor specific algorithms are needed.

– But do we want to require every task be implemented by each vendor.

• Proposal: Have each Volume Driver advertise when they have certain optional capabilities. – If the capability is advertised, vendor independent taskflow code can take

advantage of it.

– One method can be useful for many taskflows.

• Publication of these attributed is optional – If you don’t do X you don’t have to do anything to say you don’t do X.

– If you have no optional capabilities then you don’t have to say anything.

Suggested Implementation for Volume Driver attributes

cinder.volume.drivers.storwize_svc: from cinder.volume import capabilities class StorwizeSVCDriver(san.SanDriver): ... @capabilities.storage_assist def migrate_volume(self, ctxt, volume, host): ....

cinder.volume.manager: from cinder import capabilities class VolumeManager(manager.SchedulerDependentManager): ... @utils.require_driver_initialized def migrate_volume(self, ctxt, volume_id, host, force_host_copy=False): ... if capabilities.is_supported(self.driver.migrate_volume, 'storage_assist'): # Then check that destination host is the same backend. elif capabilities.is_supported(self.driver.migrate_volume, 'local'): # Then check that destination host is the same host. ...

• Suggestion use python capabilities

• Included in source code of Volume Driver – Already used by some

Volume Drivers • Easily referenced in code • https://review.openstack.org

/#/c/54803/ • https://blueprints.launchpad.

net/cinder/+spec/backend-activity




https://blueprints.launchpad.net/cinder/+spec/backend-activity





Agenda


• Taskflow Usage

–Maintain Volume Pool / “Warm Standby” – Optimized provisioning of master image

– Live Migration with Incremental Snapshots

– Apply policy for Snapshot Retention and Replication

– Check-in/Checkout Volume from Central Repository

Warm Standby – Before the Failover

1. Snapshot Volume V

2. Fully Replicate to new Backend.

Periodically/Contiously:

3. Take new snapshot.

4. Transfer incremental snapshot to standby Backend.

5. Apply incremental snapshot to make new full image versioned snapshot.

Backend selected as standby for

Volume V

Snapshot V.s1

Backend currently hosting Volume V

Volume V

1 2 Snapshot

V.s1


Volume V

Snapshot V.s2


Volume V

3 4

Snapshot V.s2

5

Failing Over to the Warm Standby

1. Current host Fails

2. Clone new Volume V from Snapshot.

3. Select new standby target and repeat prior slide.


Volume V

Snapshot V.s1


Volume V

1

2

Volume V

Adjacent to proposed Volume Mirroring solution

• Not fully overlapping, but frequently taken snapshots replicated incrementally begins to resemble Volume Mirroring.

– It cannot match Volume Mirroring with near-instant relay of transactions.

– But it consumes a lot less of network resources, especially peak network resources.

– It is more flexible operationally. There is no need to setup one-to-one mirror relationships in Cinder.

• We can offer both solutions and let end users decide which is best for their needs.

Agenda


• Taskflow Usage

– Maintain Volume Pool / “Warm Standby”

–Optimized provisioning of master image – Live Migration with Incremental Snapshots



Proposed New Taskflow: Provision boot volume with minimal overhead

• Optimize for common boot images that provide a common OS for many VMs.

• Creating two VMs from the same template should not require 2x the bandwidth.

Glance

Volume Template VT

Cinder Backend

Volume V2 based on Template VT


Snapshot optimized Image Provisioning - 1

1. Use Glance to create reference image

– Already adapted for a specific deployment format.

2. Take snapshot of that volume.

3. Clone additional targets from that snapshot.

4. Repeat as more VMs from the same template are launched.

Glance

Volume Template VT

Cinder Backend

Volume V2 based on Template VT Volume V1 based on

Template VT

Snapshot V-Prime based on initial Volume V1

2

3


4

Agenda


• Taskflow Usage


– Optimized provisioning of master image

– Live Migration with Incremental Snapshots – Apply policy for Snapshot Retention and Replication


Add live migration using incremental snapshots

• This is essentially how Hypervisors Live Migrate VMs

– Loop

• Make incremental snapshot

• If empty – Break

• Send incremental snapshot to destination

– De-activate source volume

– Clone volume at destination from snapshots.

Agenda


• Taskflow Usage




–Apply policy for Snapshot Retention and Replication

– Check-in/Checkout Volume

Possible Taskflow: Manage retention/replication of snapshots

• Set a policy for retention of snapshots

– Frequency for taking snapshots.

– Which snapshots to retain.

• Automatically replicate some snapshots to other backend targets.

• Backup some to Object storage

Agenda


• Taskflow Usage





–Check-in/Checkout Volume

Possible taskflow: Check-out and Check-in of Volumes

• Use Case: Persistent disk images for intermittent computer jobs.

– Example: non-continuous compute job needs disk image near it whenever it is launched.

– Example: VDI desktop needs access to persistent disk image. • This is especially useful when this is a thin image that relies on the

central image for blocks not altered or referenced yet.

– Periodically snapshot and post delta to the central repository.

– Check-in when done with final snapshot.

– Then delete the remote volume and change status in central repository to allow a new check-out.

Steps for Check-out/Check-In.

1. Snapshot the Volume being checked out 2. Replicate the Snapshot to a Host-

Adjacent (or co-located backend) 3. De-activate the Volume on the Master. 4. Clone the Volume on the host-adjacent

backend. 5. Periodically snapshot the Volume on the

host-adjacent backend. 6. Replicate those Snapshots to the Master

storage backend 7. Snapshot a final time on the Storage

Backend. 8. Replicate to the Master 9. Remove volume on the host-adjacent

backend. 10. Clone new volume on the master

backend from the final snapshot

Master Storage Backend

Snapshot V.s1

Snapsho V.s2

Snapshot V.s3

Volume V

Snapshot V.s4

Host-Adjacent Storage Backend

Volume V

Snapshot V.s4

Snapshot V.s5

Snapshot V.s5

1

2 Snapshot V.s3

3 4

5

6 7

8

9 10

Summary

• Taskflow can automate several Cinder releated tasks

– This logic can be vendor neutral

• But to do so efficiently it needs a handful of Cinder enhancements

– Optional separation from Volume Status for long-running activities.

– Snapshot Replication.

– Volume Driver Attributes.

• Wiki: https://wiki.openstack.org/wiki/CERSS

• Questions?:

– [email protected], irc:caitlin56.

– [email protected], irc:vito-ordaz.

mailto:[email protected]

mailto:[email protected]

cinder enhancements-for-replication-using-stateless-snapshots

Technology

volume state

cinder volume drivers

volume migration

stateless volume snapshots

attached volume

stateless volume drivers

backup cinder

ongoing use of volume