adapting swift for tape storage or other high-latency media
TRANSCRIPT
Adapting Swift for Tape Storage or other high-latency
media
October 27, 2015
Harald Seipp (IBM Systems – Presenter)Slavisa Sarafijanovic (IBM Research)
Goal
Augment cloud object storage with a low-cost, cold storage tier for archive/backup use cases
Reduced cost● significantly lower than disk
Reduced availability● on the order of minutes
primary storage
highly available
archival storage
low-cost
archive
restore
Standard API (REST)
Client Application
HDD High-latencymedia
OpenStack Swift Cluster
Main Idea
Single Object Storage name space for Objects on● Tape or● Optical Disc or● SMR or MAID Disk
integrated with a standard disk-based OpenStack Swift installation
primary storage
highly available
archival storage
low-cost
archive
restore
Standard API (REST)
Client Application
HDD High-latencymedia
OpenStack Swift Cluster
Facts about Tape
Tape is 5x-10x cheaper than diskTape density scaling and cost are
projected to be advantageous over disk for the next 10 years (see 220 TB cartridge demo)
Tape is a mature technologyTape is already used in today’s
cloud offeringsLTFS is a widely adapted standard
primary storage
highly available
archival storage
low-cost
archive
restore
Standard API (REST)
Client Application
HDD LTFS Tape
OpenStack Swift Cluster
Shortcomings to be solved
Time-to-data● Up to (single-digit) minutes
→ Not playing well with Swift infrastructure (application/load balancer) time-out assumptions
Resource availability● Few drives per 100s cartridges
→ Random access (mounts/seeks) can lead to resource congestion
Addressing shortcomings
Swift API for archiving operations● Support explicit bulk operations (to minimize tape mounts and seeks)● Store/provide object state (“offline bit”) in a standardized way● Provide additional error code (“in transit”) upon access of migrated object
Improved timeout management Configurable Data Ring Auditing
● Support asynchronous tape data verification
Policy based global cluster object distribution● Assumption: related data (e.g. container) is likely to be accessed together
Discussed atVancouver Summit
Addressing shortcomings
Reference: https://etherpad.openstack.org/p/liberty-swift-tape-storage
Swift API for archiving operations● Support explicit bulk operations (to minimize tape mounts and seeks)● Store/provide object state (“offline bit”) in a standardized way● Provide additional error code (“in transit”) upon access of migrated object
Improved timeout management Configurable Data Ring Auditing
● Support asynchronous tape data verification
Policy based global cluster object distribution● Assumption: related data (e.g. container) is likely to be accessed together
SwiftSwift API
Swift API ILM extensions
ILM capable backend
POSIX File System
Swift API ILM* extensions:• Migrate (High-Latency media → Disk)
• Recall (Disk → high-latency media)
• Query status
Implementation proposal:• SwiftILM middleware
• Control path to ILM capable backend:• (1) Swift EA ←→ file attribute (async) • (2) Backend executable (sync/async)
(1)
(2) SwiftILM
Middleware
Diskcache
Tape
OpticalDisc
MAID/SMR
CallExecutable
Swift archiving API through SwiftILM
*Information Lifecycle Management
SwiftILM API proposal
To migrate a single object, issue following HTTP POST http://SWIFT-URL/ACCT/CONT/OBJ?MIGRATE● Similar GET/HEAD requests for RECALL and STATUS
Bulk operations on container levelhttp://SWIFT-URL/ACCT/CONT?MIGRATE
...or through regular expressions on Swift namespace● Get back a request ID for efficient status tracking
SwiftILM API proposal – advanced
(Optional) Setting ILM operations through SwiftILM API● Migration/recall based on object age/size/type etc.
(Optional) Backend-specific additions● e.g. to control placement to specific library/medium/pool
(Optional) Co-existence with Swift3● enabling ILM for S3 protocol as well
Add ILM to your existing Swift cluster
OpenStack Swift
Client Application
Standard Swift API with SwiftILM extensions(REST)
Standard Disk Data Ring(replication or erasure code)
scale-out
ILM-based Data Ring(replication across nodes)
scale-out
SwiftILMMiddleware
Take unmodified Swift Configure ILM-based
Data Ring Add SwiftILM
middleware Add ILM-capable
backendILM
capablebackend
Storage Node
ILMcapablebackend
Storage Node
Diskcache
Tape OpticalDisc
MAID/SMR
Diskcache
Tape OpticalDisc
MAID/SMR
Join us at the Design Summit or IBM boothfor further discussions!
[email protected]: hseipp
Twitter: @HaraldSeipp
http://www.research.ibm.com/labs/zurich/sto/tier_icetier.html