dcache delegated storage solutions · dcache as storage system provides a singlerooted namespace....

21
dCache -  delegated storage solutions Tigran Mkrtchyan for dCache Team ISGC 2016, Taiwan

Upload: others

Post on 21-Feb-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

dCache ­  delegated storage solutionsTigran Mkrtchyan for dCache Team

ISGC 2016, Taiwan

Page 2: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 2

dCache on one slide

Pools(Data Server)

Pools(Data Server)

Door

Message passing layer

JVM JVM JVM

Door(s)(clients entry point) Pool Manager

(requests scheduler)Name Space(MetaData Server)

Pools(Data Server)

DBMSdcap

ftphttpnfs

Page 3: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 3

Usage around the World● ~ 80 installations

● > 50% of WLCG storage

● biggest 22 PB

● Typical ~100x nodes

● Typical ~ 10^7 files

Page 4: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 4

dCache as Storage System● Provides a single­rooted namespace.● Metadata (namespace) and data locations are independent.● Aggregates multipe storage nodes into a single storage system.● Manages data movement, replication, integrity.● Provides data migration between multiple tiers of storage (DISK, 

SSD, TAPE).● Uniquely handles different Authentication mechanisms, like 

x509, Kerberos, login+password, auth tokens.● Provides access to the data via variety of access protocols 

(WebDAV,  NFSv4.1/pNFS, xxxFTP. DCAP, Xrootd, DCAP).

Page 5: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 5

dCache as Storage System● Provides a single­rooted namespace.● Metadata (namespace) and data locations are independent.● Aggregates multipe storage nodes into a single storage system.● Manages data movement, replication, integrity.● Provides data migration between multiple tiers of storage (DISK, 

SSD, TAPE).● Uniquely handles different Authentication mechanisms, like 

x509, Kerberos, login+password, auth tokens.● Provides access to the data via variety of access protocols 

(WebDAV,  NFSv4.1/pNFS, xxxFTP. DCAP, Xrootd, DCAP).

Page 6: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 6

dCache's data management● Automatic migration

● Tape/disk/disk● HotSpot detection● Permanent migration jobs● Checksumming on transfer

● Manual migration● Data replication

● multiple copies● same host/rack/site policy

Page 7: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 7

Software­defined storage (or did you listen Patrick carefully?)

● Abstraction of logical storage services and capabilities from the underlying physical storage systems

● Automation with policy­driven storage provisioning with service­level agreements replacing technology details.

● Commodity hardware with storage logic abstracted into a software layer.

Page 8: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Storage in dCache (what we have)

Block device

Pool service

● dCache provides high level service● Data replication and management core dCache service● Each pool attached to own disks

Block device

Pool service

Block device

Pool service

Block device

Pool service

Block device

Pool service

Replication/Migration

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Page 9: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Storage in dCache (outsourcing, phase 1)

Block device

Pool service

● dCache provides high level service● Data replication and management core dCache service● Each pool has it own 'partition' on shared storage● Each 'partition' attached to it's own block device

Block device

Pool service

Block device

Pool service

Block device

Pool service

Block device

Pool service

Replication/Migration

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Page 10: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Phase 1 (changing IO layer)

● Single data server owns the data● Single data server manages data

● flush to tape● restore from tape● removal● garbage collection

Page 11: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Replication/Migration

Storage in dCache (outsourcing, phase 2)

Block device

Pool service

● dCache provides high level service● All pool see all 'partition' on shared storage● Any pool can deliver data from any partition● Object store takes care about replication

Block device

Pool service

Block device

Pool service

Block device

Pool service

Block device

Pool service

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Page 12: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Phase 2 (Changing core philosophy)

● All data managed by 'quorum'● group decision who interact with tape● group decision who/when file is removed● File location is always 'known'

Page 13: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Replication/MigrationReplication/Migration

Storage in dCache (outsourcing, phase 3)

Block device

● dCache provides high level service● dCache can move data between regular and OS pools

Block device Block device Block deviceBlock device

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Replication/Migration

Pool service Pool service Pool service Pool servicePool service

Page 14: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Phase 3 (mixed environment)

● Mixed setup● Islands of storage servers● Replication and data movement between 

islands 

Page 15: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 15

Why CEPH

● No specific hardware support● Runs on commodity hardware● Scalable to exabytes of data ● Deployed at sites as storage system for 

OpenStack● Provides Object, Block and File interfaces

Page 16: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 16

And not only CEPH

● Other object store can be adopted● DDN WOS

● Swift/S3/CDMI● Cluster file systems (as a side effect)

● Luster● GPFS● GlusterFS

Page 17: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 17

CEPH (extremely simplified)● OSD ~ a physical disk● CRUSH - determines how to store

and retrieve data by computing data storage locations.

● RADOS - distributes objects across the storage cluster and replicates objects

● librados - provides low-level access to the RADOS service.

OSD OSD OSD

CRUSH

RADOS

LIBRADOS

APP RDBCEPH

FS

Page 18: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 18

Current work

● Functional prototype only● Focus on stability first● RBD based

● striping● alterable content

● Object interface will be evaluated as well

Page 19: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 19

Roadmap● Phase 1

● running prototype is available today● some sites volunteer to help with testing

● cleaning up to make generally available● Phase 2/3

● depends on user demand● operational overhead, if any●  support overhead, if any

Page 20: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 20

Summary

● dCache is demanded storage system.● New technology provides required building 

blocks.● Combination on both makes us to 

concentrate on missing parts.● Working prototype available for testing. 

Page 21: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 21

Links

● https://www.dcache.org/● https://en.wikipedia.org/wiki/Software­def

ined_storage● http://ceph.com/