lustre/hsm is there! - eofs · hsm seamless integration 5 septembre 2013 | page 5 clientclient...

21
LUSTRE/HSM BINDING IS THERE! SEPTEMBER, 17 th 2013 LAD'13 | Aurélien Degrémont <[email protected]> 5 septembre 2013 | PAGE 1 CEA | 10 AVRIL 2012

Upload: others

Post on 29-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LUSTRE/HSM BINDING IS THERE!

SEPTEMBER, 17th 2013

LAD'13 | Aurélien Degrémont <[email protected]>

5 septembre 2013 | PAGE 1CEA | 10 AVRIL 2012

Page 2: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

AGENDA

Presentation

Architecture

Components

Examples

Project status

5 septembre 2013 | PAGE 2

Page 3: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

PRESENTATION

5 septembre 2013

| PAGE 3

CEA | 10 AVRIL 2012

Page 4: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

PRESENTATION (1/3)

A long-awaited project!

This project started several years ago.It has known all Lustre companies.After lots of modifications and rewrites, it is finally there!

It is landed!

Partially landed in Lustre 2.4Has reached total inclusion in Lustre 2.5Will be available in it, at the end of October 2013.

5 septembre 2013 | PAGE 4

Page 5: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

PRESENTATION (2/3)

Principle

HSM seamless integration

5 septembre 2013 | PAGE 5

ClientClientClientsClientClientClients

ClientClientClients

Take the best of each world:

Lustre: High performant disk-cache in front of the HSM

- Parallel filesystem- High I/O performance- POSIX access

HSM: long term data storage

- Manage large number of cheaper disks and tapes

- Huge storage capacity

Ideal for center-wide Lustre filesystem.

HSM backend

Page 6: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

PRESENTATION (3/3)

Features

Migrate data to HSM (Archive)Free disk space when needed (Release)Bring back data on cache-miss (Restore)

Policy management (migration, purge, soft removal,…)Import from existing backendDisaster recovery (restore Lustre filesystem from backend)

New componentsCopy tool (backend specific user-space daemon)Policy Engine (user-space daemon)Coordinator

5 septembre 2013 | PAGE 6

Page 7: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

ARCHITECTURE

5 septembre 2013

| PAGE 7

CEA | 10 AVRIL 2012

Page 8: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

ARCHITECTURE (1/2)

New components: Coordinator, Agent and Copy tool

The coordinator gathers archive requests and dispatches them to agents.

Agent is a client which runs a copytool to transfer data between Lustre and the HSM.

5 septembre 2013 | PAGE 8

Clie

nts

Lustre world HSM world

OSS

OSS

MDS

HSM protocols

Coordinator

Archiving toolArchiving tool

Client“Agent”

Client“Agent”Client

“Agent”

Copy tool

Page 9: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

ARCHITECTURE (2/2)

PolicyEngine manages Archive and Release policies

A user-space tool which communicates with the MDT and the coordinator.

Watches the filesystem changes.

Triggers actions like archive, release and removal in backend.

5 septembre 2013 | PAGE 9

Clie

nts

OSS

OSS

MDSCoordinator

ClientPolicyEngine

Page 10: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

COMPONENTS

5 septembre 2013

| PAGE 10

CEA | 10 AVRIL 2012

Page 11: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

Lustre Client“Agent”

COMPONENTS (1/4)

Copytool

It is the interface between Lustre and the HSM.It reads and writes data between them. It is HSM specific.It runs on a standard Lustre client (called Agent)

2 of them are already available:POSIX copytool. Could be used with any system supporting a POSIX interface.- It is provided with Lustre HPSS copytool. (HPSS 7.3.2+). - CEA development which will be freely available to all HPSS sites.

More supported HSM to come:DMF (SGI)OpenArchive (GRAU DATA)

5 septembre 2013 | PAGE 11

HSM world

HSM protocols

Copy tool

Page 12: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

Requests

COMPONENTS (2/4)

Coordinator

MDS thread which coordinates HSM-related actions.Centralize HSM-related requests.Ignore duplicate request.Control migration flow.Dispatch requests to copytools.Requests are saved and replayed if MDT crashes.

5 septembre 2013 | PAGE 12

MDSCoordinator

Copy toolCopy toolCopy tool

Requests

Page 13: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

MDT

COMPONENTS (3/4)

DNE compatible

Distributed NamespacE feature, introduced in Lustre 2.4, is compatible with Lustre/HSMWith the following constraints:

One Coordinator for each MDT- Each Coordinator only cares about its MDT filesEvery copytools connect to every Coordinators- No cluster-wide load balancing, though

Implementation is currently suboptimal and is to be improved in the future

5 septembre 2013 | PAGE 13

Coordinator

Copy toolCopy toolCopy tool

MDT

Coordinator

Page 14: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

COMPONENTS (4/4)

Policy Engine: RobinHood

PolicyEngine is the specificationRobinHood is an implementation:

Was first a user-space daemon for monitoring and purging large filesystems.CEA opensource development (http://robinhood.sf.net)Requires RobinHood 2.4.3+

PoliciesFile class definitions, associated to policiesBased on files attributes (path, size, owner, age, xattrs…)Rules can be combined with boolean operatorsLRU-based migration/purge policiesEntries can be white-listed

5 septembre 2013 | PAGE 14

ClientPolicyEngine

Page 15: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

EXAMPLES

5 septembre 2013

| PAGE 15

CEA | 10 AVRIL 2012

Page 16: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

EXAMPLES (1/4)

Setup

Requirements:Standard Lustre v2.5 (so far, current master branch), sources or RPMsRobinHood v2.4.3+ sources, from RobinHood website (no RPMs available yet)

Simple configuration (theorically, 1 Lustre node is enough)

5 septembre 2013 | PAGE 16

ClientOSS

MDS

Client“Agent”

POSIXCopy tool

HSMRobinhood

file accesses & lfs

MySQL

HSM with a POSIX

interface

Page 17: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

EXAMPLES (2/4)

Command line tools

Sysadmins and users can manage file system states:

5 septembre 2013 | PAGE 17

$ lfs hsm_archive /mnt/lustre/foo

$ lfs hsm_state /mnt/lustre/foo /mnt/lustre/foo: (0x00000009) exists archived, archive_id:1

$ lfs hsm_release /mnt/lustre/foo

$ lfs hsm_state /mnt/lustre/foo /mnt/lustre/foo: (0x0000000d) released exists archived, archive_id:1

$ md5sum /mnt/lustre/fooded5b0680e566aa024d47ac53e48cdac /mnt/lustre/foo

$ lfs hsm_state /mnt/lustre/foo /mnt/lustre/foo: (0x00000009) exists archived, archive_id:1

ARCHIVE

RELEASE

AUTOMATICRESTORE

Page 18: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

EXAMPLES (3/4)

Example RobinHood policy: MigrationMigrate files older than 12 hours with a different behavior for small ones.

5 septembre 2013 | PAGE 18

Filesets {FileClass small_files {

definition { tree == "/mnt/lustre/project" and size < 1MB } migration_hints = "cos=12" ; ...

} }

Migration_Policies {ignore { size == 0 or xattr.user.no_copy == 1 }ignore { tree == "/mnt/lustre/logs" and name == "*.log" }

policy migrate_small {target_fileclass = small_files;condition { last_mod > 6h or last_archive > 1d }

}...policy default {

condition { last_mod > 12h }migration_hints = "cos=42" ;

} }

Page 19: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

EXAMPLES (4/4)

Example RobinHood policy: ReleaseRelease archived files when FS usage is above 90 % but ignore some files.

5 septembre 2013 | PAGE 19

Purge_trigger {trigger_on = ost_usage;high_watermark_pct = 90%;low_watermark_pct = 80%;

}

Purge_Policies {ignore { size < 1KB or owner == “root” }

policy purge_quickly {target_fileclass = class_foo;condition { last_access > 1min }

}

...

policy default {condition { last_access > 1h }

} }

Page 20: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

LAD'13 | 17 SEPTEMBER 2013

PROJECT STATUS

Client-side was landed in Lustre 2.4

Only support compute node accessesNo administrative taskDoes not support copytools

Full code is landed in current master branchThanks to Intel, the whole code is now landedETA: End of October 2013Will be available in Lustre 2.5, which will be the next maintenance branch

Currently under test and debugging

5 septembre 2013 | PAGE 20

Page 21: Lustre/HSM is there! - EOFS · HSM seamless integration 5 septembre 2013 | PAGE 5 ClientClient Clients ClientClient Clients ClientClient Clients Take the best of each world: Lustre:

5 septembre 2013 | PAGE 21LAD'13 | 17 SEPTEMBER 2013

Thanks.Questions?