distributed file system

40
Comparison of different distributed file systems for use with Samba/CTDB @SambaXP’09 Henning Henkel April 23, 2009 H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 1 / 35

Upload: sivakumarkp

Post on 23-Nov-2014

152 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Distributed File System

Comparison of different distributed file systems foruse with Samba/CTDB

@SambaXP’09

Henning Henkel

April 23, 2009

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 1 / 35

Page 2: Distributed File System

Agenda

Agenda

1 Introduction

2 Theoretical background

3 Practical part

4 The results

5 Conclusion

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35

Page 3: Distributed File System

Agenda

Agenda

1 Introduction

2 Theoretical background

3 Practical part

4 The results

5 Conclusion

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35

Page 4: Distributed File System

Agenda

Agenda

1 Introduction

2 Theoretical background

3 Practical part

4 The results

5 Conclusion

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35

Page 5: Distributed File System

Agenda

Agenda

1 Introduction

2 Theoretical background

3 Practical part

4 The results

5 Conclusion

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35

Page 6: Distributed File System

Agenda

Agenda

1 Introduction

2 Theoretical background

3 Practical part

4 The results

5 Conclusion

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35

Page 7: Distributed File System

Introduction What is this all about?

Introduction

Diploma study in Computer Networking at the FurtwangenUniversity (HFU) for applied scienceDiploma thesis at the science + computing ag in TübingenSupervising tutors:→ Prof. Dr. Christoph Reich (Furtwangen University)→ Dipl.-Phys. Daniel Kobras (science + computing ag)

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 3 / 35

Page 8: Distributed File System

Introduction What is this all about?

What were the goals of the diploma thesis?

In the context of the diploma thesis was tested . . .. . . which features should be provided by a distributed file systemto use it with Samba/CTDB. . . what the differences between IBM’s GPFS, RedHat’s GFS andSun’s Lustre are when used with Samba/CTDB

Not tested in the context of the diploma thesis are . . .. . . the fencing mechanisms provided by Samba/CTDB. . . the cluster management provided by Samba/CTDB

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 4 / 35

Page 9: Distributed File System

Theoretical background Dissociation

What is pCIFS?

In the Samba/CTDB contextParallel CIFS servers as a CTDB layer between CIFS Clients anddistributed file systemsOne Client is connected to only one CIFS Server.There is no need for modifications on the client side.

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 5 / 35

Page 10: Distributed File System

Theoretical background Dissociation

What is pCIFS?

In the lustre contextA set of parallel CIFS servers provied access to the lustre filesystem.One client can connect to multiple CIFS Servers.

Advantage: A single client might reach the maximum throughput.But there are also major disadvantages:

There is the need for a special CIFS client software.The client software is only for one specially picked file system.

I’m not aware of a product ready implementation.

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 6 / 35

Page 11: Distributed File System

Theoretical background Dissociation

What is a distributed file system?

Appl. A

Local OS 1

Application B

Local OS 2 Local OS 3

Appl. C

Local OS 4

Distributed system layer (middleware)

Computer 2Computer 1 Computer 3 Computer 4

Network

Figure: Distributed file systems are a middelware

Source: Distributed Systems - Principles and Paradigms, Tanenbaum 2007

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 7 / 35

Page 12: Distributed File System

Theoretical background Dissociation

Microsoft’s DFS

CIFS­Client1

CIFS­Client2

CIFS­Client3

CIFS­ClientN

CIFS­ClientN+1

Network

Network

CIFS­Server1

CIFS­Server2

CIFS­Server3

CIFS­ServerN

CIFS­ServerN+1

DFS

File A

File D

File Q File H

File A

File D

File Q File H

File X

File X

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 8 / 35

Page 13: Distributed File System

Theoretical background Accessing a distributed file system

Access without CTDB

Distributed Filesystem

CIFS­Client1

CIFS­Client2

CIFS­Client3

CIFS­ClientN

CIFS­ClientN+1

Network

Network

Cluster­Node1

HD1 HD2

Cluster­Node2

HD1 HD2

Cluster­Node3

HD1 HD2

Cluster­NodeN

HD1 HD2

Cluster­Node N+1

HD1 HD2

Samba­Server

FS­Client

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 9 / 35

Page 14: Distributed File System

Theoretical background Accessing a distributed file system

Access with CTDB

Distributed Filesystem

CIFS­Client1

CIFS­Client2

CIFS­Client3

CIFS­ClientN

CIFS­ClientN+1

Network

Network

Cluster­Node1

HD1 HD2

Cluster­Node2

HD1 HD2

Cluster­Node3

HD1 HD2

Cluster­NodeN

HD1 HD2

Cluster­Node N+1

HD1 HD2

Samba­Server1

FS­Client1

Samba­ServerN

FS­ClientN

Samba­ServerN+1

FS­ClientN+1

Clustered Trivial Database

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 10 / 35

Page 15: Distributed File System

Theoretical background Accessing a distributed file system

The test candidates

(FrauenhoferFS (FhGFS))IBM’s General Parallel File System (GPFS)Sun’s LustreRed Hat’s Global File System (GFS)

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 11 / 35

Page 16: Distributed File System

Practical part The test enviroment

FhGFS

Project at the Frauenhofer Instituts für Techno- undWirtschaftsmathematik (ITWM), Competence Center for HighPerformance Computing.It is a quite young distributed file system.Easy to install and configure.According to the specifcations of the producer it scales as good asSun’s Lustre and reaches a higher throughput.

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 12 / 35

Page 17: Distributed File System

Practical part The test enviroment

FhGFS

Figure: The FhGFS Architecture

Source: FraunhoferFS User Guide, online

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 13 / 35

Page 18: Distributed File System

Practical part The test enviroment

GPFS

available since 1998 for AIXfile management infrastructureproprietary softwaremost tested with Samba/CTDB by the Samba team

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 14 / 35

Page 19: Distributed File System

Practical part The test enviroment

GPFS

Figure: Accessing a NSD in GPFS

Source: GPFS cluster configurations, online

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 15 / 35

Page 20: Distributed File System

Practical part The test enviroment

GPFS - test assembly

GPFS

Storage Target 1

Storage Target 2

CIFS Client

GPFS Client

Samba CTDB

HD

HD

Netzwerk

Netzwerk

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 16 / 35

Page 21: Distributed File System

Practical part The test enviroment

GFS

developed as part of a thesis at the university of minnesotalicensed under GPL since 2004GFS2 as the future successor

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 17 / 35

Page 22: Distributed File System

Practical part The test enviroment

GFS

Figure: Global File System used with a SAN

Source: Red Hat Cluster Suite Overview: Red Hat Cluster Suite for Red Hat Enterprise Linux, online

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 18 / 35

Page 23: Distributed File System

Practical part The test enviroment

GFS - test assembly

CTDB

ISCSI Target2

ISCSI Target1

GFS

Node 1 Node 2

CIFS Client

HD HD

Netzwerk

Netzwerk

Netzwerk

Samba CTDB Samba CTDB

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 19 / 35

Page 24: Distributed File System

Practical part The test enviroment

Lustre

developed as part of a research project at the Carnegie MellonUniversity in 1999since October 2007 part of sun’s portfoliolicensed under GPL

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 20 / 35

Page 25: Distributed File System

Practical part The test enviroment

Lustre

Figure: The Lustre Clustre Architcture

Source: LUSTRE FILE SYSTEM. – Whitepaper, online

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 21 / 35

Page 26: Distributed File System

Practical part The test enviroment

Lustre - test assembly

Lustre

MDS

OSS 1

OSS 2

CIFS Client

Lustre Client

Samba CTDB

HD

HD

LNET

Netzwerk

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 22 / 35

Page 27: Distributed File System

The results

Provided file system features

Table: Features provided by the distributed file systems for CTDB

Locking unique FileId-Mappingfile system (Posix/BSD) Inode-Number (fsid/fsname)GFS yes/yes yes yes/yesGPFS yes/yes yes yes/yesLustre yes/yesa yes yesb/yesFhGFS -/- - -/-

aWith flock as mount optionbWith Lustre Version 1.6.2

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 23 / 35

Page 28: Distributed File System

The results PingPong

PingPong

Table: PingPong results - lock coherence

GFS GPFS LustreLocks/Sec Locks/Sec Locks/Sec

1 node 98 264.072 5.4612 nodes 98 2.249 3.655

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 24 / 35

Page 29: Distributed File System

The results PingPong

PingPong

Table: PingPong results - I/O coherence

GFS GPFS LustreLocks/Sec Locks/Sec Locks/Sec

1 node 97 117.142 5.1772 nodes 13 233 83

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 25 / 35

Page 30: Distributed File System

The results PingPong

PingPong

Table: PingPong results - mmap coherence

GFS GPFS LustreLocks/Sec Locks/Sec Locks/Sec

1 node 98 195.533 5.5592 nodes 31 242 124

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 26 / 35

Page 31: Distributed File System

The results bonnie++

bonnie++

Measured speed on distributed file systems is slower then on localdevicesMeasured speed on distributed file systems over Samba/CTDB isonce again slowerbonnie++ benchmark failed with lustre over Samba/CTDB

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 27 / 35

Page 32: Distributed File System

The results smbclient

smbclient

Table: Results - reading and writing with smbclient

Dateisystem Read (MiB/Sec) Write (MiB/Sec)GFS 11,78 9,49GPFS 1HD 16.53 58.51GPFS 2HD 32,61 61,88Lustre 1HD 81,45 39,85Lustre 2HD 67,57 39,18

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 28 / 35

Page 33: Distributed File System

The results Microsoft W2k3 - robocopy

Microsoft W2k3 - robocopy

Table: Windows 2003 Server as a client

Read Write Read to writeGFS 21,65 MiB/Sec 21,61 MiB/Sec 7,05 MiB/SecGPFS 2HD 14,2 MiB/Sec 35,81 MiB/Sec 5,18 MiB/SecLustre 1HD 22,77 MiB/Sec 20,61 MiB/Sec 5,83 MiB/SecLustre 2HD 23,75 MiB/Sec 20,63 MiB/Sec 2,85 MiB/Sec

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 29 / 35

Page 34: Distributed File System

The results IOZone

IOZone

Distributed file system access achieved nearly the theoreticalnetwork bandwithAccess over Samba/CTDB with iozone was limited to ca. 50MB/Sec reading and writing with cifs-kernel-modulWindows version of IOZone was not used due to a bug in lustre

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 30 / 35

Page 35: Distributed File System

The results dbench & smbtorture

dbench - writing 1 GiB files

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320

20

40

60

80

100

120

140

GFS GPFS 1HD GPFS 2HD Lustre 1HD Lustre 2HD

Anzahl Threads

akku

mul

ierte

r Dur

chsa

tz a

ller T

hrea

ds in

 MB

/Sek

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 31 / 35

Page 36: Distributed File System

The results dbench & smbtorture

smbtorture - writing 1 GiB files

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320

10

20

30

40

50

60

70

80

90

100

GFS GPFS 1HD GPFS 2HD Lustre 1HD Lustre 2HD

Anzahl Threads

.ak

kum

ulie

rter

 Dur

chsa

tz a

ller 

Thr

eads

 in M

B/S

ek

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 32 / 35

Page 37: Distributed File System

The results dbench & smbtorture

smbtorture & dbench altogether

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320

20

40

60

80

100

120

140

GPFS smbtorture GPFS dbench Lustre smbtorture Lustre dbench

Anzahl Threads

akku

mul

ierte

r Dur

chsa

tz a

ller T

hrea

ds in

 MB

/Sek

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 33 / 35

Page 38: Distributed File System

Conclusion

Conclusion

The locks/sec are heavily depending on the distributed file system.With concurrent access the locks/sec drop.⇒ Higher latencyThere could be many reasons for more latency, like highernetwork latency, seek latencies, more managemet overhead withmore clients and so on...Throughput with Samba also depends on the cifs implementationof the clientAccording to my tests one client alone could not reach themaximum throughput with Samba/CTDB

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 34 / 35

Page 39: Distributed File System

Conclusion

Questions ?

Thanks for your attention!

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 35 / 35

Page 40: Distributed File System

Conclusion

Questions ? Thanks for your attention!

H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 35 / 35