Post on 23-Nov-2014
transcript
Comparison of different distributed file systems foruse with Samba/CTDB
@SambaXP’09
Henning Henkel
April 23, 2009
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 1 / 35
Agenda
Agenda
1 Introduction
2 Theoretical background
3 Practical part
4 The results
5 Conclusion
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35
Agenda
Agenda
1 Introduction
2 Theoretical background
3 Practical part
4 The results
5 Conclusion
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35
Agenda
Agenda
1 Introduction
2 Theoretical background
3 Practical part
4 The results
5 Conclusion
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35
Agenda
Agenda
1 Introduction
2 Theoretical background
3 Practical part
4 The results
5 Conclusion
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35
Agenda
Agenda
1 Introduction
2 Theoretical background
3 Practical part
4 The results
5 Conclusion
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 2 / 35
Introduction What is this all about?
Introduction
Diploma study in Computer Networking at the FurtwangenUniversity (HFU) for applied scienceDiploma thesis at the science + computing ag in TübingenSupervising tutors:→ Prof. Dr. Christoph Reich (Furtwangen University)→ Dipl.-Phys. Daniel Kobras (science + computing ag)
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 3 / 35
Introduction What is this all about?
What were the goals of the diploma thesis?
In the context of the diploma thesis was tested . . .. . . which features should be provided by a distributed file systemto use it with Samba/CTDB. . . what the differences between IBM’s GPFS, RedHat’s GFS andSun’s Lustre are when used with Samba/CTDB
Not tested in the context of the diploma thesis are . . .. . . the fencing mechanisms provided by Samba/CTDB. . . the cluster management provided by Samba/CTDB
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 4 / 35
Theoretical background Dissociation
What is pCIFS?
In the Samba/CTDB contextParallel CIFS servers as a CTDB layer between CIFS Clients anddistributed file systemsOne Client is connected to only one CIFS Server.There is no need for modifications on the client side.
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 5 / 35
Theoretical background Dissociation
What is pCIFS?
In the lustre contextA set of parallel CIFS servers provied access to the lustre filesystem.One client can connect to multiple CIFS Servers.
Advantage: A single client might reach the maximum throughput.But there are also major disadvantages:
There is the need for a special CIFS client software.The client software is only for one specially picked file system.
I’m not aware of a product ready implementation.
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 6 / 35
Theoretical background Dissociation
What is a distributed file system?
Appl. A
Local OS 1
Application B
Local OS 2 Local OS 3
Appl. C
Local OS 4
Distributed system layer (middleware)
Computer 2Computer 1 Computer 3 Computer 4
Network
Figure: Distributed file systems are a middelware
Source: Distributed Systems - Principles and Paradigms, Tanenbaum 2007
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 7 / 35
Theoretical background Dissociation
Microsoft’s DFS
CIFSClient1
CIFSClient2
CIFSClient3
CIFSClientN
CIFSClientN+1
Network
Network
CIFSServer1
CIFSServer2
CIFSServer3
CIFSServerN
CIFSServerN+1
DFS
File A
File D
File Q File H
File A
File D
File Q File H
File X
File X
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 8 / 35
Theoretical background Accessing a distributed file system
Access without CTDB
Distributed Filesystem
CIFSClient1
CIFSClient2
CIFSClient3
CIFSClientN
CIFSClientN+1
Network
Network
ClusterNode1
HD1 HD2
ClusterNode2
HD1 HD2
ClusterNode3
HD1 HD2
ClusterNodeN
HD1 HD2
ClusterNode N+1
HD1 HD2
SambaServer
FSClient
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 9 / 35
Theoretical background Accessing a distributed file system
Access with CTDB
Distributed Filesystem
CIFSClient1
CIFSClient2
CIFSClient3
CIFSClientN
CIFSClientN+1
Network
Network
ClusterNode1
HD1 HD2
ClusterNode2
HD1 HD2
ClusterNode3
HD1 HD2
ClusterNodeN
HD1 HD2
ClusterNode N+1
HD1 HD2
SambaServer1
FSClient1
SambaServerN
FSClientN
SambaServerN+1
FSClientN+1
Clustered Trivial Database
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 10 / 35
Theoretical background Accessing a distributed file system
The test candidates
(FrauenhoferFS (FhGFS))IBM’s General Parallel File System (GPFS)Sun’s LustreRed Hat’s Global File System (GFS)
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 11 / 35
Practical part The test enviroment
FhGFS
Project at the Frauenhofer Instituts für Techno- undWirtschaftsmathematik (ITWM), Competence Center for HighPerformance Computing.It is a quite young distributed file system.Easy to install and configure.According to the specifcations of the producer it scales as good asSun’s Lustre and reaches a higher throughput.
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 12 / 35
Practical part The test enviroment
FhGFS
Figure: The FhGFS Architecture
Source: FraunhoferFS User Guide, online
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 13 / 35
Practical part The test enviroment
GPFS
available since 1998 for AIXfile management infrastructureproprietary softwaremost tested with Samba/CTDB by the Samba team
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 14 / 35
Practical part The test enviroment
GPFS
Figure: Accessing a NSD in GPFS
Source: GPFS cluster configurations, online
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 15 / 35
Practical part The test enviroment
GPFS - test assembly
GPFS
Storage Target 1
Storage Target 2
CIFS Client
GPFS Client
Samba CTDB
HD
HD
Netzwerk
Netzwerk
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 16 / 35
Practical part The test enviroment
GFS
developed as part of a thesis at the university of minnesotalicensed under GPL since 2004GFS2 as the future successor
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 17 / 35
Practical part The test enviroment
GFS
Figure: Global File System used with a SAN
Source: Red Hat Cluster Suite Overview: Red Hat Cluster Suite for Red Hat Enterprise Linux, online
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 18 / 35
Practical part The test enviroment
GFS - test assembly
CTDB
ISCSI Target2
ISCSI Target1
GFS
Node 1 Node 2
CIFS Client
HD HD
Netzwerk
Netzwerk
Netzwerk
Samba CTDB Samba CTDB
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 19 / 35
Practical part The test enviroment
Lustre
developed as part of a research project at the Carnegie MellonUniversity in 1999since October 2007 part of sun’s portfoliolicensed under GPL
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 20 / 35
Practical part The test enviroment
Lustre
Figure: The Lustre Clustre Architcture
Source: LUSTRE FILE SYSTEM. – Whitepaper, online
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 21 / 35
Practical part The test enviroment
Lustre - test assembly
Lustre
MDS
OSS 1
OSS 2
CIFS Client
Lustre Client
Samba CTDB
HD
HD
LNET
Netzwerk
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 22 / 35
The results
Provided file system features
Table: Features provided by the distributed file systems for CTDB
Locking unique FileId-Mappingfile system (Posix/BSD) Inode-Number (fsid/fsname)GFS yes/yes yes yes/yesGPFS yes/yes yes yes/yesLustre yes/yesa yes yesb/yesFhGFS -/- - -/-
aWith flock as mount optionbWith Lustre Version 1.6.2
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 23 / 35
The results PingPong
PingPong
Table: PingPong results - lock coherence
GFS GPFS LustreLocks/Sec Locks/Sec Locks/Sec
1 node 98 264.072 5.4612 nodes 98 2.249 3.655
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 24 / 35
The results PingPong
PingPong
Table: PingPong results - I/O coherence
GFS GPFS LustreLocks/Sec Locks/Sec Locks/Sec
1 node 97 117.142 5.1772 nodes 13 233 83
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 25 / 35
The results PingPong
PingPong
Table: PingPong results - mmap coherence
GFS GPFS LustreLocks/Sec Locks/Sec Locks/Sec
1 node 98 195.533 5.5592 nodes 31 242 124
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 26 / 35
The results bonnie++
bonnie++
Measured speed on distributed file systems is slower then on localdevicesMeasured speed on distributed file systems over Samba/CTDB isonce again slowerbonnie++ benchmark failed with lustre over Samba/CTDB
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 27 / 35
The results smbclient
smbclient
Table: Results - reading and writing with smbclient
Dateisystem Read (MiB/Sec) Write (MiB/Sec)GFS 11,78 9,49GPFS 1HD 16.53 58.51GPFS 2HD 32,61 61,88Lustre 1HD 81,45 39,85Lustre 2HD 67,57 39,18
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 28 / 35
The results Microsoft W2k3 - robocopy
Microsoft W2k3 - robocopy
Table: Windows 2003 Server as a client
Read Write Read to writeGFS 21,65 MiB/Sec 21,61 MiB/Sec 7,05 MiB/SecGPFS 2HD 14,2 MiB/Sec 35,81 MiB/Sec 5,18 MiB/SecLustre 1HD 22,77 MiB/Sec 20,61 MiB/Sec 5,83 MiB/SecLustre 2HD 23,75 MiB/Sec 20,63 MiB/Sec 2,85 MiB/Sec
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 29 / 35
The results IOZone
IOZone
Distributed file system access achieved nearly the theoreticalnetwork bandwithAccess over Samba/CTDB with iozone was limited to ca. 50MB/Sec reading and writing with cifs-kernel-modulWindows version of IOZone was not used due to a bug in lustre
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 30 / 35
The results dbench & smbtorture
dbench - writing 1 GiB files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320
20
40
60
80
100
120
140
GFS GPFS 1HD GPFS 2HD Lustre 1HD Lustre 2HD
Anzahl Threads
akku
mul
ierte
r Dur
chsa
tz a
ller T
hrea
ds in
MB
/Sek
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 31 / 35
The results dbench & smbtorture
smbtorture - writing 1 GiB files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320
10
20
30
40
50
60
70
80
90
100
GFS GPFS 1HD GPFS 2HD Lustre 1HD Lustre 2HD
Anzahl Threads
.ak
kum
ulie
rter
Dur
chsa
tz a
ller
Thr
eads
in M
B/S
ek
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 32 / 35
The results dbench & smbtorture
smbtorture & dbench altogether
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320
20
40
60
80
100
120
140
GPFS smbtorture GPFS dbench Lustre smbtorture Lustre dbench
Anzahl Threads
akku
mul
ierte
r Dur
chsa
tz a
ller T
hrea
ds in
MB
/Sek
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 33 / 35
Conclusion
Conclusion
The locks/sec are heavily depending on the distributed file system.With concurrent access the locks/sec drop.⇒ Higher latencyThere could be many reasons for more latency, like highernetwork latency, seek latencies, more managemet overhead withmore clients and so on...Throughput with Samba also depends on the cifs implementationof the clientAccording to my tests one client alone could not reach themaximum throughput with Samba/CTDB
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 34 / 35
Conclusion
Questions ?
Thanks for your attention!
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 35 / 35
Conclusion
Questions ? Thanks for your attention!
H. Henkel () Samba/CTDB & different distributed fs April 23, 2009 35 / 35