Clusters with gluster fs

Post on 17-Jun-2015

1,397 views 1 download

Tags:

transcript

Marian Marinov - mm@yuhu.bizSystem Architect - Siteground.com

Kosova Sofware Freedom Conference 2009

Clusters with GlusterFS

Prishtina 29-30.Aug.2009

2

Prishtina 29-30.Aug.2009

Agenda

Cluster Filesystems

Some facts

Gluster Design

➢ kernel ➢ gluster engine

➢ protocols➢ translators➢ storage➢ performance➢ others

➢ schedulers

Some benchmarks

1/29

3

Prishtina 29-30.Aug.2009

Cluster Filesystems

2/302/29

4

Prishtina 29-30.Aug.2009

Cluster Filesystems

3/29

5

Prishtina 29-30.Aug.2009

Facts

GlusterFS project starts in August 2006

It is not actual Filesystem

Server only for LinuxClient running on Linux & FreeBSD

Very scallable

Very easy to install and maintain

4/29

6

Prishtina 29-30.Aug.2009

GlusterFS Desgin

5/29

7

Prishtina 29-30.Aug.2009

GFarm Desgin

6/29

8

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

In the kernel

➢ Requires FUSE➢ FUSE as module➢ GlusterFUSE

The engine

➢ Server & Client➢ Transport Modules➢ Translators➢ Scheduler Modules

7/29

9

Prishtina 29-30.Aug.2009

GlusterFS Desgin

8/29

10

Prishtina 29-30.Aug.2009

GlusterFS Desgin

9/29

11

Prishtina 29-30.Aug.2009

GlusterFS Desgin

10/29

12

Prishtina 29-30.Aug.2009

GlusterFS Desgin

10/29

13

Prishtina 29-30.Aug.2009

GlusterFS Desgin

12/29

The picture explained:

ClientX:

volume serverX - defines a name for a remote serversubvolumes brick0 - defines in which of all exported volumes from

the remote server we are interested

some performance translators

volume unify - defines that we will use unify cluster translatorsubvolumes serverX serverY - defines which already connected storage volumes will be used

14

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Transport Modules:For TCP/IP transport

transport-type tcp/serverFor Infiniband SDP transport

transport-type ib-sdp/serverFor Infiniband Verbs transport

transport-type ib-verbs/server

13/29

15

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

The idea – GNU/Hurd

Translators

➢ Performance➢ Clustering ➢ Scheduling➢ Storage➢ Others

14/29

16

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Performance translators

➢ Read Ahead➢ Write Behind➢ Threaded I/O➢ IO-Cache➢ Stat Pre-fetch – still not ported to the new versions➢ Booster

15/29

17

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Clustering translators

➢ Stripe➢ Unify➢ AFR

16/29

18

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Scheduling translators

➢ Adaptive Least Usage (ALU)➢ Non-uniform filesystem architecture (NUFA)➢ Random➢ Rand-Robin➢ Switch

17/29

19

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Adaptive Least Usage (ALU)

➢ disk-usage➢ read-usage➢ write-usage➢ open-files-usage➢ disk-speed-usage

18/29

20

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Non-uniform filesystem architecture (NUFA)

➢ local-volume-name➢ limits.min-free-disk

Random

➢ limits.min-free-disk

Round-Robin

➢ limits.min-free-disk➢ read-only-subvolumes➢ refresh-interval

19/29

21

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Switch

➢ switch.case *jpg:brick1,brick2;*mp3:brick3;*:brick4,brick5➢ switch.read-only-subvolumes brick7

20/29

22

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

Other translators

➢ client➢ server➢ posix ➢ posix-locks➢ bdb - very new➢ rot-13➢ trace

21/29

23

Prishtina 29-30.Aug.2009

Gluster Filesystem Design

In the feature

➢ Live addition/removal of nodes➢ Automatic File Reordering➢ Web GUI ➢ mod_glusterfs

22/29

24

Prishtina 29-30.Aug.2009

Gluster Design

23/29

25

Prishtina 29-30.Aug.2009

Benchmarks

24/29

26

Prishtina 29-30.Aug.2009

Benchmarks

25/29

27

Prishtina 29-30.Aug.2009

Benchmarks

Aggregated Read Throughput Benchmark

Multiple dd utility were executed simultaneously with different block sizes to read from GlusterFS filesystem.

4KB 16KB 128KB 256KB 512KB 1024KBLustre 1,796 MB/s 5,782 MB/s 20,423 MB/s 21,582 MB/s 22,789 MB/s 23,731 MB/sGlusterFS 11,415 MB/s 11,424 MB/s 11,427 MB/s 11,419 MB/s 11,411 MB/s 11,409 MB/s

Aggregated Write Throughput Benchmark

Multiple dd utility were executed simultaneously with different block sizes to write to GlusterFS filesystem.

4KB 16KB 128KB 256KB 512KB 1024KBLustre 969 MB/s 1,613 MB/s 1,988 MB/s 1,989 MB/s 1,984 MB/s 1,983 MB/sGlusterFS 1,886 MB/s 2,191 MB/s 2,237 MB/s 2,231 MB/s 2,236 MB/s 2,223 MB/s

Note: Higher means faster.

26/29

28

Prishtina 29-30.Aug.2009

Benchmarks

Apache Web Server Benchmark

Apache served 12039 files (595 MB) over HTTP protocol. wget client fetched the files recursively.

TimeLustre Failed after downloading 33 MB out of 585 MB in 11 mins.GlusterFS 3 mins 11 secs

Archive Creation

'tar utility created an archive of 12039 files (595 MB) served through GlusterFS.Time

Lustre 41 secsGlusterFS 25 secs

Archive Extraction

TimeLustre FAILED No space left on device.GlusterFS 43 secs

Note: Lower means faster.

27/29

29

Prishtina 29-30.Aug.2009

Sources of Information

Project's site:http://www.gluster.com

Official GlusterFS documentation wiki:http://www.gluster.org/docs/index.php/GlusterFS

On IRC:irc.freenode.net #gluster

The mailing list:gluster-devel@nongnu.org

28/29

30

Prishtina 29-30.Aug.2009

Clusters with GlusterFS

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Questions ? ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?