+ All Categories
Home > Documents > DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Date post: 22-Dec-2015
Category:
Upload: paloma-burrough
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
33
DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611
Transcript
Page 1: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Status & Roadmap

Ricardo Rocha( on behalf of the DPM team )

EMI INFSO-RI-261611

Page 2: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

CLIENT

DPNS DPM SRM HTTP NFS

GRIDFTP RFIO HTTP NFS XROOT

HEAD NODE

DISK NODE(s)

FILE METADATA OPS

FILE ACCESSOPS

DPM Overview

RFIOHTTPNFS

XROOT

Page 3: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Core

1.8.2, Testing, Roadmap

Page 4: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM 1.8.2 – Highlights

• Improved scalability of all frontend daemons– Especially with many concurrent clients– By having a configurable number of threads

• Fast/Slow in case of the dpm daemon

• Faster DPM drain– Disk server retirement, replacement, …

• Better balancing of data among disk nodes– By assigning different weights to each filesystem

• Log to syslog• GLUE2 support

Page 5: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Core – Testing Activity

• Improved validation & testing– Collaboration with ASGC for this purpose (thanks!)– Hammercloud tests running regularly– They started with a 400 core setup, we looked at the issues,

now moving to 1000 cores to increase load• Example run– http://hammercloud.cern.ch/atlas/10006472/test/

• To be used extensively for stress testing– Covering all components: DPM, RFIO, GRIDFTP, NFS, HTTP, …

• Results will benefit other sites too

Page 6: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Core – Testing

HC using GridFTP

Thanks to ShuTing for the plots ( preliminary results )

HC using RFIO

Example

GridFTP vs RFIO

Page 7: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Core - Testing

• Big contribution from openlab student– Martin Hellmich, University of Edinburgh

• Detailed analysis of DPM internals– Detecting bottlenecks in specific transfer / access phases

Example… but we have a lot moreresults which we are now investigating

Page 8: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DPM Core – Roadmap

• Package consolidation: EPEL compliance• Fixes in multi-threaded clients• Replace httpg with https on the SRM• Improve dpm-replicate (dirs and FSs)• GUIDs in DPM• Synchronous GET requests• Reports on usage information• Quotas• Accounting metrics• HOT file replication

1.8.3

1.8.4

1.8.5

Page 9: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Beta Components

HTTP/DAV, NFS, Nagios, Puppet, Perfsuite, Catalog Sync, Contrib

Tools

Page 10: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Beta Components: Overview

• Faster releases– Monthly releases since June

• Separate yum repository• Already in use by several sites– Including sites in the UK

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Dev/Components

Page 11: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Beta Components: PerfSuite

Overview

Page 12: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Performance Suite

• Set of tools to easily trigger bunches of tests– With different configurations

• Common wrapper, many tests• Existing suites– POSIX Transfers: RFIO, NFS– GET/PUT Transfers: HTTP, GSIFTP– ROOT– More coming…

• Used for most results presented later

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Performance#Perfsuite

Page 13: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Performance Suite

• Set of tools to easily trigger test bunches– With different configurations

• Common wrapper, many tests• Existing suites– POSIX Transfers: RFIO, NFS– GET/PUT Transfers: HTTP, GSIFTP– ROOT– More coming…

• Used for most results presented later

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Performance#Perfsuite

Sample Configuration

test_rfcp(c:5,s:{1M 2M 4M 8M 16M 32M 64M 128M 256M 512M 1G})x3

test_nfs(m:/mnt/nfs41,c:5,s:{1M 2M 4M 8M 16M 32M 64M 128M 256M 512M 1G})x3

Page 14: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Beta Components: HTTP / DAV

Overview, Performance, Roadmap

Page 15: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/WebDAV

HTTP / DAV: Overview

CLIENT

LFC

DPM HEAD

DPM DISK

GET

GET / PUT

GET / PUT

1

2

3

REDIRECT

REDIRECT

DATA

Page 16: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

HTTP / DAV: Overview

CLIENT

LFC

DPM HEAD

DPM DISK

GET

GET / PUT

GET / PUT

1

2

3

REDIRECT

REDIRECT

DATA

Page 17: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

HTTP: Client Supportcurl browser

OS Any Any

GUI NO YES

CLI YES NO

X509 YES YES

Proxies YES Only IE so far

Redirect YES YES

PUT YES NO

• Recommendation: browser/curl for GET, curl for PUT• Chrome Issue 9056 submitted for proxy support

Page 18: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

DAV: Client SupportTrailMix Cadaver Davlib Shared

FolderDavFS2 Nautilus Dolphin

OS Firefox < 4 *nix Mac OS X Windows *nix Gnome KDE

GUI YES NO YES YES N/A YES YES

CLI NO YES NO NO N/A NO NO

X509 YES YES NO YES YES NO NO

Proxies ? NO NO YES NO NO NO

Redirect YES NO YES Not PUT NO NO YES

• Updated analysis based on initial one from dCache• Recommendation: Cadaver for *nix, Windows explorer

Page 19: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

HTTP vs GridFTP: Multiple streams

• Not explicit in the HTTP protocol• But needed for even higher performance– Especially in the WAN

• So we added it, with some semantics– Small wrapper around libcurl– PUT with ‘0 bytes’ && null content-range == end

of write• Submitted patch to libcurl to allow ssl session

reuse among parallel requests

Page 20: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

HTTP vs GridFTP: 3rd Party Copies

• Implemented using WEBDAV COPY• Requires proxy certificate delegation– Using gridsite delegation, with a small wrapper client

• Requires some common semantics to copy between SEs (to be agreed)– Common delegation portType location and port– No prefix in the URL ( just http://<server>/<sfn> )

Page 21: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

HTTP vs GridFTP: 3rd Party Copies

Example of FTS usage

Page 22: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Ongoing EvaluationHTTP / DAV: Performance

• No difference detected in LAN with different number of streams– But early results do show a big difference on the WAN

• lcg-cp configured to use gridftp• File registration & transfer times considered in both cases

• Xeon 4 Cores 2.27GHz• 12 GB RAM• 1 Gbit/s links

Page 23: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

HTTP / DAV: Issues & Roadmap

• Towards a first production release– Testing with large number of concurrent clients– Finish up the WAN performance tests

• And after that– Further testing of 3rd party copy with larger files– Finish validation against other implementations– Validate usage via ROOT– Improved GET on the LFC– PUT support on the LFC (?)

Page 24: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Beta Components: NFS 4.1 / pNFS

Overview, Performance, Roadmap

Page 25: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

NFS 4.1/pNFS: Why?

• Industry standard (IBM, NetApp, EMC, …)• Free clients (with free caching)• Strong security (GSSAPI)• Parallel data access• Easier maintenance• …• But you know all this by now…

Page 26: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

NFS 4.1/pNFS: Overview

CLIENT

METADATA SERVER

DISK SERVER(s)

OPEN1

LAYOUTGET2

3GETDEVICEINFO

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/NFS41

4OPEN

5READ / WRITE

6CLOSE

7CLOSE

Page 27: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

NFS4.1 / pNFS: Client

• pNFS support in linux kernel from >= 2.6.38• nfs-utils >= 1.2.3• Latest Fedora and Debian Sid have it• We provide packages for EL5– Enabled pNFS in the elrepo mainline kernel– nfs-utils and AFS module we package ourselves

Page 28: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

NFS4.1 / pNFS: Performance

• IOZONE Results

Ongoing Evaluation

• Server– Xeon 4 Cores 2.27GHz– 12 GB RAM– 1 Gbit/s links

• Client– Dual core– 2 GB RAM– 100 Mbit/s link

Page 29: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

NFS4.1 / pNFS: Performance

• NFS vs RFIO

Ongoing Evaluation

RFIO read misbehaving in this test… investigating

• Server– Xeon 4 Cores 2.27GHz– 12 GB RAM– 1 Gbit/s links

• Client– Dual core– 2 GB RAM– 100 Mbit/s link

• 8 KB block sizes

Page 30: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

NFS4.1 / pNFS: Issues & Roadmap

• Towards a first production release– Tests with a faster network link– Testing with a larger number of concurrent clients– WAN testing– Enable bigger block sizes

• And after that– X509 certificate support• Still not figured out… needs a strong focus

– Further validation with other implementations

Page 31: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Beta Components: Even more…

Puppet, Nagios, Contrib, Catalog Sync

Page 32: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Even more components…

• Catalog Synchronization– Check Fabrizio’s talk next Monday (EGI Forum Lyon)

• DPM Admin contrib package– Contribution from GridPP– Now package and distributed with the DPM components– http://www.gridpp.ac.uk/wiki/DPM-admin-tools

• Nagios monitoring plugins for DPM– Available now– https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Monitoring

• Puppet templates– Available now in beta– https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Puppet

Page 33: DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611.

Conclusion

• 1.8.2 fixes many scalability and performance issues– But we continue testing and improving

• Popular requests coming in next versions– Accounting, quotas, easier replication

• Beta components getting to production state– Standards compliant data access– Simplified setup, configuration, maintenance– Metadata consistency and synchronization

• And much more extensive testing– Performance test suites, regular large scale tests


Recommended