EGEE is a project funded by the European Union under contract IST-2003-508833
LFC and Fireman Performance
Measurements
Caitriana Nicholson / IT-GD, CERN / Glasgow UniversityCraig Munro / IT-GD, CERN / Brunel University
LCG Storage Management Workshop, CERN, 7th April 2005
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 2
Overview
• Aims of Testing
• Test Methodology & Setup
• LFC Performance Results
• FiReMan Performance Results
• Conclusions
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 3
Aims of Testing
• Data Challenges of 2004 exposed limitations in LCG Data Management tools
• LCG File Catalog developed to address problems with the RLS
• Suite of tests developed to check the functionality and performance of the LFC
• Comparison required of performance of EDG RLS Globus RLS LFC FiReMan
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 4
Test Methodology
• Multi-threaded C client program written to test each type of operation (insert, query, delete etc) ./create_files -d /grid/dteam/caitriana/insert/
-f $num_files -t $num_threads
• C programs wrappered by Perl scripts
• Typically, each operation performed several thousand times in the client program ($num_files=3000) and mean result returned
• Client program called several times from Perl script and mean result taken
• Any entries removed before next test run
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 5
LFC Test Setup
• Oracle DB on Xeon 2.4GHz• PIII 1GHz, 512MB server running 20 threads• PIII 853MHz, 128MB client with configurable number of threads
(single client tests)• 10 x PIII 1GHz, 512MB clients with configurable number of threads
(multiple clients)• 100 Mb/s LAN• Insecure LFC
Comparison against published results for insecure catalogues Security overheads now being tested
• Quality of machines used should be noted when comparing LFC and RLS results! SPEC CINT2000 values:
420220400Multiple Clients
420220420Single Client
420810420Server
EDG RLSGlobus RLSLFC
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 6
LFC Performance (i) - Inserts
• Mean insert time as number of entries increased up to 40M remains below 30 ms
• EDG mean insert time was ~40 ms with 500,000 entries
0
10
20
30
40
50
60
70
80
0 10000000 20000000 30000000 40000000
No. of entries in LFC
Mea
n in
sert ti
me
(ms)
• Insert rate, with increasing number of client threads, for ~1M entries
• Increases up to ~200 adds/sec up to server thread limit
• Globus RLS gave ~84 adds/sec when run with consistency
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 7
LFC Performance (ii) - Queries
• Rate of querying for a single LFN, increasing number of client threads, ~1M entries
• Not really comparable with Globus results RLS does 1-to-1 lookup LFC stat() returns system
metadata, checks permissions…
• EDG RLS rate ~63 queries/sec for 1 thread; LFC with 1 thread gives ~90 queries/sec
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 8
LFC Performance (ii) - Queries
• Time to list and stat all replicas of a file proportional to number of replicas
• Time to read a directory is directly proportional to directory size
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 9
LFC Performance (ii) - Queries
• Default buffer size in LFC is small (4 KB)
• Tuning the buffer size leads to improved performance for readdir() Time to read directory of
100000 entries measured with varying buffer sizes
• If bulk queries implemented, they should show similar behaviour
0500
100015002000250030003500400045005000
0 200 400 600 800 1000
Buffer size (KB)
Re
ad
s p
er
se
co
nd
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 10
• Delete rate, with increasing number of client threads, for ~1M entries
• With 1 thread, time per delete ~16 ms
• EDG RLS with 1 thread gave ~30 ms per delete
• No comparable results for Globus RLS
LFC Performance (iii) – Delete Rates
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 11
• Performing a chdir() before many simultaneous operations in the same directory improves performance significantly
• Using transactions gave loss of performance with single client Extra time being spent in DB Requires further
investigation
LFC Performance (iv) – chdir & transactions
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 12
LFC Performance (v) - Multiple Client Tests
• Tests run simultaneously on varying number of client machines
• Clients all running with 10 threads
• Insert and query rates measured with: Single operations Transactions (100 operations
per transaction)
• No reduction in operation rate as number of clients increases
Single Operations
Transactions
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 13
LFC Summary
• LFC has been tested and shown to be scalable to at least: 40 million entries 100 client threads
• Performance improved with comparison to RLSs
• Stable : Continuous running at high load for extended periods of time
with no crashes Based on code which has been in production for > 4 years
• Tuning required to improve bulk performance
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 14
FiReMan Test Setup
• LFC and FiReMan tests performed on identical hardware
• Catalogue Server and Oracle DB running on same machine Dual Xeon 2.4Ghz with 2048 MB RAM
• PIII 800 Mhz, 512 MB RAM Client with configurable number of threads
• 100 Mb/s LAN
• Insecure FiReMan and LFC
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 15
Limitations of the Setup
• Limited time -> limited scope of tests
• Only one multi-threaded client used to test server. Difficult to explore all limits of server
• All tests performed over LAN, need to look at WAN Influence of round trip time
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 16
LFC Comparison
• LFC Tests repeated on identical hardware to give fair comparison
• LFC Tests performed with relative paths (chdir) and without transactions
60
80
100
120
140
160
180
200
220
240
1 2 5 10 20 50 100
Inse
rts
/ Sec
ond
Number of Threads
LFC - NewLFC - Old
50
100
150
200
250
300
350
1 2 5 10 20 50 100
Que
ries
/ S
econ
d
Number of Threads
LFC - NewLFC - Old
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 17
FiReMan Performance - Inserts
• Inserted ~1M entries in bulk with insert time ~5ms
• Insert Rate for different bulk sizes
Number Of Threads
0
50
100
150
200
250
300
350
1 2 5 10 20 50
Inse
rts
/ Sec
ond
SingleBulk 1
Bulk 10Bulk 100Bulk 500
Bulk 1000Bulk 5000
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 18
FiReMan Performance - Insert
• Comparison with LFC:
0
50
100
150
200
250
300
350
1 2 5 10 20 50 100
Inse
rts
/ Sec
ond
Number of Threads
Fireman - Single Entry
Fireman - Bulk 100
LFC
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 19
FiReMan Performance - Queries
• Query Rate for an LFN
0
200
400
600
800
1000
1200
5 10 15 20 25 30 35 40 45 50
Ent
ries
Ret
urne
d / S
econ
d
Number Of Threads
Fireman SingleFireman Bulk 1
Fireman Bulk 10Fireman Bulk 100Fireman Bulk 500
Fireman Bulk 1000Fireman Bulk 5000
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 20
FiReMan Performance - Queries
• Comparsion with LFC:
0
200
400
600
800
1000
1200
1 2 5 10 20 50 100
Ent
ries
Ret
urne
d / S
econ
d
Number Of Threads
Fireman - Single EntryFireman - Bulk 100
LFC
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 21
FiReMan Performance - Delete
• Rate LFNs can be deleted from catalogueSingleBulk 1
Bulk 10Bulk 100Bulk 500
Bulk 1000Bulk 5000
20
40
60
80
100
120
140
160
180
200
220
240
1 2 5 10 20 50
Del
etes
/ S
econ
d
Number of Threads
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 22
FiReMan Performance - Delete
• Comparison with LFC:
20
40
60
80
100
120
140
160
180
200
220
240
1 2 5 10 20 50
Del
etes
/ S
econ
d
Number of Threads
Fireman SingleFireman Bulk 100
LFC
LCG Storage Management Workshop CERN IT-GD 7 th April 2005 23
Conclusions
• Both LFC and FiReMan offer large improvements over RLS
• Still some issues remaining: Scalability of FiReMan Bulk Entry for LFC
• More work needed to understand performance and bottlenecks
• Need to test some real Use Cases
Questions?