Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | jordan-warren |
View: | 225 times |
Download: | 0 times |
13-14 June 2005 Geneve CERN 2
TIER1 CNAF PRESENTATION
Hardware and software status of our CASTOR installation and management toolsUsage from LHC experiment of our installationComments
13-14 June 2005 Geneve CERN 3
MENPOWERAt present there are 4 people at TIER1 CNAF involved in administering our CASTOR installations and front-ends:Ricci Pier Paolo (50% also activity in SAN/NAS HA disk storage management and test, Oracle adm) [email protected] Giuseppe (50% also activity in ALICE exp. as Tier1 reference, SAN HA disk storage management and test, managing Grid frontend to our resources) [email protected] Elisabetta (50% involved in Oracle and RLS development and adm. and SAN disk storage management and test) [email protected] we have 1 CNAF FTE working with the development team at CERN (started March 2005)Lopresti Giuseppe [email protected]
13-14 June 2005 Geneve CERN 4
HARDWARE STATUS
At present our CASTOR (1.7.1.5) system is:
1 STK L5500 SILOS partitioned with 2 form-factor slots
About 2000 slots LTO-2 form
About 3500 slots 9940B form
6 LTO-2 DRIVES with 2Gb/s FC interface
2 9940B DRIVES with 2Gb/s FC interface
2 more have been just acquired END OF JUNE INSTALLED
Sun Blade v100 with 2 internal ide disks with software raid-0 running ACSLS 7.0
1300 LTO-2 Imation TAPES
650 9940B Imation TAPES
13-14 June 2005 Geneve CERN 5
HARDWARE STATUS (2)
8 Tapeservers, 1U Supermicro 3 GHz 2GB with 1 Qlogic 2300 F.C. HBA, STK CSC Development Toolkit provided by CERN (with licence agreement with STK) ssi,tpdaemon and rtcpd.
The 8 tapeservers are direct connected direcly with the FC drive output: DRIVE LTO-2 0,0,10,0 -> tapesrv-0.cnaf.infn.it DRIVE LTO-2 0,0,10,1 -> tapesrv-1.cnaf.infn.it DRIVE LTO-2 0,0,10,2 -> tapesrv-2.cnaf.infn.it DRIVE LTO-2 0,0,10,3 -> tapesrv-3.cnaf.infn.it DRIVE LTO-2 0,0,10,4 -> tapesrv-4.cnaf.infn.it DRIVE LTO-2 0,0,10,5 -> tapesrv-5.cnaf.infn.itDRIVE 9940B 0,0,10,6 -> tapesrv-6.cnaf.infn.it DRIVE 9940B 0,0,10,7 -> tapesrv-7.cnaf.infn.it 2 MORE WILL BE INSTALLED SOON (tapesrv-8 tapesrv-9) with the 2 new 9940BUSING THE 9940B have drastically reduced the error rate (we report
only one 9940 tape marker RDONLY due to SCSI error and NEVER had “hanged” DRIVES in 6 months of activity).
13-14 June 2005 Geneve CERN 6
HARDWARE STATUS (3)
castor.cnaf.infn.it Central Machine 1 IBM x345 2U machine 2x3GHz Intel Xeon, raid1 with double power supply
O.S. Red Hat A.S. 3.0 Machine running all central CASTOR 1.7.1.5 services (Nsdaemon, vmgrdaemon, Cupvdaemon, vdqmdaemon, msgdaemon) and the ORACLE client for the central database
castor-4.cnaf.infn.it ORACLE Machine 1 IBM x345 O.S. Red Hat A.S. 3.0 Machine running ORACLE DATABASE 9.i rel 22 more x345 machines are in standby and are used for storing all
the backup information of the ORACLE db (.exp .dbf) and can be used for replacing the above machines if needed...
castor-1.cnaf.infn.it Monitoring Machine 1 DELL 1650 R.H 7.2 Machine running monitoring CASTOR service (Cmon daemon) NAGIOS central service for monitoring and notification. Also contains the command rtstat e tpstat that are usually runned with the –S option over the tapeserver
13-14 June 2005 Geneve CERN 7
HARDWARE STATUS (4)
Stagers with diskserver: 1U Supermicro 3 GHz 2GB with 1 Qlogic 2300 F.C. HBA accessing our SAN and runnig Cdbdaemon, stgdaemon end rfiod. 1 STAGER to EACH LHC Experiment
disksrv-1.cnaf.infn.it ATLAS stager with 2TB locallydisksrv-2.cnaf.infn.it CMS stager with 3.2TB locallydisksrv-3.cnaf.infn.it LHCB stager with 3.2TB locallydisksrv-4.cnaf.infn.it ALICE stager with 3.2TB locallydisksrv-5.cnaf.infn.it TEST and PAMELA stager disksrv-6.cnaf.infn.it stager with 2TB locally (archive purpose LVD,alice
TOF,CDF,VIRGO,AMS,BABAR,ARGO and other HEP experiment...)Diskservers: 1U Supermicro 3 GHz 2GB with 1 Qlogic 2300 F.C.
HBA accessing our SAN and runnig rfiod. Red Hat 3.0 Cluster has been tested but not used in production for the rfiod.
13-14 June 2005 Geneve CERN 8
HARDWARE STATUS (5)
Storage Element front-end for CASTOR
castorgrid.cr.cnaf.infn.it (DNS alias load balaced over 4 machines for WAN gridftp )
SRM v.1 is installed and in production in the above machines.
13-14 June 2005 Geneve CERN 9
TIER1 INFN CNAF Storage
Linux SL 3.0 clients (100-1000 nodes)
WAN or TIER1 LAN
STK180 with 100 LTO-1 (10Tbyte Native)
STK L5500 robot (5500 slots) 6 IBM LTO-2, 2 (4) STK 9940B drives
PROCOM 3600 FC NAS2 9000 Gbyte
PROCOM 3600 FC NAS3 4700 Gbyte
NAS1,NAS43ware IDE SAS1800+3200 Gbyte
AXUS BROWIEAbout 2200 GByte 2 FC interface
2 Gadzoox Slingshot 4218 18 port FC Switch
STK BladeStoreAbout 25000 GByte 4 FC interfaces
Infortrend 4 x 3200 GByte SATA A16F-R1A2-M1
NFS-RFIO-GridFTP oth...
W2003 Server with LEGATO Networker (Backup)
CASTOR HSM serversH.A.
Diskservers with Qlogic FC HBA 2340IBM FastT900 (DS 4500) 3/4 x 50000 GByte 4 FC interfaces
2 Brocade Silkworm 3900 32 port FC Switch
Infortrend 5 x 6400 GByte SATA A16F-R1211-M2 + JBOD
SAN 2 (40TB)SAN 1 (200TB) + 200TB end of June
HSM (400 TB) NAS (20TB)
NFSRFIO
13-14 June 2005 Geneve CERN 10
CASTOR HSMSTK L5500 2000+3500
6 drives LTO2 (20-30 MB/s)
2 drives 9940B (25-30 MB/s)
1300 LTO2 (200 GB native)
650 9940B (200 GB native)
TOTAL CAPACITY with 200GB
250 TB LTO-2 (400TB)
130 TB 9940B (700TB)
Sun Blade v100 with 2 internal ide disks with software raid-1 running ACSLS 7.0 OS Solaris 9.0 1 CASTOR (CERN)Central
Services server RH AS3.0
8 tapeserverLinux RH AS3.0HBA Qlogic 2300
6 stager with diskserver RH AS3.015 TB Local staging area
EXPERIMENT Staging area (TB)
Tape pool (TB native)
ALICE 8 12(LTO-2)ATLAS 6 20(MIXED)CMS 2 1(9940B)LHCb 18 30(LTO-2)BABAR,AMS+oth
2 4(9940B)
Point to Point FC 2Gb/s connections
1 ORACLE 9i rel 2 DB server RH AS 3.0
8 or more rfio diskservers RH AS 3.0 min 20TB staging area (variable)
SAN 1
WAN or TIER1 LAN
SAN 2Indicates Full rendundancy FC 2Gb/sconnections (dual controller HW and Qlogic SANsurfer Path Failover SW)
13-14 June 2005 Geneve CERN 11
CASTOR Grid Storage Element
GridFTP access through the castorgrid SE, a dns cname pointing to 3 server.Dns round-robin for load balancing During LCG Service Challenge2 introduced also a load average selection: every M minutes the ip of the most loaded server is replaced in the cname (see graph)
13-14 June 2005 Geneve CERN 12
NOTIICATION (Nagios)
13-14 June 2005 Geneve CERN 13
LHCb CASTOR tape pool
# processes on a CMS disk SE
eth0 traffic through a
CASTOR LCG SE
MONITORING (Nagios)
13-14 June 2005 Geneve CERN 14
DISK ACCOUNTINGPure disk space
(TB)CASTOR disk space
(TB)
13-14 June 2005 Geneve CERN 15
CASTOR USAGE
The access to the castor system is 1) Grid using our SE frontends (from WAN)2) Rfio using castor rpm and rfio commands
installed on our WN and UI (from LAN)Only the 17% (65TB / 380TB) of the total HSM space was
effectively used by the experiments in a 1.5 years period because
1) As TIER1 storage we offer “pure” disk as primary storage over SAN (preferred by the experiments) (GSIftp,nfs,xrootd,bbftp,GPFS ….)
2) The lack of an optimization in parallel stage-in operation (pre-stage) and reliability/performance problem arisen in LTO-2 give in general very bad performance when reading from castor so experiments in general ask for “pure” disk resources (next year requests are NOT for tape HW).
13-14 June 2005 Geneve CERN 16
COMMENTSAs said we have a lot of disk space to manage and no definitive
solution (xrootd, gpfs, dcache to be tested etc...)1) CASTOR have already an SRM interface working. Is CASTOR-2
enough reliable and scalable to manage pure diskpool spaces? We think that it should be conceived also for this use (dcache and xrootd).
2) The limits in the rfio protocol/new stager performance could seriusly limit the potential performance scalability in a pure CASTOR diskpool. (i.e. a single open() calls need to query many database). A single diskserver with rfiod can fulfil only a limited number of request. In our site we have a limited number of diskserver with a big amount of space each (10-15TB) and the limit of the rfiod caused access failure to jobs. (we use rfiod for DIRECT access to local filesystem outside castor i.e. CMS)
SOLUTION TO FAILURES=> Possibility to use swap memorySOLUTION TO PERFORMANCE=> More RAM? Other? rfio can be
modified to our site-specific use?
13-14 June 2005 Geneve CERN 17
COMMENTS1) We need the authorization method in
CASTOR to be compatible also with LDAP not only on the password and group files.
2) Useful also include rfstage (or something similar in the official release?)
3) HA. We are planning to use the 2 stand-by machines as
HA for CASTOR central services and vdqm replicaOracle 9.i rel 2 stand-by database (dataguard) or RAC
13-14 June 2005 Geneve CERN 18
CONCLUSION
Possible to have collaborations with other groups (in order to expand
the dev. team at CERN)TIER1 and LHC computing with IHEP
THANK YOU FOR THE ATTENTION!