AliEn v2-17: Before the data taking
Patricia Méndez Lorenzo, CERN IT/GS-EISALICE Offline Meeting (2-Sep-2009)
Alice Offline Meeting 2
Outlook During this summer a new AliEn v2-17
has been released At this moment it is in production and
distributed across all ALICE sites ALICE is entering in data taking mode
and this will be the AliEn version in useThis talk will present this new version
and the requirements we have for the sites before data taking
2/09/09
Alice Offline Meeting 3
Generalities of the AliEn v2-17
Available in central services at the end of July 2009
Final distribution in all sites by middle August The new version has the following (general)
features: Large number of improvements with many
contributions More weight to central services and reduction on
the list of services running in the (local) VOBOXES New requirements in terms of services for all
ALICE sites
2/09/09
Alice Offline Meeting 4
AliEn v2-17: New features New gcc4.1.2 available
Distribution of a single gsoap (previously the AliEn distribution contained 2)
3rd party tranfers for file distribution with Xrd3cp (excluding T0-T1 transfers)
In terms of xrootd there are new improvements
Reinforcement of the CREAM-CE module and full split of gLite-WMS and CREAM-CE submission modes
2/09/09
Alice Offline Meeting 5
AliEn v2-17: New features (cont.) More options to save files from the jobs
New job status: DONE_WARNING Torrent improvements New and simplified transfer model Catalogue structure improved Deletion of orphan files Deprecate SE and FTD on the sites
SE defined in LDAP but not as a local service running at the VOXES
Usual Bug fixesLet’s explain some of these features in detail
2/09/09
Alice Offline Meeting 6
Job Output Once the job is finished, the user will
not need to specify the output SE anymore
Default behaviour: Output of the job will be stored in two
working SE close to the job (via the JA) Users can specify (via jdl) how many
copies they want to have
2/09/09
Alice Offline Meeting 7
Job Output: Schema
2/09/09
JobsTaskQueue Job
Broker
Job Manager
CE
Central servicesSite services
Non AliEn Services
SplittingExpired
Priorities Merging
Zombies
File catalogue
LFN GUIDMetadata
PackmanMonALISA
CE
CMPackmanMonALISA
CMSite A
Site B
JobManager Job
Broker
Job optimizers
JA
xrootd
xrootd
Alice Offline Meeting 8
New job Status: DONE_WARNING
2/09/09
The Job output could not be copied to none of the SEs
SAVED_WARNING
DONE_WARNING
Output copied to SEs but below the number of required copies
Output registered inthe catalogue but below the number of required copies
Alice Offline Meeting 9
New transfer model No changes from the user perspective FTD is not running in the VOBOX anymore
It was previously one of the local services running in the VOBOX
Only one FTD per type of transfers (FTS or xrd3cp)
Transfer status: INSERTED WAITING ASSIGNED
TRANSFERRING DONE
2/09/09
Alice Offline Meeting 10
Transfer model: Schema
2/09/09
TransfersTransferQueue Job
Broker
Job Manager
CE
Central servicesSite services
Non AliEn Services
SplittingExpired
Priorities Merging
Zombies
File catalogueLFN GUIDMeta
data
FTD
PackmanMonALISA
CE
CMPackmanMonALISA
CMSite A
Site BTransfer Manager Transfer
Broker
Transfer optimizers
fts
bbftp
xrdcp
SRM
SRM
xrootd
Alice Offline Meeting 11
Orphan Files Up to now removal of LFN from the
catalogue did not mean removal of the file from the SE Removal feature implemented now
Deletion is asynchronous Using the transfer model and FTD with
another deamon responsible of the removal
2/09/09
Alice Offline Meeting 12
Improvements in Torrent Torrent is the new software installation
procedure chosen by ALICE The sw area usage has shown many
problems in the past For each job the required sw will be
installed in the working area of the job In the case the torrent client (aria2c)
to download new packages cannot be used, wget will make this job instead
2/09/09
Alice Offline Meeting 13
Torrent Philosophy JA lands on a virgin worker node From a centralized machine a small script is
sent: Finds a scratch area in a local disk Installs latest version of AliEn
Simple requirements: wget Starts a Job Agent
JobAgent will install required experiment software After finishing, removes everything
Or leaves it for the next agent... (still to be evaluated)
2/09/09
Alice Offline Meeting 14
Torrent Schema
2/09/09
Tracker
Seeder
Download& seed Download
& seed
Download& seed
Download& seed
Download& seed
Download& seed
alitorrent.cern.ch
Site A
Site B
Alice Offline Meeting 15
WLCG submission modules Submission to CREAM-CE and LCG-CE (via gLite-WMS)
are now separated The query to the info system has now changed
LCG-CE from the information system CREAM-CE from the CREAM-DB
This is however a problem although not a show-stopper The code has bugs preventing a good publication through this model We are back to the information system as the source also for the
CREAM-CE Multi-cluster sites are now usable for ALICE
CE ranking has been defined LDAP configuration has been improved
Less conditions needed (implemented in the code)
2/09/09
Alice Offline Meeting 16
AliEn v2-17: Deployment at the sites Once the v2-17 version was deployed at all
sites the submission to WLCG sites began to fail Both in WMS and CREAM submission modes
The problem was related only to the LDAP configuration for those sites The new v2-17 version requires changes in some
field of the CE configuration of LDAP This has been already done to all sites
The sites are nos well configured and no major issues are now reported
2/09/09
Alice Offline Meeting 17
Requirements for the sites The new AliEnv2-17 version comes together with
some important requirements for the sites Deployment of SL5 in all WNs and VOBOXEs
(T0,T1,T2 sites) DEADLINE: Mid-September 2009
Deployment of a CREAM-CE at all sites (T0,T1,T2 sites)
DEADLINE: Mid-November 2009 The main problem that we continue to face is
unresponsiveness of sites to upgrade. This is an important issue that we must resolve as a collaboration
2/09/09
Alice Offline Meeting 18
Status of the CREAM-CE Planned as replacement of the current
LCG-CE Submission procedures allowed by
CREAM-CE Submission to CREAM-CE via the gLite-WMS Direct submission via generic clients
Since Nov 2008 the WLCG encourages sites to provide a CREAM-CE in
paralel mode to the LCG-CE
2/09/09
Alice Offline Meeting 19
The WLCG Milestones on regards with CREAM-CE
https://twiki.cern.ch/twiki/pub/LCG/MilestonesPlans/WLCG_High_Level_Milestones_20090623.pdf
2/09/09
1) NOT ACHIEVED2) ACHIEVED ✔3) Today 23 sites are publishing CREAM in the prod-BDII. ~39 still missing
Alice Offline Meeting 20
Implementation and Experiences in ALICE
ALICE has been the 1st LHC experiment testing CREAM-CE and providing the developers with a continuous feedback The experience has been very positive ALICE requires the deployment of the CREAM-CE service at all sites
which provide support to the experiment The testing began in Summer 2008 and since Feb 2009 the
CREAM-CE submission is part of the AliEn standard distribution
However as of today, 13 sites FOR ALICE are providing CREAM-CE CERN, KISTI, INFN-Torino, CNAF, RAL, FZK, Kolkata, IHEP, SARA,
Subatech, Legnaro, SPbSU, Prague THE SAME NUMBER OF SITES FOR ALICE PRESENTED
DURING CHEP09! NO ADVANCES!!
2/09/09
Alice Offline Meeting 21
ALICE results with CREAM-CE
CREAM is part of the AliEn distribution since Feb 2009
2.8M jobs done since Feb08 using CREAM-CE resources
2/09/09
Alice Offline Meeting 22
SL(C)5 testing Almost parallel to the CREAM-CE
testing, ALICE has been successfully testing SL5 resources in two sites: CERN (behind CREAM-CE) RAL
ALICE has largely verified that the experiment is ready to run on these resources
2/09/09
Alice Offline Meeting 23
WLCG-MB conclusions (Summarized by I. Bird) the final
conclusions regarding the SL5 migration of the WLCG-MB at the 4th of August: Now is the time to push to complete the SL5
migration at all sites, including the Tier 2s The experiments are all ready and able to
use SL5 resources Tier 1s and Tier 2s should plan this
migration as rapidly as possible, so that the majority of resources will be migrated asap
2/09/09
Alice Offline Meeting 24
ALICE conclusions for SL(C)5 ALICE has agreed during the last 13/08/09 TF meeting to
collaborate actively with their sites to ensure the timely migration
The experiment has asked for the SL5 distribution in all official forums: WLCG-GDB, WLCG-MB and workshops
Following the strategy explained for the CREAM-CE ALICE cannot maintain an hybrid SL4/SL5 setups at the sites
SL4 is deprecated and ALICE will not maintain it anymore
The DEADLINE established by the experiment is mid-September 2009
SL5 VOBOX Max priority for WLCG-GD ALICE counts on one preliminary version by today
2/09/09
Alice Offline Meeting 25
Summary AliEn v2-17
With plenty new improvements Simplified transfer mechanism
Less AliEn services on the site The new version enforces the central services features and light the local
VOBOXES In addition: WLCG features
CREAM-CE The experiment is asking for the deployment of CREAM-CE at all sites since Summer
2008 The double support we are currently providing (WMS/LCG-CE and CREAM-CE) cannot
scale during real data taking After more than 1 year asking for this service, ALICE will deprecate the use of the
WMS/LCG-CE in benefit of the CREAM-CE by mid-November 2009 SLC5
Following the strategy explained for the CREAM-CE ALICE cannot maintain an hybrid SL4/SL5 setups at the sites
SL4 is deprecated and ALICE will not maintain it anymore The DEADLINE established by the experiment is mid-September 2009
2/09/09