Alberto Masoni 1last update: 01/07/2004 12:04
CERN
LCGStato del progetto
LHC Computing GridAlberto Masoni
Commissione I 22‐6‐2004
Alberto Masoni ‐2last update 01/07/2004 12:04
CERN
LCG LHC Computing Grid Project
Goal of the projectTo prepare, deploy and operate the computing environment
for the experiments to analyse the data from the LHC detectors
Phase 1 – 2002-05development of common applications, libraries, frameworks, prototyping of the environment, operation of a pilot computing service
Phase 2 – 2006-08acquire, build and operate the LHC computing service
LCG
Alberto Masoni ‐3last update 01/07/2004 12:04
CERN
LCGGRID TECHNOLOGYIL PROGETTO EGEE
I PROGETTI DATAGRID E DATATAG SI SONO CONCLUSI CON UNA VALUTAZIONE POSITIVA DELLA EUEGEE: Enabling Grids for E‐science in EuropeE’ PARTITO IL I APRILE 2004FINANZIAMENTO: ~32 MEUR (~4 ALL’ INFN)
Alberto Masoni ‐4last update 01/07/2004 12:04
CERN
LCG
LCG deve poter utilizzare le risorse per LHC su scala mondialeNecessità di interoperabilità fra le grids (e.g. LCG‐GRID3 attualmente in corso)LCG e EGEE condividono la responsabilità e il management dei seguenti settori:
Middleware: EGEE partirà con il middleware package di LCG e l’evoluzione sarà comune Infrastructure operation: LCG costituirà il nucleo fondamentale per lo sviluppo della grid EGEE Deployment of HEP application: integrazione con gli esperimenti LHC (EGEE‐NA4)Security: in fase di integrazione
IL PROGETTO LCG E I PROGETTI EGEE E GRID3 (US)
Alberto Masoni ‐5last update 01/07/2004 12:04
CERN
LCG LCG‐2 & gLitegLite
Focus on analysis; strongly influenced by the ARDA RTAGStarts with components from AliEn, EDG, VDT and other projectsAim at addressing advanced requirements in particular from BioMedicalsPrototyping short development cycles for fast user feedbackAim at delivering components compatible with LCG‐2 wherever appropriate
SA1 Preproduction ServiceStarts with LCG‐2 Code baseHome for new development/reengineering required from LHC data challenges experiencesValidation in particular by ATLAS/CMS current analysis toolsCertification of promising selected components from gLite
Provided they satisfy Operations requirements
LCG‐2Current base for production servicesEvolved with certified new or improved services from the preproduction
EGEE ALL ACTIVITIES MEETING 18-6-04PROTOTIPO EGEE MIDDLEWARE (DISPONIBILE)
INFRASTRUTTURA DI PRODUZIONE EGEE
INFRASTRUTTURA DI PRODUZIONE LCG
Alberto Masoni ‐6last update 01/07/2004 12:04
CERN
LCG
LCG‐2 and Next Generation Middleware (LCG‐GDB 15‐6‐04)
LCG‐2
focus on production, large‐scale data handling
The service for the 2004 data challengesProvides experience onoperating and managinga global grid serviceStrong development programme driven by data challenge experience
Evolves to LCG‐3 as components progressively replaced with new middleware
Next generation middleware
focus on analysis
Developed by EGEEproject in collaboration with VDT (US)LHC applications and users closely involved in prototyping & development (ARDA project)Short development cyclesCompleted components integrated in LCG‐2
LCG-2 (=EGEE-0)
prototyping
prototyping
product
20042004
20052005
LCG-3 (=EGEE-x?)
product
EGEE ALL ACTIVITIES MEETING 18-6-04
Alberto Masoni ‐7last update 01/07/2004 12:04
CERN
LCG Alcune considerazioni
Portato avanti correttamente questo approccio consente:Di garantire il servizio per i Data Challenges degli esperimentiDi utilizzare in modo opportuno sia il software sviluppato fino ad ora sia i progressi legati allo sviluppo prototipale (ARDA)
E’ necessario però un notevole sforzo di coordinamento e un grande impegno delle persone coinvolte nello sviluppo
Le due linee non devono diventare convergenze parallele
Software is not a technical problem. It is mainly a sociologicalproblem.(Renè Brun seminario LCG Nov. 2001)
Alberto Masoni ‐8last update 01/07/2004 12:04
CERN
LCG ARDA
“Long they laboured in the regions of Eä, which are vast beyond the thought of Elves and Men, until in the time appointed was made Arda...”
- J.R.R Tolkien, Valaquenta
Alberto Masoni ‐9last update 01/07/2004 12:04
CERN
LCGARDA in a nutshell
Main activity: enable LHC analysis on the gridUse the grid software as it matures (EGEE project)
ARDA should be the key player in the evolution from LCG2 to the EGEE infrastructureProvide early and continuous feedback (guarantee the software is what experiments expect/need)
Use the last years experience/components both from Grid projects (LCG, VDT, EDG) and experiments middleware/tools (Alien, Dirac, GAE, Octopus, Ganga, Dial,…)
Help in adapting/interfacing (direct help within the experiments)Every experiment has different implementations of the standard services, but:
Used mainly in production environmentsFew expert usersCoordinated update and read actions
ARDAInterface with the EGEE middlewareVerify (help to evolve to) such components to analysis environments
Many users (Robustness might be an issue)Concurrent “read” actions (Performance will be more and more an issue)
One prototype per experimentA Common Application Layer might emerge in futureARDA emphasis is to enable each of the experiment to do its job
Provide a forum for discussionComparison on results/experience/ideasInteraction with other projects…
Experiment interfaces
Piergiorgio Cerello (ALICE)David Adams (ATLAS)Lucia Silvestris (CMS)Ulrik Egede (LHCb)
The experiment interfaces agree with the ARDA project
leader the work plan and coordinate the activity on the
experiment side (users)
Alberto Masoni ‐10last update 01/07/2004 12:04
CERN
LCG LHCbThe LHCb system within ARDA uses GANGA as principal component.The LHCb/GANGA plans:
enable physicists (via GANGA) to analyse the data being produced during 2004 for their studiesIt naturally matches the ARDA mandateHave the prototype where the LHCb data will be the key (CERN, RAL, …)
At the beginning, the emphasis will be to validate the tool focusing on usability, validation of the splitting and merging functionality for users jobs
Grid activity:Use of the Glite testbed(since May 18th)Test jobs from Ganga to Glite☺
Other contributions:GANGA interface to Condor (Job submission) and Condor DAGMAN for splitting/merging and error recovery GANGA Release management and software processLHCb Metadata catalogue tests
Performance testsCollaborators in Taiwan (ARDA + local DB know‐how on Oracle)
Alberto Masoni ‐11last update 01/07/2004 12:04
CERN
LCG CMSThe CMS system within ARDA is still under discussionProvide easy access (and possibly sharing) of data for the CMS users is a key issue (Data management):
RefDB is the bookkeeping engine to plan and steer the production across different phases (simulation, reconstruction, to some degree into the analysis phase).
This service is under testIt contained all necessary information except file physical location (RLS) and info related to the transfer management system (TMDB)The actual mechanism to provide these data to analysis users is under discussionMeasuring performances underway (similar philosophy as for the LHCb Metadata catalog measurements)
Exploratory/preparatory activityORCA job submission to GliteGlite file catalog
RefDB
McRunjobT0 worker
nodes
GDB castor poolTapes
ExportBuffers
Transfer agent
RLS TMDB
Reconstruction instructions
Reconstructionjobs
Reconstructeddata
Reconstructeddata
Checks what hasarrived
Updates
Updates
Summaries of successful jobs
RefDB in CMS DC04
Alberto Masoni ‐12last update 01/07/2004 12:04
CERN
LCG ATLAS
The ATLAS system within ARDA has been agreed
ATLAS has a complex strategy for distributed analysis, addressing different area with specific projects (www.usatlas.bnl.gov/ADA)Starting point is the DIAL analysis model system (high level web services)
The AMI metadata catalog is a key component
Robustness and performance tests from ARDA
In the start up phase, ARDA provided help in developing ATLAS production tools
Being finalised
Submission to Glite (test jobs OK, Athena jobs still not possible)First skeleton of high level services AMI Tests
Alberto Masoni ‐13last update 01/07/2004 12:04
CERN
LCG ALICE
USER SESSIONUSER SESSION
PROOF PROOF SLAVESSLAVES
PROOFPROOF
PROOF PROOF SLAVESSLAVES
PROOF MASTERPROOF MASTERSERVERSERVER
PROOF PROOF SLAVESSLAVES
TcpRouter
Site A
Site C
Site B
Strategy:The ALICE/ARDA will evolve the ALICE analysis system (SuperComputing 2003)
Where to improve:Strong requests on networking (inbound connectivity)Heavily connected with the middleware services“Inflexible” configurationNo chance to use PROOF on federated grids like LCG in AliEnUser libraries distribution
Activity on PROOF Robustness and Error recovery
Grid activity:C++ access library on Glite
ARDA ALREADY IN
PHASE-3 OF THE
PRESENT DATA
CHALLENGE
Alberto Masoni ‐14last update 01/07/2004 12:04
CERN
LCG
DEPLOYMENT AREA: The LHC Grid Service
OBIETTIVIVO PRINCIPALE: FORNIRE UN SERVIZIO DI PRODUZIONE PER I DATA CHALLENGES NEL 2004IL PLANNING ORIGINARIO PREVEDEVA:
M‐1.1 Luglio 2003 partenza del servizio pilotaM‐1.2 Fine 2003 disponibilità del sistema di produzione
La prima milestone ha due mesi di ritardo, essenziallmentelegati al ritardo di una componente di grid middleware (RGMA) la componente e’ stata sostituita con un’ altra, sia pure con minori funzionalita’ (MDS) grazie al fondamentalecontributo INFN
Cfr discussione ultima Commissione ILa componente di grid middleware continua a mancare e non e’ chiaro se sara’ disponibile per la release da utilizzare nei Data Challenges degli esperimenti
OBIETTIVO: cercare di mantenere la seconda milestone per garantire il servizio di produzione per i Data Challenges degli esperimenti
Qui si parrà la tua nobilitateSlide presentataalla Commisione I 24/9/03
Slide presentataalla Commisione I 24/9/03
Alberto Masoni ‐15last update 01/07/2004 12:04
CERN
LCG
• Roma-1 (in manutenzione il 4-6)
• 22 Countries• 58 Sites (45 Europe, 2 US, 5 Canada, 5 Asia, 1 HP)
• Coming: New Zealand, China, other HP (Brazil, Singapore)
• 3800 cpu• Tutta INFN-GRID e’ ora inserita in EGEE-0
Sites in LCG‐2/EGEE‐0 :Situazione 4 Giugno 2004
Alberto Masoni ‐16last update 01/07/2004 12:04
CERN
LCG What is missing? (functionally)
A full storage elementdCache has had many problems
Packaged ‐ to be deployedIs dCache sufficient/the only solution?
Demonstrated integration of Tier 1 MSS’s
Alberto Masoni ‐17last update 01/07/2004 12:04
CERN
LCGI DATA CHALLENGES DEGLI
ESPERIMENTIALICE
Partito in marzoterminata fase I (simulazione)interoperabilità di 3 gridattualmente in corso fase 2 (ricostruzione) fase 3 (analisi) prevista con prototipo EGEE/ARDA
ATLASIn partenza Interoperabilità LCG – GRID3 ‐ Nordugrid
CMSAppena conclusoEnfasi su data transfer
LHCbAppena partitoUtilizzo di Dirac e LCGPrestazioni soddisfacentiProblemi su data transfer
3 esperimenti su 4
avranno i Data Challenges
in corso a partire dai prossimi giorni
3 esperimenti su 4
avranno i Data Challenges
in corso a partire dai prossimi giorni
Alberto Masoni ‐18last update 01/07/2004 12:04
CERN
LCG PIANIFICAZIONE FASE 2
E’ in corso la pianificazione per la fase 2 del progetto (2006‐2008)Gli esperimenti stanno effettuando una revisione del loro modello di calcolo e delle loro stime delle capacità richieste per il Tier‐0 i Tier‐1 i Tier‐2Sta partendo un’ attività di pianificazione per i Tier‐1
Alberto Masoni ‐19last update 01/07/2004 12:04
CERN
LCG Phase 2 Planning OutlineJune 2003 ‐ Establish editorial board for LCG TDRSeptember 2004 – Consolidated Tier‐1, Tier‐2 Regional Centre Plan
Background for the draft MoU October C‐RRBRevised version of basic computing modelsRevised estimates of overall Tier‐1, Tier‐2 resourcesCurrent state of commitments of resources
Regional Centres ↔ ExperimentsHigh‐level plan for ramping up the Tier‐1 and large Tier‐2 centresPrepared by planning team including experiments, LCG management and regional centre representatives
October 2004 C‐RRB – Draft MoUEnd 2004 ‐ Initial computing models agreed by experiments April 2005 C‐RRB ‐ Final MoUEnd June 2005 ‐ TDR
Agreed by POB3 June 04
Agreed by POB3 June 04
Alberto Masoni ‐20last update 01/07/2004 12:04
CERN
LCG CONCLUSIONIIl progetto europeo EGEE e’ partito
Continuita’ con EDG/EDTSinergia con LCG nel management, nelle scelte sul middleware, nell’ infrastrutturaEvoluzione con rapide release e, allo stesso tempo continuità del servizio basato su LCG‐2Primo prototipo gLite disponibile
Il servizio LCG‐2 è partito ed è in uso per i Data Challenge degli esperimenti
Il sistema ha permesso (e permette) l’ uso di ingenti risorse, distribuite worldwideIl contributo italiano è il più rilevante
Il CNAF rappresenta di gran lunga il contributo piu’ importanteGrid‐IT (l’ insieme degli altri centri italiani) ha fornito un Tier‐1 equivalente
Resta ancora molto da farePunto critico: Data ManagementProblemi sul data transferPianificazione risorse di storage (disco/cpu insufficiente)
E’ cominciata la pianificazione per la fase 2