Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | zahir-melendez |
View: | 21 times |
Download: | 0 times |
Enabling Grids for E-sciencE
www.eu-egee.org
Experience Supporting the Integration of LHC Experiments Computing Systems
with the LCG Middleware
Simone CampanaLCG Experiment Integration and Support
CERN-IT / INFN-CNAF
CHEP06 – 12-17 February 2006 – Mumbay (India) 2
Enabling Grids for E-sciencE
Mandate of the LCG/EIS Team
EISEIS : Experiment Integration and SupportExperiment Integration and Support Team
Help LHC Experiments integratingintegrating their production environment with the Grid Middleware and utilities.
Offer supportsupport during all steps of integration processunderstanding of the middleware functionalitytesting new prototypal componentsgetting on the LCG Infrastructure.
One person dedicated to each LHC Experiment
ProductionProduction is the main focus. Experiment Support does not mean User SupportUser Support. Experiment Support does not mean GOCGOC.
CHEP06 – 12-17 February 2006 – Mumbay (India) 3
Enabling Grids for E-sciencE
Main Tasks
IntegrationIntegration Middleware functionality and usage Functionality tests Customized distributions and missing tools Discuss requirements
And bring them to the attention of the developers
Experiment and User SupportExperiment and User Support Documentation: Manuals, Guides, FAQ First line User Support Monitoring experiment specific production system
Provide infrastructure expertiseProvide infrastructure expertise Monitoring/Managing Services
GRID and Experiment Specific Solving site-related problems Service Challenge Second Level Support (on shift)
CHEP06 – 12-17 February 2006 – Mumbay (India) 4
Enabling Grids for E-sciencE
Tools… Tools… Tools…
Data Management Customized version of LCG Data Management clients
Workload Management Monitoring of the job “standard error” and “standard output”
g-peek Estimate job normalized CPU and Wall Clock time left on CPU
Information System C++ Generic API (with ldap and R-GMA backends) User friendly querying tools
Generic Framework for Job Submission Intensively used by GEANT4
Many others …
Several functionalitiesSeveral functionalities provided by the tools have been integrated in the integrated in the MiddlewareMiddleware See the g-peek functionality
CHEP06 – 12-17 February 2006 – Mumbay (India) 5
Enabling Grids for E-sciencE
Monitoring Tools
ATLAS SC3
Service Monitor
LHCb specific
Site Functional Tests
CHEP06 – 12-17 February 2006 – Mumbay (India) 6
Enabling Grids for E-sciencE
Experiment Software Installation
Lcg-ManageSoftwareLcg-ManageSoftware
Lcg-ManageVOTagLcg-ManageVOTagTank&SparkTank&Spark
gsskloggssklog
Lcg-asis Lcg-asis UI
WN
CE
CHEP06 – 12-17 February 2006 – Mumbay (India) 7
Enabling Grids for E-sciencE
VO-BOX
First prototype developed and packageddeveloped and packaged by EIS.
Evaluation of the Globus GSI-enabled ssh server and relative configuration
Development of a ad-hoc proxy renewal server with relative user level tool
Overall configuration of the node type Inclusion of UI clients and gssklog
Following up installation issuesinstallation issues and further discussions on possible evolution discussions on possible evolution
CHEP06 – 12-17 February 2006 – Mumbay (India) 8
Enabling Grids for E-sciencE
EIS on ALICE
EIS For Data Challenges 04 and 05Data Challenges 04 and 05
Offered support for the integration of ALICE framework with LCG servicesIntegration with existing LCG services Development of new tools
Follow up of production exerciseProvided solution for site specific problemsFollow up of services deployment at the sites
Collected ALICE requirements for middleware developers
CHEP06 – 12-17 February 2006 – Mumbay (India) 9
Enabling Grids for E-sciencE
EIS on ALICE
Development ALICE specific user level toolsALICE specific user level tools
Integration of Monalisa monitoring system with LCGLater, the tools have been generalized for other use-cases
FTS transfer handling clientThen integrated in the ALICE framework
Publication of VO specific services in the Information System Included as part of the VO-BOX middleware component
CHEP06 – 12-17 February 2006 – Mumbay (India) 10
Enabling Grids for E-sciencE
Some Results of the last PDC04
◘ Statistics after phase 1 (ended April 4, 2004): ➸ ALICE::CERN::LCG is the interface to LCG-2 ➸ ALICE::Torino::LCG is the interface to GRID.IT
4
~ 1.3 million files, 26 TB data volumeS. Bagnasco. SC3 Detailed Planning Workshop, CERN 13.June, 05 )
CHEP06 – 12-17 February 2006 – Mumbay (India) 11
Enabling Grids for E-sciencE
EIS in ATLAS
Support in the development of ATLAS framework Data Management Workload
management Operational support
Exclusion of problematic sites
Follow up of site configuration problems
Understanding of failures and suggestion of solutions
jobs per day
0
1000
2000
3000
4000
5000
6000
7000
8000
6/2
5/2
004
7/2
5/2
004
8/2
5/2
004
9/2
5/2
004
10/2
5/2
004
11/2
5/2
004
12/2
5/2
004
1/2
5/2
005
2/2
5/2
005
3/2
5/2
005
4/2
5/2
005
5/2
5/2
005
6/2
5/2
005
7/2
5/2
005
Number of jobs per dayNumber of jobs per day
Data Data Challenge 2Challenge 2
Rome Rome ProductionProduction
Large event production production for Physics Rome workshop Rome workshop
EIS support activitiesEIS support activities
CHEP06 – 12-17 February 2006 – Mumbay (India) 12
Enabling Grids for E-sciencE
Rome Production experience on LCG
Jobs distributed to 45 different computing resources
Ratio generally proportional to Ratio generally proportional to the size of the clusterthe size of the cluster indicates an overall good job
distribution.
No site in particular ran large majority of jobs. The site with the largest number
of CPU resources (CERN), contributed for about 11% of the ATLAS production.
Other major sites ran between 5% and 8% of the jobs each.
Achievement toward a more
robust and fault-tolerant systemrobust and fault-tolerant system does not rely on a small number
of large computing centers.
cnaf.infn.it 7% roma1.infn.it
5%
lnl.infn.it 4%
ba.infn.it 2%
mi.infn.it 2%
others infn.it5%
ihep.su 2%
in2p3-cc.fr 5%
in2p3-cppm.fr 1%
others fr1%
prague.cz 3%
rl.ac.uk 7%shef.ac.uk
5%
fzk.de 5%
cern.ch 11%
grnet.gr 1%
nikhef.nl 5%
sara.nl 2% others
5%
triumf.ca 2%ifae.es
1%ific.uv.es
4%
ft.uam.es 5%
sinica.edu.tw 3%
others ac.uk2%
ox.ac.uk 2%
cnaf.infn.it
roma1.infn.it
lnl.infn.it
ba.infn.it
mi.infn.it
others infn.it
ihep.su
in2p3-cc.fr
in2p3-cppm.fr
others fr
prague.cz
rl.ac.uk
shef.ac.uk
ox.ac.uk
others ac.uk
sinica.edu.tw
ft.uam.es
ific.uv.es
ifae.es
triumf.ca
fzk.de
cern.ch
grnet.gr
nikhef.nl
sara.nl
others
The percentage of ATLAS jobs run at each LCG site
CHEP06 – 12-17 February 2006 – Mumbay (India) 13
Enabling Grids for E-sciencE
EIS in ATLAS
Service Challenge 3Service Challenge 3 Support to the ATLAS Data Management System
File Transfer Service (FTS) and LCG File Catalog (LFC)Prototype Data Location Interface (DLI) developed
• ATLAS WMS and DDM integration.
Role in the technical coordination of the ATLAS Service Challenge activities
ensuring the readiness of the sites before and during the exercise following up issues with the different services.
TestingTesting Several new glite components (WMS, gpbox, FTS …) In the context of the task force and in collaboration with ARDA
User SupportUser Support Analysis on LCG produced data
CHEP06 – 12-17 February 2006 – Mumbay (India) 14
Enabling Grids for E-sciencE
EIS in CMS
LFCLFC evaluation as a POOL file catalogPOOL file catalog use case: local file catalog performance tests
Results: LFC and POOL_LFC interface issues discovered and fixed
LFCLFC evaluation as a Data Location SystemData Location System implementation of a Python API performance tests
Results: LFC was found to be an valid implementation of a DLS; performance issues discovered and fixed
CHEP06 – 12-17 February 2006 – Mumbay (India) 15
Enabling Grids for E-sciencE
EIS in CMS
Service Challenge 3Service Challenge 3
fake analysis job submission analysis of job failures and related statistics Results: much better understanding of the stability of the LCG
infrastructure when intensively used
SupportSupport
active in the solution of Grid-related problems for the MC production and user analysis (CRAB) activities
CMS VO managementVO management
CHEP06 – 12-17 February 2006 – Mumbay (India) 16
Enabling Grids for E-sciencE
The CMS Analysis Jobs
Taken from the CMS Taken from the CMS
Dashboard Dashboard (ARDA)(ARDA)
CHEP06 – 12-17 February 2006 – Mumbay (India) 17
Enabling Grids for E-sciencE
EIS in LHCb
EIS supported LHCb along many activities:Data Challenge 04Data Challenge 04Service Challenge 3Service Challenge 3Analysis exerciseAnalysis exercise
Operation support chasing/tackling sites and middleware related
problemsdeveloping experiment specific monitoring tools
T1-T1 transfer monitor for SC3 VO oriented plug-ins for SFT
CHEP06 – 12-17 February 2006 – Mumbay (India) 18
Enabling Grids for E-sciencE
EIS in LHCb
Integration of LHCb framework and LCG middleware Offering suggestions for an optimized middleware usage Development of user level tools
Query the information system, interactions with SRM, LFC, DLI.
Repackaging or customized version of existing tools lcg_utils and GFAL
User Support Especially for analysis users Using the GGUS portal
Testing of new components CREAM CE, g-pbox, WMS …
CHEP06 – 12-17 February 2006 – Mumbay (India) 19
Enabling Grids for E-sciencE
The LHCb Data Challenge
DIRAC alone
LCG inaction
1.8 106/day
LCG paused
Phase 1 Completed
3-5 106/day
LCG restarted
187 M Produced Events
61% efficiency for LCG
Number of Jobs run
versus time
Jobs run in LCG and
Dirac-only sites
CHEP06 – 12-17 February 2006 – Mumbay (India) 20
Enabling Grids for E-sciencE
WISDOMWISDOM: research on : research on malaria medical caremalaria medical care Major success in EGEE
1 million of potential 1 million of potential medicines tested in 1 medicines tested in 1 weekweek
1000 CPUs employed in EGEE/LCG
Support to Biomedical community and the WISDOM project
First no-HEPno-HEP VO supported by EIS Different needs, access pattern,
user scenarios Scattered and heterogeneous
community Main support activities for Biomed:
Improvement of Job submission strategy
Adaptation of application to Grid Environment
Oparational support User Support
Biomedical Data ChallengeData Challenge in July - August 2005 ~70000 jobs run 1 TB of data produced equivalent of ~70 CPU years
computed.
CHEP06 – 12-17 February 2006 – Mumbay (India) 21
Enabling Grids for E-sciencE
GEANT4
GEANT4GEANT4: simulation of particle interactions with simulation of particle interactions with mattermatter. HEP and nuclear experiments, medical, accelerator, space
physics 3 major productions on LCG
First 2 hosted by dteam and alice, third as a real VO Aimed to test new version of software
EIS support in GEANT “Gridification” process“Gridification” process Development of tools for job submission an handling
Then extended and generalized for other VOs Creation and administration of the GEANT VO
Contact point for the EGEE ROC managers Operational support during production
CHEP06 – 12-17 February 2006 – Mumbay (India) 22
Enabling Grids for E-sciencE
Relief Projects of UNOSAT
Case StudyCase Study: Indian Ocean Tsunami Relief and Development 29th Dec 2004: First Map distributed online to field users January 2005: 200,000 tsunami maps downloaded in total
UNOSAT has a huge amount of data to be stored
Good amount of storage provided by CERN
Running and storing data in LCG/EGEE can certainly assist UNOSAT in their purposes
In Summer 2005 the collaboration with LCG started
Gridification prcess similar Gridification prcess similar to GEANT4 experienceto GEANT4 experience
CHEP06 – 12-17 February 2006 – Mumbay (India) 23
Enabling Grids for E-sciencE
Summary
Our mailing list: [email protected] WEB site: http://lcg.web.cern.ch/LCG/eis.htm
EIS provides help integratingintegrating VOVO specific software environmentsoftware environment with GRID middlewareGRID middleware Direct experiment support via a contact persons Special middleware distributions documentation User support
Data Challenges, Service Challenges and Distributed ProductionsData Challenges, Service Challenges and Distributed Productions Follow up of operational issues
maintaing experiment specific servicesassisting sites with configuration problems
Not anymore “sporadic” exercises. Overall a very interesting a productive experienceinteresting a productive experience
LHC experiments and other VOs seem to find EIS team very supportive