Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | cody-campbell |
View: | 217 times |
Download: | 0 times |
Planning LHCb computing infrastructure 22 May 2000 Slide 1
Planning LHCb computing infrastructure at CERN and at regional centres
F. Harris
Planning LHCb computing infrastructure 22 May 2000 Slide 2
Talk Outline
Reminder of LHCb distributed model
Requirements and planning for 2000-2005 ( growth of regional centres)
EU GRID proposal status and LHCb planning
Large prototype proposal and LHCb possible uses
Some NEWS from LHCb (and other) activities
Planning LHCb computing infrastructure 22 May 2000 Slide 3
General Comments
Draft LHCb Technical Note for computer model exists
New requirements estimates have been made (big changes in MC requirements)
Several presentations have been made to LHC computing review in March and May (and May 8 LHCb meeting)
http://lhcb.cern.ch/computing/Steering/Reviews/LHCComputing2000/default.htm
Planning LHCb computing infrastructure 22 May 2000 Slide 4
Baseline Computing Model - Roles
To provide an equitable sharing of the total computing load can envisage a scheme such as the following
After 2005 role of CERN (notionally 1/3) to be production centre for real data support physics analysis of real and simulated data by CERN
based physicists
Role of regional centres (notionally 2/3) to be production centre for simulation to support physics analysis of real and simulated data by
local physicists
Institutes with sufficient cpu capacity share simulation load with data archive at nearest regional centre
CPU for productionMass Storage for RAW, ESD AOD, and TAG
Institute
Selected User AnalysesInstitute
Selected User Analyses
Regional Centre
User analysis
Production Centre
Generate raw dataReconstructionProduction analysis
User analysis
Regional Centre
User analysisRegional Centre
User analysis
Institute
Selected User Analyses
Regional Centre
User analysis
Institute
Selected User Analyses
CPU for analysisMass storage for AOD, TAG
CPU and data servers
AOD,TAGreal : 80TB/yrsim: 120TB/yr
AOD,TAG8-12 TB/yr
Planning LHCb computing infrastructure 22 May 2000 Slide 6
Physics : Plans for Simulation 2000-2005
In 2000 and 2001 we will produce 3. 106 simulated events each year for detector optimisation studies in preparation of the detector TDRs (expected in 2001 and early 2002).
In 2002 and 2003 studies will be made of the high level trigger algorithms for which we are required to produce 6.106 simulated events each year.
In 2004 and 2005 we will start to produce very large samples of simulated events, in particular background, for which samples of 107 events are required.
This on-going physics production work will be used as far as is practicable for testing development of the computing infrastructure.
Planning LHCb computing infrastructure 22 May 2000 Slide 7
Computing : MDC Tests of Infrastructure
2002 : MDC 1 - application tests of grid middleware and farm management software using a real simulation and analysis of 107 B channel decay events. Several regional facilities will participate : CERN, RAL, Lyon/CCIN2P3,Liverpool, INFN, ….
2003 : MDC 2 - participate in the exploitation of the large scale Tier0 prototype to be setup at CERN High Level Triggering – online environment, performance Management of systems and applications Reconstruction – design and performance optimisation Analysis – study chaotic data access patterns STRESS TESTS of data models, algorithms and technology
2004 : MDC 3 - Start to install event filter farm at the experiment to be ready for commissioning of detectors in 200 4 and 2005
Planning LHCb computing infrastructure 22 May 2000 Slide 8
Cost of CPU, disk and tape
Moore’s Law evolution with time for cost of CPU and storage. Scale in MSFr is for a facility sized to ATLAS requirements (> 3 x LHCb)
At today’s prices the total cost for LHCb ( CERN and regional centres) would be ~60 MSFr
In 2004 the cost would be ~10 - 20 MSFr
After 2005 the maintenance cost is ~ 5 MSFr /year
Planning LHCb computing infrastructure 22 May 2000 Slide 9
Growth in Requirements to Meet Simulation Needs
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010No of signal events 1.0E+06 1.0E+06 2.0E+06 3.0E+06 5.0E+06 1.0E+07 1.0E+07 1.0E+07 1.0E+07 1.0E+07 1.0E+07No of background events 1.0E+06 1.5E+06 2.0E+06 4.0E+06 1.0E+07 1.0E+09 1.0E+09 1.0E+09 1.0E+09 1.0E+09 1.0E+09CPU for simulation of signal (SI95) 10000 10000 20000 30000 50000 100000 100000 100000 100000 100000 100000CPU for background simulation (SI95)16000 24000 32000 64000 160000 400000 400000 400000 400000 400000 400000CPU user analysis (SI95) 2500 2500 5000 7500 12500 25000 25000 25000 25000 25000 25000RAWmc data on disk (TB) 0.4 0.5 0.8 1.4 3 202 202 202 202 202 202RAWmc data on tape (TB) 0.4 0.5 0.8 1.4 3 202 202 202 202 202 202ESDmc data on disk (TB) 0.2 0.25 0.4 0.7 1.5 101 101 101 101 101 101AODmc data on disk (TB) 0.06 0.1 0.1 0.3 0.5 30.5 39.4 42.1 42.9 43.2 43.3TAGmc data on disk (TB) 0.002 0.0025 0.004 0.007 0.015 1.01 1.01 1.01 1.01 1.01 1.01
Unit Costs2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
CPU cost / SI95 64.5 46.1 32.9 23.5 16.8 12.0 8.6 6.1 4.4 3.1 2.2Disk cost / GB 16.1 11.5 8.2 5.9 4.2 3.0 2.1 1.5 1.1 0.8 0.6Tape cost / GB 2.7 1.9 1.4 1.0 0.7 0.5 0.36 0.26 0.18 0.13 0.09
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010CPU for signal(kSFr) 323 230 329 235 336 600 171 122 87 62 45CPU for background (kSFr) 0 369 263 753 1613 2880 857 612 437 312 223CPU for user analysis (kSFr) 65 92 66 59 84 150 69 49 35 25 18RAWmc data on disk (kSFr) 3 6 2 4 7 597 129 92 66 47 33RAWmc data on tape (kSFr) 0.2 0.2 0.2 0.3 0.4 20.2 14.4 10.3 7.4 5.3 3.8ESDmc data on disk (kSFr) 3 3 1 2 3 299 64 46 33 23 17AODmc data on disk (kSFr) 1 1 0 1 1 90 21 15 11 8 6TAGmc data on disk (kSFr) 0.0 0.0 0.0 0.0 0.1 3.0 2.2 1.5 1.1 0.8 0.6Investment per year (kSFr) 395 701 663 1053 2045 4639 1328 949 678 484 346
Planning LHCb computing infrastructure 22 May 2000 Slide 10
Cost / Regional Centre for Simulation Assume there are 5 regional centres(UK,IN2P3,INFN,CERN+
consortium of Nikhef, Russia, etc...) Assume costs are shared equally
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010CPU for signal (kSFr) 64.5 46.1 65.9 47.0 67.2 120.0 34.3 24.5 17.5 12.5 8.9CPU for background (kSFr) 0.0 73.8 52.7 150.5 322.6 576.0 171.4 122.4 87.5 62.5 44.6CPU user analysis (kSFr) 12.9 18.4 13.2 11.8 16.8 30.0 13.7 9.8 7.0 5.0 3.6RAWmc data on disk (kSFr) 0.6 1.2 0.5 0.7 1.3 119.4 25.7 18.4 13.1 9.4 6.7RAWmc data on tape (kSFr) 0.0 0.0 0.0 0.1 0.1 4.0 2.9 2.1 1.5 1.1 0.8ESDmc data on disk (kSFr) 0.6 0.6 0.2 0.4 0.7 59.7 12.9 9.2 6.6 4.7 3.3AODmc data on disk (kSFr) 0.2 0.2 0.1 0.1 0.2 18.0 4.3 3.1 2.2 1.6 1.1TAGmc data on disk (kSFr) 0.0 0.0 0.0 0.0 0.0 0.6 0.4 0.3 0.2 0.2 0.1Investment per year (kSFr) 79 140 133 211 409 928 266 190 136 97 69
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010No of signal events 2.0E+05 2.0E+05 4.0E+05 6.0E+05 1.0E+06 2.0E+06 2.0E+06 2.0E+06 2.0E+06 2.0E+06 2.0E+06No of background events 2.0E+05 3.0E+05 4.0E+05 8.0E+05 2.0E+06 2.0E+08 2.0E+08 2.0E+08 2.0E+08 2.0E+08 2.0E+08CPU for simulation of signal (SI95) 2000 2000 4000 6000 10000 20000 20000 20000 20000 20000 20000CPU for simulation of background (SI95)3200 4800 6400 12800 32000 80000 80000 80000 80000 80000 80000CPU user analysis (SI95) 500 500 1000 1500 2500 5000 5000 5000 5000 5000 5000RAWmc data on disk (TB) 0.08 0.1 0.16 0.28 0.6 40.4 40.4 40.4 40.4 40.4 40.4RAWmc data on tape (TB) 0.08 0.1 0.16 0.28 0.6 40.4 40.4 40.4 40.4 40.4 40.4ESDmc data on disk (TB) 0.04 0.05 0.08 0.14 0.3 20.2 20.2 20.2 20.2 20.2 20.2AODmc data on disk (TB) 0.012 0.0186 0.02958 0.05087 0.10526 6.09158 7.88747 8.42624 8.58787 8.63636 8.65091TAGmc data on disk (TB) 0.0004 0.0005 0.0008 0.0014 0.003 0.202 0.202 0.202 0.202 0.202 0.202
Planning LHCb computing infrastructure 22 May 2000 Slide 11
EU GRID proposal status (http://grid.web.cern.ch/grid/)
GRIDs Software to manage all aspects of distributed computing(security and authorisation, resource management,monitoring). Interface to high energy physics...
Proposal was submitted May 9
Main signatories (CERN,France,Italy,UK,Netherlands,ESA) + associate signatories (Spain,Czechoslovakia,Hungary,Spain,Portugal,Scandinavia..)
Project composed of Work Packages (to which countries provide effort)
LHCb involvement
Depends on country
Essentially comes via ‘Testbeds’ and ‘HEP applications’
Planning LHCb computing infrastructure 22 May 2000 Slide 12
EU Grid Work Packages
Middleware Grid work scheduling C Vistoli(INFN) Grid Data Management B Segal(IT) Grid Application Monitoring R Middleton(RAL) Fabric Management T Smith(IT) Mass Storage Management O Barring(IT)
Infrastructure Testbed and Demonstrators (LHCb in) F Etienne(Marseille) Network Services C Michau(CNRS)
Applications
HEP (LHCb in) H Hoffmann(CERN) Earth Observation L Fusco(ESA) Biology C Michau(CNRS)
Management Project Management F Gagliardi(IT)
Planning LHCb computing infrastructure 22 May 2000 Slide 13
Grid LHCb WP - Grid Testbed (DRAFT)
MAP farm at Liverpool has 300 processors would take 4 months to generate the full sample of events
All data generated (~3TB) would be transferred to RAL for archive (UK regional facility).
All AOD and TAG datasets dispatched from RAL to other regional centres, such as Lyon , CERN, INFN.
Physicists run jobs at the regional centre or ship AOD and TAG data to local institute and run jobs there. Also copy ESD for a fraction (~10%) of events for systematic studies (~100 GB).
The resulting data volumes to be shipped between facilities over 4 months would be as follows :
Liverpool to RAL 3 TB (RAW ESD AOD and TAG)
RAL to LYON/CERN/… 0.3 TB (AOD and TAG)
LYON to LHCb institute 0.3 TB (AOD and TAG)
RAL to LHCb institute 100 GB (ESD for systematic studies)
Planning LHCb computing infrastructure 22 May 2000 Slide 14
MILESTONES for 3 year EU GRID project starting Jan 2001
Mx1 (June 2001) Coordination with the other WP’s.
Identification of use cases and minimal grid services required
at every step of the project. Planning of the exploitation of the
GRID steps.
Mx2 (Dec 2001) Development of use cases programs.
Interface with existing GRID services as planned in Mx1.
Mx3 (June 2002) Run #0 executed (distributed MonteCarlo
production and reconstruction) and feed back provided to the
other WP’s.
Mx4 (Dec 2002) Run #1 executed (distributed analysis) and
corresponding feed-back to the other WP’s. WP workshop.
Mx5 (June 2003) Run #2 executed including additional GRID
functionality.
Mx6 (Dec 2003) Run #3 extended to a larger user community
Planning LHCb computing infrastructure 22 May 2000 Slide 15
‘Agreed’ LHCb resources going into EU GRID project over 3 years
Country FTE equivalent/year CERN 1 France 1 Italy 1 UK 1 Netherlands .5 These people should work together….LHCb GRID CLUB!
This is for HEP applications WP - interfacing our physics software into the GRID and running it in testbed environments
Some effort may also go into testbed WP (? Don’t know if LHCb countries have signed up for this?)
Planning LHCb computing infrastructure 22 May 2000 Slide 16
Grid computing – LHCb planning
Now : Forming GRID technical working group with reps from regional facilities Liverpool(1), RAL(2), CERN(1), IN2P3(?), INFN(?), …
June 2000 : define simulation samples needed in coming years
July 2000 : Install Globus software in LHCb regional centres and start to study integration with LHCb production tools
End 2000 : define grid services for farm production June 2001 : implementation of basic grid services for farm
production provided by EU Grid project Dec 2001 : MDC 1 - small production for test of software
implementation (GEANT4) June 2002 : MDC 2 - large production of signal/background
sample for tests of world-wide analysis model June 2003 : MDC 3 - stress/scalability test on large scale Tier
0 facility, tests of Event Filter Farm, Farm control/management, data throughput tests.
Planning LHCb computing infrastructure 22 May 2000 Slide 17
Prototype Computing Infrastructure
Aim to build a prototype production facility at CERN in 2003 (‘proposal coming out of LHC computing review)
Scale of prototype limited by what is affordable - ~0.5 of the number of components of ATLAS system Cost ~20 MSFr Joint project between the four experiments Access to facility for tests to be shared
Need to develop a distributed network of resources involving other regional centres and deploy data production software over the infrastructure for tests in 2003
Results of this prototype deployment used as basis for Computing MoU
Planning LHCb computing infrastructure 22 May 2000 Slide 18
Tests Using Tier 0 Prototype in 2003
We intend to make use of the Tier 0 prototype planned for construction in 2003 to make stress tests of both hardware and software
We will prepare realistic examples of two types of application : Tests designed to gain experience with the online farm
environment Production tests of simulation, reconstruction, and
analysis
Planning LHCb computing infrastructure 22 May 2000 Slide 19
Switch(Functions as Readout Network)
~100
~100RU
SFC
CPU
CPU
CPU
CPU
CPU
~10
CPC
RU
SFC
CPU
CPU
CPU
CPU
CPU
~10
CPC
Controls Network
Storage Controller(s)
Controls System
Storage/CDR
Readout Network Technology (GbE?)
Sub-Farm Network Technology (Ethernet)Controls Network Technology (Ethernet)
SFC Sub-Farm Controller
CPC Control PC
CPU Work CPU
Event Filter Farm Architecture
Planning LHCb computing infrastructure 22 May 2000 Slide 20
Switch(Functions as Readout Network)
~100
~100RU
SFC
CPU
CPU
CPU
CPU
CPU
~10
CPC
RU
SFC
CPU
CPU
CPU
CPU
CPU
~10
CPC
Storage Controller(s)
Controls System
Storage/CDR
Testing/Verification
Controls Network
Legend
Small Scale Lab Tests +Simulation
Full Scale Lab Tests
Large/Full Scale Tests using Farm Prototype
Planning LHCb computing infrastructure 22 May 2000 Slide 21
Scalability tests for simulation and reconstruction
Test writing of reconstructed+raw data at 200Hz in online farm environment
Test writing of reconstructed+simulated data in offline Monte Carlo farm environment Population of event database from multiple input processes
Test efficiency of event and detector data models Access to conditions data from multiple reconstruction jobs Online calibration strategies and distribution of results to
multiple reconstruction jobs Stress testing of reconstruction to identify hot spots, weak
code etc.
Planning LHCb computing infrastructure 22 May 2000 Slide 22
Scalability tests for analysis
Stress test of event databaseMultiple concurrent accesses by “chaotic”
analysis jobs
Optimisation of data modelStudy data access patterns of multiple,
independent, concurrent analysis jobsModify event and conditions data models as
necessaryDetermine data clustering strategies
Planning LHCb computing infrastructure 22 May 2000 Slide 23
Work required now for planning prototypes for 2003/4(request from Resource panel of LHC review)
Plan for evolution to prototypes (Tier0/1) -who will work on this from the institutes?
Hardware evolution Spending profile Organisation(sharing of responsibilities in
collaboration/CERN/centres) Description of Mock Data Challenges
Draft of proposal (hardware and software) for prototype construction) ? By end 2000
If shared Tier-0 prototype then single proposal for 4 expts??
Planning LHCb computing infrastructure 22 May 2000 Slide 24
Some NEWS from LHCb RC activities (and other..)
LHCb/Italy currently preparing case to be submitted to INFN in June(compatible with planning shown in this talk)
Liverpool Increased COMPASS nodes to 6 (3 TBytes of disk) Bidding for a 1000PC system with 800MHZ/processor and
70Gbyte/processor Globus should be fully installed soon Collaborating with Cambridge Astronomy to test Globus package
Other experiments and the GRID CDF and Babar planning to set up GRID prototypes soon…
GRID workshop in Sep (date and details to be confirmed)
Any other news?