Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | derick-montgomery |
View: | 219 times |
Download: | 0 times |
Outline:
The LHCb Computing ModelPhilippe Charpentier, CERNICFA workshop on Grid activities, Sinaia, Romania, 13-18 October 2006
0110100111011010100010101010110100
B00le
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 2
LHCb in brief
Experiment dedicated to studying CP-violation
Responsible for the dominance of matter on antimatter
Matter-antimatter difference studied using the b-quark (beauty)
High precision physics (tiny difference…)
Single arm spectrometer Looks like a fixed-target
experiment Smallest of the 4 big LHC
experiments ~500 physicists
Nevertheless, computing is also a challenge….
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 3
LHCb data processing software
Simul.Gauss
AnalysisDaVinci
MCHits
DST
Raw Data (r)DSTMCParts
GenParts
Event model / Physics event model
AOD
ConditionsDatabase
Gaudi
Digit.Boole
TriggerMoore
Recons.Brunel
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 4
LHCb software stack
Uses CMT for build and configuration (handling dependencies)
LHCb projects: Applications
Gauss (simulation), Boole (digitisation), Brunel (reconstruction), Moore (HLT), DaVinci (analysis)
Algorithms LBCOM (commone packages), Rec
(reconstruction), Phys (physics), Online
Event model LHCb
Software framework Gaudi
LCG Applications area POOL, root, COOL
Lcg/external External SW: boost, xerces… also
middleware client (lfc, gfal,…)
LHCb
Online
SEALPOOL
Root Ext.Libs
Gau
ss
Boo
le
Bru
nel
Pan
oram
ix
Moo
re
Gaudi
LCG
Framework
App
licat
ions
PhysRecLbcom
Event Model
DaV
inci
Componentprojects
COOL
CORAL
Geant4
GENSER
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 5
LHCb Basic Computing principles
Raw data shipped in real time to Tier-0 Registered in the Grid (File Catalog) Raw data provenance in a Bookkeeping database (query-
enabled) Resilience enforced by a second copy at Tier-1’s Rate: ~2000 evts/s (35 kB) 70 MB/s 4 main trigger sources (with little overlap)
b-exclusive; dimuon; D*; b-inclusive All data processing up to final Tuple or histogram
production distributed Not even possible to reconstruct all data at Tier0…
Part of the analysis is not data-related Extracting physics parameters on CP violation (toy-MC,
complex fitting procedures…) Also using distributed computing resources
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 6
Basic principles (cont’d)
LHCb runs jobs where data are All data are placed explicitly
Analysis made possible by reduction of datasets many different channels of interest very few events in each channel (from 102 to 106 events / year) physicist dealing with maximum 107 events small and simple events final dataset manageable on physicist’s desktop (100’s of
GBytes)
Calibration and alignment performed on a selected part of the data stream
Alignment and tracking calibration using dimuons (~200/s) PID calibration using D* (~100/s)
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 7
LHCb dataflow
Online
MSS-SE
Tier1Tier1Tier1Tier1Tier1MSS-SE
Recons.
Stripping
Simulation.Simulation.
Simulation.Simulation.
Simulation.Simulation.
Simulation.
Raw
Digi
Raw/Digi
rDST
DST
rDST+Raw
Tier1
Tier2Tier0
Analysis
DST
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 8
Comments on the LHCb Distributed Computing
Only last part of the analysis is foreseen to be “interactive” Either analysing ROOT trees or using GaudiPython/pyRoot
User analysis at Tier1’s - why? Analysis is very delicate, needs careful file placement Tier1’s are easier to check, less prone (in principle) to outages CPU requirements are very modest
What is LHCb’s concept of the Grid? It is a set of computing resources working in a collaborative way Provides computing resources for the collaboration as a whole Recognition of contributions is independent on what type of jobs
are run at a site There are no noble and less noble tasks. All are needed to make the
experiment a success Resources are not made available for nationals
Resource high availability is the key issue
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 9
How to best achieve Distributed Computing?
Data Management is primordial It was almost completely absent from EDG R&D
R&D took place but didn’t deliver anything usable (Too) few resources are allocated in EGEE
Successful packages were developed in close collaboration with VOs
LFC, FTS: very close contacts with users SRM v2.2 specification: done after experiments’ request and with
their participation Infrastructure is vital
Resource management 24x7 support coverage Reliable and powerful networks (OPN)
Resource sharing is a must Less support needed Best resource usage (less idle CPUs, empty tapes, unused
networks…) …. but opportunistic resources should not be neglected…
EGEE
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 10
How to best achieve Distributed Computing (cont’d)
Workload Management This received most development effort (EDG, EGEE) Developments were not (are still not) done in so close
collaboration with users Experiments participate in TCG meetings, but their experience is
not enough taken into account Experiments had to develop their own solutions to implement
what they needed A bit of history….
2000-2004: EDG (R&D) 2004: LCG ARDA RTAG - generated great hopes…. 2004- EGEE WMS re-engineering - still not fully exposed to
experiments and not at the expected level (although more stable) Analysis tasks requires a 99% efficiency
In parallel, experiments developed their solutions to cope with these inefficiencies: AliEn, DIRAC
They also allow them to deal with heterogeneous Grids … and take advantage of opportunistic resources
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 11
LHCb Distributed Computing software
Integrated WMS and DMS : DIRAC Presentations by Andrei and Andrew on Sunday
Distributed analysis portal: GANGA Presentation by Ulrik on Friday Uses DIRAC W&DMS as back-end
Main characteristics Implements late job scheduling Overlay network (pilot agents, central task queue) Allows LHCb policy to be enforced Alleviates the level of support required from sites LHCb services designed to be redundant and hence highly
available (multiple instances with failover, VO-BOXes)
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 12
The LHCb Tier1s
6 Tier1s CNAF (IT, Bologna) GridKa (DE, Karlsruhe) IN2P3 (FR, Lyon) NIKHEF (NL, Amsterdam) PIC (ES, Barcelona) RAL (UK, Didcot)
Contribute o Reconstruction Stripping Analysis
Keeps copies on MSS of Raw (2 copies shared) Locally produced rDST DST (2 copies) MC data (2 copies)
Keeps copies on disk of DST (7 copies)
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 13
LHCb Computing: a few numbers
Event sizes on persistent medium
(not in memory) Processing time
Best estimates as of today
Requirements for 2008 4 106 seconds of beam
TDRestimate
Current estimate
Event Size kB
RAW 25 35
rDST 25 20
DST 100 110
Evt processing
kSI2k.s
Reconstruction
2.4 2.4
Stripping 0.2 0.2
Analysis 0.3 0.3
Breakdown of trigger rate (Hz)
200
600
300
900 b-exclusivedimuonD*b-inclusive
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 14
Reconstruction requirements
• 2 passes per year:• 1 quasi real time over ~100 day period (2.8 MSI2k)• re-processing over 2 month period of shutdown (4.3
MSI2k)• Make use of Filter Farm at pit (2.2 MSI2k) - data back to
the pit
b-exclusive
Dimuon D* b-inclusive Total
Input fraction 0.1 0.3 0.15 0.45 1.0
Number of events
8108 2.4109 1.2109 3.6109 8109
MSS storage (TB) 16 48 24 72 160
CPU (MSI2k.yr) 0.15 0.45 0.23 0.68 1.52
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 15
Stripping requirements
• Stripping 4 times per year - 1 month production outside of recons• Stripping has at least 4 output streams• Only rDST stored for “non-b” channels+RAW i.e. 55 kB• RAW+full DST for “b” channels - i.e. 110kB• Output on disk SE at all Tier-1 centres
Exclusive-b dimuon D* Inclusive-b TotalInput fraction 0.1 0.3 0.15 0.45 1.00Reduction factor 10 5 5 100 9.57Event yield per stripping
8107 4.8108 2.4108 3.6107 8.4109
CPU (MSI2k.year) 0.02 0.06 0.03 0.02 0.11Storage requirement per stripping (TB)
9 26 13 4 52
TAG (TB) 1 2 1 4 8
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 16
Simulation requirements
- studies to measure performance of detector & event selection in particular regions of phase space
- use large statistics dimuon & D* samples for systematics - reduced Monte Carlo needs
Application Nos. of events
CPU time/evt (kSI2k.s)
Total CPU (MSI2k.year)
Signal Gauss 8108 75 1.9Boole 8108 1 0.03Brunel 8107 2.4 0.01
Inclusive Gauss 8108 75 1.9Boole 8108 1 0.03Brunel 8107 2.4 0.01
Total 3.87
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 17
Simulation storage requirements
- Simulation still dominate LHCb CPU needs- Current evt size for Monte Carlo DST (with truth info) is
~400kB/evt; - Total storage needs 64TB in 2008- Output at CERN and another 2 copies distributed over
Tier-1 centres
Output Nos. of events
Storage/evt (kB)
Total Storage (TB)
Signal DST 8107 400 32
TAG 8107 1 0.1Inclusive DST 8107 400 32
TAG 8107 1 0.1
Total 64
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 18
Analysis requirements
- user analysis accounted in model predominantly batch - ~30k jobs/year
- predominantly analysing ~106 events- CPU of 0.3 kSI2k.s/evt- Analysis needs grow linearly with year in early phase of
expt
Nos. of physicist performing analysis 140
Nos. of analysis jobs per physicist/week 4
Event size reduction factor after analysis 5
Number of “active” Ntuples 10
2008 CPU needs (MSI2k.years) 0.31
2008 Disk storage (TB) 80
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 19
Summary (incl. efficiencies) for 2008
Data on disk 2008
(TB)
RAW rDST Stripped Simulation Analysis
76 43 775 375 114
CPU needs in 2008
(MSI2k.yr)
Recons. Stripping Simulation Analysis
1.4 0.5 4.6 0.5
Data on tape 2008
(TB)
RAW rDST Stripped Simulation Analysis
560 320 483 128 -
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 20
Summary & evolution of requirements
CPU Evolution
0
5
10
15
20
25
2007 2008 2009 2010Year
MSI
2k.y
ears Tier2s
Tier1sCERN T0 + T1Online Farm
Disk Evolution
0500
100015002000250030003500400045005000
2007 2008 2009 2010Year
TB
Tier2sTier1sCERN T0 + T1
Tape Evolution
0
2000
4000
6000
8000
10000
12000
2007 2008 2009 2010Year
TB
Tier1sCERN T0 + T1
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 21
Conclusions
LHCb has proposed a Computing Model adapted at its specific needs (number of events, event size, low number of physics candidates)
Reconstruction, stripping and analysis resources located at Tier1s (and possibly some Tier2s with enough storage and CPU capacities)
CPU requirements dominated by Monte-Carlo, assigned to Tier2s and opportunistic sites
With DIRAC, even idle desktops / laptops could be used ;-) LHCb@home ?
Requirements are modest compared to other experiments DIRAC is well suited and adapted to this computing model
Integrated WMS and DMS GANGA is being more and more used for submitting user
analysis to the Grid LHCb’s Computing should be ready when first data come
ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 22
Hot news! Stop press!
16 October 22:29 CET
Test jobs running successfully at NIPNE!
16 October 21:32 CET