Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | annabelle-simmons |
View: | 214 times |
Download: | 0 times |
LCG-France
Vincent Breton, Eric Lançon and Fairouz Malek,CNRS-IN2P3 and LCG-France
ISGC SymposiumTaipei, March 27th 2007
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 2
LCG-France Project
• Goals Setup, develop and maintain a LCG Tier-1 and an Analysis Facility at
CC-IN2P3 Promote the creation and coordinate the integration of Tier-2/Tier-3
french sites into the LCG collaboration
• Funding national funding for tier-1 and AF Tier-2s and tier-3s funded by universities, local/regional governments,
hosting laboratories, …
• Organization Started in June 2004 Scientific and technical leaders appointed, management board
(executive) and overview boards in place since then
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 3
LCG-France sites
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 4
LCG France contribution (October 2005 - September 2006)
• EGEE « CPU accounting » per EGEE region French contribution includes LCG-France sites
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 5
sites contribution to LCG France (September 2005 - October 2006)
• « CPU accounting » per site for all EGEE Virtual Organizations
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 6
Tier-1 Contribution
• Planned contribution of LCG-France Tier-1 % of required resources for all tier-1s in 2008
Source: Comparison of New Requirements with Current Pledges – 24/10/2006
CPU11%
89%
Disk11%
89%
MSS9%
91%
CC-IN2P3
Other Tier-1s
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 7
Tier-1 Contribution (cont.)
• Planned contribution of LCG-France tier-1 % of required resources in all tier-1s in 2008 (to be finalized)
0%
5%
10%
15%
20%
25%
30%
CPU 8% 11% 10% 27%
Disk 8% 11% 11% 27%
MSS 8% 11% 7% 27%
Alice Atlas CMS LHCb
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 8
Tier-2s Contribution
Planned Contribution of LCG-France Tier-2 Sites(% of required resources in All Tier-2 Sites in 2008)
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
10%
CPU 8% 9% 3% 6%
Storage 8% 5% 8% 0%
Alice Atlas CMS LHCb
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 9
Tier-1 Planned Evolution
Increase rate over the period 2006-2010:
CPU: x 17DISK: x 16MSS: x 18
Increase rate over the period 2006-2010:
CPU: x 17DISK: x 16MSS: x 18
Tier-1 & Analysis Facility Resource Deployment
0
2 000
4 000
6 000
8 000
10 000
12 000
2006 2007 2008 2009 2010
k S
I20
00
0
5 000
10 000
15 000
20 000
25 000
Te
raB
yte
DISK MSS CPU
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 10
Tier-2s Planned Evolution
0
2 000
4 000
6 000
8 000
10 000
12 000
14 000
CP
U [
k S
I200
0]
Subatech 56 112 280 280 280
LPC Clermont 200 300 300 300 300
GRIF 283 660 1 920 3 440 3 440
AF Lyon 44 1 115 2 996 4 261 7 556
2006 2007 2008 2009 20100
500
1 000
1 500
2 000
2 500
D
isk
[T
B]
Subatech 15 32 78 78 78
LPC Clermont 25 25 25 25 25
GRIF 72 168 560 1 096 1 096
AF Lyon 7 137 396 550 887
2006 2007 2008 2009 2010
Roughly equivalent to the planned Tier-1 CPU capacity the
same year
Roughly equivalent to the planned Tier-1 CPU capacity the
same year
43% of the planned Tier-1
disk capacity the same year
43% of the planned Tier-1
disk capacity the same year
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 11
Tier-3s: Planned Evolution
LCG-France Tier-3s
Planned Disk Storage Capacity
0
10
20
30
40
50
60
70
80
Ter
aByt
es
CPPM 2 25 25 25 50
LAPP 11 20 20 20 20
2006 2007 2008 2009 2010
LCG-France - Tier-3s
Planned CPU Capacity
0
100
200
300
400
500
600
700
800
k SI
2000
CPPM 83 200 200 200 400
LAPP 126 315 315 315 315
2006 2007 2008 2009 2010
Data for IPNL (Lyon) are not includedData for IPNL (Lyon) are not included
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 12
Tier-2/Tier-3 Activities
• Coordination of LCG-France tier-2/tier-3 technical activities officially set up in April 2006 Frédérique Chollet is leading the group Collaboration tools in place
Mailing list, wiki pages, regular video-conference meetings
• Activities Very active in the Quattor working group
Used by most of the LCG-France sites Network-level and SRM-level data transfer tests from and
to tier-1 Including associated foreign sites (more on this later)
Meetings held with several potential hardware providers Sharing of technical and commercial information (hardware
evaluation results, commercial conditions, etc.)
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 13
Tier-2/Tier-3 Activities (cont.)
• In close contact with some foreign associated tier-2s Europe
Belgium CMS Tier-2Romanian Federation ATLAS Tier-2
Asia IHEP China - ATLAS and CMS Tier2 ICEPP Japan - ATLAS Tier2
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 14
Tier-1: site overview
Site B
DII
VO BoxVO LHC
ComputingElement
ComputingElement
StorageElement
StorageElement
VOMS4 VOs
LFC CentralBiomed
HPSS DCACHE
StorageElement
SRMGridftpGridftp
XFS
Stockage
BQS
Anastasie
WN WN WN WN WN WN WN WN
Calcul
LFC Local4 VOs LHC
FTS4 VOs LHC
MonBox4 Sites
Système d’information de la grille
VO BoxVO LHCV OBox
VO LHCVO BoxVO LHC
Service globalService global
Service régional/fédéralService régional/fédéral
Service localService local
Cou
rtes
y of
Pie
rre
Gir
ard
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 15
Tier-1: site overview (cont.)
• Operating also several grid services for non-LHC VOs
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 16
Contribution of LCG-France Tier-1J anuary-December 2006
0%
5%
10%
15%
20%
25%
CPU Time (% of All Tier-1s) 23% 9% 5% 14%
Alice Atlas CMS LHCb
Tier-1 Contribution in 2006
• CPU time contributed by the french tier-1 in 2006 % of CPU time (grid and non-grid) used by the experiments in all the tier-1s
The CC-IN2P3 contribution to the global effort in 2006 was 10% of the total CPU used by the 4 experiments in all the tier-1s.
The CC-IN2P3 contribution to the global effort in 2006 was 10% of the total CPU used by the 4 experiments in all the tier-1s.
So
urc
e:
WL
CG
Acc
ou
ntin
g R
ep
ort
Tie
r-1
Ce
ntr
es
+ C
ER
N
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 17
Tier-1 Contribution in 2006 (cont.)
• CPU utilisation by LHC experiments at all the tier-1s and at CC-IN2P3
Source: http://www3.egee.cesga.es/gridsite/accounting/CESGA/tier1_view.html
Alice13%
Atlas48%
CMS12%
LHCb27%
All Tier-1s(does not include non-grid usage of some sites) CC-IN2P3 (grid and non-grid)
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 18
0%
10%
20%
30%
40%
50%
60%
70%
80%
Capacity Consumed byGrid Jobs (% of the totalexperiment's consumption)
70% 66% 41% 15%
Grid Jobs (% of the totalnumber of jobs of theexperiment)
62% 47% 59% 54%
Alice Atlas CMS LHCb
Tier-1: grid vs. non-grid usage
• Site usage (grid vs. non-grid) greatly varies from one experiment to another Both in terms of
consumed capacity and number of jobs
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 19
Tier-1: efficiency (CPU time vs. wallclock)
ALICE - Efficiency (CPU Time vs. Wallclock Time)
0%
50%
100%
150%
200%
2006
-01
2006
-02
2006
-03
2006
-04
2006
-05
2006
-06
2006
-07
2006
-08
2006
-09
2006
-10
2006
-11
2006
-12
Efficiency Grid J obs Efficiency Non-Grid J obs
ATLAS - Efficiency (CPU Time vs. Wallclock Time)
0%
20%
40%
60%
80%
100%
2006
-01
2006
-02
2006
-03
2006
-04
2006
-05
2006
-06
2006
-07
2006
-08
2006
-09
2006
-10
2006
-11
2006
-12
Efficiency Grid J obs Efficiency Non-Grid J obs
CMS - Efficiency (CPU Time vs. Wallclock Time)
0%
20%
40%
60%
80%
100%
2006
-01
2006
-02
2006
-03
2006
-04
2006
-05
2006
-06
2006
-07
2006
-08
2006
-09
2006
-10
2006
-11
2006
-12
Efficiency Grid J obs Efficiency Non-Grid J obs
LHCb - Efficiency (CPU Time vs. Wallclock Time)
0%
20%
40%
60%
80%
100%
2006
-01
2006
-02
2006
-03
2006
-04
2006
-05
2006
-06
2006
-07
2006
-08
2006
-09
2006
-10
2006
-11
2006
-12
Efficiency Grid J obs Efficiency Non-Grid J obs
Measurement error.
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 20
Tier-1: CPU planned vs. actual consumption
2006 - Planned vs. Consumed CPU Capacity
0
500 000
1 000 000
1 500 000
2 000 000
2 500 000
3 000 000
3 500 000
4 000 000
4 500 000
5 000 000
kH S
I200
0
Planned 1 927 200 4 380 000 2 628 000 1 708 200
Consumed 362 637 1 368 733 343 740 749 742
Alice Atlas CMS LHCb
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 21
Tier-1: CPU capacity delivered
Total Installed Computing Capacity -- 2006
0,0
0,5
1,0
1,5
2,0
2,5
2006
-01
2006
-02
2006
-03
2006
-04
2006
-05
2006
-06
2006
-07
2006
-08
2006
-09
2006
-10
2006
-11
2006
-12
Million
s Hou
rs S
I2000
Unavailable CapacityAvailable Capacity for Non-LHC ExperimentsAvailable Capacity for LHC Experiments
Several service interruptions in august and september due to problems
with the cooling or power infrastructure
4 days-long scheduled complete shutdown of the
site for replacing some central electric equipement
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 22
Tier-1: storage delivered
• Disk storage capacity Delivered 34% (180 TB out of 520 TB planned) More on this later
• Tape storage capacity Installed capacity (as planned) of 535 TB (of
which 73% was actually used)
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 23
Tier-1: data transfer exercises
• CERN → CC-IN2P3 (disk) April 2006
• CERN → CC-IN2P3 (MSS) April 2006
Target: 200 MB/sec
Target: 75 MB/sec
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 24
Tier-1: site availability
So
urc
e:
htt
p:/
/lcg
.we
b.c
ern
.ch
/LC
G/M
B/a
vaila
bili
ty/s
ite_
relia
bili
ty.p
df
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 25
Tier-1: capacity increase in 2006
• CPU +265 worker nodes (IBM, dual-
processor dual-core AMD Opteron 275, 2.2 GHz, 2 GB/core, 290 GB internal disk)
Theoretical power: 1573 SI2000 per core
Total: 1,6 M SI2000 Observed power with typical
applications is ~30% less than theoretical
• Disk storage +400 TB of rack-mounted Sun
Fire X4500 (aka Thumper)
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 26
Tier-1: capacity increase in 2006 (cont.)
• Tape storage Call for tender for a new
cartridge library Selected Sun/StorageTek
SL8500 30 T10000 drives 10 LTO-3 drives
Will progressively replace the current one 6 silos
Installation started in January Expected to be finished by
end of april 2007
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 27
Tier-1: capacity increase in 2006 (cont.)
• Databases Reconfiguration of Oracle cluster
Extensible hardware architecture +1 TB added to the dedicated SAN (2 TB total) +3 front-end database servers (5 total)
2 of them will share the load of the LHC experiments
• International connectivity Dedicated link CC-IN2P3↔CERN 10 Gbps
2 x 1 Gbps links CC-IN2P3 ↔ Fermilab
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 28
Hardware procurement
• Procurement process (evaluation, publication, selection) is more or less under control Delivery delays are not! In 2006, we suffered delivery delays of several
months for some equipmentA fraction of the equipment is still not delivered!
• Procurement of equipment is an issue Several constraints: space in the machine room,
budget constraints, delivery delays, requested availability, …
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 29
Facility Upgrade
• Major effort for upgrading the electric and cooling infrastructure of the site Currently reaching the limits of the installation When the current works will be finished (april 2007)
from 500 kW to 1000 kW usable for computing equipement
Cou
rtes
y of
Dom
iniq
ue B
outig
ny
Puissance Électrique MOYENNE par mois
200
300
400
500
600
700
800
900
1 000
kW
Average total electrical power monthly consumption
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 30
Facility Upgrade (cont.)
• Scheduled 4 days-long complete shutdown of the site in December 2006 for replacing central electric equipment Vital services (network equipment, mail servers, web
servers, Oracle, FTS, LFCs, VOMS,…) were kept alive by ad hoc means) Extensive use of virtual machines
Others services have been switched to partner sites CIC Portal was hosted by CNAF during the shutdown and
switched back to CC-IN2P3 afterwards Failover procedure tested in real conditions
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 31
Plans for 2007
• Tier-1 Consolidate current grid services and integrate them into « normal »
operations Works towards the stability desired not only by the experiments but by the
people operating the services Increase network bandwidth with tier-2s and backup link to other tier-
1s through FZK Increase the rythm of the new machine room building project planned
for 2009• Tier-2s/Tier-3s
Improve availability of the sites Keep exercising the data transfer infrastructure
• All Make sure site administrators understand the ways data will be
accessed (!)
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 32
Conclusions
• Participating sites are very motivated to contribute to this project… … but it is harder than most of us expected
• Ramp up plans of sites are rather aggressive Several constraints don’t really make our life
easier
• Operating the grid services in their current status is complex and requires (highly competent and motivated) people
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 33
For more information
• LCG-France website http://lcg.in2p3.fr• LCG-France T2-T3 Technical coordination
wiki page:http://lcg.in2p3.fr/wiki/index.php/T2-T3
• CC-IN2P3: http://cc.in2p3.fr• LCG-France Tier-1 resource planning
https://edms.in2p3.fr/document/I-004736
• LCG-France Tier-2s resource planning https://edms.in2p3.fr/document/I-008142
V. Breton, E. Lançon and F. Malek, ISGC symposium, March 27th 2007, Taipei 34
Aknowledgments
• Thanks to the people that contributed material to this talk Most of the slides are taken from Fabio
Hernandez talk at WLCG Collaboration Workshop (CERN, January 22nd 2007)
Special thanks to Eric Lançon and Fairouz Malek