Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | willis-owen |
View: | 216 times |
Download: | 2 times |
NEC2013 Varna M. Lokajicek 1
Tier 2 PragueInstitute of Physics AS CR
Status and OutlookJ. Chudoba, M. Elias, L. Fiala, J. Horky,
T. Kouba, J. Kundrat, M. Lokajicek, J. Svec, P. Tylka
13. September 2013
2
Outline
• Institute of Physics AS CR (FZU)• Computing Cluster• Networking• LHCONE• Looking for new resources
– CESNET National Storage Facility– IT4I supercomputing project
• Outlook
11 September 2013
3
Institute of Physics AS CR (FZU)
• Institute of Physics of the Academy of the Czech Republic
• 2 locations in Prague, 1 in Olomouc
– In 2012: 786 employees (281 researchers + 78 doctoral students)– 6 Divisions
• Division of Elementary Particle Physics• Division of Condensed Matter Physics• Division of Solid State Physics• Division of Optics• Division of High Power Systems• ELI Beamlines Project Division
• Department of Networking and Computing Techniques (SAVT)
11 September 2013
4
FZU - SAVT
• Institute’s networking and computing service department– Several server rooms– Computing clusters
Golias – Particle physics, Tier2• Few nodes from already before EDG• WLCG iMoU 4 July 2003 (interim)• New server room 1 November 2004• WLCG MoU from 28 April 2008
– Dorje – solid state, condensed matterLuna, Thsun, smaller group clusters
11 September 2013
5
Main server room• Main server room (in FZU, Na Slovance)
– 62 m2, ~20 racks, 350 kVA motor generator, 200 + 2 x 100 kVA UPS, 108 kW air cooling, 176 kW water cooling
– continuous changes– hosts computing servers and central services
11 September 2013
7
Cluster Golias
• Upgraded every yearseveral (9) sub-clusters of the identical HW
• 3800 cores, 30 700 HS06• 2 PB disk space
• Tapes used only for local backups (125 LTO4, max 500 cassettes)
• Serving: ATLAS, ALICE, D0 (NOvA), Auger, STAR, …
• WLCG Tier2Golias@FZU + xrootd servers@REZ (NPI)
11 September 2013
8
Utilization
• Very high average utilization– Several different projects, different tools for production– D0 – production submitted locally by 1 user– ATLAS – panda, ganga, local users; DPM– ALICE – VO box; xrootd
11 September 2013
D0
ATLAS
ALICE
3.5 k
9
„RAW“ CapacitiesHEPSPEC2006 % TB disk %
2009 10 340 186
2010 19 064 100 427 100
2011 23 484 100 1 714 100
2012 29 660 100 2521 100
D0 9 993 34 35 1
ATLAS 12 127 41 1880 (+16 MFF) 74
ALICE 7 540 25 606(+100 Řež) 24
2013 29 660 100 2521 100
D0 9 993 34 35 1
ATLAS 12 127 41 1880 (+16 MFF) 74
ALICE 7540 25 606(+140 Řež) 24
11 September 2013
10
2012 D0, ATLAS and ALICE usage
• ATLAS• 2,2 M tasks• 90 MHEPSEPC06 hours,
1,9 PB disk space• Data transfer 1,2 PB to farm
0,9 PB from farm
• 2% contribution to ATLAS
• ALICE• 2 M simulation tasks• 60 MHEPSEC06 hours• Data transfer 4,7 PB to farm and 0,5 PB from
farm• 5% our contribution to ALICE tasks processing
• 140TB disk space in INF (Tier3)
2012-01
2012-02
2012-03
2012-04
2012-05
2012-06
2012-07
2012-08
2012-09
2012-10
2012-110.0
500.0
1000.0
1500.0
2000.0
2500.0
3000.0
incommingoutgoing
2012 Data transfers inside farm - month means to and from working nodes in TB
11 September 2013
• D0• 290 M tasks• 90 MHEPSPEC06 hours• 13% contribution to D0
12
Network - CESNET, z. s. p. o.• FZU Tier2 Network connections
– 10 Gbps LHCONE (GEANT), 18 July 2013– 10 Gbps KIT from 1st Sept 2013
11 September 2013
– 1 Gbps FNAL, BNL, Taipei– 10 Gbps to commodity network– 1-10 Gbps to Tier3 collaborating
institutes http://netreport.cesnet.cz/netreport/hep-cesnet-experimental-facility2/
13
LHCONE - Network transition
11 September 2013
• Link to KIT saturated at 1 Gbps E2E line
• LHCONE from 18 July 2013 over 10 Gbps infrastructure
• Relieves also the commodity network
10 Gbps
14
Atlas tests
• Testing upload speed of files > 1 GB to all Tier1 centra
• After LHCONE connection only 2 sites with < 5MB/s
• Prague Tier2 ready for validation as T2D
11 September 2013
30 60
15
LHCONE – trying to understand monitoring
• L
11 September 2013
Prague – DESY Very asymmetric throughput
LHCONE line cut
DESY – PragueLHCONE optical line cutAt 4:00One way latency improved
16
International contribution of Prague center to ATLAS + ALICE centra T2 LCG
• http://accounting.egi.eu/• Grid + local tasks• Long term slide down until we
received regular financing in 2008
• Original 3% target is not achievable with current financial resources
• Necessary to look for other resources
2005 2006 2007 2008 2009 2010 2011 20120
1
2
3
4
5
6
jobs %cpu %
11 September 2013
17
Remote storage
11 September 2013
• CESNET - Czech NREN + other services
• New project: National storage facility
FZU Tier-2 in Prague
CESNET storage site in Pilsen
100km
FZU<->Pilsen - 10Gbit link with ~3.5ms latency
• Three distributed HSM based storage sites
• Designed for research and science community
– 100TB for both ATLAS and Auger experiments offered
– Implemented as remote Storage Element with dCache
– disk <-> tape migration
18
Remote storage
11 September 2013
remote/local Method TTreeCache events/s ( %) Bytes transferred %CPU Efficiency
local rfio ON 100% 117% 98,9%
local rfio OFF 74% 100% 72,7%
remote dCap ON 75% 101% 73,5%
remote dCap OFF 46% 100% 46,9%
• TTreeCache in ROOT helps a lot – both for local and for remote transfers
• TTreeCached remote jobs faster than local ones without the cache
Influence of distributing a Tier-2 data storage on physics analysis
19
Outlook • In 2015 after LHC start up
– Higher data production– Flat financing not sufficient– Computing can become an item of
M&O A (Maintenance &Operations cat. A)
• Search for new financial resources or new unpaid capacities necessary– CESNET
• Crucial free delivery of network infrastructure
• Unpaid External storage, how long?
– IT4I, Czech supercomputing project search for computing capacities (free cycles), relying on other project to find the way how to use them
11 September 2013
20
• 16th International workshop on Advanced Computing and Analysis Techniques in physics (ACAT)
• http://www.particle.cz/acat2014
• Topics• Computing Technology
for Physics Research• Data Analysis -
Algorithms and Tools• Computations in
Theoretical Physics: Techniques and Methods
11 September 2013