+ All Categories
Home > Documents > Tier 2 Prague Institute of Physics AS CR Status and Outlook J. Chudoba, M. Elias, L. Fiala, J....

Tier 2 Prague Institute of Physics AS CR Status and Outlook J. Chudoba, M. Elias, L. Fiala, J....

Date post: 23-Dec-2015
Category:
Upload: willis-owen
View: 216 times
Download: 2 times
Share this document with a friend
Popular Tags:
21
Tier 2 Prague Institute of Physics AS CR Status and Outlook J. Chudoba, M. Elias, L. Fiala, J. Horky, T. Kouba, J. Kundrat, M. Lokajicek, J. Svec, P. Tylka 13. September 2013 1 NEC2013 Varna M. Lokajicek
Transcript

NEC2013 Varna M. Lokajicek 1

Tier 2 PragueInstitute of Physics AS CR

Status and OutlookJ. Chudoba, M. Elias, L. Fiala, J. Horky,

T. Kouba, J. Kundrat, M. Lokajicek, J. Svec, P. Tylka

13. September 2013

2

Outline

• Institute of Physics AS CR (FZU)• Computing Cluster• Networking• LHCONE• Looking for new resources

– CESNET National Storage Facility– IT4I supercomputing project

• Outlook

11 September 2013

3

Institute of Physics AS CR (FZU)

• Institute of Physics of the Academy of the Czech Republic

• 2 locations in Prague, 1 in Olomouc

– In 2012: 786 employees (281 researchers + 78 doctoral students)– 6 Divisions

• Division of Elementary Particle Physics• Division of Condensed Matter Physics• Division of Solid State Physics• Division of Optics• Division of High Power Systems• ELI Beamlines Project Division

• Department of Networking and Computing Techniques (SAVT)

11 September 2013

4

FZU - SAVT

• Institute’s networking and computing service department– Several server rooms– Computing clusters

Golias – Particle physics, Tier2• Few nodes from already before EDG• WLCG iMoU 4 July 2003 (interim)• New server room 1 November 2004• WLCG MoU from 28 April 2008

– Dorje – solid state, condensed matterLuna, Thsun, smaller group clusters

11 September 2013

5

Main server room• Main server room (in FZU, Na Slovance)

– 62 m2, ~20 racks, 350 kVA motor generator, 200 + 2 x 100 kVA UPS, 108 kW air cooling, 176 kW water cooling

– continuous changes– hosts computing servers and central services

11 September 2013

611 September 2013

7

Cluster Golias

• Upgraded every yearseveral (9) sub-clusters of the identical HW

• 3800 cores, 30 700 HS06• 2 PB disk space

• Tapes used only for local backups (125 LTO4, max 500 cassettes)

• Serving: ATLAS, ALICE, D0 (NOvA), Auger, STAR, …

• WLCG Tier2Golias@FZU + xrootd servers@REZ (NPI)

11 September 2013

8

Utilization

• Very high average utilization– Several different projects, different tools for production– D0 – production submitted locally by 1 user– ATLAS – panda, ganga, local users; DPM– ALICE – VO box; xrootd

11 September 2013

D0

ATLAS

ALICE

3.5 k

9

„RAW“ CapacitiesHEPSPEC2006 % TB disk %

2009 10 340 186

2010 19 064 100 427 100

2011 23 484 100 1 714 100

2012 29 660 100 2521 100

D0 9 993 34 35 1

ATLAS 12 127 41 1880 (+16 MFF) 74

ALICE 7 540 25 606(+100 Řež) 24

2013 29 660 100 2521 100

D0 9 993 34 35 1

ATLAS 12 127 41 1880 (+16 MFF) 74

ALICE 7540 25 606(+140 Řež) 24

11 September 2013

10

2012 D0, ATLAS and ALICE usage

• ATLAS• 2,2 M tasks• 90 MHEPSEPC06 hours,

1,9 PB disk space• Data transfer 1,2 PB to farm

0,9 PB from farm

• 2% contribution to ATLAS

• ALICE• 2 M simulation tasks• 60 MHEPSEC06 hours• Data transfer 4,7 PB to farm and 0,5 PB from

farm• 5% our contribution to ALICE tasks processing

• 140TB disk space in INF (Tier3)

2012-01

2012-02

2012-03

2012-04

2012-05

2012-06

2012-07

2012-08

2012-09

2012-10

2012-110.0

500.0

1000.0

1500.0

2000.0

2500.0

3000.0

incommingoutgoing

2012 Data transfers inside farm - month means to and from working nodes in TB

11 September 2013

• D0• 290 M tasks• 90 MHEPSPEC06 hours• 13% contribution to D0

1111 September 2013

12

Network - CESNET, z. s. p. o.• FZU Tier2 Network connections

– 10 Gbps LHCONE (GEANT), 18 July 2013– 10 Gbps KIT from 1st Sept 2013

11 September 2013

– 1 Gbps FNAL, BNL, Taipei– 10 Gbps to commodity network– 1-10 Gbps to Tier3 collaborating

institutes http://netreport.cesnet.cz/netreport/hep-cesnet-experimental-facility2/

13

LHCONE - Network transition

11 September 2013

• Link to KIT saturated at 1 Gbps E2E line

• LHCONE from 18 July 2013 over 10 Gbps infrastructure

• Relieves also the commodity network

10 Gbps

14

Atlas tests

• Testing upload speed of files > 1 GB to all Tier1 centra

• After LHCONE connection only 2 sites with < 5MB/s

• Prague Tier2 ready for validation as T2D

11 September 2013

30 60

15

LHCONE – trying to understand monitoring

• L

11 September 2013

Prague – DESY Very asymmetric throughput

LHCONE line cut

DESY – PragueLHCONE optical line cutAt 4:00One way latency improved

16

International contribution of Prague center to ATLAS + ALICE centra T2 LCG

• http://accounting.egi.eu/• Grid + local tasks• Long term slide down until we

received regular financing in 2008

• Original 3% target is not achievable with current financial resources

• Necessary to look for other resources

2005 2006 2007 2008 2009 2010 2011 20120

1

2

3

4

5

6

jobs %cpu %

11 September 2013

17

Remote storage

11 September 2013

• CESNET - Czech NREN + other services

• New project: National storage facility

FZU Tier-2 in Prague

CESNET storage site in Pilsen

100km

FZU<->Pilsen - 10Gbit link with ~3.5ms latency

• Three distributed HSM based storage sites

• Designed for research and science community

– 100TB for both ATLAS and Auger experiments offered

– Implemented as remote Storage Element with dCache

– disk <-> tape migration

18

Remote storage

11 September 2013

remote/local Method TTreeCache events/s ( %) Bytes transferred %CPU Efficiency

local rfio ON 100% 117% 98,9%

local rfio OFF 74% 100% 72,7%

remote dCap ON 75% 101% 73,5%

remote dCap OFF 46% 100% 46,9%

• TTreeCache in ROOT helps a lot – both for local and for remote transfers

• TTreeCached remote jobs faster than local ones without the cache

Influence of distributing a Tier-2 data storage on physics analysis

19

Outlook • In 2015 after LHC start up

– Higher data production– Flat financing not sufficient– Computing can become an item of

M&O A (Maintenance &Operations cat. A)

• Search for new financial resources or new unpaid capacities necessary– CESNET

• Crucial free delivery of network infrastructure

• Unpaid External storage, how long?

– IT4I, Czech supercomputing project search for computing capacities (free cycles), relying on other project to find the way how to use them

11 September 2013

20

• 16th International workshop on Advanced Computing and Analysis Techniques in physics (ACAT)

• http://www.particle.cz/acat2014

• Topics• Computing Technology

for Physics Research• Data Analysis -

Algorithms and Tools• Computations in

Theoretical Physics: Techniques and Methods

11 September 2013

21

Backup

11 September 2013


Recommended