+ All Categories
Home > Documents > The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet...

The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet...

Date post: 04-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
29
The Worldwide LHC Computing Grid Nils Høimyr IT Department
Transcript
Page 1: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

The Worldwide LHC Computing Grid

Nils HøimyrIT Department

Page 2: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

LHC accelerator and detectors

Ultra high vacuum, colder than outer space

LHC ring:27 km circumference

CMS

ALICE

LHCb

ATLAS

Page 3: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

7000 tons, 150 million sensorsgenerating data 40 millions times per second

The ATLAS experiment

Page 4: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

4

A collision at LHC

January 2013 - The LHC Computing Grid - Nils Høimyr

Page 5: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

COLLISIONS

30/11/2016 OpenStack Switzerland Meetup 5

Collisions Produce 1PB/s

Page 6: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Pick the interesting events• 40 million per second

• Fast, simple information• Hardware trigger in

a few micro seconds

• 100 thousand per second• Fast algorithms in local

computer farm • Software trigger in <1 second

• Few 100 per second• Recorded for study

6

Muontracks

Energydeposits

17 January 2017 Computing for the LHC

Page 7: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Pick the interesting events: Data size

• 40 million per second• Fast, simple information• Hardware trigger in

a few micro seconds

• 100 thousand per second• Fast algorithms in

computers • Software trigger

• Few 100 per second• Recorded for study

Computing for the LHC 7

~1 Petabyte per second?• Cannot afford to store it

• 1 year’s worth of LHC data at 1 PB/s would cost few hundred trillion dollars/euros

• Have to filter in real time to keep only “interesting” data

• We keep 1 event in a million • Yes, 99.9999% is thrown away

>>6 Gigabytes per second

17 January 2017

Page 8: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN Data Centre

• Built in the 70s on the CERN site (Meyrin-Geneva), 3.5 MW for equipment

• Extension located at Wigner (Budapest), 2.7 MW for equipment• Connected to the Geneva CC with 3x100Gb links (24 ms RTT)• Hardware generally based on commodity• 15,000 servers, providing 190,000 processor cores• 80,000 disk drives providing 250 PB disk space• 104 tape drives, providing 140 PB

Computing for the LHC 817 January 2017

Page 9: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

OpenStack day CERN 27/05/19 [email protected] 9

Worldwide computing2017:- 63 MoU’s- 167 sites; 42 countries

2017:- 63 MoU’s- 167 sites; 42 countries

Page 10: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not exist

We had to invent all of the tools from scratch At CERN we had no tools to manage a data centre at the scale we thought was needed (no commercial or OS tools

existed) Initial tools developed through EU Data Grid – Open Source from the beginning

Grid ideas from computer science did not work in the real world at any reasonable scale We (EU, US, LHC grid projects) had to make them work at scale We had to invent trust networks to convince funding agencies to open their resources to federated users

Our users were not convinced that any of this was needed ;-)

OpenStack day CERN 27/05/19 [email protected] 10

Evolution of Grids

26

Data Challenges

First physics

GRID 3

EGEE 1

LCG 1

EU DataGrid

GriPhyN, iVDGL, PPDG

EGEE 2

OSG

LCG 2

EGEE 3

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

WLCG

CosmicsService Challenges

26 June 2009 Ian Bird, CERN

Public national and international grid infrastructures

WLCG for LHC

“Grid” Federated, distributed, computing infrastructure

“Grid” Federated, distributed, computing infrastructure

Page 11: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Data - 2018

OpenStack day CERN 27/05/19 [email protected] 11

2018: 88 PBATLAS: 24.7CMS: 43.6LHCb: 7.3ALICE: 12.4

inc. parked b-physics data

Data transfers

HI Run

14 PB in August

0

1

2

3

4

5

6

7

8

2010 Jan

2010 M

ar

2010 M

ay

2010 Jul

2010 Sep

2010 N

ov

2011 Jan

2011 M

ar

2011 M

ay

2011 Jul

2011 Sep

2011 N

ov

2012 Jan

2012 M

ar

2012 M

ay

2012 Jul

2012 Sep

2012 N

ov

2013 Jan

2013 M

ar

2013 M

ay

2013 Jul

2013 Sep

2013 N

ov

2014 Jan

2014 M

ar

2014 M

ay

2014 Jul

2014 Sep

2014 N

ov

2015 Jan

2015 M

ar

2015 M

ay

2015 Jul

2015 Sep

2015 N

ov

2016 Jan

2016 M

ar

2016 M

ay

2016 Jul

2016 Sep

2016 N

ov

2017 Jan

2017 M

ar

2017 M

ay

2017 Jul

2017 Sep

2017 N

ov

2018 Jan

2018 M

ar

2018 M

ay

2018 Jul

2018 Sep

2018 N

ov

2019 Jan

2019 M

ar

Billion H

S06-h

ours

CPU Delivered: HS06-hours/month

ALICE ATLAS CMS LHCb

0

1

2

3

4

5

6

7

8

2010 Jan

2010 M

ar

2010 M

ay

2010 Jul

2010 Sep

2010 N

ov

2011 Jan

2011 M

ar

2011 M

ay

2011 Jul

2011 Sep

2011 N

ov

2012 Jan

2012 M

ar

2012 M

ay

2012 Jul

2012 Sep

2012 N

ov

2013 Jan

2013 M

ar

2013 M

ay

2013 Jul

2013 Sep

2013 N

ov

2014 Jan

2014 M

ar

2014 M

ay

2014 Jul

2014 Sep

2014 N

ov

2015 Jan

2015 M

ar

2015 M

ay

2015 Jul

2015 Sep

2015 N

ov

2016 Jan

2016 M

ar

2016 M

ay

2016 Jul

2016 Sep

2016 N

ov

2017 Jan

2017 M

ar

2017 M

ay

2017 Jul

2017 Sep

2017 N

ov

2018 Jan

2018 M

ar

2018 M

ay

2018 Jul

2018 Sep

2018 N

ov

2019 Jan

2019 M

ar

Billion H

S06-h

ours

CPU Delivered: HS06-hours/month

ALICE ATLAS CMS LHCb

~ 860 k cores continuous

Page 12: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

HPC Use is challengingHEP engagement with DOE & NSF in USA and (together with SKA) with PRACE and EuroHPC in Europeand participating in BDEC2 workshops

Heterogenous computing The majority of today’s HEP processing is

performed on dedicated clusters of commodity processors (“x86”)

Recently: opportunistic use of many types of compute, in particular HPC systems, and HLT

In future, this heterogeneity will expand; we must be able to make use of all types: Non-x86 (esp GPU), HPC, clouds, HLT farms

(inc FPGA?)

OpenStack day CERN 27/05/19 [email protected] 12

HLT

Page 13: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Heterogenous compute Requires:

Common provisioning mechanisms, transparent to users

Facilities able to control access (cost), appropriate use, etc

HPC, Clouds, HLT will not have (affordable) local storage service (in the way we assume today) Must be able to deliver data to

them when they are in active use

OpenStack day CERN 27/05/19 [email protected] 13

Deployed in a hybrid cloud mode: • Procurers’ data centres• commercial cloud

service providers • GEANT network and

EduGAIN Federated Identity Management

Page 14: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

LHC Schedule

Run 3 Alice, LHCb upgrades

Run 4 ATLAS, CMS upgrades

OpenStack day CERN 27/05/19 [email protected] 14

Page 15: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

The HL-LHC computing challenge

HL-LHC needs for ATLAS and CMS are above the expected hardware technology evolution (15% to 20%/yr) and funding (flat)

The main challenge is storage, but computing requirements grow 20-50x

OpenStack day CERN 27/05/19

[email protected] 15

ATLAS and CMS had to cope with monster pile-up

With L=1.5 x 1034 cm-2 s-1 and 8b4e bunch structure à pile-up of ~ 60 events/x-ing (note: ATLAS and CMS designed for ~ 20 events/x-ing)

CMS: event with 78 reconstructed vertices

CMS: event from 2017 with 78reconstructed vertices

ATLAS: simulation for HL-LHC with 200 vertices

Page 16: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Googlesearches

98 PB

LHC Sciencedata

~200 PBSKA Phase 1 –

2023~300 PB/yearscience data

HL-LHC – 2026~600 PB Raw data

HL-LHC – 2026~1 EB Physics data

SKA Phase 2 – mid-2020’s~1 EB science data

LHC – 201650 PB raw data

Facebookuploads180 PB

GoogleInternet archive~15 EB

Yearly data volumes

10 Billion of theseOpenStack day CERN 27/05/19 [email protected] 16

Page 17: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Many software challenges ● Improved algorithms, Machine Learning (ML)

– “ML” as Neural Networks used for more than 20 years in HEP

– A lot of development in the IT industry in this area, scope for re-use and improvements

● Vectorisation, GPUs, FPGAs, other processor architectures

● Data Analysis model and software changes

● Visualisation

● Storage and preservation

Page 18: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Grid vs Cloud

• “Cloud computing” everywhere– Web based solutions (http/https and RES)

– Virtualisation, containers….

• GRID has mainly a scientific user base– Complex applications running across multiple sites, but

works like a cluster batch system for the end user

– Mainly suitable for parallel computing and massive data processing

• Technologies converging– “Internal Cloud” at CERN – OpenStack

– Xbatch – extending to external cloud providers

– CernVM – virtual machine running e.g. at Amazon

– Google Cloud Kubernetes Higgs Challenge

– “Volunteer Cloud” - LHC@home 2.0

Page 19: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Volunteer grid - LHC@home

• LHC volunteer computing– Allows us to get additional computing

resources for e.g. accelerator physics and theory simulations

• Based on BOINC– “Berkeley Open Infrastructure for

Network Computing”

– Software platform for distributed computing using volunteered computer resources

– Uses a volunteer PC’s unused CPU cycles to analyse scientific data

– Virtualization support - CernVM

– Other well known projects• SETI@Home

• Climateprediction.net

• Einstein@Home

Page 20: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

20

You can help us!

• As a volunteer, you can help us by donating CPU when your computer is idle

• Connect with us on: – http://cern.ch/lhcathome

20

Page 21: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

The Balance between Academic Freedom, Operations & Computer Security

http://cern.ch/security

Page 22: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Open Data

Page 23: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it19 August 2015 CERN-ITU - Frédéric Hemmer 23

http://opendata.cern.ch

http://zenodo.org

http://cds.cern.ch

Page 24: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Open Data – Open KnowledgeCERN & the LHC experiments have made the first steps towards Open Data (http://opendata.cern.ch/)

- Key drivers: Educational Outreach & Reproducibility

- Increasingly required by Funding Agencies

- Paving the way for Open Knowledge as envisioned by DPHEP (http://dphep.org)

- ICFA Study Group on Data Preservation and Long Term Analysis in High Energy Physics

CERN has released Zenodo, a platform for Open Data as a Service (http://zenodo.org)1

• Building on experience of Digital Libraries & Extreme scale data management

• Targeted at the long tail of science

• Citable through DOIs, including the associated software

• Generated significant interest from open data publishers such as Wiley, Ubiquity, F1000, eLife, PLOS

1Initially cofunded by the EC FP7 OpenAire series of projects

19 August 2015 CERN-ITU - Frédéric Hemmer 24

Page 25: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Training

16 January 2015 IT Department Meeting 25

Page 26: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN School of Computing

13 October 2016 CERN-NTNU Nils Høimyr 26

Page 27: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

• A science – industry partnership to drive R&D and innovation with over a decade of success

• Evaluate state-of-the-art technologies in a challenging environment and improve them

• Test in a research environment today what will be used in many business sectors tomorrow

• Train next generation of engineers/employees

• Disseminate results and outreach to new audiences

27The Worldwide LHC Computing Grid

Page 28: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

CERN IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

IT at CERN – more than the Grid

• Physics computing – Grids (this talk!)

• Administrative information systems– Financial and administrative management systems, e-business...

• Desktop and office computing– Windows, Linux and Web infrastructure for day to day use

• Engineering applications and databases– CAD/CAM/CAE (Autocad, Catia, Cadence, Ansys etc)

– A number of technical information systems based on Oracle, MySQL

• Controls systems– Process control of accelerators, experiments and infrastructure

• Networks and telecom– European IP hub, security, telephony software...

More information: http://cern.ch/it

Page 29: The Worldwide LHC Computing Grid · When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not

Thank You!Thank You!

Accelerating Science and Innovation Accelerating Science and InnovationCERN/IT Nils Høimyr


Recommended