Tom DietelUniversity of Cape Town
for the
ALICE Collaboration
Computing for ALICE at the LHC
Outline
Physics at the Large Hadron Collider• Higgs – ATLAS & CMS• Quark-Gluon Plasma – ALICE (+ATLAS,CMS)
Computing for ALICE• Present: Processing LHC Run-1 (2010-13)• Near Future: Run-2 (2015-17)• Long-Term Development: Run-3 (after 2018)
Large Hadron Collider collides protons and lead ions at >99.99999% of the speed of lightto research the most fundamental particles and their interactions
CMS
LHCbATLAS
ALICE
Tom Dietel 4
Search for the Higgs Boson
CHPC National Meeting, 4-6 Dec 2013
• quantum field fills universe• field gives mass to
elementary particles: W/Z, quarks, leptons
• new particle → Higgs boson
Predicted in 1964• Peter Higgs• R. Brout, F. Englert• G. S. Guralnik, C. R. Hagen,
and T. W. B. Kibble
Tom Dietel 5
ATLAS Higgs Candidate
CHPC National Meeting, 4-6 Dec 2013
Tom Dietel 6
Discovery of the Higgs Boson at LHC
CHPC National Meeting, 4-6 Dec 2013
Spring 2010• start of data taking
4 July 2012• discovery of a new
particle
March 2013• it’s a Higgs!
October 2013• Nobel prize
Extremely Rare
few 100 Higgs in a
quadrillion (1015 ) collisions
Tom Dietel 7
Mass of the Proton - the other 99%
CHPC National Meeting, 4-6 Dec 2013
Proton contains 3 quarks• 2 up-quarks: mu ≈ 2.5 MeV
• 1 down-quark: md ≈ 5 MeV
Proton heavier than 3 quarks• 2u+1d: mass ≈ 10 MeV• mp = 931 MeV• 100 time heavier
Where does the mass come from?• Quantum-Chromodynamics• Confinement: no free quarks
Tom Dietel 8
Quark-Gluon Plasma
Compression• reduce distance between
nucleons
Heating• thermally create pions• fill space between nucleons
• hadrons overlap• quarks roam freely (deconfinement)• Quark-Gluon Plasma
CHPC National Meeting, 4-6 Dec 2013
Heavy-Ion Physics
• Can the quarks inside the protons and neutrons be freed? • What happens to matter when it is heated to 100000
times the temperature at the centre of the Sun? • Why do protons and neutrons weigh 100 times more than
the quarks they are made of? → collisions of heavy nuclei (Pb) at high energies
10
Tom Dietel 11
ALICE Event Display
CHPC National Meeting, 4-6 Dec 2013
Tom Dietel 12
CERN and South Africa
CHPC National Meeting, 4-6 Dec 2013
SA-CERN• home to all CERN research in South Africa• 5 universities + 1 national lab• more than 60 scientists
ALICE• heavy-ion
physics• quark-gluon
plasma
• UCT, iThemba
ATLAS• particle physics• Higgs physics• SUSY, BSM
• UCT, UKZN, UJ, Wits
ISOLDE• rare isotope
facility• nuclear and
atomic physics
• UKZN, UWC, Wits, iThemba
Theory• particle,
heavy-ion and nuclear physics
• UCT, UJ, Wits
Tom Dietel 13
ALICE Data Flow
CHPC National Meeting, 4-6 Dec 2013
Event• 1 readout of detectors• approx. 1 collision (but: pile-up,
empty events)• data block: 1 (pp) to 100 MB
(Pb-Pb)• independent → embarrassingly
parallel processing
Storage• disk buffer: short term,
random access working copy• long term (“tape”): backup
Reconstruction• merge signals from same
particle• determine particle properties
(momentum, energy, species)
Simulation
event generators• model of known physics• compare experiment / theory
particle transport• model of detectors• correct for detector effects
User Analysis• extraction of physics results• based on reconstructed data• 100’s of different analysis at
Tom Dietel 14
Reconstruction – Bubble Chambers
CHPC National Meeting, 4-6 Dec 2013
Raw Data production: 7 PB
Tom Dietel 15
pp@8 TeV
CHPC National Meeting, 4-6 Dec 2013
Big Data!
Tom Dietel 16
ALICE Grid Computing
CHPC National Meeting, 4-6 Dec 2013
Tier-0• CERN ( + Budapest)• reco, sim, analysis• 1 copy of raw data
Tier-1• reco, sim, analysis• 1 shared copy of raw data
Tier-2• sim, analysis• no access to raw data
ALICE Computing Resources
• tape storage (Tier-0: 22.8 PB, Tier-1: 13.1 PB)• network• human resources
Tier-01000023%
Tier-11200027%
Tier-22200050%
Tier-08.1
28%
Tier-17.8
27%
Tier-212.845%
CPU Corestotal: 44 000 cores
Disk Storage (PB)total: 28.7 PB
ALICE GRID Sites
Tom Dietel 18CHPC National Meeting, 4-6 Dec 2013
Tom Dietel 19
South African Tier-2 at CHPC
CHPC National Meeting, 4-6 Dec 2013
iQudu Cluster• IBM e1350 cluster• 160 nodes
– 2 dual-core AMD Opteron @ 2.6 GHz
– 16 GB RAM• ethernet + infiniband• 100 TB storage (xroot)• launched in 2007
– high power consumption– aging hardware
• used by ALICE since October 2012
Tom Dietel 20
ALICE Computing at CHPC
CHPC National Meeting, 4-6 Dec 2013
Avg: 348 running jobs1% of all ALICE jobs
Tom Dietel 21
Completed Jobs at CHPC
CHPC National Meeting, 4-6 Dec 2013
Start of grid @ CHPC
Network Switch Failure
20000 jobs /
month
Tom Dietel 22
CPU delivered 2012
Armenia 0.0% Brazil 0.3%
CERN 22.3%
China 0.0%
Czech Republic 0.9%
France CEA + IN2P3 12.0%
Germany 13.2%
Greece 0.0%Hungary 0.6%India 0.6%Italy INFN + Centro Fermi
15.1%
Japan 3.5%Korea NRF +KISTI 2.7%
Netherlands 2.8%
Norway + Denmark + Fin-land + Sweden 3.3%
Poland 1.8%Romania 4.2%
Russia 4.6%Slovak Republic 1.1%
USA DOE+NSF 8.2%Spain 0,8% Ukraine KIPT + Kiev 0.4% United Kingdom 1.5%
South Africa 0.3%projection for 2013: ~ 1%
Resources Sharing
CHPC National Meeting, 4-6 Dec 2013
CPU Requirements – RUN2
Tom Dietel 23
+60%
CHPC National Meeting, 4-6 Dec 2013
Disk Requirements – RUN2
Tom Dietel 24
2013/14 2015 2016 20170
10
20
30
40
50
60
70
T0 T1s T2s Sum ×2.3
CHPC National Meeting, 4-6 Dec 2013
CHPC UpgradeWLCG• sign MoU in (April) 2014• representation in WLCG
replace grid cluster (iQudu)• first quarter of 2014• 2000 cores @ 3.2 GHz• 900 TB storage• ALICE + ATLAS
additional human resources
goal: Tier-1Parallel session “CHPC Roadmap” Fri morning
Tom Dietel 26
ALICE LS2 Upgrade
CHPC National Meeting, 4-6 Dec 2013
2018/19 (LHC 2nd Long Shutdown)• 50 kHz Pb-Pb collisions
ALICE Hardware Upgrade• Inner Tracking System (ITS) • Time Project Chamber
Change of Strategy• all data into online computing farm• continuous readout of detectors• massive online processing
ALICE Challenges for Run-3
• data rates– reduce 1 TB/s to 30 GB/s– data compression– use partial reconstruction
• overlapping events– process time-slices– major change in data model
Detector Event Size(MB)
Bandwidth(GByte/s)
TPC 20.0 1000TRD 1.6 81.5ITS 0.8 40
Others 0.5 25Total 22.9 1146.5
Tom Dietel
Computing Working Groups
CWG1Architecture
CWG1Architecture
CWG2Tools
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG7Reconstruction
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG8Physics
Simulation
CWG7Reconstruction
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG8Physics
Simulation
CWG7Reconstruction
CWG9QA, DQM
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG9QA, DQM
CWG10Control,
Configuration
CWG8Physics
Simulation
CWG7Reconstruction
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG9QA, DQM
CWG10Control,
Configuration
CWG11SoftwareLifecycle
CWG8Physics
Simulation
CWG7Reconstruction
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
CWG5ComputingPlatforms
CWG6Calibration
CWG9QA, DQM
CWG12Computing Hardware
CWG10Control,
Configuration
CWG11SoftwareLifecycle
CWG8Physics
Simulation
CWG7Reconstruction
CWG1Architecture
CWG2Tools
CWG3Dataflow
CWG4Data Model
SummaryPresent ALICE computing • part of WLCG
– more than 40000 CPU cores– almost 30 TB of data– Big Data!
• South Africa – CHPC– 1% of ALICE resources
Near Future• growth within current computing model• upgrade of CHPC – towards Tier-1
Long-term Future• major ALICE upgrade → extreme data rates• new computing concepts → huge R&D effort
CHPC National Meeting, 4-6 Dec 2013Tom Dietel 30
Backup
Tom Dietel 31
AliRoot
CHPC National Meeting, 4-6 Dec 2013
O2 Project
InstitutionBoards
ComputingBoard
OnlineInstitution
Board
ComputingWorking Groups
Projects
O2 Steering Board ProjectLeaders
DAQ
CWG1 Architecture
CWG2 Procedure &
Tools
HLT
CWG3 DataFlow
CWG4 Data Model
Offline
CWG5 Platforms
CWG6 Calibration
CWG13Sw Framework
CWGnn-----
CWG7 Reconstruc.
CWG8Simulation
CWGnn-----
CWGnn-----
CWGnn-----
CWGnn-----
CWG9QA, DQM, Vi
CWG10Control
CWG11 Sw Lifecycle
CWG12Hardware
- 50 people activein 1-3 CWGs- Service tasks
O2 Hardware System2 x 10 or 40 Gb/s
FLP 10 Gb/s
FLP
FLP
ITS
TRD
Muon
FTPL0L1
FLPEMC
FLPTPC
FLP
FLPTOF
FLPPHO
Trigger Detectors
~ 2500 linksin total
~ 250 FLPsFirst Level Processors
EPN
EPN
DataStorage
DataStorage
EPN
EPN
StorageNetwork
FarmNetwork
10 Gb/s
~ 1250 EPNsEvent Processing Nodes
Tom Dietel 34CHPC National Meeting, 4-6 Dec 2013
Dataflow Model