Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | preston-parker |
View: | 220 times |
Download: | 0 times |
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
DataGrid meeting, CSC DataGrid meeting, CSC V. Karimäki (HIP)V. Karimäki (HIP)
Otaniemi, 28 August, 2000Otaniemi, 28 August, 2000
Data Intensive Computing in CMS Experiment
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Outline of the TalkOutline of the Talk
LHC computing challengeLHC computing challenge Hardware challengeHardware challenge CMS softwareCMS software DataBase management systemDataBase management system Regional CentresRegional Centres DataGrid WP 8 in CMSDataGrid WP 8 in CMS SummarySummary
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Challenge: Collision ratesChallenge: Collision rates
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Challenges: Event complexityChallenges: Event complexity
Events:Events: Signal event is obscured by 20 overlapping Signal event is obscured by 20 overlapping
uninteresting collisions in same crossinguninteresting collisions in same crossing Track reconstruction time at 10Track reconstruction time at 103434 Luminosity Luminosity
several times 10several times 1033 33
Time does not scale from previous generationsTime does not scale from previous generations
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Challenges: Challenges: Geographical dispersionGeographical dispersion
Geographical dispersion:Geographical dispersion: of people and resources of people and resources Complexity:Complexity: the detector and the LHC environment the detector and the LHC environment Scale:Scale: Petabytes per year of data Petabytes per year of data
1800 Physicists 150 Institutes 32 Countries
Major challenges associated with:Major challenges associated with: Coordinated Use of Distributed computing resources Coordinated Use of Distributed computing resources Remote software development and physics analysisRemote software development and physics analysis Communication and collaboration at a distanceCommunication and collaboration at a distance
R&D: New Forms of Distributed SystemsR&D: New Forms of Distributed Systems
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Challenges: Data RatesChallenges: Data Rates
online systemmulti-level trigger
filter out backgroundreduce data volume
level 1 - special hardware
40 MHz (40 TB/sec)level 2 - embedded processorslevel 3 - PCs
75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)
data recording &offline analysis
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
PetaByte Mass StoragePetaByte Mass Storage
Each silo has 6,000 slots, each of which can hold a 50GB cartridgeEach silo has 6,000 slots, each of which can hold a 50GB cartridge
==> theoretical capacity : 1.2 PetaBytes ==> theoretical capacity : 1.2 PetaBytes
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
The new Supercomputer?The new Supercomputer?
From http://now.cs.berkeley.edu (The Berkeley NOW project)
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Event Parallel Processing SystemEvent Parallel Processing System
About 250 PCs, with 500 Pentium About 250 PCs, with 500 Pentium
processors are currently installedprocessors are currently installed
for offline physics data processingfor offline physics data processing
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Cost Evolution: CMS 1996 VersusCost Evolution: CMS 1996 Versus1999 Technology Tracking Team 1999 Technology Tracking Team
Compare to 1999 Technology Tracking Team Projections for 2005Compare to 1999 Technology Tracking Team Projections for 2005 CPU: Unit cost will be close to early predictionCPU: Unit cost will be close to early prediction Disk: Will be more expensive (by ~2) than early predictionDisk: Will be more expensive (by ~2) than early prediction Tape: Currently Zero to 10% Annual Cost Decrease (Potential Problem)Tape: Currently Zero to 10% Annual Cost Decrease (Potential Problem)
CMSCMS1996 Estimates1996 Estimates 1996 Estimates1996 Estimates1996 Estimates1996 Estimates
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Data Challenge plans in CMSData Challenge plans in CMS
Dec 2000: Level 1 trigger TDRDec 2000: Level 1 trigger TDR First large-scale productions for trigger studiesFirst large-scale productions for trigger studies
Dec 2001: DAQ TDR Dec 2001: DAQ TDR Continue High Level Trigger studies; Production at Tier0 and Continue High Level Trigger studies; Production at Tier0 and
Tier1sTier1s
Dec 2002: Software and Computing TDR Dec 2002: Software and Computing TDR First large-scale Data Challenge (5%)First large-scale Data Challenge (5%) Use full chain from online farms to production in Tier0, 1, 2 centersUse full chain from online farms to production in Tier0, 1, 2 centers
Dec 2003: Physics TDR Dec 2003: Physics TDR Test physics performance; need to produce large amounts of dataTest physics performance; need to produce large amounts of data Verify technology choices by performing distributed analysis Verify technology choices by performing distributed analysis
Dec 2004: Second large-scale Data Challenge (20%)Dec 2004: Second large-scale Data Challenge (20%) Final test of scalability of the fully distributed CMS computing Final test of scalability of the fully distributed CMS computing
systemsystem
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Hardware - CMS computingHardware - CMS computing
10
~ 120 MCHFTotal Computing cost to 2006
inclusive~ consistent with canonical 1/3 : 2/3
rule
~ 40 MCHF(Central systems
at CERN)
~ 40 MCHF(~5 Regional Centres
each ~20% of central systems)
~ 40 MCHF (?)(Universities, Tier2 centres, MC, etc..)
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Computing tasks - SoftwareComputing tasks - Software
Off-line computingOff-line computing
Detector simulationDetector simulation OSCAROSCAR Physics simulation Physics simulation CMKINCMKIN CalibrationCalibration Event reconstructionEvent reconstruction ORCAORCA and Analysisand Analysis Event visualisationEvent visualisation IGUANAIGUANA
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
CMS Software MilestonesCMS Software Milestones
CMS MILESTONES
CORE SOFTWARE
End of Fortran development
GEANT4 simulation of CMS 1 2 3 4
Reconstruction/analysis framework 1 2 3 4
Detector reconstruction 1 2 3 4
Physics object reconstruction 1 2 3 4
User analysis environment 1 2 3 4
DATABASE
Use of ODBMS for test-beam
Event storage/retrieval from ODBMS 1 2 3 4
Data organisation/access strategy
Filling ODBMS at 100 MB/s
Simulation of data access patterns
Integration of ODBMS and MSS
Choice of vendor for ODBMS
Installation of ODBMS and MSS
1 Proof of concept 3 Fully functional
2 Functional prototype 4 Production system
Dec-00
Dec-01
Dec-02
Dec-03
Dec-98
Jun-99
Dec-00
Dec-97
Jun-98 Dec-99 Dec-01
Dec-98 Jun-00 Dec-02 Dec-04
Mar-99 Jun-00 Dec-02 Dec-04
Dec-98 Dec-99 Jun-02 Jun-04
Jun-98 Dec-99 Dec-01 Dec-03
2005
Jun-98
Jun-98 Dec-99 Jun-01 Dec-03
2001 2002 2003 20041998 1999 2000CMS MILESTONES
CORE SOFTWARE
End of Fortran development
GEANT4 simulation of CMS 1 2 3 4
Reconstruction/analysis framework 1 2 3 4
Detector reconstruction 1 2 3 4
Physics object reconstruction 1 2 3 4
User analysis environment 1 2 3 4
DATABASE
Use of ODBMS for test-beam
Event storage/retrieval from ODBMS 1 2 3 4
Data organisation/access strategy
Filling ODBMS at 100 MB/s
Simulation of data access patterns
Integration of ODBMS and MSS
Choice of vendor for ODBMS
Installation of ODBMS and MSS
1 Proof of concept 3 Fully functional
2 Functional prototype 4 Production system
Dec-00
Dec-01
Dec-02
Dec-03
Dec-98
Jun-99
Dec-00
Dec-97
Jun-98 Dec-99 Dec-01
Dec-98 Jun-00 Dec-02 Dec-04
Mar-99 Jun-00 Dec-02 Dec-04
Dec-98 Dec-99 Jun-02 Jun-04
Jun-98 Dec-99 Dec-01 Dec-03
2005
Jun-98
Jun-98 Dec-99 Jun-01 Dec-03
2001 2002 2003 20041998 1999 2000
We are well in schedule!
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Worldwide Computing Plan
Tier2 Center ~1 TIPS
Online System
Offline Farm,CERN Computer Center > 20 TIPS
Fermilab~ 4 TIPS
France Regional Center
Italy Regional Center
UK Regional Center
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100 MBytes/sec
~2.4 Gbits/sec
100 - 1000 Mbits/sec
Bunch crossing per 25 nsecs.100 triggers per secondEvent is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute has ~10 physicists working on one or more channels
Data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight
Tier2 Center ~1 TIPS
Tier2 Center ~1 TIPS
Tier2 Center ~1 TIPS
~622 Mbits/sec
Tier 0 +1
Tier 1
Tier 3
Tier 4
Tier2 Center ~ ~ 1 TIPS
Tier 2
1 TIPS = 25,000 SpecInt951 TIPS = 25,000 SpecInt95
PC (1999) = 15 SpecInt95PC (1999) = 15 SpecInt95
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Computing at Regional CentersComputing at Regional Centers
CERN/CMS350k SI95
350 TB DiskRobot
FNAL/BNL70k SI95
70 TB DiskRobot
Tier0
Tier1
Tier2 Center20k Si95
20 TB DiskRobot
622 Mb/sN x
622
Mb/s
Tier3UnivWG
1
Tier3UnivWG
2
Tier3UnivWG
N
Model Circa 2005
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
Tapes
Network from CERN
Networkfrom Tier 2& simulation centers
Tape Mass Storage & Disk Servers
Database Servers
PhysicsSoftware
Development
R&D Systemsand Testbeds
Info serversCode servers
Web ServersTelepresence
Servers
TrainingConsultingHelp Desk
ProductionReconstruction
Raw/Sim ESD
Scheduled, predictable
experiment/physics groups
ProductionAnalysis
ESD AODAOD DPD
Scheduled
Physics groups
Individual Analysis
AOD DPDand plots
Chaotic
PhysicistsDesktops
Tier 2
Local institutes
CERN
Tapes
Regional Centre ArchitectureRegional Centre ArchitectureExample by I. Gaines Example by I. Gaines
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
CMS production 2000 - Grid WP 8CMS production 2000 - Grid WP 8
SignalZebra fileswith HITS
ORCADigitization
(merge signal and MB)
ObjectivityDatabase
HEPEVTntuples
CMSIM
HLT AlgorithmsNew
ReconstructedObjects
MC
P
rod
.O
RC
A P
rod
.
HLT G
rp
Data
bases
ORCAooHit
FormatterObjectivityDatabase
MB
ObjectivityDatabase
Catalog import
Catalog import
ObjectivityDatabaseObjectivityDatabaseytivitcejbOytivitcejbOesabataDesabataD
Mirro
red
Db
’s(U
S, R
ussia
, Ita
ly..)
Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)
SummarySummary
ChallengesChallenges: high rates, large data sets, : high rates, large data sets, complexity, world wide dispersion, costcomplexity, world wide dispersion, cost
SolutionsSolutions: event parallism, commodity : event parallism, commodity components, computing modelling, distributed components, computing modelling, distributed computing, OO paradigm, OO databasecomputing, OO paradigm, OO database
PlanningPlanning: CMS in schedules with various : CMS in schedules with various milestonesmilestones
DataGrid WP 8DataGrid WP 8: production of large number of : production of large number of events in fall 2000 events in fall 2000