+ All Categories
Home > Documents > INFN Computing for LHCb Domenico Galli, Umberto Marconi, and Vincenzo Vagnoni Genève, February 15,...

INFN Computing for LHCb Domenico Galli, Umberto Marconi, and Vincenzo Vagnoni Genève, February 15,...

Date post: 03-Jan-2016
Category:
Upload: alfred-henderson
View: 214 times
Download: 0 times
Share this document with a friend
22
INFN Computing for LHCb Domenico Galli, Umberto Marconi, and Vincenzo Vagnoni Genève, February 15, 2001
Transcript

INFN Computing for LHCb

Domenico Galli, Umberto Marconi, andVincenzo Vagnoni

Genève, February 15, 2001

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Changes in INFN Computing Plans Before December 2000:

Plans for the Tier-1 were different for the 4 LHC experiments.

ALICE, ATLAS and CMS planned for a Tier-1 “distributed” among 2 or 3 sites.

After December 2000: INFN plans to build a Regional Computing Centre (RCC),

located at CNAF (Bologna) in the first instance, which: acts as a unique “concentrated” Tier-1 for all the 4 LHC

experiments, acts as Tier-A for BaBar, Supplies enough computing power and off-line storage for

VIRGO.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

INFN computing projects for LHC Short term project: INFN-Grid.

Started (and funded) on January 1, 2001. Term: 3 years. Funds: 1.5 M€ (2001) + 4 M€ (2002) + 4 M€ (2003). Aim:

developing Grid software in collaboration with DATAGRID; developing Grid test-bed; funding hardware and consumables needed in Tier-n

prototypes testing. Long term project: INFN-RCC.

Will start before April 2001. Funds: about 22 M€ available soon. Aim:

Supply the computing resources for VIRGO and the Tier-A computing centre for BaBar;

Build the final Italian Tier-1 Regional Centre for LHC (30% in 2005, 60% in 2006, 100% in 2007).

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Short Term: LHCb Funded Investments in 2001

Funded

racks (with cooling and power supply) 1 3 k€

Commodity monoprocessor or biprocessor motherboards: 100Base-TX network IF with BOOT-PROM Intel Pentium III 800 MHz CPU 256 MB RAM 133 MHz bus

15 15 k€

NAS RAID-5 Raidzone OpenNAS RS15-R1200 Dual processor Pentium III 800 MHz 15 IDE 80 GB Ultra ATA/100 disks Dual 100 Mbps network IF (configured in port trunking) Dual redundant 300W hot swap power supply

1 30 k€

56 ports ethernet switch 1 3.5 k€

Total51.5 k€+ 20%

VAT

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Short Term: LHCb Further Investments in 2001

Availability of more funds, later in 2001, is conditional to: The production of a believable computing

model for 2001 (given the present WAN bandwidth).

The evidence of a high load of the installed CPUs.

The use of produced MC by Italian institutes for their own analyses.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

2001 Farm for LHCb-Italy LHCb-Italy plans to build a diskless (and

swapless) farm with a concentrated file server (NAS).

Cheaper in hardware: both initial purchase and subsequent maintenance;

Cheaper in software support; More robust for demanding computing environments:

no moving parts in the PC; Swapping and paging are not needed in job processing

nodes: because their memory consumption is predictable; it is absolutely not convenient that they have any memory

swapped. A disadvantage can be the network load during

simultaneous booting. But it can be completely overcome by using MTFTP (multicast TFTP).

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

2001 Farm for LHCb-Italy (II)

CPU

DRAMChipsetGC

NIPCI

AGP

CPU

DRAM Chipset GC

NIPCI

AGP

IDE

swap

CPU

DRAM Chipset GC

NIPCI

AGP

IDE

swap

CPU

DRAMChipsetGC

NIPCI

AGP

CPU

DRAMChipsetGC

NIPCI

AGP

CPU

DRAMChipsetGC

NIPCI

AGP

CPU

DRAMChipsetGC

NIPCI

AGP

CPU

DRAMChipsetGC

NIPCI

AGPCPU

CPU

DRAM

Chipset

Switched Node Backplane

NI

NI

PCI

hostadapter

Switched Node

Disk IF

Switched Node

Disk IF

Switched Node

Disk IF

Switched Node

Disk IF

Switched Node

Disk IF

Switched Node

Disk IF

Ultra ATA Ultra ATA Ultra ATA Ultra ATA Ultra ATA Ultra ATA

Network Attached Storage

Login Nodes Job Execution Nodes

Ethernet Switch

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

2001 Components (Motherboards, Racks, NAS, Switches)

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Preliminary Farm Tests NFS operation

NFS is needed to collect MC output on RAID disk arrays (instead of spread them among hundreds of non-redundant local disks).

MC output rate is low (a few kB/s) compared with NFS performances (~ 7-10 MB/s on Fast Ethernet).

Swap-less operation Memory demand by job execution nodes is stable. Swap area is useless on job execution nodes (but is

required on login nodes). Monte Carlo performances do not depend on the

presence of a swap area. NFS functionalities

Kernel version 2.2.18 or higher required on clients, in order to allow broken setuid on NFS root.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Preliminary Farm Tests (II) RAM errors

It seems that after the Taiwan earthquake in 1999, the quality of memories in the market has dropped significantly.

Memtest86 program can detect cross-talk errors in RAM modules without ECC control;

Badram kernel module can fix them (marking bad memory locations as “permanently used”).

Switching nodes on and off from remote. Wake-on-LAN (from Intel Wired-for-Management) allows

to switch on computer from remote, but does not allow to switch them off.

Wake-on-LAN not useful to reboot a hung-up node. A different solution must be found (e.g. remote power

switch control, products by BayTech, Western Telematic, National Instrument).

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Preliminary Farm Tests (III) Network boot

Takes a few seconds more than a boot on local HD. Required BOOT-PROM on Network Interface Card. PXE (Preboot Execution Environment from Intel Wired-for-

Management specifications) compliance is not mandatory, but strongly preferred.

PXE allows a multiple boot-server configuration, to provide load balancing and redundancy in a large installation.

MTFTP (Multicast Trivial File Transfer Protocol) is not an IETF standard. PXE defines a proprietary implementation with wide spread usage. 4 phases: listen a matching MTFTP session already in progress; open a new MTFTP session; receive MTFTP data with acknowledgement; close a MTFTP session.

Intel-Red Hat Linux implementation of PXE 2.1 exists and works. We tested it with 3Com EtherLink 3C905C-TX-M NIC. Linux implementation of PXE 32/64 almost ready.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

PXE Boot Sequence

PX EClient

PX EClient

PX EClient

PX EClient

PX EClient

E x ecut eDownloadedBoot I mage

PX E Client

DH CP/Proxy DH CP

S erver

DH CP/Proxy DH CP

S erver

DH CP D iscover t o Por t 67Cont ains “PX E Client ” ex t ension t ags

E x t ended DH CP O ff er t o Por t 68 cont ains:PX E ser ver ex t ension t ags +[O t her DH CP opt ion t ags] +

Boot S er ver list , Client I P addr ess, M ult icast D iscover y I P addr ess

DH CP/Pr ox y DH CP

S er verDH CP A cknowledge r eply t o Por t 68

DH CP Request t o por t 67Cont ains “PX E Client ” ex t ension t ags +

[O t her DH CP opt ion t ags]

M / T FT PS erver

BootS erver

BootS er ver

Boot S er vice D iscover t o por t 67 or 4011Cont ains: “PX E Client ” ex t ension t ags

+ [O t her DH CP opt ion t ags]

Boot S er vice A ck r eply t o client sour ce por tCont ains: [PX E S er ver ex t ension t ags]

(cont ain N et wor k Boot st r ap Pr ogr am fi le name)

N et wor k Boot st r ap Pr ogr am downloadr equest t o T F T P por t 69 or M T F T P por t

(f r om Boot S er vice A ck)

N et wor k Boot st r ap Pr ogr am Downloadt o Client ’s por t

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Long Term: The Proposed INFN Unique Tier-1 RCC The proposed unique INFN Tier-1 Regional

Computing Centre for the 4 LHC experiments doesn’t change LHCb-Italy computing planning.

Since the beginning LHCb-Italy planned indeed to build up a “concentrated” Tier-1, and already in 2001, it will put computing resources in only one site (the Tier-1 prototype at CNAF).

On the contrary, ALICE, ATLAS and CMS had to revise their plan. Now they try to move resources from Tier-1s to Tier-2s, but it is not clear if this scheme will be practicable.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Final Tier-n Organization in Italy ATLAS-Italy plan:

Tier-1 stores ESD. Performs reprocessing and first analysis stage (AOD production). No MC production.

Tier-2s store AOD and perform MC production + analysis. Tier-3s execute analysis.

CMS-Italy plan: Tier-1 stores ESD and performs collaboration scheduled

MC production + reprocessing. Tier-2++ (Legnaro) performs local requested MC production

+ reprocessing + analysis. Tier-2s/Tier-3s produce local requested MC production +

analysis. Tier-2/Tier-3 difference is not functional-based.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Final Tier-n Organization in Italy (II) LHCb-Italy plan:

LHCb distributes AOD (not ESD) to Tier-1s. User analyses in Tier-3 will access mainly AOD.

LHCb-Italy believes that in the next years most of manpower will be absorbed by Tier-1 construction.

No pre-existent (reasonable sized) farms in INFN institutes.

Rather than build up possible Tier-2s, it is preferable, for LHCb-Italy, to concentrate on Tier-1.

LHCb-Italy plan to build: 1 Tier-1;

9 Tier-3 (one for each institute).

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

LHCb-Italy Tier-n Tasks Tier-1.

MC production ( RAWmc, ESDmc) + production analysis ( AOD) + MC reprocessing following changes in reconstruction algorithms ( ESDmc).

User analysis (selection task) in collaboration with Tier-3s. Storage of RAWmc and ESDmc produced in the centre

itself. Storage of all the AOD (real AOD produced at CERN, MC

AOD produced in all Tier-1 centres). Tier-3s.

User analysis (processing task, DPD) in collaboration with Tier-1.

Interactive analysis of DPD. Storage of AOD selections. Storage of DPDs.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

LHCb Requirements for INFN unique Tier-1 INFN unique Tier-1 must place at LHCb’s disposal the

computing resources needed (in terms of CPU power, disk/tape storage, connectivity, etc.),

with the requirements demanded by LHCb (operating system, experiment software, etc.).

Personnel at Tier-1 should include: Qualified system administrators; Computer scientist motivated by the interest about

computing methods; Physicists directly involved in analysis, motivated by the

scientific results. Experiments should exert on Tier-1 a strong steering

and control action (more like a “board of directors” than a “user council”).

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

LHCb Tier-1 Capacity Needs

Processing StepOutput Data

CPU Power [SI95]

Disk Storage

[TB]

Active Tape Storage [TB]

Archive Tape

Storage [TB]

Real DataAOD+TA

G— 40 80 0

Simulation/ Reconstruction

Raw+ESD 110000 23 70 40

Production MC Analysis

AOD+TAG

8000 18 35 0

Calibration — — 10 0 10

Disk Cache for Staging Data

— — 15 0 0

User Analysis DPD 23000 5 0 5

Total — 141000 111 185 55

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

LHCb-Italy Required Personnel

Tier-1 [FTE]

Tier-3 [FTE]

Support for R&D and general software tools

4 —

Support for experiment-specific LHCb software

2 0.5

System administration 2 0.5

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

Final LHCb Tier-n Organization in Italy

Centre # Location Resources

Tier-1 1 CNAF

140 kSI95, 110 TB disk, 185 TB active tape, 55 TB archive tape

Tier-2 0 — —

Tier-3 9

Bologna, Cagliari, Ferrara, Firenze, Frascati,

Genova, Milano, Roma1, Roma2

Average (10 physicists):

5 kSI95, 10 TB disk. They can vary, depending upon group size and analysis activity.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

RCC Project Committee February 6, 2001: the INFN President elected a

Project Committee for the Regional Computing Centre:

Paolo Capiluppi (CMS, Bologna) Domenico Galli (LHCb, Bologna) Alberto Masoni (ALICE, Cagliari) Mauro Morandin (BaBar, Padova) Laura Perini (ATLAS, Milano) Fulvio Ricci (VIRGO, Roma) Federico Ruggieri (CNAF, Bologna)

Other experts will be involved.

INFN Computing for LHCbDomenico Galli, Umberto Marconi and Vincenzo Vagnoni

RCC Preliminary Time Schedule February 13, 2001: first meeting of the Project

Committee.

April 10, 2001: detailed working plan for the experimental phase and general working plan for the executing phase.

July 2001: adjustment of the plan to the results of LHC Computing Review.

End 2001: study of tecnical, economical, and organizational problems and beginning of experimental phase.

End 2002: revisal on the basis of the experiment engagements in MoU.

End 2003: end of the experimental phase.


Recommended