Download - Computing in the PHENIX experiment and our experience in Korea

Computing in the PHENIX experiment and our experience in Korea

Presented by H.J. Kim Yonsei University

For the behalf of HEP Data GRID Working Group

The Second International Workshop on HEP Data Grid

CHEP, KNU Aug. 22-23, 2003

RHIC

Configurations: Two concentric superconducting magnet rings (3.8Km circumference) with 6 interaction regions

Ion Beams: Au + Au (or d + A) s = 200 GeV/nucleon luminosity = 21026 cm-2 s-1

Polarized proton: p + p s = 500 GeV luminosity = 1.4 1031 cm-2 s-1

Experiments: PHENIX, STAR, PHOBOS, BRAHMS

PHENIX Experiment

4 spectrometer arms

12 Detector subsystems

350,000 detector channels

Event size typically 90Kb

Data rate 1.2 - 1.6KHz, 2KHz observed

100-130MB/s typical data rate

expected duty cycle ~50% -> 4TB/day

We store the data in STK Tape Silos with HPSS

Physics Goals

Search for Quark-Gluon Plasma Hard Scattering Processes Spin Physics

RCF

RHIC Computing Facility (RCF) provides computing facilities for four RHIC experiments (PHENIX, STAR, PHOBOS, BRAHMS).

Typically RCF gets ~ 30 MB/sec (or a few TB/day) from the PHENIX counting house only through Gigabit network. Thus RCF is required to have complicated data storage and data handling systems.

RCF has established an AFS cell for sharing files with remote institutions and NFS is the primary means through which data is made available to the users at the RCF.

The similar facility is established at RIKEN (CC-J) as a regional computing center for PHENIX.

Submit Grid Jobs

Internet

LSF Grid Cluster

LSFServer1

LSFServer2

GatekeeperJob manager

HPSS

Disks

Grid Job Requests

grid

Grid Configuration concept in RCF

620GB

30MB/sec

PHENIX Computing Environment

Linux OS with ROOT Framework PHOOL (PHenix Object Oriented Library) C++ class library

LocalDisks

Database

Reconstruction Farm

Calibrations&

Run Info

RawData

DSTData

BigDisk

HPSS

Mining&

Staging

AnalysisJobs

Tag DB

CountingHouse

PrettyBigDisk

RawData

PHENIX DAQ Room

A picture from the PHENIX Webcam

PHENIX Computing Resources

Locally in the countinghouse:

•~110 Linux machines of different ages ( 26 dual 2.4GHz , lots of 900-1600MHz machines), loosely based on RedHat 7.3

•Configuration,reinstallation, very low maintenance

•8TB local disk space

At the RHIC computing facility:

•40TB disk (PHENIX’s share)

•4 tape silos w/ 420 TB each (no danger of running out of space here)

•~320 fast dual CPU Linux machines (PHENIX’s share)

Run ‘03 highlights

•Replaced essentially all previous Solaris DAQ machines with Linux PC’s

•Went to Gigabit for all the core DAQ machines

•tripled the CPU power in the countinghouse to perform online calibrations and analysis

•boosted our connectivity to the HPSS storage system to 2x1000MBit/s (190MB/s observed)

•DATA taken : ~¼ PB /year

Raw data in Run-II: Au-Au; 100TB, d-AU; 100TB (100days), p-p; 10TB, p-p in Run-III; 35TB

Micro-DST(MDST) : 15% of Raw data

Au-Au, d-Au; 15TB, p-p ; 1.5TB (Run-II), 5TB(Run-III)

Phenix Upgrade plans

Locally in the countinghouse:

•go to Gigabit (almost) everywhere (replace ATM)

•use compression

•replace older machines, re-claim floor space, get another factor of 2 CPU and disks

•get el cheap commodity disk space

In general:

•Move to gcc 3.2, loosely some Redhat 8.x flavor

•we have already augmented Objectivity with convenience databases (Postgres, MySQL), which will play a larger role in the future.

•Network restructuring

Network data transfer at CHEP, KNU

● Real-time data transfer between CHEP(Daegu, Korea) and CCJ(RIKEN Japan) 2 Tbyte of data volume ( physics data ) has been transferred by one bbftp session and we experienced

maximum : 200 Gbyte/dayprobable : 100 Gbyte/day.

● 200 Gbyte/day between CCJ and RCF(BNL,USA) is known. Comparable speed is expected between CHEP and RCF.

Mass storage at KNU for Phenix

Mass storage ( HSM at KNU ) :

Efficient usage by PHENIX. 2.4 Tbyte assigned (for PHENIX) among total of 50 TB, 2 + some TB used (by PHENIX)

ExperienceReliable storage up to 2 TB. ( 5TB possible?)Optimised usage is under study.

Computing nodes at KT, DaeJeon

Computing nodes at KT, Deajeon : 10 nodes ( CPU : Intel XEON 2.0 GHz & memory : 2 Gbyte )Centralized management of the analysis softwares are possible by

nfs-mount of the CHEP disk to the cluster supported by the fast network between KT and CHEP.

CPU usages for the Phenix analysis > 50% always.

Experience : Network between CHEP and KT is satisfactory. No difference from the computing by the local cluster is seen.

CHEP17

Network

100 Gbyte/day

HSM

2.5 Tbyte

Script

Softwarelibrary

Data

NFS

KT cluster1

KT cluster2

KT cluster3

KT cluster9

KT cluster0

Yonsei Computing Resources for PHENIX

Linux (RedHat 7.3) ROOT Framework

Database

Reconstruction

Calibrations&

Run Info

RawData & DST

Big Disk(1.3 T byte)RAID tools for Linux

AnalysisJobs

Tag DB

Computers (P4)

100Mbps PHENIX Library

CHEP

RAID IDE Storage system R&D at Yonesei

● IDE RAID5 system with Linux

● 8 (1 for parity) x 180 Gbyte = 1.3 Tbyte storage

● Total Cost 3000$ ( Interface CARD + HD + PC)

● Write ~100Mbyte/sec and Read ~150Mbyte/sec

● So far no serious problem is experienced

Summary

* Computing facility of PHENIX experiment and current status is reported. It manages large data storage (1/4 PB/year) well. There is a movement to HEP GRID.

* We investigated parameters relevant to the HEP data GRID computing for PHENIX in CHEP 1. Network data transfer : Real-time data transfer speed of 100 GB/day between the major research facilities at CHEP (KOREA), CCJ (JAPAN), and RCF (USA) 2. Mass storage ( 2.5 TB, HSM at CHEP) 3. 10 Computing nodes ( at KT, Deajeon ) Data analysis is in progress (Run-II p-p MDST). The overall analysis experience has been satisfactory.

* IDE RAID storage R&D is on the way in Yonsei Univ.