Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 216 times |
Download: | 2 times |
2
Introduction
• During January Paul and I had a meeting at Brunel to make a first stab at estimates for data storage and transfer requirements for MICE.
• Based on some rough estimates (hopefully biased towards the worst case) of data size and rate and aim to develop a plan to securely maintain all the data that is taken as well as distribute it throughout the collaboration for processing and analysis.
• Most of this talk consists of slides that were presented by Paul at a MICE-UK meeting recently.
5
Data rate from Detectors
• Tracker can conceivably operate in several data taking modes:– Discriminators only: allows higher trigger rate, but
results in lowest event size and data rate.– Discriminators and ADC/TDC information: required
for calibration and in the event of high RF background rate. Will probably result in a lower trigger rate being achieved, but due to much larger event size will produce the largest data rate.
• First estimate of PID detectors indicates that tracker produces the bulk of the data (when it writes out ADCs and TDCs)
6
Data Rate from the detector
Estimate:
Tracker data Rate if full data readout is required for the tracker 14 MBytes/s (110 MBits/s) – all other data negligible
Data per hour 50 GigaBytes
Data per day 1 TeraByte
ISIS efficiency ~75% over an extended period
Running schedule: unknown – assume the LHC 4 months.
Data per year 120 TeraBytes
Data for MICE <0.5 PetaBytes
7
Data Rate from Monte Carlo
Unknown:
Assume Monte Carlo totals are the same as data.
Data per year 120 TeraBytes
Data for MICE <0.5 PetaBytes
Plan to use Monte Carlo pre-startup to test data flow rates
8
Hardware requirements for the hall
Local storage needed to buffer data before dispatch to Atlas
and to allow running when link or centre is unavailabledata deleted when TWO copies exist.
Data per day 1 TeraByte
Weekend running – Friday to Monday ~ 3 TeraBytes
Disk server with 10 eide drives – each 500 GigaBytes.
2*5 RAID-5 arrays – striped across all 10 disks
8 TeraBytes
By the time is comes to purchase the disks,larger capacity disks will be available
9
Hardware for link to Tier-0
Tracker data Rate if full data readout is required for the tracker 14 MBytes/s (110 MBits/s)
Peak must allow reverse traffic for remote operation of detector subsystems. Transferring data delayed by unavailability of Atlas storage.
Gigabyte link – gives approximately a factor 4 in safety (at 40% collisions become a problem).
10
Hardware at Tier-0
Data per year 120 TeraBytes
Data for MICE <0.5 PetaBytes
400 TeraBytes tape for data – none for Monte Carlo.
120+30 TeraBytes of disk space (MC is divided between the four
centres)
Ability to write tape data at 110 Mbytes/s (see data security)
*2 for monte-carlo
11
Hardware links from Tier-0
Tracker data Rate if full data readout is required for the tracker 14 MBytes/s (110 MBits/s)
Ability to copy the data to one remote site in real time
110 MBits per second off-site average rate
Higher unless two tier-1 sites can read 110 Mbits per second and relay at the same rate.
12
Hardware at Tier-1
Data per year 120 TeraBytes
Data for MICE <0.5 PetaBytes
120+30 TeraBytes of disk space (Fast storage)
400 TeraBytes near-line storage
Ability to receive data at 110 MBits/s and for two of them to relay at this rate.
13
Note on tier-1 data links
CERN can clearly handle the traffic.FermiLab can handle the traffic.Japan may be a problem.
if so transfer selectively processed data.
Not established if
they will
14
Note on data security
MICE data will consider to be safe when there are two distinct copies.
During the writing of a single file in the hall only 1 copy.Writing to Tier-0 create a second copy.
Deletion of the hall copy is allowed when an extra external copy has been created. In normal running this will be copy to a tier-1 centre. (The first copy will be rotated between tier-1s) If this is not possible we will create immediate tape copies at the tier-0. (alternative external sites are not considered to be helpful)
Bookkeeping decisions will influence data taking rules
15
Data Flow in MICEMice Hall
4 TeraByte disk1 GigaBit Link
400 TB
Fermilab Tier-1110 MegaBit Link150TB fast storage400TB slow storage
CERN Tier-1110 MegaBit Link150TB fast storage400TB slow storage
KEK/Osaka Tier-1110 MegaBit Link150TB fast storage400TB slow storage
110 MegaBit relay 110 MegaBit relay
Two copies of data required at
all times
150 TB
Atlas Tier-01 GigaBit Link
16
Data Processing
• Plan to use LCG for both Monte Carlo production and real data analysis.
• Have made a slow start, GRID hardware installed in the UK for MICE (Sheffield and Brunel) and a VO has been created.
• Post-Osaka, plan to ramp up GRID activities through G4MICE simulation studies.
• Larger productions and transfers of large volumes of Monte Carlo can be used to test data-taking systems.
17
Local Processing• Some sort of farm will be required on the floor (or
at least on the RAL site) in order to provide feedback for beam tuning and for online monitoring.
• An estimate from Tom Roberts is that during beamline tuning runs of approximately 10k events in a minute will be collected and will need to be reconstructed (to determine Twiss parameters).
• This reconstruction will need to be performed on a comparable time scale to allow quick feedback to the beamline settings.
18
Next Steps• Need to become more definite about event sizes,
trigger rates and running periods to confirm estimate of total data storage and transfer rate requirements.
• Reconstruction will need to be tested (once at a sufficient level for online work) to estimate online processing needs.
• Paul has already started on a list of networking/storage requirements for the RAL site. Need to establish local contacts at FNAL, CERN and Japan for Tier-2 centres.
• Alan has started process of arranging computing resources at FNAL for MICE.