+ All Categories
Home > Documents > Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results...

Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results...

Date post: 28-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
13
Nechaevskiy A.V. (SINP MSU), Korenkov V.V. (LIT JINR) Experience of Experience of Data Grid simulation Data Grid simulation packages using packages using. Dubna, 2008
Transcript
Page 1: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Nechaevskiy A.V. (SINP MSU), Korenkov V.V. (LIT JINR)

Experience of Experience of Data Grid simulationData Grid simulationpackages usingpackages using..

Dubna, 2008

Page 2: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Contant•Operation of LCG DataGrid•Errors of FTS services of the Grid.•Primary goals of the Grid simulation systems. •The OptorSim and the GridSim simulators. •Results of the LCG DataGrid simulation with theOptorSim.

Tier- 2s and T ie r-1s a re inter- connec ted by the genera l

pu rpose research ne tworks

Any T ie r-2 m a yaccess da ta a t

any T ie r- 1

T i e r-2 IN2P3

T R I U M F

A S C C

F N A L

B N L

Nordic

C N A F

S A R APIC

R A L

G r i d K a

T ie r- 2

T ie r -2

T i e r-2

T ie r- 2

T ie r- 2

T i e r-2

T i e r-2T i e r-2

Tier -2

Grid – solution for LHC experiments

Page 3: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

LHC experiments support

Main faults have been allocated for the monitoring time: timeouts, the program errors, specific errors of applications and an users errors.

•SOURCE during PREPARATION phase: [REQUEST_TIMEOUT] failed to prepare source file in 180 seconds•TRANSFER during TRANSFER phase: [TRANSFER_TIMEOUT] gridftp_copy_wait: Connection timed out•The server sent an error response: 425 425 Can't open data connection. timed out() failed•DESTINATION during PREPARATION phase: [CONNECTION] failed to contact on remote SRM [srm]. Givin' up after 3 tries

Error’s details description:https://twiki.cern.ch/twiki/bin/view/LCG/TransferOperationsPopularErrors

Errors description are used in FTS monitoring:• Scope – source’s error (SOURCE – source site, DESTINATION – destination site,

TRANSFER – during transfer).• Category – an error class (FILE-EXIST, NO-SPACE-LEFT, TRANSFER-TOMEOUT etc.).• Phase – a stage in transfer life cycle on which there was an error (ALLOCATION,

TRANSFER-PREPARATION, TRANSFER, etc.).• Message – the detailed description of an error. We have a list from more than 400

various patterns which changes in time.

Page 4: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

The primary goals solved by DataGrid simulation tools

• Simulation allows to make various experiments of investigated object;• Simulation allows to predict and prevent a number of unexpected

situations;• Simulation makes it possible to define equipment for data transfers

and data storage in a minimum variation for providing requirements of the project;

• Simulation also gives possibilities to check the system work to define its "bottlenecks" and many other possibilities.

Grid simulators:SimGridOptorSimGridSim

Page 5: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Requirements for grid simulator

It is obvious that a simulator must include:

• simulation of operation of DataGrid’s basic elements (data storage elements (SE), resource brokers (RB), replica catalogs (RC), network, users, sites);

• simulation time has to be much less then a time of real work of DataGrid;

• different kind of statistics is needed (for example, volume of data transfers, throughput, etc.);

• simulation of failures of the equipment is necessary and also results of the simulation should be comparable to a real situation.

Page 6: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

• OptorSim allows to estimatevarious algorithms of optimisationand replication strategy

• Implemented in Java• Configuration files are used to set

simulation’s parameters• The source code is available• edg-wp2.web.cern.ch/edg-

wp2/optimization/optorsim.html

OptorSim

Page 7: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Implementation of the Replica Catalog in the LCG and in the OptorSim

LCG:The file catalogue LFC stores the information about all the files and their replicas in the LCG . It is one of the critical services.Logical File Name (LFN)

An alias created by a user to refer to some item of data, e.g.“lfn:cms/20030203/run2/track1”

Globally Unique Identifier (GUID)A non-human-readable unique identifier for an item of data, e.g.“guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6”

Site URL (SURL) / Physical FN (PFN) / Site FN (SFN)

The location of an actual piece of data on a storage system, e.g.“srm://srm.cern.ch/castor/cern.ch/grid/cms/output10_1”

OptorSim:• File information is stored in the

OptorSim in the Replica Catalogue(same in LCG) • Replica Catalogue is a list of

mapping of LFN to their physical file names (LFN and PFN in LCG)

• Replica Manager manages the data replication and registers files in Replica Catalogue (The cataloging of the files is implemented in the LFC)

• The "best" placement of replica is defined before the transfer. It allows Sites to copy the files from different sources in order to avoid huge loadings of the resources.

Page 8: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

OptorSim’s - graphic interface

The Statistics is available in the table forms, graphics anddiagrammes

Page 9: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

GridSim• GridSim allows to simulate various classes of heterogeneous

resources, users, applications and brokers• Implemented in Java• Configuration files are used to set simulation’s parameters• The source code is available• There is a lot of examples of the GridSim using• http://www.gridbus.org/gridsim/

Page 10: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

The simulation details

•CERN-RDIG segment is a part ofglobal LCG structure

•GEANT2 network are used for the huge data traffic between CERN and RDIG’s sites and other participants

•Routers are also used for foreigntraffic and they are represented asbackground traffic in the simulastion

•Four RDIG’s sites - JINR, SINP (Moscow State University), IHEP, ITEP were considered

Page 11: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Simulation’s results•It is required 12-14 hours for transfer of 500-700 GB data with 6-12 Mb/s throughputs. This situation is close to a reality

•The volumes of the data transfers can vary from several Gigabytes to hundreds of Gigabytes per hour but channel’s throughputs in the OptorSim are fixed

•The possibility to simulate various failures of the equipment and the other errors is absent in the OptorSim

Throughput of the channel CERN-JINR and quantity of the passed data for 02.02.2008

Page 12: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Conclusion

• The main errors of the LCG including the FTS errors were considered

• The simulation toolkits do not provide possibility to simulate various sorts of errors in Grid

• The simulation of the various sorts of errors in Grid-networks is necessary

Page 13: Experience of Data Grid simulation packages using.grid2008.jinr.ru/pdf/nechaevsky.pdf · •Results of the LCG DataGrid simulation with the OptorSim. Tier-2s and Tier-1s are inter-connected

Questions?


Recommended