+ All Categories
Home > Documents > First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with...

First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with...

Date post: 16-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
15
CMS Padova Padova, Thuersday 22 April 2004 First experience with ORCA Analysis on Grid a user point of view Stefano Lacaprara [email protected] INFN and Padova University Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.1/15
Transcript
Page 1: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS

Padova

Padova, Thuersday 22 April 2004

First experience with ORCAAnalysis on Grid

a user point of view

Stefano [email protected]

INFN and Padova University

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.1/15

Page 2: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS

JobORCA 800,Access to Digis formerly produced at LNL(tt2mu),Access to DST (tt2mu) transferred fromCERN via CNAFSimple jobs: printout plus histograms,Private library and executable,Submission from PD UI,No data discovery, jobs forced to go toLNL,

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.2/15

Page 3: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Job PreparationCode development on local machine (my own),Test of code running on locally produced data(SingleMuon, available in PD via RFIO),Copy of library, executable and .orcarc on UI(gridit003)Job preparation script reusing private code (perl+ bash) written long ago,Changes to produce jdl : trivial (with Federica’shelp!),Got GRID certificate (not so easy, even if ratherdocumented)Get proxy, and submit to RB: CNAF or CERNwhen CNAF down: some magic (Federica!)

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.3/15

Page 4: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Job Preparation (ii)What the job does:

Source script to set up environment (Marco)create ORCA 800 area (scram project) onWN, using local ORCA installation (M)copy (via input sandbox) tarball with lib(s) andexemove libs and executable to proper places(some ORCA/scram expertise needed),get .orcarc fully set up via sandboxExecute jobput root file in output sandbox

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.4/15

Page 5: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS

Job submissionSingle job directly via edg-job-submitGet job id from terminal (mouse cut and paste!)Get job status via edg-job-status using“mouse–recorded” idGet job output sandbox when status done,always via mouseFor multiple submission (up to 100 jobs inparallel) used a perl script (written long ago forLSF, adapted)Save id’s on a fileWrote a (rather complex) perl script to retrievemultiple job status and sandbox if all ok

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.5/15

Page 6: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS

DataThe Real Mess!!!

Digis available at LNL since long time (PCP)Missing: MetaData with Digi (and SimHits)attachedMissing: PoolCatalog with PFN of all files locationStole (I mean really stolen!) full MetaData fromCERNProduce Catalog from stolen one updated forLNL EVD and MetaData: partially via Poolcommands (too slow and complex) large viaeditor (and large use of RegEx)Put Catalog(s) on defined placeSet InputFileCatalogURL by hand to propercatalog(s)

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.6/15

Page 7: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Data (ii)THE REAL MESS!!!

DST available at LNL: pushed from CNAFMissing: MetaData with anything attachedMissing: PoolCatalog with PFN of all files locationFull MetaData not available nowhereDeep Winter Mode Access: no run attached!Run FixColls (COBRA tool) directly oncollection EVD run per run (Marco)Get oid and put it (them) in .orcarcDone for a couple of runs ( � �� � �

events),resulting in a multi–line, very complex and errorprone entry in .orcarcCatalog built by hand (M) and set by hand in.orcarc

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.7/15

Page 8: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS

Resultspositive

The machinery, however complex, can be forcedto workJob submitted via grid to LNLJob execution (after some job debuggingiteration)Job submission and execution overhead notdramatic (but no data discovery)Can get back the results

neutralNo real Grid job!Job forced to run at LNLData prepared by hand(s) (DC04 problem, notgrid)

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.8/15

Page 9: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Results (negative)Develop on a machine, move all tested code to aUI and then submit job from itA generic user machine must be allowed tosubmit to grid, ie to be a UI (in principle possible,via a set of rpm’s + script, not tested)Interface to Grid service not friendlyoutput edg-whatever designed to be humanreadable, not script readable (eg multi line...)What if I submit a job and lose the id? Grid-leak?Sometimes job submission failed, need expert tosee why (error message meaningless)Problem with RB unavailability: need expert toswitch to other one (must be automatic!!!)Must source by hand script to get CMSenvironment (VO==CMS)Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.9/15

Page 10: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CM

S

Res

ults

(neg

ativ

e)(i

i)N

oco

mm

ento

nD

ATA

avai

labi

lity

and

info

rmat

ion

flow

s:no

tmuc

hto

dow

ithgr

id,a

lotw

ithC

MS

com

putin

gm

odel

(?)

Aam

azin

glo

tofp

eopl

e,ex

pert

ise,

mag

ic,

stea

ling

etc

toha

veso

met

hing

usab

le,a

ndon

lyto

real

expe

rtA

bsol

utel

yno

tfor

end–

user

/ana

lyst

Nee

dw

ork

tode

alw

ithjo

bsid

’s,j

obs

stat

usqu

eryi

ngan

dsa

ndbo

xre

cove

ryD

evel

oped

ad–

hoc

scrip

tto

hand

lem

ultip

lejo

bsjo

bre

turn

stat

usm

ostly

mea

ning

less

:cr

ashe

djo

bsok

,goo

djo

bsre

port

edas

bad

Ste

fano

Laca

prar

a–

Pad

ova,

Pad

ova,

Thu

ersd

ay22

Apr

il20

04–

Firs

texp

erie

nce

with

OR

CA

Ana

lysi

son

Gri

d–

p.10

/15

Page 11: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Results (negative) (iii)Getting the output sandbox is a nightmare!!!!Must ask one by one when the job is declared tobe overOnly partial control on where get back the results(default is tmp, can easily crash the UI, noscalable at all!!)I want the job to push back the output whenfinishedI guarantee the availability of UII’m ready to lose all output if UI off-line, muchbetter that have to retrieve all outputs one by one,move it to a decent place and eventually changethe name (all by hand)

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.11/15

Page 12: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS

FutureMost depends on DC04 data availability in adecent wayDeep Winter Mode is not for userCan think to attach run at Tn if T0 will not do itWant to have a local catalog available and up todate with local PFNData discovery cannot be done on a file basisNo matter what will be the performances of RLS,my “typical” job will require

� �� � ��

files, notthinkable to search for all of them each time!!!!Current RLS implementation is similar to afilesystem w/o directlyAll files (can be

� �� � ��

) on /Idea of directories to sort files out since early

�� �

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.12/15

Page 13: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Future (ii)Get DST (a full dataset) in a TnGet all Full MetaData as wellProduce (by Tn) a catalog with all PFN ofMetaData and EVD: only once, (eg from RLS)Publish the local catalog (Tn dependent) on RLSGeneric user ask for DataSet/OwnerQuery the RLS for catalogs for catalog containingthat D/O (may be in RLS MetaData) just one file(or fews)!!!Put the result of the query in .orcarcUse the result of the query to decide where to runRun the executable

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.13/15

Page 14: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CMS Future (iii)What if (part of) a Dataset in different location?Can have RLS MetaData stating which event areavailable from a catalog, and also which type(AOD, DST, Digis, MC)In case of full dataset access, split jobs accordingto RLS metadata of catalogs for user requireddataset/ownerLNL catalog has event

� � � � � �

, PIC

� � � � � �� � �

, CNAF� � �� � �

Implement sort of directory structure in RLS

Stefano Lacaprara – Padova, Padova, Thuersday 22 April 2004 – First experience with ORCA Analysis on Grid – p.14/15

Page 15: First experience with ORCA Analysis on Gridlacaprar/talks/SwPD_Grid... · Missing: PoolCatalog with PFN of all files location Stole (I mean really stolen!) full MetaData from CERN

CM

S

Fut

ure

(iv)

Sho

rttim

esc

ale

(bef

ore

Aac

hen

Muo

nw

eek?

)te

stsh

ould

bepo

ssib

leB

asic

tool

sal

read

yte

sted

and

mor

eor

less

usab

leIn

case

,can

forc

eru

nnin

gon

give

nT

nP

ros Allo

wus

erac

cess

toda

tavi

agr

id,

use

grid

data

disc

over

y,sh

ould

have

reas

onab

lepe

rfor

man

ce(ju

stfe

ws

files

tobe

foun

d),

shou

ldev

ensc

ale

can

even

cope

with

job

split

ting

DAT

AM

US

TB

ER

EA

LLY

AV

AIL

AB

LE!

Ste

fano

Laca

prar

a–

Pad

ova,

Pad

ova,

Thu

ersd

ay22

Apr

il20

04–

Firs

texp

erie

nce

with

OR

CA

Ana

lysi

son

Gri

d–

p.15

/15


Recommended