PROOF developmentsPROOF developments
G. GanisG. Ganis
CAF meeting, ALICE offline week , 11 July 2008CAF meeting, ALICE offline week , 11 July 2008
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 22
OverviewOverview
Recent / Current developments focus mostly Recent / Current developments focus mostly onon Solving Instabilities and improving on error Solving Instabilities and improving on error
recovery recovery Improving the user interfaceImproving the user interface Resource control in multiuserResource control in multiuser
CAF is one of the main source of feedback toCAF is one of the main source of feedback to Understand problemsUnderstand problems spot missing functionalityspot missing functionality
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 33
Today’s SubjectsToday’s Subjects
Stability issueStability issue New XrdProofd plug-inNew XrdProofd plug-in Related issuesRelated issues
New Log boxNew Log box Monitoring of the memory consumptionMonitoring of the memory consumption
Dataset managementDataset management Scheduling developmentsScheduling developments
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 44
New XrdProofd plug-in (1)New XrdProofd plug-in (1)
Addresses Addresses stability issues stability issues observed typically observed typically after a failure and the attempt to reset the after a failure and the attempt to reset the sessionsession
We traced-back these to deadlock situations We traced-back these to deadlock situations due to concurrent actions not well protecteddue to concurrent actions not well protected
New plug-in implements re-designed interaction New plug-in implements re-designed interaction between components significantly reducing between components significantly reducing lockslocks
The changes for the user are minimalThe changes for the user are minimal But the level of asynchronism introduced may But the level of asynchronism introduced may
confuse people looking at the process tables, as the confuse people looking at the process tables, as the processes are cleaned with some delayprocesses are cleaned with some delay
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 55
New XrdProofd plug-in (2)New XrdProofd plug-in (2)
New featuresNew features Resiliance to xrootd failures/glitchesResiliance to xrootd failures/glitches
Applications attempt to restore the connections for Applications attempt to restore the connections for 10 mins10 mins
Solves the problem of restarting xrootd to change Solves the problem of restarting xrootd to change the configurationthe configuration
Directive to define workers in the xrootd config Directive to define workers in the xrootd config filefile Example: on CAF DEV the workers are define withExample: on CAF DEV the workers are define with
Get rid of proof.confGet rid of proof.conf
xpd.worker master lxb6043xpd.worker worker lxb60[41-42,44]xpd.worker worker lxb60[41-42,44]
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 66
Related ImprovementsRelated Improvements
Automatic shutdown of orphalin sessionsAutomatic shutdown of orphalin sessions Get rid of proofserv processes hanging aroundGet rid of proofserv processes hanging around
Improved notification in case of a worker Improved notification in case of a worker deathdeath
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 77
New Log Dialog boxNew Log Dialog box
Using TProof::Mgr(master)->GetSessionLogs()Using TProof::Mgr(master)->GetSessionLogs() Should work even if the session hangsShould work even if the session hangs
A. Kreshuk
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 88
Memory usage monitoringMemory usage monitoring Worker: RAM vs events procWorker: RAM vs events proc Master: RAM vs object Master: RAM vs object
mergedmerged Should allow to spot easily Should allow to spot easily
mem leaksmem leaks Additional analysis w/ Additional analysis w/
another tool: TMemStat?another tool: TMemStat?
A. Kreshuk
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 99
Memory consumption monitoringMemory consumption monitoring
Normal levelNormal level Workers monitor their memory usage and save info Workers monitor their memory usage and save info
in the log filein the log file Client get warned of high usageClient get warned of high usage
The session may be eventually killedThe session may be eventually killed Advanced levelAdvanced level
Possibility to save in a dedicated tree (TProofStats) Possibility to save in a dedicated tree (TProofStats) very detailed information (e.g. interface to Marian very detailed information (e.g. interface to Marian Ivanov’s memsta tool)Ivanov’s memsta tool)
To be run as second pass when a problem shows upTo be run as second pass when a problem shows up
First version in SVN the coming daysFirst version in SVN the coming days
A. Kreshuk
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1010
Dataset management (1)Dataset management (1)
Hot topic for T2/T3Hot topic for T2/T3 DatasetDataset: metadata about a set of files: metadata about a set of files
TFileCollection: list of TFileInfoTFileCollection: list of TFileInfo TFileInfoTFileInfo
UUID, TUrl’s of the fileUUID, TUrl’s of the file TFileInfoMeta: one per Ttree with name, entries, …TFileInfoMeta: one per Ttree with name, entries, …
Data-sets are Data-sets are identified by nameidentified by name Info may come from different places: catalogs, Info may come from different places: catalogs,
SQL databases, file systemsSQL databases, file systems
JFGO
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1111
Dataset manager (2)Dataset manager (2)
TProofDataSetManagerTProofDataSetManager: abstract interface : abstract interface describing the basic functionalitydescribing the basic functionality RegisterDataSet, GetDataSet, VerifyDataSet, …RegisterDataSet, GetDataSet, VerifyDataSet, … VerifyDataSet opens the files, i.e. may trigger VerifyDataSet opens the files, i.e. may trigger
stagingstaging TProofDataSetManagerFileTProofDataSetManagerFile: implementation : implementation
handling information via ROOT files handling information via ROOT files datasetname.rootdatasetname.root
Stored on the master on dedicated Stored on the master on dedicated subdirectory subdirectory <DatsetDir>/group/user/dataset <DatsetDir>/group/user/dataset
JFGO
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1212
Dataset manager (3)Dataset manager (3)
TProofDataSetManagerFile is what is used on TProofDataSetManagerFile is what is used on CAFCAF
Users can register, scan, getUsers can register, scan, get Verify is disallowed (to avoid staging overload)Verify is disallowed (to avoid staging overload)
It is run by a dedicated daemon (JFGO)It is run by a dedicated daemon (JFGO) Datasets can be processed by nameDatasets can be processed by name
Provide a way to cache the information needed at the Provide a way to cache the information needed at the validation step, speeding this up considerablyvalidation step, speeding this up considerably
TProofDataSetManager can be used also locally TProofDataSetManager can be used also locally to organize your datasets or chains.to organize your datasets or chains. No need of a dedicated macro to create the chain No need of a dedicated macro to create the chain
(CreateESDchain)(CreateESDchain)
JFGO
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1313
Dataset manager (4)Dataset manager (4)
ATLAS is very interestedATLAS is very interested They are oriented a MySQL backend and They are oriented a MySQL backend and
validity tokens for the datasetvalidity tokens for the dataset Will provide TProofDataSetManagerSQLWill provide TProofDataSetManagerSQL
Other issues raised by ATLASOther issues raised by ATLAS Possibility to use multiple dataset sources, e.g. file Possibility to use multiple dataset sources, e.g. file
and SQL based concurrentlyand SQL based concurrently problem of the datasets in federated clusters (multi-problem of the datasets in federated clusters (multi-
masters) which is challenging on the PROOF side toomasters) which is challenging on the PROOF side too
JFGO
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1414
Scheduling developmentsScheduling developments
Control resources and how they are usedControl resources and how they are used Improving efficiency Improving efficiency
assigning to a job those nodes that have data which needs assigning to a job those nodes that have data which needs to be analyzed.to be analyzed.
Implementing different scheduling policiesImplementing different scheduling policies e.g. fair share, group priorities & quotase.g. fair share, group priorities & quotas
Efficient use even in case of congestionEfficient use even in case of congestion
J. Iwaszkiewicz
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1515
Scheduling developments (2)Scheduling developments (2)
Assigning a set of workers for a job based on:Assigning a set of workers for a job based on: The data set locationThe data set location User priority (Quota + historical usage)User priority (Quota + historical usage)
Can be taken for external sourceCan be taken for external source
The current load of the clusterThe current load of the cluster Create (priority) queues for queries that cannot be Create (priority) queues for queries that cannot be
startedstarted
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1616
Scheduling developments (3)Scheduling developments (3)
Implementation exists with:Implementation exists with: # of Workers ≈ relativePriority * nFreeCPUs # of Workers ≈ relativePriority * nFreeCPUs Assign least loaded workers firstAssign least loaded workers first
Missing piecesMissing pieces Dynamic worker setup (Dynamic worker setup (advanced prototype exists)advanced prototype exists) Worker nodes auto-registrationWorker nodes auto-registration
Improved load monitoringImproved load monitoring Support for “put-on-hold” submission (Support for “put-on-hold” submission (prototype)prototype)
Scheduling schemaScheduling schema
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1717
PROOFPROOFmastermaster
DatasetDatasetLookupLookup
ClientClient SchedulerScheduler Load, history,Load, history,policy, …policy, …
1: Job{dataset, …}
2: dataset 3: file locations
4: Job info
5: workers
StartStartworkersworkers
6: workers
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1818
Other developments Other developments
PROOFLITEPROOFLITE Version of PROOF optimized for multicore machines with Version of PROOF optimized for multicore machines with
workers started directly by the ROOT session (no workers started directly by the ROOT session (no daemon)daemon)
Useful to quickly test code in a real PROOF environmentUseful to quickly test code in a real PROOF environment Will be used to study I/O issues in multicoreWill be used to study I/O issues in multicore Almost ready to go into the trunkAlmost ready to go into the trunk
PROOF / Condor integrationPROOF / Condor integration Possible ATLAS model for T3 farms not dedicated to Possible ATLAS model for T3 farms not dedicated to
PROOFPROOF Condor provides mechanism to give high priority to Condor provides mechanism to give high priority to
PROOF queries when required by PROOF queries when required by suspending/hibernating batch jobssuspending/hibernating batch jobs
11/7/200811/7/2008 G. Ganis, CAF, Alice offline weekG. Ganis, CAF, Alice offline week 1919
Questions? Questions?
CreditsCredits G.G., J. Iwaszkiewizc, A. Kreshuk, F. RademakersG.G., J. Iwaszkiewizc, A. Kreshuk, F. Rademakers M. Meoni, J.F. Grosse-Oetringhaus (ALICE)M. Meoni, J.F. Grosse-Oetringhaus (ALICE) F.Furano, A. Peters (CERN/IT)F.Furano, A. Peters (CERN/IT) A. Hanushevsky (SLAC)A. Hanushevsky (SLAC)