+ All Categories
Home > Documents > Recent PROOF developments

Recent PROOF developments

Date post: 14-Jan-2016
Category:
Upload: glora
View: 42 times
Download: 0 times
Share this document with a friend
Description:
Recent PROOF developments. G. Ganis PROOF workshop, 29 November 2007. files. scheduler. The PROOF approach in a nutshell. catalog. Storage. PROOF farm. query. PROOF job: data file list, myAna.C. final outputs (merged). MASTER. feedbacks (merged). - PowerPoint PPT Presentation
23
Recent PROOF developments Recent PROOF developments G. Ganis G. Ganis PROOF workshop, 29 November 2007 PROOF workshop, 29 November 2007
Transcript
Page 1: Recent PROOF developments

Recent PROOF developmentsRecent PROOF developments

G. GanisG. Ganis

PROOF workshop, 29 November 2007PROOF workshop, 29 November 2007

Page 2: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 22

The The PROOFPROOF approach approach in a nutshellin a nutshell

catalog StoragePROOF farm

schedulerquery

MASTER

PROOF job:data file list, myAna.C

files

final outputs

(merged)feedbacks (merged)

farm perceived as extension of local PC same syntax as in local session

more dynamic use of resources real time feedback automated splitting and merging

Page 3: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 33

Issues addressed by the Issues addressed by the developmentsdevelopments

User interfaceUser interface Processing of generic jobsProcessing of generic jobs Data set, software handlingData set, software handling

Performance and responsivenessPerformance and responsiveness Load balancing within a queryLoad balancing within a query Access to dataAccess to data

Monitoring toolsMonitoring tools Processing information at the end of a queryProcessing information at the end of a query Memory usageMemory usage

Resource usage controlResource usage control Enforce prioritiesEnforce priorities Improve responsiveness in multi-user environment Improve responsiveness in multi-user environment

Testing, tutorials, installationTesting, tutorials, installation

Page 4: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 44

Dataset managerDataset manager

Metadata about a set of files stored in sandbox on the master on Metadata about a set of files stored in sandbox on the master on dedicated subdirectory dedicated subdirectory <DatsetDir>/group/user/dataset<DatsetDir>/group/user/dataset or or <SandBox>/dataset<SandBox>/dataset

Data-sets are Data-sets are identified by nameidentified by name

Data-sets can be Data-sets can be processed by nameprocessed by name

No need to create the chain locally (i.e. on the client)No need to create the chain locally (i.e. on the client)

root[0] TProof *proof = TProof::Open(“master”);root[1] TFileCollection fc(“dum”,””,”file.list”);root[2] proof->RegisterDataSet(“MyDataSet”, &fc);root[3] proof->ShowDataSets();Existing Datasets:MyDataSet

root[] proof->Process(“MyDataSet”, “MySelector.C+”);

J. Iwaszkiewicz + G. Bruckner + J.F. Grosse-Oetringhaus (more on Jan-Fiete’s talk)

Page 5: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 55

Begin()•Create histos, …•Define output list

Terminate()•Final analysis (fitting, …)

output listSelector

Time

Process()

analysis

1…N

// Open the PROOF sessionroot[0] TProof *p = TProof::Open(“master”)// Run 1000 times the analysis defined in the// MonteCarlo.C TSelectorroot[1] p->Process(“MonteCarlo.C+”, 1000)

New TProof::New TProof::ProcessProcess(const char *(const char *selectorselector, Long64_t, Long64_t times times))

Implement algorithm in a TSelectorImplement algorithm in a TSelector

Generic, non-data-driven analysisGeneric, non-data-driven analysisL. Tran-Thanh

Page 6: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 66

Generic, non-data-driven analysisGeneric, non-data-driven analysis

New packetizer TPacketizerUnitNew packetizer TPacketizerUnit Time-based packet sizesTime-based packet sizes Processing speed of each worker measured Processing speed of each worker measured

dynamicallydynamically Included in ROOT 5.17/04Included in ROOT 5.17/04

Page 7: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 77

Output file mergingOutput file merging

Large output objectsLarge output objects (e.g. trees) create memory (e.g. trees) create memory problemsproblems

Solution:Solution: save them in files on the workerssave them in files on the workers merge the files on the master using TFileMergermerge the files on the master using TFileMerger

New class New class TProofFileTProofFile defines the file and provide tools defines the file and provide tools to handle the mergingto handle the merging Unique file names are created internally to avoid crashesUnique file names are created internally to avoid crashes

Merging will happen on the Master at the end of the Merging will happen on the Master at the end of the query query

Final file is left in sandbox on the master or saved Final file is left in sandbox on the master or saved where the client wisheswhere the client wishes

Included in ROOT 5.17/04Included in ROOT 5.17/04

L. Tran-Thanh

Page 8: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 88

Output file merging: exampleOutput file merging: examplevoid PythiaMC::SlaveBegin(TTree *) { // Meta file object: to be added to the output list fProofFile = new TProofFile();

fOutput->Add(fProofFile); // Output filename (any format understood by TFile::Open) TNamed *outf = (TNamed *) fInput->FindObject(“PROOF_OUTPUTFILE”); if (outf) fProofFile->SetOutputFileName(outf->GetTitle()); // Open the file with a unique name fFile = fProofFile->OpenFile(“RECREATE”); // Create the tree and attach it to the file fTree = new TTree(…); fTree->SetDirectory(fFile); …}Bool_t PythiaMC::Process(Long64_t entry) { fTree->Fill();}void PythiaMC::SlaveTerminate() { if (fFile) { fFile->cd(); // Write here big objects fTree->Write(); fFile->Close(); }}

Page 9: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 99

Software handlingSoftware handling

Package handlingPackage handling Separated behaviour client / cluster for enablingSeparated behaviour client / cluster for enabling Real-time feedback during buildReal-time feedback during build API to modify include / library paths on the workersAPI to modify include / library paths on the workers

Use packages globally available on the clusterUse packages globally available on the cluster Load mechanism extended to single class / macroLoad mechanism extended to single class / macro

Selectors / macros / classes binaries cachedSelectors / macros / classes binaries cached Decreases initialization time if selector did not changeDecreases initialization time if selector did not change Version check for binaries based also on SVN revisionVersion check for binaries based also on SVN revision

Support for multiple ROOT versionsSupport for multiple ROOT versions

root[] TProof *proof = TProof::Open(“master”)root[] proof->Load(“MyClass.C”)

Page 10: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1010

Software handlingSoftware handling

Next stepsNext steps Package versioning (e.g. ESD-v1.12.103-new)Package versioning (e.g. ESD-v1.12.103-new)

Directory structure including also the ROOT Directory structure including also the ROOT versionversion

Filter the selector code into a “client” and “cluster” Filter the selector code into a “client” and “cluster” partsparts

Clients should not be obliged to load tons of Clients should not be obliged to load tons of experiment libraries typically needed only for experiment libraries typically needed only for processing on the clusterprocessing on the cluster

~$ pwd<SandBox>/packages/ESD/1.12.103-new/root_v5.17.05-r20920/ESD

Page 11: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1111

Load balancing: improved packetizerLoad balancing: improved packetizer

Packetizer’s goal: optimize work distribution to Packetizer’s goal: optimize work distribution to process queries as fast as possibleprocess queries as fast as possible

Standard TPacketizer’s strategyStandard TPacketizer’s strategy first process local files, than try to process remote datafirst process local files, than try to process remote data

End-of-query bottleneckEnd-of-query bottleneck

Active workersActive workers

Processing timeProcessing time

J. Iwaszkiewicz

Page 12: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1212

New strategy: TPacketizerAdaptiveNew strategy: TPacketizerAdaptive

Predict processing time of local files for each workerPredict processing time of local files for each worker Keep assigning remote files from start of the queryKeep assigning remote files from start of the query to to

workers expected to finish fasterworkers expected to finish faster Processing time Processing time improved by up to 50%improved by up to 50%

Remote packetsRemote packets

SameSamescalescale

Processing rateProcessing rate for all packetsfor all packets

NEW

OLD

Page 13: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1313

Data accessData access

Tree cache enabled (+ asynchronous reading)Tree cache enabled (+ asynchronous reading) Expect improvements in the case of many users Expect improvements in the case of many users

and non-local filesand non-local files Under study:Under study:

Exploit large number of cores and relatively large Exploit large number of cores and relatively large amount of memory of new machinesamount of memory of new machines

Separate thread for unzipping the dataSeparate thread for unzipping the data Use xrootd as dynamic pre-loaderUse xrootd as dynamic pre-loader

Page 14: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1414

Monitoring the resource usageMonitoring the resource usage

Per-query informationPer-query information CPU time, wall time, bytes read, events, user, groupCPU time, wall time, bytes read, events, user, group

Posted by the master via Posted by the master via TVirtualMonitorWriter TVirtualMonitorWriter E.g. MonAlisa, MySQLE.g. MonAlisa, MySQL

Used for monitoring or to correct priorities Used for monitoring or to correct priorities based on usage history (see M.Meoni’s talk)based on usage history (see M.Meoni’s talk)

Page 15: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1515

Memory consumption monitoringMemory consumption monitoring

Workers monitor their memory usage and Workers monitor their memory usage and save info in the log filesave info in the log file

New button in the dialog box to display the New button in the dialog box to display the evolution of memory usage per node in real evolution of memory usage per node in real timetime

Client get warned of high usageClient get warned of high usage The session may be eventually killedThe session may be eventually killed

Prototype being testedPrototype being tested

A. Kreshuk

Page 16: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1616

Motivation for scheduling?Motivation for scheduling?

Controlling resources and how they are usedControlling resources and how they are used Improving efficiency Improving efficiency

assigning to a job those nodes that have data which assigning to a job those nodes that have data which needs to be analyzed.needs to be analyzed.

Implementing different scheduling policiesImplementing different scheduling policies e.g. fair share, group priorities & quotase.g. fair share, group priorities & quotas

Efficient use even in case of congestionEfficient use even in case of congestion

Page 17: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1717

PROOF specific requirementsPROOF specific requirements

Interactive systemInteractive system Jobs should be processed as soon as submitted.Jobs should be processed as soon as submitted. However when max system throughput is reached However when max system throughput is reached

some jobs has to postponedsome jobs has to postponed I/O bound jobs use more resources at the start I/O bound jobs use more resources at the start

and less at the end (file distribution)and less at the end (file distribution) Try to process data at its location for Try to process data at its location for

performanceperformance User defines a dataset not the #workersUser defines a dataset not the #workers

Page 18: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1818

Enforcing experiment priority policiesEnforcing experiment priority policies

Based on group priority information defined in Based on group priority information defined in dedicated filesdedicated files

TechnologyTechnology ““renice” low priority non-idle sessionsrenice” low priority non-idle sessions

Priority = 20 – nice ( -20 <= nice <= 19)Priority = 20 – nice ( -20 <= nice <= 19) Limit max priority to avoid over killing the systemLimit max priority to avoid over killing the system

May be centrally controlledMay be centrally controlled Master updates the priorities and broadcast them Master updates the priorities and broadcast them

to the active workersto the active workers Feedback mechanism – e.g. via monitoring tool – Feedback mechanism – e.g. via monitoring tool –

allows to adjust the priorities (see M.Meoni’s talk)allows to adjust the priorities (see M.Meoni’s talk)

Page 19: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 1919

Central SchedulerCentral Scheduler

Assigning a set of workers for a job based on:Assigning a set of workers for a job based on: The data set locationThe data set location User priority (Quota + historical usage)User priority (Quota + historical usage) The current load of the clusterThe current load of the cluster

First implementation:First implementation: # of Workers ≈ relativePriority * nFreeCPUs # of Workers ≈ relativePriority * nFreeCPUs Assign least loaded workers firstAssign least loaded workers first

Missing ingredientsMissing ingredients Come&Go functionality for workersCome&Go functionality for workers

Needed also by the Condor interfaceNeeded also by the Condor interface

Page 20: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 2020

Central schedulingCentral scheduling

Schematic viewSchematic view

PROOFPROOFmastermaster

DatasetDatasetLookupLookup

ClientClient SchedulerScheduler Load, history,Load, history,policy, …policy, …

1: Job{dataset, …}

2: dataset 3: file locations

4: Job info

5: workers

StartStartworkersworkers

6: workers

Page 21: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 2121

Tutorials, Testing, installationTutorials, Testing, installation

TutorialsTutorials Frame for PROOF examples:Frame for PROOF examples:

$ROOTSYS/tutorials/proof/runProof.C$ROOTSYS/tutorials/proof/runProof.C Currently available: Currently available:

« simple »: histogram filling with random entries« simple »: histogram filling with random entries « h1-http »: H1 analysis reading data via HTTP« h1-http »: H1 analysis reading data via HTTP

TestingTesting Frame for PROOF tests:Frame for PROOF tests:

$ROOTSYS/test/stressProof.C$ROOTSYS/test/stressProof.C InstallationInstallation

Interactive script to simplify the installation of a small clusterInteractive script to simplify the installation of a small cluster $ROOTSYS/etc/proof/utils/proofinstall.sh$ROOTSYS/etc/proof/utils/proofinstall.sh

Page 22: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 2222

PROOF and SVNPROOF and SVN

PROOF development branchPROOF development branch http://root.cern.ch/svn/root/branches/dev/proofhttp://root.cern.ch/svn/root/branches/dev/proof Synchronized daily with the main trunk Synchronized daily with the main trunk

PROOF tagsPROOF tags http://root.cern.ch/svn/root/branches/dev/proof-tagshttp://root.cern.ch/svn/root/branches/dev/proof-tags Specific « snapshots » of the dev branchSpecific « snapshots » of the dev branch Binaries installed on AFS at Binaries installed on AFS at

/afs/cern.ch/sw/lcg/contrib/proof/root/afs/cern.ch/sw/lcg/contrib/proof/root

Page 23: Recent PROOF developments

29/11/200729/11/2007 G. Ganis, PROOF workshop 2007G. Ganis, PROOF workshop 2007 2323

Questions? Questions?

CreditsCredits G.G., J. Iwaszkiewizc, A. Kreshuk, F. Rademakers, L. G.G., J. Iwaszkiewizc, A. Kreshuk, F. Rademakers, L.

Tran-Thanh (summer student ‘07)Tran-Thanh (summer student ‘07) G. Bruckner, J.F. Grosse-Oetringhaus, M.Meoni, A. G. Bruckner, J.F. Grosse-Oetringhaus, M.Meoni, A.

Peters (ALICE)Peters (ALICE) F. Furano (CERN)F. Furano (CERN) A. Hanushevsky (SLAC)A. Hanushevsky (SLAC)


Recommended