+ All Categories
Home > Documents > Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3...

Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3...

Date post: 10-Jun-2018
Category:
Upload: dangnhu
View: 214 times
Download: 0 times
Share this document with a friend
52
Axel Naumann
Transcript
Page 1: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Axel Naumann

Page 2: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

OutlineMotivationBasic ingredients of I/OX-Ray of a TTreeAnalysis EnvironmentsOptimizing a TTreeTSelector & PROOF

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 2

Page 3: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Motivation

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3

First data @ LHC!Reports of mis-designed TTreesReports of mis-designed data transferBored coresMisleading recommendations, rumors, misunderstanding

Let’s explain how I/O and TTrees work!

Page 4: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Reflection, C++ Objects

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 4

Page 5: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Storing C++ Objects

Need to know:TypeMembersLocation in memoryHow to create an object when reading

Provided by dictionary (rootcint / genreflex)

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 5

TNamed n("name","title");file->WriteTObject(&n);

Page 6: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

I/O's CPU TimeSerialization and zipping takes time

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 6

Page 7: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

C++ Objects versus Disk

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 7

Disk stores series of bytesC++ objects structured:

Data membersBase classesPointers

ROOT I/O convertsStreaming or Serialization

Page 8: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Zipping: CPU vs. Real TimeExample for reading:

zipped file9.4MB/s disk I/OCPU unzipscorresponds to 34MB/s data

unzipped file25MB/s disk I/O

Zipping can increase bandwidth!Especially for concurrent disk access!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 8

Page 9: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

There is more than branches…

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 9

Page 10: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Views Of A TTree: C++ Access

Branch / leaf structureSplitting: generate branches recursively according to C++ class layout;create sub-branches for

Data membersMembers of base classesContainers: split elements!(C's members, not vector<C>'s members)

Split level: where to stop splitting and put entire object into one branch instead

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 10

MyClass fMember

A: public B

vector<C> fC

Page 11: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Splitting Across CollectionsSplit vector<C>

D.fC.fM, D.fC.fNOr even vector<C*>

D.fC.fM, D.fC.fNOr even polymorph, with split level >100

D.fC.C.fM, D.fC.C.fND.fC.C1.fC1D.fC.C2.fC2

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 11

class C {int fM, fN;};

class D1{vector<C*> fC;};

class C1: public C {int fC1;};class C2: public C {float fC2;};class D2{vector<C*> fC;};

class D0{vector<C> fC;};

Page 12: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Obvious Data Considerations

Don't store empty or useless data!Can use //! to not store members

Combine branchesBetter store the jet algo name with the jets than one jet branch per algoConsider vector<T*> with split level > 100

Branch granularity is read selectivityAlways reading x,y,z,E? Don't split them!Split xyzE saves a bit of disk space, though

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 12

Page 13: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Data Layout ConsiderationsAllocating objects takes time

TClonesArray faster than vector<T>vector<T> faster than vector<T*>

Building objects takes timeFlat inheritance hierarchyReduce object containment:class A has member of class B, which has member of class C,…STL platform dependent; need extra layer

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 13

Page 14: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Data ReferencesReferences are easy to get wrong

NO map (dead slow) or uuid etc (slow + big)Better use indices

Optimal: TRef / TRefArrayGood reason to inherit from TObject!Extremely fast object dereferencing, embedded in ROOT I/OSupport for merged TTreesSupport for autoloading of branches

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 14

Page 15: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Non-Split CollectionsNon-split storing of C:1. object-wise

Faster object retrieval

2. member-wise

Faster member retrieval,Better compression

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 15

class C {int fM, fN;};

class D0{vector<C> fC;};

Page 16: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TTree's Data LayoutTTree::Fill() adds to baskets

Baskets: most important internal concept of TTrees!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 16

class C {int fM;long fN;}

BasketsObjects TTree HeaderC.fM

fMfN

C.fN offset …

offsetfMfM fMfMfM

fN fN fNfN fN

Page 17: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TTree's Data Layout: Baskets

Baskets concatenate collection elements across TTree entriesWhen basket full: zip, write to TFile, store file offset in TTree headerImportant parameters:

basket sizesizeof(element)sizeof(element)*collection entries per tree entry

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 17

e.g. std::vector<C>

Page 18: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

1 versus 1M BranchesEach branch has management overhead (baskets,…):1 branch ideal!Each branch can be accessed independently, without reading anything else: 1M branches ideal!Reasonable number of branches:tens to few hundreds

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 18

Page 19: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Spin, little disk, spin!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 19

Page 20: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Reading TTreesTask: read (subset of) all branches for a

TTree entry numberGet file offsets for requested branchesRead necessary baskets from fileUnzip baskets and fill objects

Plus: schema evolution, endianness,…

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 20

Page 21: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Reading TTreesReading baskets: what can happen?

Only part of basket is needed

Need to skip baskets of other branches

Huge basket size, tiny contained values: basket might be written at end of file

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 21

Page 22: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Read Access PatternTraditional TTree has many small reads

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 22

Page 23: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

I/O Performance AnalysisMonitor TTree reads with TTreePerfStats

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 23

TFile *f = TFile::Open("xyz.root");T = (TTree*)f->Get("MyTree");

TTreePerfStats *ps = new TTreePerfStats("ioperf",T);

Long64_t n = T->GetEntries();for (Long64_t i = 0;i < n; ++i) {

GetEntry(i);DoSomething();

}ps->SaveAs("perfstat.root");

New in v5.25/04!

Page 24: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Study TTreePerfStatsVisualizes read access:x: tree entry

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 24

TFile f("perfstat.root");ioperf->Draw();ioperf->Print();

y: file offsety: real time

Page 25: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Reading BasketsProblem: many seeks

Reduces throughputfrom O(100) MB/s to O(1) MB/s (real time)

Disk cannot support >1 userLatency for each request

typical network, typical file: 120ms round trip, 1M readsone day waiting time!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 25

Page 26: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Legi, Vidi, Vici!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 26

Page 27: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Fewer Requests, Part 1Less sensitive to latency

better ask once for 1M baskets than 1M times for 1 basket

NOT A SOLUTION: only one branchno granularityno parallelizationcharging network with irrelevant data

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 27

Page 28: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Fewer Requests, Part 2Less sensitive to latency

better ask once for 1M baskets than 1M times for 1 basket

NOT A SOLUTION: only one branchBetter: sending a collection of requests

Storage (kernel / disk / disk server) can sort requests

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 28

Page 29: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TTreeCacheSends a collection of read requests before analysis needs the basketsMust predict baskets:

learns from previous entriestakes TEntryList into account

Enabled per TTree

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 29

Improved in v5.25/04!f = new TFile ("xyz.root");T = (TTree*)f->Get("Events");T->SetCacheSize(30000000);T->AddBranchToCache("*");

Page 30: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TTreeCache UsageWithout: analysis after transfer + latency:

With TTreeCache, transfer and analysis of prior TTree entry in parallel:

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 30

CPU

I/O

CPU

I/O

Page 31: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TTreeCache vs. SeeksTTreeCache sends collected read request (readv)Merges only adjacent baskets, reducing number of requests by almost nothingDisks hate seeking, love sequential readingMuch cheaper to read 2MB than to read 1k at the beginning and 1k at the end

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 31

Page 32: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Read PaddingMerges all read requests within a given distance by also requesting bytes in betweenTypical window: 2MBDramatically reduces load onstorage device, even local diskDramatically increases throughputMust-use for concurrent storageaccess and / or network

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 32

Page 33: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Half WayMuch more ordered readsStill lots of jumps because baskets spread acrossfile

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 33

Page 34: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Problem: Basket SizeIdeally, reading TTree entry is one seekAll TTree entries' baskets consecutive

In reality, most baskets not full after filling a TTree entryBaskets shared by several TTree entriesNeed to seek to read all baskets

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 34

Page 35: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

OptimizeBaskets, AutoFlushSolution, enabled by default:

Tweak basket size!Flush baskets at regular intervals!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 35

New in v5.25/04!

Page 36: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TTree Optimizations: Results

Studying Atlas and CMS AOD filesResults for Atlas: factor 6 improvement!That can be 1 hour instead of 6!Concurrent data access now possible

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 36

TTreeCache off 30MB TTreeCacheOriginal File 658s real time 183s real time

166s CPU time 126s CPU timeOptimized File 117s real time 109s real time

102s CPU time 99s CPU time

Page 37: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

We know how to process your data!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 37

Page 38: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

TSelectorEverybody uses TTreesObvious to create a common analysis frameworkDerive from TSelectorSeparates analysis into steps

Init() – "this is your tree!"SlaveBegin() – "create your histogram!"Process() – "analyze the event!"Terminate() – "we're done, fit!"

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 38

Page 39: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Parallel AnalysisAnalyze several TTree entries in parallel, e.g. in a batchTypical steps:1. send code2. send split data3. analyze4. merge resultsUse the same TSelector also for parallel analysis!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 39

Page 40: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF

Axel Naumann • ROOT @ NTNU Tech Screening 40

PROOF farm

Storage

MASTER

commands,commands,scriptsscripts

list of outputlist of outputobjectsobjects

(histograms, (histograms, ……))

Client

Workers

2009-11-09

Page 41: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Creating a PROOF SessionIn ROOT type:

Connects ROOT to the master machine on the PROOF cluster (here: "uberpc")TSelectors will be run in PROOF

Axel Naumann • ROOT @ NTNU Tech Screening 41

TProof *p = TProof::Open("uberpc")

2009-11-09

Page 42: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF Lite

Axel Naumann • ROOT @ NTNU Tech Screening 42

commands,commands,scriptsscripts

list of outputlist of outputobjectsobjects

(histograms, (histograms, ……))

Client

Multi-core Desktop/Laptop

2009-11-09

Page 43: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Creating a PROOF Lite SessionIn ROOT type:

TSelectors will be run on all cores in parallelConverts your multi-core computer into a PROOF cluster!

Axel Naumann • ROOT @ NTNU Tech Screening 43

TProof *p = TProof::Open("lite")

2009-11-09

Page 44: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF AnalysisExample of local TChain analysis

Axel Naumann • ROOT @ NTNU Tech Screening 44

PROOF// Create a chain of treesroot[0] TChain *c = new TChain("myTree");root[1] c->Add("http://www.any.where/file1.root");root[2] c->Add("http://www.any.where/file2.root");

// MySelector is a TSelectorroot[3] c->Process("MySelector.C+");

2009-11-09

Page 45: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF AnalysisSame example with PROOF

Axel Naumann • ROOT @ NTNU Tech Screening 45

// Create a chain of treesroot[0] TChain *c = new TChain("myTree");root[1] c->Add("http://www.any.where/file1.root");root[2] c->Add("http://www.any.where/file2.root");

// Start PROOF and tell the chain to use itroot[3] TProof::Open("lite");root[4] c->SetProof();

// Process goes via PROOFroot[5] c->Process("MySelector.C+");

2009-11-09

Page 46: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF Is InteractiveSee results while they accumulate

Calculation wrong?Forgot to fill histogram?Restart now instead of in 8 hours

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 46

Page 47: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF Is QuickOptimized for quick results,not batch system occupancy

Send TTree entries to workers while running, based on their past performanceReduces "tail"Allowed ALICE tosee first collisions after two minutesinstead of hours!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 47

time

Page 48: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

PROOF AvailabilityPROOF Lite is "just there" >= 5.24Set up a local PROOF cluster, e.g. allow a batch cluster to also be used by PROOF!People who use it and the grid or traditional job-based batches love itBut you already have it:PROOF@NAF!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 48

Page 49: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

We're still not done!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 49

Page 50: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

Upcoming ChallengesDecrease CPU time of I/O

Parallel unzipping (CPU time / core)

Building objects in a smarter wayShorten merge time!

Merge in parallel to analysisEasy for histograms etc, tricky for TTrees…

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 50

I/O

CPUAnalysis

ZIP

Page 51: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

What To Take To Your Office

Many optimizations enabled by default,Except for those:

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 51

f = new TFile ("xyz.root");T = (TTree*)f->Get("Events");T->SetBranchStatus("*", 0);T->SetBranchStatus("MyBranch*", 1);T->SetCacheSize(30000000);T->AddBranchToCache("MyBranch*");

Page 52: Axel Naumann - DESY · Motivation 2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 3 First data @ LHC! Reports of mis-designed TTrees Reports of mis-designed data transfer

SummaryI/O costs real time, CPU timePerformance monitoring and optimizations part of ROOTDefault optimizers show huge benefit for network transfer and even local files!Build a good tree, see how it behavesAnalyze with PROOF for quick results!

2009-11-30 Axel Naumann • ROOT I/O @ DESY Computing Seminar 52


Recommended