+ All Categories
Home > Documents > 20-30 years of Computing in HEP

20-30 years of Computing in HEP

Date post: 02-Jan-2016
Category:
Upload: raymond-rice
View: 32 times
Download: 5 times
Share this document with a friend
Description:
20-30 years of Computing in HEP. I nternational W orkshop on L arge S cale C omputing Kolkata 8 February 2006 Ren é Brun CERN. Hardware & Software Evolution. Looking several years back to understand how HEP software has been shaped. Punched cards. - PowerPoint PPT Presentation
82
20-30 years of Computing in HEP International Workshop on Large Scale Computing Kolkata 8 February 2006 René Brun CERN
Transcript
Page 1: 20-30 years  of Computing in HEP

20-30 years of Computing in

HEPInternational Workshop on Large Scale

ComputingKolkata

8 February 2006René Brun

CERN

Page 2: 20-30 years  of Computing in HEP

Hardware & Software Evolution

Looking several years back

to understand how HEP software has been shaped

Page 3: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 3

Punched cards

1973

hbook

geant1

Program size limited by the size of the boxand memory (32 Kb)

2000 cards per box

200000 cards inall racks

1500000 lines inROOT !!

750 boxes

Page 4: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 4

Mainframes, workstations,..

1980

1982

1986

geant3

HTV

High level packagesGraphics

MainframesWorkstations

Page 5: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 5

, OS, Desktops & Laptops

CDC SCOPE

IBM/OS/Wylbur

CRAY

Linux

VAX/VMS

Apollo Aegys

Unix workstations

MacOSX

Windows

Page 6: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 6

PROOF

Farms & GRID(s)

Page 7: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 7

The 3 technology laws

• Moore's Law: formulated by Gordon Moore of Intel in the early 70's - the processing power of a microchip doubles every 18 months; corollary, computers become faster and the price of a given level of computing power halves every 18 months. (well ! Not true anymore, see later)

• Gilder's Law: proposed by George Gilder, prolific author and prophet of the new technology age - the total bandwidth of communication systems triples every twelve months. New developments seem to confirm that bandwidth availability will continue to expand at a rate that supports Gilder's Law.

• Metcalfe's Law: attributed to Robert Metcalfe, originator of Ethernet and founder of 3COM: the value of a network is proportional to the square of the number of nodes; so, as a network grows, the value of being connected to it grows exponentially, while the cost per user remains the same or even reduces.

• But no laws about Software (well ! Murphy’s law)

Page 8: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 8

Hardware & Compilers

Rela

tive u

nit

s

Page 9: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 9

Multi Core CPUs

http://www.intel.com/technology/computing/archinnov/platform2015/

Page 10: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 10

Program Size (lines of code)

One user

Public

Libraries

MS Windows

Experiment

Code base

Page 11: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 11

Program Size (RAM)

?

Page 12: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 12

Time to compile

C++

C

ADAF77/90

Page 13: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 13

Files, Classes

Becoming a problem.

Need tools to manage dependencies

The same curve can be used to show the number of people in

one large experiment

Page 14: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 14

Languages

Business

Cobol

AI languages

Lisp, Prolog

ADA Modula 2

Eiffel

Objective C

C++ Java

SQL

Machine code

Pascal C

Formula Translation

Algorithmic Languages

Page 15: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 15

Fortran to C++

fortran

Fortran+

zebra

C++

Java

complexity

power

Page 16: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 16

From static modules to plug-ins

• app.exe = (main.o) 1955• app.exe = (main.o, x.o, y.o) 1965• app.exe = (main.o, x.o, lib1.a, lib2.a) 1975• app.exe = (main.o, x.o, lib1.a, lib2.so, lib3.so)

1985• app.exe = (main.o, libs.so) + dyn libs.so 1995• app.exe = (main.o,libs.so) + plug-in manager

2005• BOOT + URLs + local caches (interp + comp)

2015 ??• See my talk at CHEP

Page 17: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 17

Current ROOT structure & libs

Page 18: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 18

libGraf-------

…TGraphTGaxisTPave

libX11-------

drawlinedrawtext

pm

libCore-------

…I/O

TSystem…

libHist-------

…TH1TH2…

libHistPainter-------

…THistPainter

TPainter3DAlgorithms…

libGpad-------

…TPadTFrame

h.Draw()

CINT

local mode

(Plug-in Manager)

pm

pm

pm

pm

Page 19: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 19

Compiled + interpreted code

• 1980: zcedex, mini command interpreter• 1985: kuip/paw, command and macro interp• 1986: Tk/Tcl includes a GUI• 1984: comis, Fortran77 interpreter• 1994: cint, a C & C++ interpreter• 1998: python, OO on top of C++, Java• 2002: ruby, better than python?• 200x: BOOT (inter->code generation->compiler)

Page 20: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 20

Basic types and modules

• 1950: basic operators (trig functions part of the application)• 1953: trig functions in a library• 1954: fortran types (integer, real, hollerith). Subroutines

communication only via arguments.• 1965: subroutines communicate via a blank common, then

labeled common blocks.• 1975: communication via a data structure management

system• 1980: derived types• 1988: Object-Oriented programming: classes• 1995: parametrized types, templates, STL• 1996: Reflexion/RTTI (Java)

Page 21: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 21

Programing models

• Procedural sequential• Parallelism (MPI)• Vectorisation• Shared memory• Multi-threading• Client-server

• Statefull• Stateless ->web

• Corba• Distributed parallel computing (asynchronous)• Messages. Signal/slots

Page 22: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 22

Problems with Fortran

• Abuse of common blocks.• Unmanageable in large programs• No data structures• No generic machine independent I/O• Systems like Hydra(1974),

Zbook(1975),Bos(1977),Zebra(1983) designed to overcome these problems.

But most people loved it

Page 23: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 23

The Zebra system (1983)

• Zebra = Zbook + Hydra• Main data structure management system used

by PAW and Geant3 and also many collaborations.

• Powerful machine independent I/O• FZ: sequential• RZ: direct access (PAW ntuples)• Nice Data structure documentation system,

including an interactive browser DZDOC.

Page 24: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 24

Zebra bank descriptor

iq(lvx+vtype)= type

q(lvx+vdz) = dz

ltk = lq(lvx-1)

Vertex* vx;

vx.vtype = type;

vx.vdz = dz;

Track* tk = vx.tk;

Zebra style

C/C++ style

I/O descriptor

Page 25: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 25

Zebra DZDOC

lhdircall fzout(lun,lhdir,..)

Zebra I/O could support

very complexdata structures

A Zebra bank similarto a C++ data object

Page 26: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 26

Atlas DZDOC

Page 27: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 27

Zebra pros/cons

• Programming style archaic• Easy to overwrite data structures• Shared global store(s)• Shared global store(s)• Self-describing structures• Concept of multi-heap (constants, histograms,

event,..)• Efficient garbage collection (division wipe)• Built-in efficient and machine independent I/O• Used by Geant3,PAW and many experiments

Page 28: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 28

Geant 1,2,3,……..4

• Geant1 1974 • 2000 lines of Fortran 4• No physics, no geometry, only a bare framework

• Geant2 1975 • 20000 lines of Fortran 4• Some physics for multiple scattering, energy loss, decays,

framework for geometry and tracking

• Geant3 1980,81 1994 ------2006?• About 120000 lines of Fortran77 + zebra + paw• Electromagnetic physics• 4 hadronic packages (Tatina, Gheisha, Fluka, Calor)• Powerful geometry package including graphics• Hits/Digits framework• I/O subsystem (zebra) for all structures including geometry.

• Used by many experiments. Still a reference!!!

Page 29: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 29

Fluka Fluka

• Originally developed by safety protection group at CERN (stevenson) + aarnio + ranft) 1985 ?

• Reengineered by A.Ferrari &co: Rubbia project 1990

• Probably the best for Physics processes• Simple geometry• The reference for radiation/shielding• Written in fortran77• Interfaced with VMC (TFluka) and G4 (Flugg)

Page 30: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 30

Geant4

• Started in 1994• Originally a flagship project for the move to C+

+• A huge investment in manpower• About 600000 lines of C++• Validation process in Atlas, CMS and LHCb• Physics processes getting better and better• But still many limitations

• Poor interpreter (small subset callable from python)• No I/O interface (geometry cannot yet be made

persistent)• Batch style graphics

Page 31: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 31

The Virtual MC (1998)

User Code

VMC

Geometrical Modeller

G3 G3 transport

G4 transportG4

FLUKA transportFLUKA

Reconstruction

Visualisation

Generators

Page 32: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 32

Virtual Monte Carlo and ROOT Geometry

• The ROOT geometry package (TGeo) can be used in detector simulation, reconstruction, graphics, etc.

• TGeant3• Used in production – native GEANT3• New: TGeant3TGeo – interface to G3 using TGeo geometry

• No modification required in the user code• Validated by Alice• Same speed or faster than TGeant3

• TGeant4• Used for Geant4 physics validation – G4 native geometry built after g3tog4

conversion• No interface yet between G4 and ROOT geometry

• But Andrei Gheata actively working on it (expected this spring)

• TFluka• Old geometry interface using G4 geometry vis FLUGG• Currently a fully validated geometry interface based on TGeo• Validated by the Fluka team• At least 2 times slower than TGeant3

• The VMC framework is currently used by Alice, Opera, Minos, NA48b,Hades, CBM and may be STAR.

Page 33: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 33

PAW: a long saga

• First version (Jan 1985) by a committee• Must use GKS• GUI based on VT100 functionality• No ntuples

• June 1985: developers “abolish” the committee• Higz: GKS + X11• Row wise ntuples, then ColumnWiseNtuples

(1986)• Frozen in 1994, but still maintained by ROOT

team

Page 34: 20-30 years  of Computing in HEP

Crisis: 1992 1999

Move to F90?

Attempts to use OODBMS

and commercial software

Page 35: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 35

Why not F90 after F77?

• In 1989,90,91 assumption was F90• Some work invested in I/O with F90 (to support

derived types). We could not solve this problem, because no formal way to parse the F90 module descriptors.

• In 1992 many forces pushing towards OO• CHEP Annecy, web, Next

• Crisis in Dec 1992 (at least in IT software group)• 1/3 in favour of f90• 1/3 in favour of commercial solutions• 1/3 in favour of C++

Page 36: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 36

1993,1994,1995

• ZOO, NextPaw, Geant3.5 proposals rejected• ZOO: Zebra in the OO world• NextPaw: Paw evolution ->C->C++• Geant3.5: Implement geometry package in C++

• Geant4 proposal (June 1994)• RD45/Objectivity project (fall 1994)• ROOT project starts (in NA49) (Jan 1995)

XCERNLIB

ROOTGEANT4

Objectivity

LHC++

LCG

1994

2006

Oracle

COOL

PAW GEANT3

Page 37: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 37

1996

• ROOT chooses the CINT interpreter• We had been attracted by Java (Object base

class, many common ideas).• Work on object persistency based on the

dictionary information (introspection).• Design of ROOT Trees (split mode). Comparison

with Objectivity• LHC++ project starts (against ROOT)

• see Jamie’s talk

Page 38: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 38

1997->2000

• Getting experience with OO (professional developers).

• Most users lost in f77->C++• First signs of problems with Objectivity in Babar• FNAL RUN II chooses ROOT

• But C++ seen as a temporary solution waiting for efficient Java at the horizon 2003.

• ROOT : automatic I/O based on dictionary, automatic schema evolution.

Page 39: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 39

Problems with commercial systems

• Licensing• Deployment• Vendor is late to follow with compilers & OS• Difficult to request new functionality• Difficult to get good people to do support and

maintenance. Programmers want to develop code.

Page 40: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 40

Data Analysis Software

• 1960: Do it yourself• 1968: SUMX

• Histograms and data blocks described in input file. SUMX is the master.

• 1973: HBOOK• Histogram library. User controls the event loop and the

selection.• 1985: PAW

• Interactive histograms/fitting. Ntuples• 1995: ROOT

• Same as PAW + persistency for C++ objects. C++ interpreter

• 2005: PROOF and GRID• Distributed analysis: client->Master->Workers (parallelism)

Page 41: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 41

PAW

paw > set col 2

paw > hi/plot 1

paw > set col 4

paw > hi/plot 2 same

paw > ntuple/plot 100 px:py px*px+py*py>1

Simple adhoc interpreter

KUIP

Page 42: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 42

ROOT

root > h1.SetLineColor(2)

root > h2.SetLineColor(4)

root > h1.Draw()

root > h2.Draw(“same”)

root > t.Draw(“px:py”,”px*px+py*py>1”)

root > myobject.DoSomething(…)

C++ interpreterCINT

Page 43: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 43

Graphics & GUI evolution

• Plotters (eg GD3): Calcomp• GKS times: screen is the memory• PHIGS• X11, GL: the winners• From graphics attributes set in sequence to Objects

• With PAW: • set color red• Now all primitives are red

• With ROOT: attribute values do not depend on the order they are set => easier to write a graphics editor

• From Callbacks Messages->Signal&Slots• Signal&Slots require an interpreter (see Qt and

Root)• Scriptable GUIsScriptable GUIs (a MUST)

Page 44: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 44

Graphics and GUI systems

• Calcomp plotters 1955• First graphics packages (CERN GD3 1970)• HPLOT 1975……• HPLOT -> HIGZ GD3 1978• HPLOT -> HIGZ US Core system (fnal) 1981• HPLOT -> HIGZ GKS 1983• HPLOT -> HIGZ PHIGS 1985• HPLOT -> HIGZ X11 1985• PAW ->VT100, GKS, 1985• PAW ->MOTIF 1991• ROOT -> X11 1995• ROOT ->Win32 1996, 2002• ROOT ->Qt 2002, 2006• ROOT -> GL 2002

Page 45: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 45

Graphics and GUI systems (cont)

• Most graphics/GUI systems that we have used have been based on International standards or de facto standards.

• All these systems had a limited life time• The CORE system : 5 years • GKS : 10 years• PHIGS : < 10 years• X11 : > 20 years• MOTIF : < 8 years• Qt : ??

• So far, no applications built directly on top of these systems were portable to the next generation. A new generation every 8, 10 years

Page 46: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 46

All items are clickable objects

from plotters

to objects

Page 47: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 47

Can take advantage of

graphics accelerators

Page 48: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 48

Extremely important

The ROOT GUIIs fully scriptable

Page 49: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 49

ROOT GUI/graphics interoperability

ROOT GUI and Graphics embedded in a Windows MFC application

The ROOT GUI events can be passed to the MFC application

event loop.A ROOT canvas can be

embedded in a MFC canvas

Page 50: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 50

ROOT GUI/graphics interoperabilityROOT GUI and Graphics embedded in a Qt application

The ROOT GUI events can be passed to the Qt

application event loop.A ROOT canvas can be

embedded in a Qt canvas

Qt browser

Drag & Drop objects in the ROOT

canvas One can use ROOT context menus

Page 51: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 51

Interpreters & dictionaries

CompiledLibrary1

CompiledLibrary2

Dictionary1Data &

Functions

Dictionary2Data &

Functions

C++interpreter

Pythoninterpreter

*.h

SWIGBoos

t

CompiledLibrary3

Dictionary3Functions

rootcint

reflex

*.h

Page 52: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 52

Interpreter & Compiler integration

root > .x script.C

root > DoSomething(…);

root > .x script.C++

root > .x script.C+

gROOT->ProcessLine(“.L script.C+”);

gROOT->ProcessLine(“DoSomething(…)”);

execute file script.C

execute function DoSomething

compile file script.Cand execute it

compile file script.Cif file has been

modified.execute it

same from compiledor interpreted code

Page 53: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 53

Possible Progress with Interpreters

• Eliminate the stub interface to call C/C++ functions.• This is already possible in CINT with C libraries.• It will be possible with C++ when a standard ABI will be

available, otherwise compiler&linker dependent.

• If compiler is fast enough (eg C), use the interpreter only for organizing the top level.

• If next C++ provides introspection, one could eliminate• the header files parser• 95 per cent of the dictionary structure in memory

• A good argument to have the interpreted and compiled code being in the same language!

• But WHEN ???????

Page 54: 20-30 years  of Computing in HEP

Object Persistency with ROOTObject Persistency with Objectivity

Page 55: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 55

Object Persistency

• Object Persistency has been a long snake for 10 years or more.

• Today general agreement to exploit HEP feature of having mainly read-only files and use RDBMS systems only where concurrent write access is required.

• A lot of work spent in ROOT to understand and design an efficient object streaming system (object-wise and member-wise).

• I/O system and query system must know each other.

Page 56: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 56

OODBMS (ie Objectivity)

• Hope:• Address one single object in a petabyte data base• Resolve all the object catalog issues

• Reality:• Licensing/installation/portability problems• 64 bits OID did not scale above 10 terabytes• Request for 128 bits OID never implemented• Locking problems when many users in read mode.• Central DB mismatch with GRID• No automatic schema evolution (big problem)• No interactivity

Page 57: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 57

OODBMS (ie Objectivity) (2)

• The OODBMS evangelists (and later RDBMS) passed many wrong messages• Commercial data bases will save manpower• Commercial data bases can be used for all type of data• Performance is OK

• Reality:• Probably more than 100 personyears invested in this

exercise• To be compared to a few man years for ROOT I/O• Performance was not adequate (already spotted by

ROOT/Objy comparisons early 1996)• Physics analysis requirements were totally ignored• Too much weight given to “experts in bookkeeping”

Page 58: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 58

ROOT I/O principles

• Two main I/O solutions• Unix-like file/directory structure with keyed objects

• OK for histograms, geometries, mag field

• Special Event data oriented Trees• With object streaming and splitting modes• Optimized for data analysis

• Design targeting performance and minimum file size

• Support for network files• Exploit advantages of read-only files as much as

possible• Interface with RDBMS when locking required

Page 59: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 59

ROOT Trees

0123456789101112131415161718

T.Fill()

T.GetEntry(6)

T

Memory

Page 60: 20-30 years  of Computing in HEP

Data Analysis on the GRID(s)see Fons talk

Page 61: 20-30 years  of Computing in HEP

Some observations

Page 62: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 62

Experience with C++

• Very powerful but complex language.• Easy to make a complex system with a lot of

class dependencies. Changing one class forces a recompilation of many other classes.

• No garbage collector. Only one heap.• ABI(Application Binary Interface) is not yet

standardized: a mess on Linux/gcc (C is OK)• No introspection: -> develop yours.• Too much coupling between data and code.• Templates defined statically at compilation time,

ie difficult to use in an interactive environment.• Slow compilation if abuse of templates and STL

Page 63: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 63

Missing features in C++

• Introspection• Not possible to compile a class from a dictionary • Multi-heap (like Zebra divisions)

• Would require a garbage collector and a Handle type like in C++/CLI from MS

• Possibility to add one or more functions without recompiling the class, although this can be easily done in C.

• Dynamic creation of templated types

Page 64: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 64

Introspection systems

• Meta information describing all types and functions.

• Not necessary for languages like f77 having only basic types. I/O in f77 implemented via simple switch statements.

• Vital for languages supporting derived types for automatic I/O, inspectors, browsers and interpreters.

• CINT, Java, cint/root/reflex

Page 65: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 65

Why not Java or Python

• Java strong candidate in 1996->2000• Why experiments moved to C++?

• Speed, Geant4, ROOT ?

Java is more productive than C/C++. Use C/C++ only when speed or bare metal access is called for. Python/Ruby is more productive than Java and more pleasant to code in.

Microsoftview

Computer scientist

view

Page 66: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 66

Main software problems seen by large experiments• Move to C++ completed (well nearly!)• Complex experiment framework• Too many dependencies• Difficult to install (SCRAM, CMT)• Installation time far too long• The wheel is reinvented many times• Several unwanted features (eg Atlas Storegate)• Coding conventions not followed

• A code checker is essential

• Non documented classes and modules

Page 67: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 67

Alice Atlas CMS ROOTnumber of lines in header files

102282 698208 104923 153775

classes total 1815 8000? ??? 1500

classes in dict 1669 >4120 835 1422

lines in dict 479849 ??? 103057 698000

classes c++ lines

577882 1524866 277923 857390

total linesClasses+dict

1057731 ??? 380980 1553390

totalf77 lines

736751 928574 ??? 3000

directories 540 19522 <500 958

comp time 25’ 750’ 90’ 30’

comp lines/s 1196 50 (70) 71 863

LHC software

Page 68: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 68

• A considerable amount of time is spent in installing software (up to one day for an expert).

• Porting to a new platform is non trivial.• Dependency problems in case many packages

must be installed.• Only a small subset of the software is used.• The installation may require a huge amount of

disk space. Users are scared to download a new version.

• This is not fitting well with the GRID concept.• The GRID should be used to simplify this

process and not to make it more complex.

Observations

Page 69: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 69

Page 70: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 70

Page 71: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 71

%classes used

%functions used

Fraction of code really used in one program

Page 72: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 72

Consequences

• The fact that only a very small fraction of the total code base is used has important consequences.

• We must turn this apparent problem into a great feature.

• BOOT: a proposal to solve this problem. • see my talk at CHEP.

Page 73: 20-30 years  of Computing in HEP

Spare slides

Page 74: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 74

Tree Friends

0123456789101112131415161718

0123456789101112131415161718

0123456789101112131415161718

Public

read

Public

read

User

Write

Entry # 8

Page 75: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 75

File types & Access in 5.06

LocalFileX.xml

RFIO Chirp

CastorDcacheLocalFileX.root

http rootd/xrootd

Oracle

SapDb

PgSQL

MySQL

TFileTKey/TTreeTStreamerInfo

user

TSQLServerTSQLRowTSQLResult

TTreeSQL

Page 76: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 76

Typical trends with Experiments frameworks

• A few gurus design the framework• In general adequate for batch processing (simulation

and reconstruction).• But too complex for the majority of users.• Users find simpler individual solutions.• Many users work in several experiments and want to

use common software.• Fights between groups.• New management structure put in place.

Software is not a technical problem. There are many ways to implement an algorithm, a module, a complete system.

Sociology is an important issue in large collaborations.

Page 77: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 77

Experiment FrameworksStarting point

Monolithicsimulation

Analysistoolkit

Simulationtoolkit

Monolithicreconstructi

onuser

PAWor

ROOT used like PAW

Page 78: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 78

Experiment FrameworksEnd point

Core frameworkwith plug-in manager persistency, dictionary, folders, graphics, GUI

and general utilities

Simulationtoolkit

Simulation & Reconstruction

libraries hierarchy

UserLoads only

whathe needs

Page 79: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 79

Alice packages with > 10000 lines

398742 PDF fortran=398729,ansic=13146414 PYTHIA6 fortran=140748,cpp=5413,ansic=153,pascal=100128337 HLT cpp=127601,ansic=605,sh=100,csh=31128103 ITS cpp=128010,sh=93105763 MUON cpp=105673,sh=9094548 DPMJET fortran=94267,cpp=28172400 STEER cpp=7240052443 HBTAN cpp=51260,fortran=118351489 TPC cpp=51479,sh=1050932 PHOS cpp=50639,csh=29346176 TRD cpp=4617641998 ISAJET fortran=40483,cpp=1494,pascal=2139407 RALICE cpp=29764,ansic=9355,sh=28835916 EMCAL cpp=35410,fortran=383,csh=12331820 ANALYSIS cpp=3182027751 HERWIG fortran=27246,cpp=477,ansic=2827025 FMD cpp=27021,sh=426667 TOF cpp=2666724258 EVGEN cpp=2425821588 HIJING fortran=21099,cpp=48920562 JETAN cpp=19687,fortran=87518344 RAW cpp=1834415232 STRUCT cpp=1523213142 PMD cpp=1314212945 RICH cpp=1294510966 FASTSIM cpp=1096610944 MONITOR cpp=1094410659 ZDC cpp=10659

1.5 million lines of code

Page 80: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 80

• Assumes BOOT already installed on your machine [email protected]

• Nothing else on the machine except the compiler (no ROOT, etc)

• Import a ROOT file containing histograms, Trees and other classes (usecase1.root)

• Browse contents of file• Draw an histogram

BOOT: Use Case 1

Page 81: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 81

Usecase1.root(2 Mbytes)

Contains references(URL) to classes in

namespace ROOT

[email protected]

http://root.cern.ch/coderoot.root

This is a compressed ROOT filecontaining the full ROOT source tree

automatically built from CVS(25 Mbytes)

+

ROOT classes dictionary DSgenerated by Reflex

(5 Mbytes)+

The full classes documentationObjects generated by the source

parser(5 Mbytes)

[email protected]

Local cache withthe source of the

classes really used+

binaries for the classes or functions

that are automatically generated from the

interpreter (like ACLIC mechanism)

Use Case 1

Page 82: 20-30 years  of Computing in HEP

René Brun, CERN Kolkata IWLSC workshop 82

code.root

usecase1.root

Use Case 1 pictures


Recommended