Andy Buckley, Hendrik Hoeth, Holger Schulz, Jan Eike von … · MC predictions used to estimate...

Professor

Andy Buckley, Hendrik Hoeth, Holger Schulz, Jan Eike von Seggern,Frank Siegert

December 6, 2010CERN

Introduction

QCD well known where perturbation theory applies

’Soft effects’ (Underlying event (UE), hadronisation. . . ) need to bemodelled

Use Monte-Carlo generators to do that

Models often phenomenological ⇒ tuneable parameters (a prioriunknown)

MC predictions used to

estimate experimental efficiencies, uncertaintiestest theories

⇒ generator tuning essential to simulate events that look like realdata

Professor 1 / 16

Typical tuneables

Intrinsic kT : a dirty little MC secret, important for first 5 GeV ofboson p⊥ (peak)

(FSR): assume universality → tune to e+e− data (eventshapes).Parameters: αs , cutoff, starting scale fudge factors; different showerevolutions (Q2, p⊥, . . . ) → different tunings

Hadronisation: model dependent! String or cluster constants, manyparameters, separate heavy quark fragmentation. Tune to (e+e−)identified particle spectra

(ISR): similar to FSR, tune to hadron collider data. Inter-jet datae.g. Z p⊥ and dijet angular decorrelation – but jet shapes nowconsidered important. For PYTHIA, fitting jet shapes means moresemi-dirty tricks: vary αs in FSR of ISR particles! (Perugia 2010)

Underlying Event (UE): Tune to hadron collider data, sensitive toPDF choice. Parameters: beam particle matter distribution, cutoff forMultiple Parton Interactions (MPI)

Professor 2 / 16

Tuning through the ages (and at LHC)

Manual tunes: lots of time and manpower or tuning experience of a life-time

Brute-force grid-scans: tough in higher dimensions of parameter space

Genetic algorithm (GAMPI, Sami Kama): burns a LOT of CPU

systematically:

Bin-wise interpolation of MC generator response and χ2 minimization(DELPHI 1995, Hamacher et al.)2nd order polynomials account for parameter correlations

“PROcedure For EStimating Systematic errORs”

Pick up DELPHI idea, much more functionality

Implemented as a Python package and set of scripts:

Actively being developed

arXiv:0907.2973, arXiv:0906.0075, arXiv:0902.4403

Professor 3 / 16

Tuning procedure in Professor (1D, 1Bin)

1 Random sampling: N parameter points in n-dimensional space

2 Run generator and fill histograms

3 For each bin: use N points to fit interpolation (2nd or 3rd orderpolynomial)

4 Construct overall (now trivial) χ2 ≈ ∑bins(interpolation−data)2

error2

5 and Numerically minimize pyMinuit, SciPy

p

bbb b

best p

data bin

bin interpolation

Professor 4 / 16

Professor setup

This is how to setup the just released version 1.1.0:

source ~abuckley/public/tutorialenv.sh

source /afs/cern.ch/sw/lcg/external/MCGenerators/professor/

1.1.0/x86 64-slc5-gcc43-opt/setup.sh

This also sources some necessary tools for plotting from Rivet

Professor 5 / 16

Warm up

We skip the parameter sampling and Monte Carlo generation steps anddownload some previously produced files

How do we know that our produced Monte Carlo histograms cover data?

We plot envelopes!

Go to the tutorial page, download the LEP tarball and do the envelopesexercise:http://projects.hepforge.org/professor/docs.sphinx/tutorial.html

In the mc folder there are many subdirectories, each contain a parameter fileused params and one file with histograms out.aida

bb

bbbbbbbb b

bb b

bb b b b

b bb

b b

bb

b

b bb b

b b b

Envelope (CL 100.0 %)

ATLAS datab

2 4 6 8 10 12 14 16 18 200

0.5

1

1.5

2

Transverse region Nchg density vs. p⊥ (leading track) (√s = 7 TeV)

p⊥ (leading track) [GeV]

〈d2Nch

g/d

ηd

φ〉

Professor 6 / 16

http://projects.hepforge.org/professor/docs.sphinx/tutorial.html

Interpolations

To build interpolations, we need to select which generator runs to use

There is only a requirement for the minimal number of runs to use,Nmin(n) = 1 + n+ n(n+ 1)/2 for interpolations using quadraticpolynomials, with n being the dimensionality of the parameter space

Have N > Nmin in this tutorial ⇒ can build many interpolations

Task

Create 11 run combinations, one with all N available runs, 10 with N − 15runs using e.g.prof-runcombs --mcdir mc -c 0:1 -c 15:10 this creates a file runcombs.dat

Create the corresponding interpolations for all available observables usinge.g. prof-interpolate --datadir . --runs runcombs.dat

Professor 7 / 16

Interactivity

Key feature of Professor:

1 we are parameterising a very expensive function

2 input to that parameterisation can be trivially parallelised

Can parallelise parameterisation (for many run combinations)

Optimisation, too

Parameterisation produces a fast, analytic “pseudo-generator”

⇒ Can get a good approximation of what a generator will do whenrun for many hours/days with particular params, in < 1 second!

Why not make an interactive MC simulator?

Professor 8 / 16

prof-I

Usage: prof-I --datadir .

Professor 9 / 16

prof-I


Professor 9 / 16

prof-I


Professor 9 / 16

Observables and Weights

This is what Professor minimises: χ2(~p) = ∑O ∑b ∈O wb(f (b)(~p)−Rb)2

∆2b

Slightly more art than science

Garbage in, garbage out

Use weights wb to:

emphasize certain observablesemphasize certain bins of an observableswitch off single bins (e.g. MinBias region for Jimmy Herwig)

No MinBias physics in JimmyHerwig

Cannot get first 3 bins or so right

Transition from MinBias to UE typephysics

⇒ Exclude these bins from Professorminimisation

Professor 10 / 16

Observables and Weights

This is what Professor minimises: χ2(~p) = ∑O ∑b ∈O wb(f (b)(~p)−Rb)2

∆2b

Slightly more art than science

Garbage in, garbage out

Use weights wb to:

emphasize certain observablesemphasize certain bins of an observableswitch off single bins (e.g. MinBias region for Jimmy Herwig)

No MinBias physics in JimmyHerwig

Cannot get first 3 bins or so right

Transition from MinBias to UE typephysics

⇒ Exclude these bins from Professorminimisation

b

b

b

b

bb

b b b b b

bb

b

b

b

b

b

b

bb

CDF datab

Jimmy/Herwig MC09 tune

Jimmy/Herwig AUET1 tune (LO*)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Nch (transverse) for min-bias

Nch

0 5 10 15 20

0.6

0.8

1

1.2

1.4

plead⊥ / GeV

MC/data

Professor 10 / 16

Tuning to data

Now that we have calculated some interpolations, we can access thetuning stage, prof-tune

prof-tune needs to know at least, which observables to tune to andwhat interpolations to use

Example:prof-tune --datadir . --weights weights --runs runcombs.dat

For a first look, it should be sufficient to tune using theobservables/weights supplied in the tarball, weights

Go to the tutorial page and do the “Tuning to reference data” exercise

This should give you 11 slightly different tuning results which we willhave a more systematic look at in the next steps

To look at the obtained results, use e.g.prof-showminresults tunes/results.pkl

Professor 11 / 16

Histograms at the tuned parameter point

prof-tune writes out the histograms as calculated from theinterpolations for each tuning result

We can use these to get a very good approximation of what thegenerator would do

Task

Use Rivet’s rivet-mkhtml to produce comparison plots for some of theAIDA files in ipolhistos/histos-tuneXYZ

Professor 12 / 16

Some tune param spreads

Due to oversampling we can create some semi-independent interpolationsthat yield slightly different tuned parameter points.We can project the obtained results on the parameter axes to investigatethe spread of of the tuning results, usingGo to the tutorial page and do the “Visualise result scatter“ exercise

1.6 1.8 2.0 2.2 2.4PARP(82)

4.4

4.6

4.8

5.0

5.2

5.4

5.6

5.8

6.0

0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30PARP(90)

4.4

4.6

4.8

5.0

5.2

5.4

5.6

5.8

6.0

χ2 /

Ndf

1.0 1.5 2.0 2.5 3.0 3.5 4.0PARP(71)

4.4

4.6

4.8

5.0

5.2

5.4

5.6

5.8

6.0

informal picture of how well-constrained a parameter is

We are happy if it looks like a vertical line

Professor 13 / 16

Sensitivities

How do we select, which (existing) data to tune to?

Lots of thinking, reading and consultation of generator authors.

Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”

p p + ε

f (b)(p + ε)

f (b)(p)

Professor 14 / 16

Sensitivities




p p + ε

f (b)(p + ε)

f (b)(p)

Professor 14 / 16

Sensitivities




p p + ε

f (b)(p + ε)

f (b)(p)

Professor 14 / 16

Sensitivities




p p + ε

f (b)(p + ε)

f (b)(p)

Professor 14 / 16

Sensitivities




p p + ε

f (b)(p + ε)

f (b)(p)

b

b

b

bbbbbbb b

bb b

bb b b b

b bb

b b

bb

b

b bb b

bb

b

p⊥ > 500 MeV, |η| < 2.5

ATLAS data (preliminary)b

HERWIG MC09

HERWIG AUET1 (MRST LO∗)HERWIG AUET1 (CTEQ6L1)

HERWIG AUET1 (CTEQ6.6)

2 4 6 8 10 12 14 16 18 200

0.5

1

1.5

2Transverse region Nch density vs. plead⊥ (

√s = 7 TeV)


〈d2Nch/d

ηd

φ〉

Professor 14 / 16

Sensitivities




p p + ε

f (b)(p + ε)

f (b)(p)

b

b

b

b

bbb b

bb

b bbb b

b bb

b

b

b

bb b

bb b

b

b

b

b

b

b

b

u u u uu u u

u u uuuuu u u u u u

u

u

u u

u uu

u

u

u uu

uu

u

qq

q

q

qq q q q

q q q qq q q

q q q q qq q q q q q

q qq

q q q q

EXPb

PRRADu

PTJIM0q

2 4 6 8 10 12 14 16 18 20

-5

-4

-3

-2

-1

0

1

Sensitivities


extrSrel

i

Professor 14 / 16

Statistically-driven tune error bandsErrors from run-combination sampling

CL=95%CL=68%

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Transverse region charged particle density

p⊥(leading track)/GeV

〈ptrack

⊥〉/

GeV

CL=95%CL=68%Jimmy pseudodata, 1M events

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Transverse region charged particle density

p⊥(leading track)/GeV

〈ptrack

⊥〉/

GeV

⇒ turned parameter spread into uncertainty beltsMost complete procedure for full systematics in Les Houches proceedings(arXiv:1003.1643). Full treatment requires asymmetric covariancesampling.

Professor 15 / 16

Checking parameterisation: line-scans

Sample params from straighthyperline through χ2 valley

Calculate and compare χ2 ofparameterisation with “true”MC response

Professor 16 / 16

Checking parameterisation: line-scans

Sample params from straighthyperline through χ2 valley

Calculate and compare χ2 ofparameterisation with “true”MC response

Professor 16 / 16

Eigentunes

Pick the extremal points of the χ2 contour hyper-ellipsoid asrepresentative tunes, cf. Hessian PDF errors.⇒ obtained Eigentunes stay consistent, respect correlations

p1

p2

p′1

p′2

b

b

b

b

b

b

bb

bb

b bb b

b

bb b b

b b

b

b b

b

b

b

bb

b b

bb

b

b

b

b

ATLAS datab

AMBT1 (central tune)

EigenTune 1+EigenTune 1-EigenTune 3+EigenTune 3-

0

0.2

0.4

0.6

0.8

1

Transverse region Nchg density vs. p⊥ (leading track)

〈d2Nch

g/d

ηd

φ〉

1 2 3 4 5 6 7 8 9 10

0.6

0.8

1

1.2

1.4


MC/data

Professor 17 / 16

Backup

Intrinsic k⊥

b

b

b

bb

b bb

bb

b b

b bb b

b

bb

b

bb b

b b

b

bb

b bb

b bb b b b b b b b b b b b b b b b b

MRST LO∗, k⊥ = 0.0 GeV




CDF datab

0

5

10

15

20

25

p⊥(Z) in Z → e+e− events

dσ/dp⊥(Z

)/(pb/GeV

)

0 2 4 6 8 10 12 14

0.6

0.8

1

1.2

1.4

p⊥(Z) / GeV

MC/data

2nd order polynomial includes lowest-order correlations between parameters

MCb(�p ) ≈ f (b)(�p ) = α(b)0 + ∑

i

β(b)i p�i + ∑

i≤j

γ(b)ij p�i p

�j

Now use N generator runs, i.e. N different parameter sets x,y:

v1

v2

...

vN

� �� v (N values, i.e. N bin contents)

=

1 x1 y1 x21 x1y1 y2

1

1 x2 y2 x22 x2y2 y2

2...

1 xN yN x2N xNyN y2

N

� �� P (N sampled parameter sets)

α0

βx

βy

γxx

γxy

γyy

� �� c (coeffs)

Therefore: �cb = I [P]�v where I is the pseudoinverse operator.

�cb = I [P]�v

Use Singular Value Decomposition (SVD), a general diagonalisation

for all normal matrices M:M = UΣV ∗

Method available in SciPy.linalg

Minimal number of runs = number of coefficients in �cb:

N(n)min = 1+ n + n(n + 1)/2+ (n + 1)(n + 2)/6

� �� cubic only

�cb = I [P]�v

Use Singular Value Decomposition (SVD), a general diagonalisation

for all normal matrices M:M = UΣV ∗

Method available in SciPy.linalg

Minimal number of runs = number of coefficients in �cb:

N(n)min = 1+ n + n(n + 1)/2+ (n + 1)(n + 2)/6

� �� cubic only

Oversampling by a factor of three has proven to be much better

Num params, P N(P)2 (2nd order) N

(P)3 (3rd order)

1 3 4

2 6 10

4 15 35

6 28 84

8 45 165

9 55 220

10 66 286

Date post:	12-Apr-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Andy Buckley, Hendrik Hoeth, Holger Schulz, Jan Eike von … · MC predictions used to estimate...

Documents