Professor
Andy Buckley, Hendrik Hoeth, Holger Schulz, Jan Eike von Seggern,Frank Siegert
December 6, 2010CERN
Introduction
QCD well known where perturbation theory applies
’Soft effects’ (Underlying event (UE), hadronisation. . . ) need to bemodelled
Use Monte-Carlo generators to do that
Models often phenomenological ⇒ tuneable parameters (a prioriunknown)
MC predictions used to
estimate experimental efficiencies, uncertaintiestest theories
⇒ generator tuning essential to simulate events that look like realdata
Professor 1 / 16
Typical tuneables
Intrinsic kT : a dirty little MC secret, important for first 5 GeV ofboson p⊥ (peak)
(FSR): assume universality → tune to e+e− data (eventshapes).Parameters: αs , cutoff, starting scale fudge factors; different showerevolutions (Q2, p⊥, . . . ) → different tunings
Hadronisation: model dependent! String or cluster constants, manyparameters, separate heavy quark fragmentation. Tune to (e+e−)identified particle spectra
(ISR): similar to FSR, tune to hadron collider data. Inter-jet datae.g. Z p⊥ and dijet angular decorrelation – but jet shapes nowconsidered important. For PYTHIA, fitting jet shapes means moresemi-dirty tricks: vary αs in FSR of ISR particles! (Perugia 2010)
Underlying Event (UE): Tune to hadron collider data, sensitive toPDF choice. Parameters: beam particle matter distribution, cutoff forMultiple Parton Interactions (MPI)
Professor 2 / 16
Tuning through the ages (and at LHC)
Manual tunes: lots of time and manpower or tuning experience of a life-time
Brute-force grid-scans: tough in higher dimensions of parameter space
Genetic algorithm (GAMPI, Sami Kama): burns a LOT of CPU
systematically:
Bin-wise interpolation of MC generator response and χ2 minimization(DELPHI 1995, Hamacher et al.)2nd order polynomials account for parameter correlations
“PROcedure For EStimating Systematic errORs”
Pick up DELPHI idea, much more functionality
Implemented as a Python package and set of scripts:
Actively being developed
arXiv:0907.2973, arXiv:0906.0075, arXiv:0902.4403
Professor 3 / 16
Tuning procedure in Professor (1D, 1Bin)
1 Random sampling: N parameter points in n-dimensional space
2 Run generator and fill histograms
3 For each bin: use N points to fit interpolation (2nd or 3rd orderpolynomial)
4 Construct overall (now trivial) χ2 ≈ ∑bins(interpolation−data)2
error2
5 and Numerically minimize pyMinuit, SciPy
p
bbb b
best p
data bin
bin interpolation
Professor 4 / 16
Professor setup
This is how to setup the just released version 1.1.0:
source ~abuckley/public/tutorialenv.sh
source /afs/cern.ch/sw/lcg/external/MCGenerators/professor/
1.1.0/x86 64-slc5-gcc43-opt/setup.sh
This also sources some necessary tools for plotting from Rivet
Professor 5 / 16
Warm up
We skip the parameter sampling and Monte Carlo generation steps anddownload some previously produced files
How do we know that our produced Monte Carlo histograms cover data?
We plot envelopes!
Go to the tutorial page, download the LEP tarball and do the envelopesexercise:http://projects.hepforge.org/professor/docs.sphinx/tutorial.html
In the mc folder there are many subdirectories, each contain a parameter fileused params and one file with histograms out.aida
bb
bbbbbbbb b
bb b
bb b b b
b bb
b b
bb
b
b bb b
b b b
Envelope (CL 100.0 %)
ATLAS datab
2 4 6 8 10 12 14 16 18 200
0.5
1
1.5
2
Transverse region Nchg density vs. p⊥ (leading track) (√s = 7 TeV)
p⊥ (leading track) [GeV]
〈d2Nch
g/d
ηd
φ〉
Professor 6 / 16
Interpolations
To build interpolations, we need to select which generator runs to use
There is only a requirement for the minimal number of runs to use,Nmin(n) = 1 + n+ n(n+ 1)/2 for interpolations using quadraticpolynomials, with n being the dimensionality of the parameter space
Have N > Nmin in this tutorial ⇒ can build many interpolations
Task
Create 11 run combinations, one with all N available runs, 10 with N − 15runs using e.g.prof-runcombs --mcdir mc -c 0:1 -c 15:10 this creates a file runcombs.dat
Create the corresponding interpolations for all available observables usinge.g. prof-interpolate --datadir . --runs runcombs.dat
Professor 7 / 16
Interactivity
Key feature of Professor:
1 we are parameterising a very expensive function
2 input to that parameterisation can be trivially parallelised
Can parallelise parameterisation (for many run combinations)
Optimisation, too
Parameterisation produces a fast, analytic “pseudo-generator”
⇒ Can get a good approximation of what a generator will do whenrun for many hours/days with particular params, in < 1 second!
Why not make an interactive MC simulator?
Professor 8 / 16
prof-I
Usage: prof-I --datadir .
Professor 9 / 16
prof-I
Usage: prof-I --datadir .
Professor 9 / 16
prof-I
Usage: prof-I --datadir .
Professor 9 / 16
Observables and Weights
This is what Professor minimises: χ2(~p) = ∑O ∑b ∈O wb(f (b)(~p)−Rb)2
∆2b
Slightly more art than science
Garbage in, garbage out
Use weights wb to:
emphasize certain observablesemphasize certain bins of an observableswitch off single bins (e.g. MinBias region for Jimmy Herwig)
No MinBias physics in JimmyHerwig
Cannot get first 3 bins or so right
Transition from MinBias to UE typephysics
⇒ Exclude these bins from Professorminimisation
Professor 10 / 16
Observables and Weights
This is what Professor minimises: χ2(~p) = ∑O ∑b ∈O wb(f (b)(~p)−Rb)2
∆2b
Slightly more art than science
Garbage in, garbage out
Use weights wb to:
emphasize certain observablesemphasize certain bins of an observableswitch off single bins (e.g. MinBias region for Jimmy Herwig)
No MinBias physics in JimmyHerwig
Cannot get first 3 bins or so right
Transition from MinBias to UE typephysics
⇒ Exclude these bins from Professorminimisation
b
b
b
b
bb
b b b b b
bb
b
b
b
b
b
b
bb
CDF datab
Jimmy/Herwig MC09 tune
Jimmy/Herwig AUET1 tune (LO*)
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Nch (transverse) for min-bias
Nch
0 5 10 15 20
0.6
0.8
1
1.2
1.4
plead⊥ / GeV
MC/data
Professor 10 / 16
Tuning to data
Now that we have calculated some interpolations, we can access thetuning stage, prof-tune
prof-tune needs to know at least, which observables to tune to andwhat interpolations to use
Example:prof-tune --datadir . --weights weights --runs runcombs.dat
For a first look, it should be sufficient to tune using theobservables/weights supplied in the tarball, weights
Go to the tutorial page and do the “Tuning to reference data” exercise
This should give you 11 slightly different tuning results which we willhave a more systematic look at in the next steps
To look at the obtained results, use e.g.prof-showminresults tunes/results.pkl
Professor 11 / 16
Histograms at the tuned parameter point
prof-tune writes out the histograms as calculated from theinterpolations for each tuning result
We can use these to get a very good approximation of what thegenerator would do
Task
Use Rivet’s rivet-mkhtml to produce comparison plots for some of theAIDA files in ipolhistos/histos-tuneXYZ
Professor 12 / 16
Some tune param spreads
Due to oversampling we can create some semi-independent interpolationsthat yield slightly different tuned parameter points.We can project the obtained results on the parameter axes to investigatethe spread of of the tuning results, usingGo to the tutorial page and do the “Visualise result scatter“ exercise
1.6 1.8 2.0 2.2 2.4PARP(82)
4.4
4.6
4.8
5.0
5.2
5.4
5.6
5.8
6.0
0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30PARP(90)
4.4
4.6
4.8
5.0
5.2
5.4
5.6
5.8
6.0
χ2 /
Ndf
1.0 1.5 2.0 2.5 3.0 3.5 4.0PARP(71)
4.4
4.6
4.8
5.0
5.2
5.4
5.6
5.8
6.0
informal picture of how well-constrained a parameter is
We are happy if it looks like a vertical line
Professor 13 / 16
Sensitivities
How do we select, which (existing) data to tune to?
Lots of thinking, reading and consultation of generator authors.
Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”
p p + ε
f (b)(p + ε)
f (b)(p)
Professor 14 / 16
Sensitivities
How do we select, which (existing) data to tune to?
Lots of thinking, reading and consultation of generator authors.
Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”
p p + ε
f (b)(p + ε)
f (b)(p)
Professor 14 / 16
Sensitivities
How do we select, which (existing) data to tune to?
Lots of thinking, reading and consultation of generator authors.
Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”
p p + ε
f (b)(p + ε)
f (b)(p)
Professor 14 / 16
Sensitivities
How do we select, which (existing) data to tune to?
Lots of thinking, reading and consultation of generator authors.
Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”
p p + ε
f (b)(p + ε)
f (b)(p)
Professor 14 / 16
Sensitivities
How do we select, which (existing) data to tune to?
Lots of thinking, reading and consultation of generator authors.
Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”
p p + ε
f (b)(p + ε)
f (b)(p)
b
b
b
bbbbbbb b
bb b
bb b b b
b bb
b b
bb
b
b bb b
bb
b
p⊥ > 500 MeV, |η| < 2.5
ATLAS data (preliminary)b
HERWIG MC09
HERWIG AUET1 (MRST LO∗)HERWIG AUET1 (CTEQ6L1)
HERWIG AUET1 (CTEQ6.6)
2 4 6 8 10 12 14 16 18 200
0.5
1
1.5
2Transverse region Nch density vs. plead⊥ (
√s = 7 TeV)
p⊥ (leading track) [GeV]
〈d2Nch/d
ηd
φ〉
Professor 14 / 16
Sensitivities
How do we select, which (existing) data to tune to?
Lots of thinking, reading and consultation of generator authors.
Analysing sensitivity of observables to shifts in parameter space:“How much does the bin content change if I vary parameter i?”
p p + ε
f (b)(p + ε)
f (b)(p)
b
b
b
b
bbb b
bb
b bbb b
b bb
b
b
b
bb b
bb b
b
b
b
b
b
b
b
u u u uu u u
u u uuuuu u u u u u
u
u
u u
u uu
u
u
u uu
uu
u
q
q
qq q q q
q q q qq q q
q q q q qq q q q q q
q qq
q q q q
EXPb
PRRADu
PTJIM0q
2 4 6 8 10 12 14 16 18 20
-5
-4
-3
-2
-1
0
1
Sensitivities
p⊥ (leading track) [GeV]
extrSrel
i
Professor 14 / 16
Statistically-driven tune error bandsErrors from run-combination sampling
CL=95%CL=68%
0 2 4 6 8 100
0.2
0.4
0.6
0.8
1Transverse region charged particle density
p⊥(leading track)/GeV
〈ptrack
⊥〉/
GeV
CL=95%CL=68%Jimmy pseudodata, 1M events
0 2 4 6 8 100
0.2
0.4
0.6
0.8
1Transverse region charged particle density
p⊥(leading track)/GeV
〈ptrack
⊥〉/
GeV
⇒ turned parameter spread into uncertainty beltsMost complete procedure for full systematics in Les Houches proceedings(arXiv:1003.1643). Full treatment requires asymmetric covariancesampling.
Professor 15 / 16
Checking parameterisation: line-scans
Sample params from straighthyperline through χ2 valley
Calculate and compare χ2 ofparameterisation with “true”MC response
Professor 16 / 16
Checking parameterisation: line-scans
Sample params from straighthyperline through χ2 valley
Calculate and compare χ2 ofparameterisation with “true”MC response
Professor 16 / 16
Eigentunes
Pick the extremal points of the χ2 contour hyper-ellipsoid asrepresentative tunes, cf. Hessian PDF errors.⇒ obtained Eigentunes stay consistent, respect correlations
p1
p2
p′1
p′2
b
b
b
b
b
b
bb
bb
b bb b
b
bb b b
b b
b
b b
b
b
b
bb
b b
bb
b
b
b
b
ATLAS datab
AMBT1 (central tune)
EigenTune 1+EigenTune 1-EigenTune 3+EigenTune 3-
0
0.2
0.4
0.6
0.8
1
Transverse region Nchg density vs. p⊥ (leading track)
〈d2Nch
g/d
ηd
φ〉
1 2 3 4 5 6 7 8 9 10
0.6
0.8
1
1.2
1.4
p⊥ (leading track) [GeV]
MC/data
Professor 17 / 16
Backup
Intrinsic k⊥
b
b
b
bb
b bb
bb
b b
b bb b
b
bb
b
bb b
b b
b
bb
b bb
b bb b b b b b b b b b b b b b b b b
MRST LO∗, k⊥ = 0.0 GeV
MRST LO∗, k⊥ = 0.5 GeV
MRST LO∗, k⊥ = 1.2 GeV
MRST LO∗, k⊥ = 1.5 GeV
CDF datab
0
5
10
15
20
25
p⊥(Z) in Z → e+e− events
dσ/dp⊥(Z
)/(pb/GeV
)
0 2 4 6 8 10 12 14
0.6
0.8
1
1.2
1.4
p⊥(Z) / GeV
MC/data
2nd order polynomial includes lowest-order correlations between parameters
MCb(�p ) ≈ f (b)(�p ) = α(b)0 + ∑
i
β(b)i p�i + ∑
i≤j
γ(b)ij p�i p
�j
Now use N generator runs, i.e. N different parameter sets x,y:
v1
v2
...
vN
� �� ��v (N values, i.e. N bin contents)
=
1 x1 y1 x21 x1y1 y2
1
1 x2 y2 x22 x2y2 y2
2...
1 xN yN x2N xNyN y2
N
� �� �P (N sampled parameter sets)
α0
βx
βy
γxx
γxy
γyy
� �� ��c (coeffs)
Therefore: �cb = I [P]�v where I is the pseudoinverse operator.
�cb = I [P]�v
Use Singular Value Decomposition (SVD), a general diagonalisation
for all normal matrices M:M = UΣV ∗
Method available in SciPy.linalg
Minimal number of runs = number of coefficients in �cb:
N(n)min = 1+ n + n(n + 1)/2+ (n + 1)(n + 2)/6
� �� �cubic only
�cb = I [P]�v
Use Singular Value Decomposition (SVD), a general diagonalisation
for all normal matrices M:M = UΣV ∗
Method available in SciPy.linalg
Minimal number of runs = number of coefficients in �cb:
N(n)min = 1+ n + n(n + 1)/2+ (n + 1)(n + 2)/6
� �� �cubic only
Oversampling by a factor of three has proven to be much better
Num params, P N(P)2 (2nd order) N
(P)3 (3rd order)
1 3 4
2 6 10
4 15 35
6 28 84
8 45 165
9 55 220
10 66 286