Heiner TholenUniversity of Hamburg
MINIAODPAT Tutorial
June/July 2015
1
•what is miniaod?• eventformat / datatier
• small event size
• ... use other slides from rizzi
•hands on• miniaod objects and collections
• example
2
what is miniaod?
slides taken fromhttps://indico.cern.ch/event/326181/contribution/4
by Giovanni Petrucciani
MiniAOD: overall idea!• Compact high-level data tier (30-50 kb/event)
designed to cover the mainstream analyses replacing the need of big group-level pattuples or ntuples
25.06.14 A. Rizzi (Pisa), G. Petrucciani (CERN) 3
Analysis2 tuple
Analysis3 tuple
Analysis1 tuple
Analysis2 tuple
Analysis3 tuple
Analysis5 tuple
AOD CERN CMGTuple
B2G PATuple
MIT NTuple
…
ETH EDMNTuple
MiniAOD
Analysis1 tuple
AOD
RECO
RECO
very analysis-specific usually flat trees, ~1TB
~1 month
general purpose ntuples 30-100 kb/ev, tot. O(100) TB
1-2 days
promptly
when new high level calibrations or recipes become available
if needed?
Analysis5 tuple
1-2 days
TYPIC
AL R
UN1!
WO
RK
FLO
W!
MIN
IAO
D!
WO
RK
FLO
W!
4
MiniAOD event content!• High level physics objects (e.g. leptons, jets, …)
– with detailed information, e.g. to allow retuning of IDs – with some loose preselection if needed to fit the budget
• All PF Candidates, in a smartly packed format – allow re-computing isolations, re-clustering of jets,
jet substructure analysis, event interpretation, … – including track parameters, to also re-run b-tagging
• Trigger information: bits, and 4-vectors of objects • MC truth: “interesting” genParticles, plus all status==1
genParticles (packed), plus GEN/LHE/PDF/PU info • Other small footprint stuff (e.g. vertices, MET filter flags)
28.05.14 A. Rizzi (Pisa), G. Petrucciani (CERN) 5 5
MC information!• The full list of genParticles is too large to keep, so we
follow a two-prong approach: • prunedGenParticles: the interesting particles
– select leptons, photons, EWK bosons, top quarks, high pT partons, heavy flavour quarks & hadrons, …
– saved with full information, mother-daughter links, …(reconnecting them if intermediate particles are dropped)
• packedGenParticles: all status == 1 particles – useful e.g. to remake GenJets with different clustering – use a lossy compressed format like for PF candidates. – include a link to the mother, or closest ancestor available
in the prunedGenParticles collection (e.g. for flavour history)
25.06.14 A. Rizzi (Pisa), G. Petrucciani (CERN) 13 6
Cross-referencing!
• High level physics objects in miniAOD contain references to the packed PF candidates corresponding to the original PFCandidates they came from: – useful e.g. for footprint removal in isolation, event
interpretation (aka “top projection”), …
25.06.14 A. Rizzi (Pisa), G. Petrucciani (CERN) 14
E/G Muon Tau Jet
PF PF PF PF PF PF PF PF PF PF PF PF PF PF PF PF
one or more
note: if a muon fails the PF id, it will point to a PF hadron!
7
Physics Objects!Electrons: – keep all gedGsfElectrons – detailed info for pT > 5 GeV
Muons: – keep all with pT > 5, or that
pass some loose id (details) – all information saved
Taus: – keep those with pT > 20 &
‘decayModeFinding’ ID – save IDs & links to PFCands.
Photons: – keep those with pT > 14 &
hadTowOverEm <0.15 – detailed info if r9 > 0.8 OR
chargedHadronIso < 20 OR chargedHadronIso < 0.3 · pT
Jets (ak4PFchs, ak8PFchs): – keep those with pT > 10 GeV
(pT > 100 for AK8 ones) – note: JEC are applied – keep daughters, id info, b-tag
discriminators
25.06.14 A. Rizzi (Pisa), G. Petrucciani (CERN) 12 8
Size decomposition !
28.05.14 A. Rizzi (Pisa), G. Petrucciani (CERN) 12
Mu
PFCands
Jet
Gen Gen
Jet
Mu Trig
E/G
E/G
Single Muon Data
TTbar MC PFCands
E/G
E/G
9
•miniaod is not yet backward compatible. I.e. a file made in 7_4_1 can only be read within 7_4_X versions
•if you're interested in how the compression works:https://github.com/cms-sw/cmssw/blob/CMSSW_7_4_X/DataFormats/PatCandidates/interface/PackedCandidate.h
10
notesminiaod format
11
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD2015
documentation<>
12
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD2015#Analyzing_MiniAOD
analyzing miniaod <>
•MINIAOD is designed to save diskspace
•amounts of objects are reduced to the important ones
•less important objects are packed with lossy compression
•if you don't need specialized objects, you could work with MINIAOD samples directly (=> exercises)
•the names of the collections are changed, as compared to RECO, AOD, PAT
•questions?
13
summaryminiaod format
object disambiguationwith (top) projections
•yesterday: cross-object-cleaning• if you do nothing...
=> ambiguities
15
object disambiguation
leptonjet axis
=>
•yesterday: cross-object-cleaning• jets overlapping with leptons are completely removed
16
object disambiguation
leptonjet axis
=>
•yesterday: cross-object-cleaning• jets overlapping with leptons are completely removed
•other possibilities to disambiguate?
17
object disambiguation
leptonjet axis
=>
•yesterday: cross-object-cleaning• jets overlapping with leptons are completely removed
•other possibilities to disambiguate?• reclustering of jets without selected muons
18
object disambiguation
leptonjet axis
=>
•yesterday: cross-object-cleaning• jets overlapping with leptons are completely removed
•other possibilities to disambiguate?• reclustering of jets without selected muons
• substract lepton p4 from jet p4 (in general: different result)
19
object disambiguation
leptonjet axis
=>
•yesterday: cross-object-cleaning• jets overlapping with leptons are completely removed
•other possibilities to disambiguate?• reclustering of jets without selected muons
• substract lepton p4 from jet p4 (in general: different result)
20
object disambiguation
leptonjet axis
=>
The exercise will deal with reclustering of jets.The tool to accomplish this type of disambiguation is called "top projections". Its idea is explained in the next slides.Slides taken from:https://indico.cern.ch/event/314115/session/7/contribution/25by Sadia Khalil
Ranking of Event Reconstruction and Interpretation
8 21
22
Top projection
13
• In generic terms, a top projection has two inputs:
- The objects we want to disambiguate ➡Top Collection
- The objects we have ➡Bottom Collection
• Inputs of any type
23
Top projection
14
• In generic terms, a top projection has two inputs:
- The objects we want to disambiguate ➡Top Collection
- The objects we have ➡Bottom Collection
• Inputs of any type
24
Top projection
15
25
Top projection
16
26
Top projection
17
27
Top projection
18
28
Top projection
19
particleFlow
(1)
(2)
(3)
(1)
( (
(4)
(5)
(6)
pfJet
pfPileUp
pfNoPileUp
pfMuon
pfNoMuon
pfElectron
pfNoElectron
pfTau
pfNoTau
Top projection in PF2PAT
- input source for PAT (2) (3) (4) (5)
PFBRECO = cms.Sequence( pfNoPileUpSequence + pfParticleSelectionSequence + pfPhotonSequence + pfMuonSequence + pfNoMuon + pfElectronSequence + pfNoElectron + pfJetSequence + pfNoJet + pfTauSequence + pfNoTau + pfMET )
29
object disambiguation exercise
DeltaR(mu, leading jet)
pt(m
u) /
pt(je
t)
30
object disambiguation exercise
DeltaR(mu, leading jet)
pt(m
u) /
pt(je
t)
jet seems to consistof the muon alone
31
object disambiguation exercise
DeltaR(mu, leading jet)
pt(m
u) /
pt(je
t)
jet seems to consistof the muon alone
exercise:- reproduce this plot- make your own miniaod tuple with
reclustered jets- make this plot with your new jets- check that the ambiguity is gone
exercise: simple W-tagging
•jet-tagging is getting more important• heavy BSM particles => boosted W, Z, H, t
• decay products merged into single fat jets
•jet-substructure info• N-subjettiness
(measure of number of subjets)
• grooming algorithms(remove contribution from unwanted particles)
33
W-taggingintro
•jet-tagging is getting more important• heavy BSM particles => boosted W, Z, H, t
• decay products merged into single fat jets
•jet-substructure info• N-subjettiness
(measure of number of subjets)
• grooming algorithms(remove contribution from unwanted particles)
34
W-taggingintro
From the miniaod-twiki:double tau1 = jet.userFloat("NjettinessAK8:tau1"); //double tau2 = jet.userFloat("NjettinessAK8:tau2"); // Access the n-subjettiness variablesdouble tau3 = jet.userFloat("NjettinessAK8:tau3"); //
double softdrop_mass = jet.userFloat("ak8PFJetsCHSSoftDropMass"); // access to filtered massdouble trimmed_mass = jet.userFloat("ak8PFJetsCHSTrimmedMass"); // access to trimmed massdouble pruned_mass = jet.userFloat("ak8PFJetsCHSPrunedMass"); // access to pruned massdouble filtered_mass = jet.userFloat("ak8PFJetsCHSFilteredMass"); // access to filtered mass
bool mySimpleWTagger = (tau2/tau1) < 0.6 && softdrop_mass > 50.0;
•jet-tagging is getting more important• heavy BSM particles => boosted W, Z, H, t
• decay products merged into single fat jets
•jet-substructure info• N-subjettiness
(measure of number of subjets)
• grooming algorithms(remove contribution from unwanted particles)
35
W-taggingintro
From the miniaod-twiki:double tau1 = jet.userFloat("NjettinessAK8:tau1"); //double tau2 = jet.userFloat("NjettinessAK8:tau2"); // Access the n-subjettiness variablesdouble tau3 = jet.userFloat("NjettinessAK8:tau3"); //
double softdrop_mass = jet.userFloat("ak8PFJetsCHSSoftDropMass"); // access to filtered massdouble trimmed_mass = jet.userFloat("ak8PFJetsCHSTrimmedMass"); // access to trimmed massdouble pruned_mass = jet.userFloat("ak8PFJetsCHSPrunedMass"); // access to pruned massdouble filtered_mass = jet.userFloat("ak8PFJetsCHSFilteredMass"); // access to filtered mass
bool mySimpleWTagger = (tau2/tau1) < 0.6 && softdrop_mass > 50.0;
exercise:- find W's in genParticle collection and match
them to ak8 jets => signal / background- plot N-subjettiness for signal and background- write your own W-tagger- what's the performance of your tagger?- improve your tagger! =)
summary
•MINIAOD is a new standard event format in CMS
•designed to suite most analyses' needs=> might need to check if it fits your needs
•"top projections" help to disambiguate objects
•jet-tagging with substructure information
•questions?
37
summaryminiaod