Identification of Jets with Heavy Flavors at DØ
Joseph Zennamo SUNY at Buffalo
Erice School on Subnuclear Physics
June 25th, 2013
Motivation
J. Zennamo, SUNY at Buffalo
• The ability to identify a heavy flavor jets is of paramount importance to any collider experiment’s physics program
• The ability to model the responses of these identification tools is also of great importance
• The backgrounds associated with a light jet faking a heavy flavor jet can quickly swamp many searches
• I will present a novel approach to modeling these backgrounds
2
Higg
s Bos
on
3
DØ
CDF
Booster Anti- proton source
Main Injector
Linear accelerator Fixed target area
The Tevatron
J. Zennamo, SUNY at Buffalo
The DØ Detector
J. Zennamo, SUNY at Buffalo
• General purpose hadron collider experiment • Central tracking system is of particular importance for heavy flavor jet identification
• Allows for precise reconstruction of the primary interaction vertex and secondary vertices
• Enables an accurate determination of the impact parameter
4
Pseudorapidity = η = − ln[tan(θ
2)]
Jet Structure
6/25/13 J.Zennamo, SUNY at Buffalo 5
• Jets are cascades of particles coming from initial state quarks or gluons • These appear in our detector as tracks and energy deposits • This track information can be utilized to help us determine which particle
initiated the shower
Heavy Flavor Jet Identification
J. Zennamo, SUNY at Buffalo
• B and C hadrons have a relatively long lifetime (~1 ps) and large mass • This allows the hadron to travel a measureable distance away before
decaying (~100-500 µm)
• These properties are utilized to create a discriminant which allows us to enrich the sample in heavy flavor jets
6
Heavy Flavor Jet Identification
J. Zennamo, SUNY at Buffalo
• B and C hadrons have a relatively long lifetime (~1 ps) and large mass • This allows the hadron to travel a measureable distance away before
decaying (~100-500 µm)
• These properties are utilized to create a discriminants which allows us to enrich the sample in heavy flavor jets
The MVAbl Algorithm
6/25/13 J. Zennamo, SUNY at Buffalo 8
• We can combine a great number of variables to help better distinguish the various flavors of jets
• Six random forests are trained with a variety of variables • One based on impact parameter information, and five others are
based on secondary vertex information • These are then combined with a single neural network
MVAbl
Modeling Heavy Flavor Response, System8 Method
J. Zennamo, SUNY at Buffalo
• The efficiency of selecting a heavy flavor jet must be corrected for data to MC differences
• First select a dijet data sample which is enriched with heavy flavor jets • A system of eight equations can be built to give the efficiency for
selecting a b-jet in data with the corresponding systematic uncertainties • Results are parameterized as a function of jet pT and pseudorapidity
9
Corrected Heavy Flavor Selection Efficiencies
6/25/13 J.Zennamo, SUNY at Buffalo 10
• The correction factors which are derived from the heavy flavor enriched sample give us data driven “tag rate function”
Misidentification Rates
J. Zennamo, SUNY at Buffalo
• While the misidentification rate for the tagging procedure is on the order of 1%, there is a large number of light jets making them a significant portion of our samples
• Previous methods relied heavily on MC inputs, this was known as the NT method
• DØ has a novel approach to extract the fake rates directly from data • SystemN method
11
Jet Secondary Vertex
Primary Vertex
New SystemN method
6/25/13 J. Zennamo, SUNY at Buffalo 12
• Using the heavy flavor efficiencies measured previously we can build a system of N equations • Where N-1 is the number of
selected operating points • The number of b, c, and light jets
in the event must also be measured
• To determine this a separate procedure is used to extract the flavor composition directly from the data
1 2 3 4 5
1
… 5
1 1 1 1 1 1
5 5 5 5 5 5
Sample Composition Fits
6/25/13 J. Zennamo, SUNY at Buffalo 13
• First a sample is created with the requirement that a ‘good’ secondary vertex can be formed
• From here we take the secondary vertex mass distribution from data and fit it with MC templates for b, c, and light jets [GeV]SVM
0 1 2 3 4 5 6
Even
ts /
0.38
500
1000
1500
DataFitted Totalb jetsc jetslight jets
< 45 GeVT
DØ, 35 GeV < p| < 1.51.1 < |
Secondary Vertex Mass [GeV]
• These fits are parameterized in terms of jet pT and pseudorapidity • Template shapes are corrected to data before the fitting but any difference
are taken as a systematic uncertainty
Resulting Fake Rates and Scale factors
6/25/13 J. Zennamo, SUNY at Buffalo 14
• Once we have the sample composition the last remaining unknown is the light jet tagging efficiency
• This is then extracted directly from data and can be compared to the MC predicted values
• A correction factor is created by taking the ratio of the rate in data to MC, with corresponding uncertainties
• This is used to model the light jet responses
Comparison to old method
6/25/13 J. Zennamo, SUNY at Buffalo 15
• If we compare the results of our data driven fake rates to those found with the previous method we can see a large deviation at high jet pT
• This discrepancy grows if we required a stricter cut on our algorithm
• This discrepancy points to the fact that the previous method for determining the fake rates highly underestimated their contributions at high pT
Conclusions
6/25/13 J. Zennamo, SUNY at Buffalo 16
• Robust heavy flavor tagging algorithms are essential to any collider experiment’s physics program
• Additionally it is important to accurately model the responses to these algorithms in a data driven fashion
• I have presented a novel method for determining the light jet’s response to any tagging algorithm
• Further, this new method yields light jet efficiencies which are significantly different to those estimated with the previous methods
• All of these advances will be published shortly in a NIM article
BACKUPS
6/25/13 J. Zennamo, SUNY at Buffalo 17
System8
6/25/13 J. Zennamo, SUNY at Buffalo 18
Intermediate Random Forests
6/25/13 J. Zennamo, SUNY at Buffalo 19