+ All Categories
Home > Documents > ID - Artificial Intelligencecs231n.stanford.edu/reports/2017/posters/539.pdfATLAS-CONF-2014-018...

ID - Artificial Intelligencecs231n.stanford.edu/reports/2017/posters/539.pdfATLAS-CONF-2014-018...

Date post: 10-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
C ONVOLUTIONAL N EURAL N ETWORKS FOR P ILE U P ID IN ATLAS M URTAZA S AFDARI ( MURTAZAS @ STANFORD . EDU ) M OTIVATION ATLAS is a physics detector on the LHC looking at proton-proton collisions. It sees collimated streams of particles, called jets, in its equipment. Jets are crucial to studying any particle physics process. The detector records many fake jets, called Pile Up (PU) jets[1] due to particles crossing over from different interaction points. Goal: To develop a classifier that discrimi- nates between real (HS) and PU jets better than the current standard[2] using CNNs. D ATASET The dataset consists of 4.10 5 detector level jets which contain: The true jet p T , (η,φ) coordinates The p T , (η,φ) coords for the clusters in a jet The p T , (η,φ) coords for tracks leading into a jet, separately for HS and PU tracks The jet Rpt; the sum of the p T of tracks from the PV divided by the pT of the jet Data split: 80% training, 10% CV, 10% test. Only central jets with |η | < 0.8 are taken for uniform de- tector response, and with p T [20, 30] GeV are considered to wash out any p T dependence. Im- ages are formed using the cluster p T s, HS track p T s, and PU track p T s binned in the η - φ plane. Averaged image of HS jets in the (η,φ) plane. Absolute difference in the averaged HS & PU jets. B ASELINE The ATLAS standard for discriminating between HS and PU jets in the central region is using the Jet Vertex Tagger (JVT)[2]. The jet Rpt variable serves as a good proxy for the JVT, and shall serve as the baseline against which network per- formance will be measured. In addition to jet Rpt, a baseline Neural Network has also been trained using jet Rpt and p T as input features. This is theoretically a more challenging baseline to work with, as it uses p T information to improve predictions. R ESULTS Several different approaches were taken to network modeling. Sequential models were made with a couple of convolutional layers. Wide inception-inspired models were also made to combine the convo- lutional capacities of different kernels. Models with the jet p T passed as an Auxiliary input were also experimented with. The results are presented below: The model accuracies Models Accuracy ATLAS standard proxy, Rpt 0.5005 Baseline NN using jet Rpt and p T 0.6994 Pseudo CNN with full sized kernels and angular regularization 0.7013 Sequential CNN with 3x3 “Same” Conv2D followed by full sized Conv2D 0.7025 Sequential CNN with downscaling to 5x5 image followed by full sized Conv2D 0.7029 CNN with parallel convolutions of 3x3, 5x5, 10x10 filters 0.7036 CNN with parallel convolutions and Auxiliary Input of jet p T 0.7072 Sequential CNN with Auxiliary Input of jet p T 0.7073 R EFERENCES [1] Menke, Sven. “Pile-Up in Jets in ATLAS ” Talk given at the BOOST 2013, Flagstaff, AZ. [2] The ATLAS Collaboration. “Tagging and sup- pression of pileup jets with the ATLAS detector” ATLAS-CONF-2014-018 D ISCUSSION AND F UTURE S TEPS Trained CNNs outperform Baseline Rpt discriminant by 20 - 30 % in PU efficiency. Much of the physics analysis at ATLAS happens in the central region; These re- sults have the potential to massively im- pact ATLAS Pile Up ID procedures. Interest to note effectiveness of CNNs at a classi- fication job intractable by human eyes alone. Accuracies suggest models with jet p T passed as auxiliary inputs perform best. However learning on jet p T makes the trained models sensitive to the p T scale of the data, ren- dering it non generalizable. Consequently best network: Wide Incep- tion inspired model, learned from different convolutions. This makes physical sense given the sparse nature of the input images. Detailed study of learned weights is re- quired to understand how and why these networks outperform the current standard. Formal proposal to ATLAS needs to be made following a more thorough analysis. E VALUATION M ETRIC AND L OSS In addition to accuracy as a metric used to gauge the performance of a discriminator in ATLAS, we also use Receiver Operating Characteristic (ROC) curves. Cross Entropy Loss is used as it tries to accumu- late the probability distribution on the true labels, making the output of the network a good discrim- inator, as opposed to margin losses which settle once a margin is achieved.
Transcript
Page 1: ID - Artificial Intelligencecs231n.stanford.edu/reports/2017/posters/539.pdfATLAS-CONF-2014-018 DISCUSSION AND FUTURE STEPS Trained CNNs outperform Baseline Rpt discriminant by 20

CONVOLUTIONAL NEURAL NETWORKS FOR PILE UP ID IN ATLASMURTAZA SAFDARI ([email protected])

MOTIVATION• ATLAS is a physics detector on the LHC

looking at proton-proton collisions.• It sees collimated streams of particles, called

jets, in its equipment.• Jets are crucial to studying any particle

physics process.• The detector records many fake jets, called

Pile Up (PU) jets[1] due to particles crossingover from different interaction points.• Goal: To develop a classifier that discrimi-

nates between real (HS) and PU jets betterthan the current standard[2] using CNNs.

DATASET

The dataset consists of ∼ 4.105 detector level jetswhich contain:• The true jet pT , (η, φ) coordinates• The pT , (η, φ) coords for the clusters in a jet• The pT , (η, φ) coords for tracks leading into

a jet, separately for HS and PU tracks• The jet Rpt; the sum of the pT of tracks from

the PV divided by the pT of the jetData split: 80% training, 10% CV, 10% test. Onlycentral jets with |η| < 0.8 are taken for uniform de-tector response, and with pT ∈ [20, 30] GeV areconsidered to wash out any pT dependence. Im-ages are formed using the cluster pT s, HS trackpT s, and PU track pT s binned in the η − φ plane. Averaged image of HS jets in the (η, φ) plane. Absolute difference in the averaged HS & PU jets.

BASELINEThe ATLAS standard for discriminating betweenHS and PU jets in the central region is using theJet Vertex Tagger (JVT)[2]. The jet Rpt variableserves as a good proxy for the JVT, and shallserve as the baseline against which network per-formance will be measured.In addition to jet Rpt, a baseline Neural Networkhas also been trained using jet Rpt and pT as inputfeatures. This is theoretically a more challengingbaseline to work with, as it uses pT informationto improve predictions.

RESULTSSeveral different approaches were taken to network modeling. Sequential models were made with acouple of convolutional layers. Wide inception-inspired models were also made to combine the convo-lutional capacities of different kernels. Models with the jet pT passed as an Auxiliary input were alsoexperimented with. The results are presented below:

The model accuracies

Models Accuracy

ATLAS standard proxy, Rpt 0.5005Baseline NN using jet Rpt and pT 0.6994Pseudo CNN with full sized kernels and angular regularization 0.7013Sequential CNN with 3x3 “Same” Conv2D followed by full sized Conv2D 0.7025Sequential CNN with downscaling to 5x5 image followed by full sized Conv2D 0.7029CNN with parallel convolutions of 3x3, 5x5, 10x10 filters 0.7036CNN with parallel convolutions and Auxiliary Input of jet pT 0.7072Sequential CNN with Auxiliary Input of jet pT 0.7073

REFERENCES

[1] Menke, Sven. “Pile-Up in Jets in ATLAS ” Talkgiven at the BOOST 2013, Flagstaff, AZ.

[2] The ATLAS Collaboration. “Tagging and sup-pression of pileup jets with the ATLAS detector”ATLAS-CONF-2014-018

DISCUSSION AND FUTURE STEPS• Trained CNNs outperform Baseline Rpt

discriminant by 20 - 30 % in PU efficiency.• Much of the physics analysis at ATLAS

happens in the central region; These re-sults have the potential to massively im-pact ATLAS Pile Up ID procedures.

• Interest to note effectiveness of CNNs at a classi-fication job intractable by human eyes alone.

• Accuracies suggest models with jet pT passed asauxiliary inputs perform best.

• However learning on jet pT makes the trainedmodels sensitive to the pT scale of the data, ren-dering it non generalizable.

• Consequently best network: Wide Incep-tion inspired model, learned from differentconvolutions. This makes physical sensegiven the sparse nature of the input images.

• Detailed study of learned weights is re-quired to understand how and why thesenetworks outperform the current standard.

• Formal proposal to ATLAS needs to bemade following a more thorough analysis.

EVALUATION METRIC AND LOSSIn addition to accuracy as a metric used to gaugethe performance of a discriminator in ATLAS, wealso use Receiver Operating Characteristic (ROC)curves.

Cross Entropy Loss is used as it tries to accumu-late the probability distribution on the true labels,making the output of the network a good discrim-inator, as opposed to margin losses which settleonce a margin is achieved.

Recommended