+ All Categories
Home > Documents > Searching for Single Top Using Decision Trees

Searching for Single Top Using Decision Trees

Date post: 07-Jan-2016
Category:
Upload: gafna
View: 19 times
Download: 2 times
Share this document with a friend
Description:
Searching for Single Top Using Decision Trees. G. Watts (UW) For the DØ Collaboration 5/13/2005 – APSNW Particles I. SingleTop Challenges. Overwhelming Background!. Straight Cuts. (and counting experiments). Difficulty taking advantage of correlations. Multivariate Cuts. - PowerPoint PPT Presentation
15
Searching for Single Top Using Decision Trees G. Watts (UW) For the DØ Collaboration 5/13/2005 – APSNW Particles I
Transcript
Page 1: Searching for Single Top Using Decision Trees

Searching for Single Top Using Decision Trees

Searching for Single Top Using Decision Trees

G. Watts (UW)For the DØ Collaboration

5/13/2005 – APSNW Particles I

Page 2: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 2

SingleTop ChallengesSingleTop Challenges

Overwhelming Background!

Straight Cuts

Difficulty taking advantage of correlations

(and counting experiments)

Multivariate Cuts(and shape fitting)

Designed to take advantage of correlations and irreducible backgrounds

Page 3: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 3

Asymmetries in t-Channel ProductionAsymmetries in t-

Channel Production

b

Pair Production

Lots of variables give small separation

(Use ME, phase space, etc.)

Page 4: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 4

Combine Variables!Combine Variables!

Multivariate Likelihood Fit

7 variables means 7 dimensions…

Neural NetworkMany inputs and a single outputTrained on signal and background sampleWell understood and mostly accepted in HEP

Decision TreeMany inputs and a single outputTrained on signal and background sampleUsed mostly in life sciences & business (MiniBOONE

- physics/0408124).

Page 5: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 5

Decision TreeDecision Tree

Trained Decision Tree

(Binned Likelihood Fit)

(Limit)

Page 6: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 6

Internals of a Trained Tree

Internals of a Trained Tree

Every Event belongs to a single leaf node!

“Rooted Binary Tree”“You can see a decision tree”

Page 7: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 7

TrainingTraining

Determine a branch point

Calculate Gini ImprovementAs a function of a

interesting variable (HT in this case)Choose the largest improvement as the cut pointRepeat for all interesting

variablesHT, Jet pT, Angular Variables, etc.

Best improvement is this node’s decision.

Page 8: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 8

GiniGini

Process Requires a Variable to optimize separation.

bs

s

WW

WP

Ws – Weight of Signal Events

Wb – Weight of Background Events

Purity

Gini

bS WWPPG )1(

G is zero for pure background or signal!

Page 9: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 9

Gini ImprovementGini Improvement

Data (S)S1 S2

For each node

GI = G(S) – G(S1) – G(S2)

Repeat the process for each subdivision of data

Page 10: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 10

And Cut…And Cut…

Determine the Purity of each leaf

bs

s

WW

WP

Stop process and generate a leaf.We used statistical sample error

(# of events)

Use Tree as Estimator of PurityEach event belongs to a unique leafThe leaf’s purity is the estimator of the event

Page 11: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 11

DT in the Single Top Search

DT in the Single Top Search

DTWbb

DTtt

l+jets

Two DTs

2d Histogram used in binned likelihood

fit

2d Histogram used in binned likelihood

fit

Trained on signal and Wbb as background

Trained on signal and tt lepton +

jets as background

This part is identical to a NN based analysisSeparate DT for muon & electron

Backgrounds: W+Jets, QCD, top Pair Production Fake Leptons

Page 12: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 12

ResultsResults

Expected Limitss-channel: 4.5 pb (NN: 4.5)t-channel: 6.4 pb (NN: 5.8)

Actual Limitss-channel: 8.3 pb (NN: 6.4)t-channel: 8.1 pb (NN: 5.0)

Expected Results Close to NN

Page 13: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 13

Future of the AnalysisFuture of the Analysis

Use a Single Decision Tree

Train it against all backgrounds

PruningTrain until each leaf has only a single eventRecombine leaves (pruning) using statistical estimator

BoostingCombine multiple trees, each weightedTrain trees on event samples that have mis-classified event weights enhanced

Page 14: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 14

References & Introduction

References & Introduction

MiniBooNE Paper: hep-ex/0408124

Recent Advances in Predictive (Machine) Learning

Jerome H. Friedman, Conf. Proceedings

I have then linked and other on my web page

http://d0.phys.washington.edu/~gwatts/research/conferences

Page 15: Searching for Single Top Using Decision Trees

Gordon Watts (UW) APSNW Meeting May 13, 2005 15

ConclusionsConclusions

• Decision Trees are good…– Model is obvious in form of 2d binary tree.– Not as sensitive to outliers in input data as other

methods– Easily accommodate integer inputs (NJets) or missing

variable inputs.– Easy to implement (several months to go from scratch to

working code)• Decision Trees aren’t so good…

– Well understood input variables are a must• Similar for Neural Networks, of course.

– Minor changes in the input events can make for major changes in tree layout and results.

– Estimator is not a continuous function• Don’t have to deal with hidden nodes

– Separate training of background or other issues


Recommended