Sergey Kotov DPG meeting, March 28, 2006 · Sergey Kotov MPI für Physik, München DPG meeting,...

Multivariate analysis of H → bb̄ in associated production of H withtt̄-pair using full simulation of ATLAS detector

Sergey Kotov

MPI für Physik, München

DPG meeting, March 28, 2006

Sergey Kotov (MPI für Physik, München) Multivariate analysis of H → bb̄ in tt̄H production DPG meeting, March 28, 2006 1 / 16

1 Channel overview

2 General reconstruction strategy

3 Building and training of the neural network

4 Analysis results

5 Conclusions and plans


Low mass SM Higgs boson overview

LEP2 experimental bounds on Higgs mass

precision measurements of EW observables:mH = 117+67

−45 GeV

direct searches: mH > 114 GeV

Signature channels for low mass SM Higgs

H → τ+τ− in vector boson fusion

H → γγ in gluon fusion

H → WW∗ → lνlν in vector boson fusion

H → ZZ∗ → 4l in gluon fusion

H → bb̄ in H associated production with tt̄


Channel description

Features of tt̄H, H → bb̄ channel

Complex final state

I 6 jets: 4 b-jets and 2 light jetsI 1 high-pt lepton (trigger)I missing energy from neutrinoI additional jets from ISR/FSR

Large backgrounds

I combinatorial from mis-pairing of jetsI irreducible from tt̄bb̄ eventsI reducible from tt̄ + jets events

Full reconstruction of event and very goodb-tagging are needed

tt̄H, H → bb̄ signal

tt̄bb̄ background

tt̄jj background

Expected number of events at LHC

Process σLO , BR LHC events for L of MC FastSim FullSimpb 30 fb−1 100 fb−1 generator sample sample

tt̄H → (blν)(bjj)(bb̄) 0.52 0.20 3.15k 10.5k Pythia 1M 42ktt̄bb̄ → (blν)(bjj)bb̄ (QCD) 8.1a 0.29 70.5k 235k AcerMC 1.8M 92ktt̄bb̄ → (blν)(bjj)bb̄ (EW) 0.9 0.29 7.8k 26k AcerMC 200k —tt̄ → (blν)(bjj) + jets 500 0.29 4.3M 14.5M Pythia 4M 327K

aStrongly depends upon factorization scale (up to a factor of 2). Here, µ0 = (mt + mH )/2


Event reconstruction: preselection

H t

t

b

b b

b

+W

-W

+l

ν

q ’q

Preselection cuts

≥ 1 isolated leptons

I pt>20 GeV and |η|<2.7I Et < 10 GeV within the isolation cone of ∆R = 0.4I e-Id: EM cluster has a matched track in ID and the cluster

shape is consistent with e-hypothesisI µ-Id: the combined fit of muon track has has good quality

≥ 4 b-jets

I pt>15 GeV and |η|<3I standard ATLAS b-tagging cut: jetWeight > 3

≥ 2 light jets

I pt>15 GeV and |η|<3I b-tagging cut (anti-b-tag): jetWeight < 0.1

Preselection efficiencies

Particle Kinematical Reconstructionacceptance, % efficiency, %

e 82.8 66.0µ 82.3 70.2

b-jet 93.4 42.9light jet 48.6 52.2

15 GeV pt cut on light jets required by anti-b-tagging algorithm considerably decreaseskinematical acceptance

b-tagging algorithm has 60% efficiency, butthere’re fewer than expected reconstructed jets totag


Reconstruction efficiencies and resolutions: leptons

, MeVt

p0 20 40 60 80 100 120 140 160 180 200

310×0

200

400

600

800

1000

1200

Electrons

Matched

Fakes

for electronst

Reconstructed p

, MeVt

p0 20 40 60 80 100 120 140 160 180 200

310×

Eff

icie

ncy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

= 66.0%∈

Efficiency

Fake rate

for electronst

Efficiency vs p

, MeVt

p0 20 40 60 80 100 120 140 160 180 200

310×

t/p t

p∆

-0.2

-0.15

-0.1

-0.05

-0

0.05

0.1

0.15

0.2

for electronst

resolution vs pt

p

, MeVt

p0 20 40 60 80 100 120 140 160 180 200

310×0

200

400

600

800

1000

1200

1400

Muons

Matched

Fakes

for muonst

Reconstructed p

, MeVt

p0 20 40 60 80 100 120 140 160 180 200

310×

Eff

icie

ncy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

= 70.2%∈

Efficiency

Fake rate

for muonst

Efficiency vs p

, MeVt

p0 20 40 60 80 100 120 140 160 180 200

310×

t/p t

p∆

-0.2

-0.15

-0.1

-0.05

-0

0.05

0.1

0.15

0.2

for muonst

resolution vs pt

p

due to high jet activity the efficiencies are somewhat lower than expected

the pt resolutions are ∼ 4% for electrons and ∼ 5% for muons


Reconstruction efficiencies and resolutions: jets

, MeVt

p0 50 100 150 200 250 300

310×0

1000

2000

3000

4000

5000

B-jets

Matched

Fakes

for b-jetst

Reconstructed p

, MeVt

p0 50 100 150 200 250 300

310×

Eff

icie

ncy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

= 42.9%∈

Efficiency

Fake rate

for b-jetst

Efficiency vs p

, MeVt

p0 50 100 150 200 250 300

310×

t/p t

p∆

-0.5

0

0.5

1

1.5

for b-jetst

resolution vs pt

p

, MeVt

p0 50 100 150 200 250

310×0

500

1000

1500

2000

2500

3000

3500

4000

Light jets

Matched

for light jetst

Reconstructed p

, MeVt

p0 50 100 150 200 250

310×

Eff

icie

ncy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

= 52.2%∈

Efficiency

for light jetst

Efficiency vs p

, MeVt

p0 50 100 150 200 250

310×

t/p t

p∆

-2

-1

0

1

2

3

4

for light jetst

resolution vs pt

p

in high jet multiplicity events overlapping of jets considerably deteriorates jet energycalibration and resolution


Event reconstruction: making combinations

Making 4 b-jet + 2 light jets + 1 lepton combinations and selecting the best one

use events which pass preselection criteria (1 lepton, 4 b-jets, 2 light jets)

determine pν from pl and pmiss using mW constraint (if fails, use approximation pzν = pz

l )

reconstruct “leptonic” W → lν from lepton and neutrinos

reconstruct “hadronic” W → jj from jj combinations with |mjj −mW | < 35 GeV (the jets 4-momentarescaled to get the nominal W mass)

permute over all combinations of reconstructed Wlep, Whad, and 4 b-jets

calculate the evaluation parameter for each combination

from each event select the combination with the highest value of this parameter

plot invariant mass distributions from these best combinations and look for a Higgs peak

Various evaluation parameters of tt̄-pair reconstruction

ATLAS TDR: ∆mtt̄ =√

(mblν −mt)2 + (mbjj −mt)2

tt̄-pair likelihood in ATL-PHYS-2003-024 analysis

this analysis uses neural network evaluation parameter


ANN variables: Full simulation vs Fast simulation signal samples

, MeVtt

m∆0 10 20 30 40 50 60 70 80 90 100

310×0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Delta2tMassDelta2tMass

, MeVjjm50 60 70 80 90 100 110

310×0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

jjMassFast, matched

Fast, not-matched

Full, matched

Full, not-matched

jjMass

jjR∆0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

0

0.01

0.02

0.03

0.04

0.05

0.06

jjDeltaRdjjDeltaRd

hbWR∆

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

bWhDeltaRdbWhDeltaRd

lbWR∆

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03

0.04

0.05

bWlDeltaRd

Fast, matched

Fast, not-matched

Full, matched

Full, not-matched

bWlDeltaRd

, MeVHtt

m400 600 800 1000 1200 1400

310×0

0.01

0.02

0.03

0.04

0.05

0.06

ttHMassttHMass

most powerful discriminating variables are ∆mtt̄ and ∆Rjj


The neural network structure and performance

Due to limited size of the full simulation sample, fast simulation sample was used to train the ANN

ANN variables

TDR’s evaluator, ∆mtt̄ =√

(mblν −mt)2 + (mbjj −mt)2

invariant mass of two light jets from Whad

invariant mass of tt̄-H system

∆R between two light jets from Whad

∆R between b-jet and Whad from the same t-quark

∆R between b-jet and Wlep from the same t-quark

∆R between tt̄ system and HiggsbWlDeltaRdNN

jjMassNN

jjDeltaRdNN

bWhDeltaRdNN

Delta2tMassNN

HttMassNN

HttDeltaRdNN

ttHTruthKineTagNN

ANN output value-0.2 0 0.2 0.4 0.6 0.8 1 1.20

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Fast simulation and Htnot-matched t

, not-matched Htmatched t

, matched Htnot-matched t

and Htmatched t

Fast simulation

ANN output value-0.2 0 0.2 0.4 0.6 0.8 1 1.20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Full simulation and Htnot-matched t

, not-matched Htmatched t

, matched Htnot-matched t

and Htmatched t

Full simulation


Reconstructed tt̄-pair invariant mass distributions

bWhMassSgnEntries 364

Mean 1.742e+05

RMS 1.84e+04

Constant 3.51± 45.58

Mean 1153± 1.736e+05

Sigma 1333± 1.606e+04

, MeVbjjm100 120 140 160 180 200 220 240 260

310×

Eve

nts

0

10

20

30

40

50

60

bWhMassSgnEntries 364

Mean 1.742e+05

RMS 1.84e+04


Mean 1153± 1.736e+05

Sigma 1333± 1.606e+04

all

matched

H signaltt bWlMassSgnEntries 364

Mean 1.725e+05

RMS 1.555e+04


Mean 969± 1.719e+05

Sigma 997± 1.448e+04

, MeVνblm100 120 140 160 180 200 220 240 260

310×

Eve

nts

0

10

20

30

40

50

60

bWlMassSgnEntries 364

Mean 1.725e+05

RMS 1.555e+04


Mean 969± 1.719e+05

Sigma 997± 1.448e+04

all

matched

H signaltt

bWhMassBgdEntries 267

Mean 1.737e+05

RMS 1.777e+04


Mean 1436± 1.717e+05

Sigma 1637± 1.636e+04

, MeVbjjm100 120 140 160 180 200 220 240 260

310×

Eve

nts

0

10

20

30

40

50

60

bWhMassBgdEntries 267

Mean 1.737e+05

RMS 1.777e+04


Mean 1436± 1.717e+05

Sigma 1637± 1.636e+04

backgroundbbtt bWlMassBgdEntries 267

Mean 1.704e+05

RMS 1.72e+04


Mean 1549± 1.721e+05

Sigma 1877± 1.729e+04

, MeVνblm100 120 140 160 180 200 220 240 260

310×

Eve

nts

0

10

20

30

40

50

60

bWlMassBgdEntries 267

Mean 1.704e+05

RMS 1.72e+04


Mean 1549± 1.721e+05

Sigma 1877± 1.729e+04

backgroundbbtt

there’s a small shift of ∼3 GeV in the reconstructed mt

the width of the reconstructed mt is ∼16 GeV


Reconstructed Higgs invariant mass distributions

bbMassSgnEntries 364

Mean 1.382e+05

RMS 5.693e+04


Mean 3241± 1.192e+05

Sigma 5018± 2.384e+04

, MeVbbm50 100 150 200 250 300

310×

Eve

nts

0

5

10

15

20

25

30

35

40

45

bbMassSgnEntries 364

Mean 1.382e+05

RMS 5.693e+04


Mean 3241± 1.192e+05

Sigma 5018± 2.384e+04

N=152/71

all

matched

H signaltt bbMassBgd

Entries 267

Mean 1.482e+05

RMS 6.641e+04

, MeVbbm50 100 150 200 250 300

310×

Eve

nts

0

5

10

15

20

25

30

35

40

45bbMassBgd

Entries 267

Mean 1.482e+05

RMS 6.641e+04

N=60/0

backgroundbbtt

Efficiencies

tt̄H sample tt̄bb̄ sampleε, % Events ε, % Events100 42882 100 96053

Wlep 55.5 23810 58.3 56038Whad 32.3 13834 37.9 36455

4 b-jets 0.84 362 0.28 267mH window 0.35 152 0.06 60

the shape of the irreducible background isreasonably flat

the reconstructed Higgs mass is close tothe nominal with the width of ∼24 GeV


Expected signal after 30fb−1 of luminosity

=120 GeVH

, M-1L=30 fb

, MeVbb

M0 50 100 150 200 250 300

310×

even

ts/1

5 G

eV

0

5

10

15

20

25

30

35

40

45

50HttH, matchedttbbtt

jjtt=1.561S = 11/

Signal and backgrounds

=120 GeVH

, M-1L=30 fb

, MeVbb

M0 50 100 150 200 250 300

310×

even

ts/1

5 G

eV

0

5

10

15

20

25

30

35

40

45

50

data″Real″

Signal significance estimate

tt̄H tt̄bb̄ tt̄jjEvents in mH±30 GeV window 152 60 1

Final efficiency, % 0.35 0.06 0.0003Events normalized to 30 (100) fb−1 11 (38) 48 (160) 13 (45)

Signal significance 1.5 (2.7)

it’s hard to extract thesignal, unless thebackground shape is wellknown from MC


Conclusions and plans

signal significance for tt̄H, H → bb̄ channel in this study comes out quite low: S = 1.5 for 30fb−1 of integrated luminosity (ATLAS TDR had S = 3)

it would be difficult to extract the H → bb̄ signal from data without good understanding of thebackground shape

ANN gives a small improvement in signal significance over standard TDR evaluator (∼4%)still a lot of things can be done to improve the signal significance

I smarter jet reconstruction algorithms are needed to deal with overlapping of jets in high jet multiplicityevents (smaller jet cone size, TopoCluster jets)

I b-jet reconstruction efficiency can be increased by loosening the b-tagging cut, with the downside ofreduced suppression against tt̄jj background → room for optimisation

I with more statistics, the neural network can be retrained on full simulation data


Distributions of the ANN variables in signal sample: Fast simulation

100 120 140 160 180 200 220 240 260

310×0

0.01

0.02

0.03

0.04

0.05

bWlMassmatched

not-matched

mc truth

bWlMass

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

bWlDeltaRdmatched

not-matched

mc truth

bWlDeltaRd

40 50 60 70 80 90 100 110 120

310×0

0.01

0.02

0.03

0.04

0.05

jjMassmatched

not-matched

mc truth

jjMass

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

jjDeltaRdmatched

not-matched

mc truth

jjDeltaRd

100 120 140 160 180 200 220 240 260

310×0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

bWhMassmatched

not-matched

mc truth

bWhMass

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

bWhDeltaRdmatched

not-matched

mc truth

bWhDeltaRd

0 10 20 30 40 50 60 70 80 90 100

310×0

0.01

0.02

0.03

0.04

0.05

Delta2tMassmatched

not-matched

mc truth

Delta2tMass

200 300 400 500 600 700 800 900 100011001200

310×0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

ttMassmatched

not-matched

mc truth

ttMass

0 1 2 3 4 5 60

0.02

0.04

0.06

0.08

0.1

ttDeltaRdmatched

not-matched

mc truth

ttDeltaRd

0 100 200 300 400 500 600 700 800 900 1000

310×0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

ttSumPtmatched

not-matched

mc truth

ttSumPt

400 600 800 1000 1200 1400

310×0

0.01

0.02

0.03

0.04

0.05

0.06

ttHMassmatched

not-matched

mc truth

ttHMass

0 1 2 3 4 5 60

0.02

0.04

0.06

0.08

0.1

0.12

ttHDeltaRdmatched

not-matched

mc truth

ttHDeltaRd


Neural network basics

Multilayer Percerptron

Combination input variables

Probability of being a signal combination

Input layer

Hidden layer

)jθ+ijwi xΣ = f(jh

-x1+e1f =

Output layer

oθ+jwj hΣO =

2(O-T)∑ N

1E =

TMultiLayerPerceptron ROOTbuilt-in class is used as neuralnetwork (1 hidden layer with 10nodes)

11500 of matched and 12500 ofnon-matched combinations wereused to train the neural network


Date post:	22-Mar-2020
Category:	Documents
Upload:	others
View:	6 times
Download:	0 times

Sergey Kotov DPG meeting, March 28, 2006 · Sergey Kotov MPI für Physik, München DPG meeting,...

Documents