Bill Atwood, Nov. 2002GLAST 1 Classification PSF Analysis A New Analysis Tool: Insightful Miner...

Post on 21-Dec-2015

219 views 2 download

Tags:

transcript

Bill Atwood, Nov. 2002 GLASTGLAST1

Classification PSF Analysis

• A New Analysis Tool: Insightful Miner• Classification Trees• From Cuts Classification Trees: Recasting of the GLAST PSF Analysis• Energy Dependencies• Present status of GLAST PSFs

Bill Atwood, Nov. 2002 GLASTGLAST2

A Data Mining Tool

An MinerAnalysisProgram!

Bill Atwood, Nov. 2002 GLASTGLAST3

Miner Details What is a Data Miner?

o A graphical user programming environment

o An ensemble of Data Manipulation Tools

o A Set of Data Modelling Tools

o A “widget” scripting language

o An interface to data bases

Why use a Data Miner?

o Fast and Easy prototyping of Analysis

o Encourages “exploration”

o Allows a more “Global” View of Analysis

INPUT OUTPUT

A Properties Browser to set parameters

A Traditional “CUT”

Bill Atwood, Nov. 2002 GLASTGLAST4

Classification Trees

Root

Branch 1

Branch 2

Given a “catagorical varible” split the data into two pieces using “best” independent continuous varible

Example: VTX.Type =1 if “vertex” direction is best

2 if “best-track” direction is best

Use “Entropy” to deside whichIndependent varible to use:

Entropy = )log( ikik ppk

Where k is over catagories and i is the ith Node

(There are other criteria)

Continue process – treating each branch asa new “root.” Terminate according to statistics in last node and/or change in Entropy

Example: Classification Tree from Miner

Bill Atwood, Nov. 2002 GLASTGLAST5

Classification Trees

Why use Classification Trees?

1. Simplicity of method – recursive application of a decision making rule

2. Easily captures non-linear behavior in predictors as well As interactions amoung them

3. Not limited to just 2 catagories

There are numerous text on this subject……

In the following analysis Classification Trees will be used to:

Separate out the good “vertex” events

Predict how “good” and event really is

Bill Atwood, Nov. 2002 GLASTGLAST6

GLAST PSF Analysis This portion of the code

Reads in the data Culls out bad data Adds new columns for analysis Makes Global Cuts Splits the data into 2 pieces Thin Radiators Thick Radiators

(TKR.1.z0 > 250)

( ACD.DOCA > 350 & Energy > .5*MC.Energy)

Bill Atwood, Nov. 2002 GLASTGLAST7

The VTX Classification Tree

Relative amounts of Catagories

Relative amount of Data

Bill Atwood, Nov. 2002 GLASTGLAST8

CPA: To Vertex or not to Vertex?

Probability is not continuous – its essentially binned by the finite number of leaves (ending nodes)

There is a “gap” at .5 - Use that to determine which solution to use

Bill Atwood, Nov. 2002 GLASTGLAST9

Do the Vertex Split!

Use 2-Track Solution

Use 1-Track Solution

The data are now divided into 2 subsets according to the Probability that the 2-Track (“vertex”) solution is best.

No data have been eliminated – Failed Vertexed solutionsAre tried again as 1-Track events

Predictor created by Classification Tree

Rename probability column

From “Thin”

Split

Bill Atwood, Nov. 2002 GLASTGLAST10

Bin the PSF

Continuous Variable Catagroical Variable

Target Class: Class #1 – MS PSF Limited Bin

Bill Atwood, Nov. 2002 GLASTGLAST11

2 Track Classification Tree

Bill Atwood, Nov. 2002 GLASTGLAST12

1 Track Classification Tree

Bill Atwood, Nov. 2002 GLASTGLAST13

Combining Results

Bill Atwood, Nov. 2002 GLASTGLAST14

Example PSF’s At FoM Max

100 MeVPSF-68 =2.7o

95/68 = 2.65

1000 MeV: PSF-68 = .35o

95/68 = 2.3

10000 MeV :PSF-68 = .1o

95/68 = 2.9

Bill Atwood, Nov. 2002 GLASTGLAST15

Before and After Trees

PSF: 2.1o

95%/68% :2.34

Aeff: 1387 cm2

Using Classification Trees

Bill Atwood, Nov. 2002 GLASTGLAST16

Before and After Trees

0.0 0.1 0.2 0.3 0.4 0.5 0.6

VTX Angle

1.5

2.0

2.5

3.0

3.5

PSF6

8

1.0

1.5

2.0

2.5

PSF9

5/PS

F68

0.0 0.1 0.2 0.3 0.4 0.5 0.6

Vtx.Angle

1.5

2.0

2.5

3.0

3.5

4.0

4.5

PSF68

0

500

1000

1500

2000

2500

Aeff

95/68 Ratio

Aeff

Best results obtained using the “cuts” to achieve a good PSF

PSF: 2.1o

95%/68% :2.34

Aeff: 1387 cm2

Using Classification Trees