Provide tools for the statistical comparison of distributions equivalent reference distributions ...

Post on 13-Jan-2016

221 views 0 download

transcript

Provide tools for the Provide tools for the statistical comparisonstatistical comparison of distributions of distributions equivalent reference distributions experimental measurements data from reference sources functions deriving from theoretical calculations or fits

Detector monitoringDetector monitoring

Simulation validationSimulation validation

Reconstruction vs. expectationReconstruction vs. expectation

Regression testingRegression testing

Physics analysisPhysics analysis

Data analysis in HEP

Qualitative evaluationQualitative evaluationQuantitative evaluationQuantitative evaluation

GoF statistical toolkit

A project to develop a

statistical comparison systemstatistical comparison system A project to develop a

statistical comparison systemstatistical comparison system

Comparison of distributionsComparison of distributions

Goodness of fit testingGoodness of fit testing

• United Software Development ProcessUnited Software Development Process, specifically tailoredtailored to the project– practical guidance and tools from the RUPRUP– both rigorous and lightweight– mapping onto ISO 15504

• Guidance from ISO 15504ISO 15504

• Incremental and iterative life cycle model

Software process guidelines

SPIRAL APPROACHSPIRAL APPROACH

• The project adopts a solid architectural approachsolid architectural approach– to offer the functionalityfunctionality and the qualityquality needed by the users– to be maintainablemaintainable over a large time scale– to be extensibleextensible, to accommodate future evolutions of the

requirements

• Component-based approachComponent-based approach– to facilitate re-use and integration in different frameworks

• AIDAAIDA– adopt a (HEP) standard– no dependence on any specific analysis tool

Architectural guidelines

The algorithms are specialised on the kind of distribution The algorithms are specialised on the kind of distribution (binned/unbinned)(binned/unbinned)

Every algorithm has been rigorously tested!

Documentation available:

http://www.ge.infn.it/geant4/analysis/HEPstatistics/http://www.ge.infn.it/geant4/analysis/HEPstatistics/

• Applies to binnedbinned distributions

• It can be useful also in case of unbinned distributions, but the data must be grouped into classes

• Cannot be applied if the counting of the theoretical frequencies in each class is < 5

– When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached

– Otherwise one could use Yates’ formula

Chi-squared testChi-squared test

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS

• Kolmogorov-Smirnov test

• Goodman approximation of KS test

• Kuiper test

)(

4 22

nm

nmDmn

)()( xGxFSupD mnmn

)()()()( 00* xFxFMaxxFxFMaxD TT

Dmn

More sophisticated algorithmsMore sophisticated algorithmsunbinned distributionsunbinned distributions

SUPREMUMSUPREMUMSTATISTICSSTATISTICS

)()()(2

02 xdFxFxF T • Cramer-von Mises test

• Anderson-Darling test

)()(1)(

)()( 202 xdF

xFxF

xFxFA T

TT

T

These algorithms are so powerful that we decided to implement theirequivalent in case of binned distributions:

• Fisz-Cramer-von Mises test

• k-sample Anderson-Darling test

i

ii xFxFnn

nnt 2

21221

21 )]()([)(

i k kkk

kiikk

iK nh

HnH

HnnFh

nkn

nA

4)(

)(1

)1(

)1( 2

22

More powerful algorithmsMore powerful algorithmsunbinned distributionsunbinned distributions

binned distributionsbinned distributions

TESTS CONTAININGTESTS CONTAININGA WEIGHTING FUNCTIONA WEIGHTING FUNCTION

2 loses information in a test for unbinned distribution by grouping the data into cellsKac, Kiefer and Wolfowitz (1955) showed that Kolmogorov-Smirnov test

requires n4/5 observations compared to n observations for 2 to attain the same power

Cramer-von Mises and Anderson-Darling statistics are expected to be superior to Kolmogorov-Smirnov’s, since they make a comparison of the two distributions all along the range of x, rather than looking for a marked difference at one point

2222 Supremum Supremum statistics statistics

teststests

Supremum Supremum statistics statistics

teststests

Tests Tests containing a containing a

weight functionweight function

Tests Tests containing a containing a

weight functionweight function< <

In terms of power:

IsIs 2 the most powerful algorithm?

The user is completely shieldedshielded from both statistical and computing complexity.

USERUSER

EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODEEXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE

TOOLKITTOOLKITSTATISTICALSTATISTICAL

RESULTRESULT

User’s point of viewUser’s point of view•Simple user layerSimple user layer

•Only deal with AIDA objectsAIDA objects and choice of comparison algorithmcomparison algorithm

Examples of practical applicationsExamples of practical applications

2N-S=0.267 =28 p=1

2N-L=1.315 =28 p=1

2N-S=0.532 =28 p=1

2N-L=1.928 =28 p=1

2N-S=0.373 =28 p=1

2N-L= 5.882 =28 p=1

Geant4 simulationsare statistically

comparable withreference data (NIST database

http://www.nist.gov)

NIST

Geant4 Standard

Geant4 LowE

Chi-squared test

MiMicroscopic validation of physics

2 not appropriate

(< 5 entries in some bins, physical

information would be lost if rebinned)

Anderson-Darling

Ac (95%) =0.752

Test beam at BessyTest beam at BessyBepi-Colombo missionBepi-Colombo mission

Energy (keV)

Cou

nts

X-ray fluorescence spectrum in Iceand basalt(EIN=6.5 keV)

Very complex distributions

Experimental measurements are comparable with Geant4 simulations

DEXP-GEANT4=0.11 p=n.s.

2EXP-GEANT4=3.8 =2 p=n.s.

KOLMOGOROV-SMIRNOVKOLMOGOROV-SMIRNOV

Goodman approximation Goodman approximation KOLMOGOROV-SMIRNOVKOLMOGOROV-SMIRNOV

Medical applications-hadron therapyMedical applications-hadron therapy

Experimental measurements are comparable with Geant4 simulations

Future developmentsFuture developments

• Real-lifeReal-life distributions are not strictly limited to

one-dimension.• For this reason the algorithms contained in the

GoF Toolkit are going to be generalised to the case of higherhigher dimensional distributions.

• This is a big step forwardbig step forward in statisticsstatistics and in

physics data analysisphysics data analysis as well.

Work in progress (I)Work in progress (I)

• The user will have the possibility to compare its distributions with some theoretical referencetheoretical reference distributions, as:

- uniform, - gaussian, - Weibull, - gamma, …

• Data handlingData handling: filtering

• Treatment of errorsTreatment of errors (uncertainties)

Work in progress (II)Work in progress (II)

• The GoF Toolkit is downloadable from the web:www.ge.infn.it/geant4/analysis/HEPstatistics/index.html

• Recent developments– added new algorithms, improved design, improved

documentation– user examples, unit and system tests– statistical detailed documentation

StatusStatus

• This is a newnew up-to-dateup-to-date easy to handleeasy to handle and powerfulpowerful tool for statistical comparison in particle physics.

• It the first tool supplying such a variety of sophisticated and sophisticated and powerful statistical testspowerful statistical tests in HEP.

• AIDAAIDA interfaces allow its integration in any other data analysis tool.

Applications in: Applications in: HEPHEP, , astrophysicsastrophysics, , medical physicsmedical physics, … , …

ConclusionsConclusions