Post on 13-Jan-2016
transcript
Provide tools for the Provide tools for the statistical comparisonstatistical comparison of distributions of distributions equivalent reference distributions experimental measurements data from reference sources functions deriving from theoretical calculations or fits
Detector monitoringDetector monitoring
Simulation validationSimulation validation
Reconstruction vs. expectationReconstruction vs. expectation
Regression testingRegression testing
Physics analysisPhysics analysis
Data analysis in HEP
Qualitative evaluationQualitative evaluationQuantitative evaluationQuantitative evaluation
GoF statistical toolkit
A project to develop a
statistical comparison systemstatistical comparison system A project to develop a
statistical comparison systemstatistical comparison system
Comparison of distributionsComparison of distributions
Goodness of fit testingGoodness of fit testing
• United Software Development ProcessUnited Software Development Process, specifically tailoredtailored to the project– practical guidance and tools from the RUPRUP– both rigorous and lightweight– mapping onto ISO 15504
• Guidance from ISO 15504ISO 15504
• Incremental and iterative life cycle model
Software process guidelines
SPIRAL APPROACHSPIRAL APPROACH
• The project adopts a solid architectural approachsolid architectural approach– to offer the functionalityfunctionality and the qualityquality needed by the users– to be maintainablemaintainable over a large time scale– to be extensibleextensible, to accommodate future evolutions of the
requirements
• Component-based approachComponent-based approach– to facilitate re-use and integration in different frameworks
• AIDAAIDA– adopt a (HEP) standard– no dependence on any specific analysis tool
Architectural guidelines
The algorithms are specialised on the kind of distribution The algorithms are specialised on the kind of distribution (binned/unbinned)(binned/unbinned)
Every algorithm has been rigorously tested!
Documentation available:
http://www.ge.infn.it/geant4/analysis/HEPstatistics/http://www.ge.infn.it/geant4/analysis/HEPstatistics/
• Applies to binnedbinned distributions
• It can be useful also in case of unbinned distributions, but the data must be grouped into classes
• Cannot be applied if the counting of the theoretical frequencies in each class is < 5
– When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached
– Otherwise one could use Yates’ formula
Chi-squared testChi-squared test
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
• Kolmogorov-Smirnov test
• Goodman approximation of KS test
• Kuiper test
)(
4 22
nm
nmDmn
)()( xGxFSupD mnmn
)()()()( 00* xFxFMaxxFxFMaxD TT
Dmn
More sophisticated algorithmsMore sophisticated algorithmsunbinned distributionsunbinned distributions
SUPREMUMSUPREMUMSTATISTICSSTATISTICS
)()()(2
02 xdFxFxF T • Cramer-von Mises test
• Anderson-Darling test
)()(1)(
)()( 202 xdF
xFxF
xFxFA T
TT
T
These algorithms are so powerful that we decided to implement theirequivalent in case of binned distributions:
• Fisz-Cramer-von Mises test
• k-sample Anderson-Darling test
i
ii xFxFnn
nnt 2
21221
21 )]()([)(
i k kkk
kiikk
iK nh
HnH
HnnFh
nkn
nA
4)(
)(1
)1(
)1( 2
22
More powerful algorithmsMore powerful algorithmsunbinned distributionsunbinned distributions
binned distributionsbinned distributions
TESTS CONTAININGTESTS CONTAININGA WEIGHTING FUNCTIONA WEIGHTING FUNCTION
2 loses information in a test for unbinned distribution by grouping the data into cellsKac, Kiefer and Wolfowitz (1955) showed that Kolmogorov-Smirnov test
requires n4/5 observations compared to n observations for 2 to attain the same power
Cramer-von Mises and Anderson-Darling statistics are expected to be superior to Kolmogorov-Smirnov’s, since they make a comparison of the two distributions all along the range of x, rather than looking for a marked difference at one point
2222 Supremum Supremum statistics statistics
teststests
Supremum Supremum statistics statistics
teststests
Tests Tests containing a containing a
weight functionweight function
Tests Tests containing a containing a
weight functionweight function< <
In terms of power:
IsIs 2 the most powerful algorithm?
The user is completely shieldedshielded from both statistical and computing complexity.
USERUSER
EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODEEXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE
TOOLKITTOOLKITSTATISTICALSTATISTICAL
RESULTRESULT
User’s point of viewUser’s point of view•Simple user layerSimple user layer
•Only deal with AIDA objectsAIDA objects and choice of comparison algorithmcomparison algorithm
Examples of practical applicationsExamples of practical applications
2N-S=0.267 =28 p=1
2N-L=1.315 =28 p=1
2N-S=0.532 =28 p=1
2N-L=1.928 =28 p=1
2N-S=0.373 =28 p=1
2N-L= 5.882 =28 p=1
Geant4 simulationsare statistically
comparable withreference data (NIST database
http://www.nist.gov)
NIST
Geant4 Standard
Geant4 LowE
Chi-squared test
MiMicroscopic validation of physics
2 not appropriate
(< 5 entries in some bins, physical
information would be lost if rebinned)
Anderson-Darling
Ac (95%) =0.752
Test beam at BessyTest beam at BessyBepi-Colombo missionBepi-Colombo mission
Energy (keV)
Cou
nts
X-ray fluorescence spectrum in Iceand basalt(EIN=6.5 keV)
Very complex distributions
Experimental measurements are comparable with Geant4 simulations
DEXP-GEANT4=0.11 p=n.s.
2EXP-GEANT4=3.8 =2 p=n.s.
KOLMOGOROV-SMIRNOVKOLMOGOROV-SMIRNOV
Goodman approximation Goodman approximation KOLMOGOROV-SMIRNOVKOLMOGOROV-SMIRNOV
Medical applications-hadron therapyMedical applications-hadron therapy
Experimental measurements are comparable with Geant4 simulations
Future developmentsFuture developments
• Real-lifeReal-life distributions are not strictly limited to
one-dimension.• For this reason the algorithms contained in the
GoF Toolkit are going to be generalised to the case of higherhigher dimensional distributions.
• This is a big step forwardbig step forward in statisticsstatistics and in
physics data analysisphysics data analysis as well.
Work in progress (I)Work in progress (I)
• The user will have the possibility to compare its distributions with some theoretical referencetheoretical reference distributions, as:
- uniform, - gaussian, - Weibull, - gamma, …
• Data handlingData handling: filtering
• Treatment of errorsTreatment of errors (uncertainties)
Work in progress (II)Work in progress (II)
• The GoF Toolkit is downloadable from the web:www.ge.infn.it/geant4/analysis/HEPstatistics/index.html
• Recent developments– added new algorithms, improved design, improved
documentation– user examples, unit and system tests– statistical detailed documentation
StatusStatus
• This is a newnew up-to-dateup-to-date easy to handleeasy to handle and powerfulpowerful tool for statistical comparison in particle physics.
• It the first tool supplying such a variety of sophisticated and sophisticated and powerful statistical testspowerful statistical tests in HEP.
• AIDAAIDA interfaces allow its integration in any other data analysis tool.
Applications in: Applications in: HEPHEP, , astrophysicsastrophysics, , medical physicsmedical physics, … , …
ConclusionsConclusions