Szapudi´s talk – False Detection Rate
Simultaneous hypothesis testing - Setting the statistical significance
Detections of Non-Gaussianity in CMB observations
•Brief review of detections/methods used to set the statistical significance. •Application of FDR to individual statistical methods.•Application of FDR to a combination of statistical methods?.•What is the appropriate method for assessing the statistical significance of localized detections?.
CMB: Non-Gaussianity tests
Non-Gaussianity tests
Real spaceSpherical Harmonic
space
Wavelet space
Blind tests – The alternative to the null hypothesis is not specified.
Non-Gaussianity tests: Real Space
Courtesy of WMAP Science Team
Based on the temperature fluctuationobserved at each pixel i, ΔT(i)
Tests (detections):
•Genus (Park 2003).
•N-point correlation function (Eriksen et al. 2003, Eriksen et al. 2004).
•Minkowski functionals and length of skeleton (Eriksen et al. 2004).
•Extrema (Larson & Wandelt 2004)
•2-point correlation function of maxima and minima (Tojeiro et al. 2005, Larson & Wandelt 2005)
Non-Gaussianity Tests: Spherical Harmonic Space
),(),( l m
lmlmYaT
Based on the complex coefficients
lma
Tests (detections):
•Power spectrum distribution (Eriksen et al. 2003, Hansen et al. 2004).
•Correlations between adjacent multipoles (Prunet et al. 2004).
Non-Gaussianity Tests: Wavelet Space
Based on wavelet coefficients calculated at each pixel i, at a given scale R,wv(i,R) .
Tests (detections):
•Kurtosis – Spherical Mexican HatWavelet (Vielva et al. 2003, Mukherjee & Wang 2004, McEwen et al. 2004, Liu & Zhang 2005, Cruz et al. 2006).
•Skewness – Real Morlet Wavelet(McEwen et al. 2004, Liu & Zhang 2005).
•Number and area and volume of spots – SMHW (Cruz et al. 2004, Cruz et al. 2006).
•Higher Criticism – SMHW (Cayon et al. 2005, Cruz et al. 2006).
Testing the null hypothesis through Monte Carlo simulations
),(),( l m
lmlmYaT
Simulating a CMB map:
At each pixel (i – θ,ψ) 1)
)2/,0()Im(),Re( lllmlm CBNaa
Bl – Antenna Cl - Cosmology
2) Add Noise - dispersion given by simulated experiment.
Mask – Galaxy plus point sources (zeros to the pixel in the mask).
Monopole and dipole removal.
Testing the null hypothesis through Monte Carlo simulations
Map ΔT(i)
ΔT(i)Test statistic:
Over whole mapor
Different combinations of pixels (N-point corr.)
almTest Statistic:
Several multipoles or
Several combinations ofmultipoles (bispectrum)
Wavelet coeff. wv(i,R) at scale R
Test Statistic:Over whole map
at each scaleand/or
above/below threshold(spots)
Statistical Hypothesis Testing
• Testing the null hypothesis with a single
configuration
(confidence level).
• Simultaneous hypotheses testing :
- Χ2 Example (McEwen et al.2004)
)()1)((21 1
jSSjijCiSSiNstat
i
Nstat
j
SMHW Kurtosis- scale 9 above 99% confidence level.
SMHW Skewness and kurtosis – 24 statistics. Detection at the 99.9% significance level.
- Hypothesis Test Larson & Wandelt, astro-ph/0505046(maximum risk of false detection at the same level as the claimed significance)
- Conservative significance levelBased on marginal distribution of all statistics. Ex. Above 95.3% significance level.
Statistical Hypothesis Testing(FDR)
• Simultaneous Hypotheses testing – False Detection Rate
- Control of the fraction of false discoveries (detections)
over the total number of discoveries. - No assumption on the
Gaussianity of the error distribution.
- Correlations between statistics can be taken into account (?).
FDR
αx100 % of discoveries may bemistakes.
False Discovery Rate – Ex. Wavelet Space
No correlations, α=0.05, detection scales=9,8,7Correlations, α=0.1, detection scales=9,8,7
16 tests / statistical test
No correlations, α=0.1, detection scales=9,8Correlations, α=0.2, detection scales=9
Figs from Cruz et al. 2006
False Discovery Rate – Ex. Wavelet Space
Statistical Test No Correlations CN=1
α |scales (detect.)
Correlations CN=3.38 (16 scales)
α|scales (detect.)
Kurtosis α=0.05 9,8,7 α=0.1 9,8,7
Max α=0.05 9,10 α=0.2 9,10
Higher Criticism α=0.1 9,8 α=0.2 9
Area above 3σ α=0.05 8,9 α=0.2 8,9
Area above 3.5σ α=0.05 9,8 α=0.05 9,8
Area above 4σ α=0.05 9,8,10 α=0.2 9,8,10
Area above 4.5σ α=0.1 9,8 α=0.2 9
Acknowledgement – M. Cruz
What is the appropriate method for setting the statistical significance?
• Only considering detections based on
SMHW: 112 statistical tests.
- Different statistical methods.
- Several scales.• Is it possible to assess
the statistical significance of all detections all together?
• The pixels behind some of the detections are localized.