+ All Categories
Home > Documents > Correction for multiple comparisons - University of Edinburgh · Correction for multiple...

Correction for multiple comparisons - University of Edinburgh · Correction for multiple...

Date post: 03-Sep-2019
Category:
Upload: others
View: 13 times
Download: 0 times
Share this document with a friend
31
Correction for multiple Correction for multiple comparisons comparisons Cyril Pernet, PhD Cyril Pernet, PhD SBIRC/SINAPSE SBIRC/SINAPSE University of Edinburgh University of Edinburgh
Transcript
Page 1: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Correction for multiple Correction for multiple comparisons comparisons

Cyril Pernet, PhDCyril Pernet, PhDSBIRC/SINAPSE SBIRC/SINAPSE –– University of EdinburghUniversity of Edinburgh

Page 2: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

OverviewOverview

Multiple comparisons correction procedures Multiple comparisons correction procedures Levels of inferences (set, cluster, voxel)Levels of inferences (set, cluster, voxel)Circularity issuesCircularity issues

Page 3: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Multiple comparison Multiple comparison correctioncorrection

Avoiding false positivesAvoiding false positives

Page 4: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

What Problem?What Problem?

44--Dimensional DataDimensional Data1,000 multivariate observations,1,000 multivariate observations,each with > 100,000 elementseach with > 100,000 elements100,000 time series, each 100,000 time series, each with 1,000 observationswith 1,000 observations

Massively UnivariateMassively UnivariateApproachApproach

100,000 hypothesis100,000 hypothesisteststests

Massive MCP!Massive MCP!

1,000

1

2

3

. . .

Tom Nichols’ intro

Page 5: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

What Problem?What Problem?

Typical brain ~ 130000 voxelsTypical brain ~ 130000 voxels@ p = .05, it is expected = 6500 false positives!@ p = .05, it is expected = 6500 false positives!@ a more conservative value like p = .001 we still @ a more conservative value like p = .001 we still expect 130 false positives.expect 130 false positives.

Using extend threshold k without correction is not Using extend threshold k without correction is not enough as it, by chance, can cluster as well. enough as it, by chance, can cluster as well.

Page 6: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

What Problem?What Problem?

BennetBennet et al., 2009et al., 2009

TaskTask: take a decision about emotions on pictures: take a decision about emotions on picturesDesignDesign: blocks of 12 sec activation/rest: blocks of 12 sec activation/restAnalysisAnalysis: standard data processing with SPM: standard data processing with SPMSubjectSubject: a dead salmon!: a dead salmon!

Page 7: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

What Problem?What Problem?

The cluster was 81mmThe cluster was 81mm33 ! ! –– after multiple comparison corrections after multiple comparison corrections all false activations were removed.all false activations were removed.

Page 8: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Solutions for MCPSolutions for MCP

Height ThresholdHeight ThresholdFamilywiseFamilywise Error Rate (FWER)Error Rate (FWER)

Chance of Chance of anyany false positives; Controlled by false positives; Controlled by Bonferroni & Random Field MethodsBonferroni & Random Field Methods

False Discovery Rate (FDR)False Discovery Rate (FDR)Proportion of false positives Proportion of false positives amongamong rejected testsrejected tests

BayesBayes StatisticsStatistics

Page 9: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

From single univariate to From single univariate to massive univariatemassive univariate

FamilyFamily--wise null hypothesiswise null hypothesisNull hypothesisNull hypothesis

FamilyFamily--wise error ratewise error rateType 1 error rate (chance to Type 1 error rate (chance to be wrong rejecting H0)be wrong rejecting H0)

Family of statistical valuesFamily of statistical values1 statistical value1 statistical valueMany voxelsMany voxels1 observed data1 observed data

Functional neuroimagingFunctional neuroimagingUnivariate statUnivariate stat

Page 10: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Height ThresholdHeight Threshold

Choose locations where a test statistic Z (T, F, ...) is Choose locations where a test statistic Z (T, F, ...) is large to threshold the image of Z at a height zlarge to threshold the image of Z at a height zThe problem is how to choose this threshold z to The problem is how to choose this threshold z to exclude false positives with a high probability (e.g. exclude false positives with a high probability (e.g. 0.95)?0.95)?

To control for family wise error on must take into account the nb of tests

Page 11: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

BonferroniBonferroni

10000 Z10000 Z--scores ; alpha = 5%scores ; alpha = 5%alpha corrected = .000005 ; zalpha corrected = .000005 ; z--score = 4.42score = 4.42

100 voxels

100 voxels

Page 12: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

BonferroniBonferroni

10000 Z10000 Z--scores ; alpha = 5%scores ; alpha = 5%2D homogeneous smoothing 2D homogeneous smoothing –– 100 independent 100 independent observationsobservationsalpha corrected = .0005 ; zalpha corrected = .0005 ; z--score = 3.29score = 3.29

100 voxels

100 voxels

Page 13: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Solutions for MCPSolutions for MCP

An important feature of neuroimaging data is that we An important feature of neuroimaging data is that we have a family of stat values that has topological features have a family of stat values that has topological features ((BonferroniBonferroni for instance consider tests as independent)for instance consider tests as independent)Why considering data as a smooth lattice? (Why considering data as a smooth lattice? (ChumbleyChumbley et et al., 2009 al., 2009 NeuroImageNeuroImage 44)44)fMRI/PET are projection methods of data points onto fMRI/PET are projection methods of data points onto the whole space the whole space –– MEEG forms continuous functions MEEG forms continuous functions in time and are smooth by the scalp (space)in time and are smooth by the scalp (space)Neural activity propagate locally through Neural activity propagate locally through intrinsic/lateral connections and is distributed via intrinsic/lateral connections and is distributed via extrinsic connections / Hemodynamic correlates are extrinsic connections / Hemodynamic correlates are initiated by diffusing signals (e.g. NO) initiated by diffusing signals (e.g. NO)

Page 14: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Random Field TheoryRandom Field Theory

10000 Z10000 Z--scores ; alpha = 5%scores ; alpha = 5%Gaussian kernel smoothing Gaussian kernel smoothing ––How many independent observations ?How many independent observations ?

100 voxels

100 voxels

Page 15: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Random Field TheoryRandom Field Theory

RFT relies on theoretical results for smooth statistical RFT relies on theoretical results for smooth statistical maps (hence the need for smoothing), allowing to find maps (hence the need for smoothing), allowing to find a threshold in a set of data where ita threshold in a set of data where it’’s not easy to find s not easy to find the number of independent variables. Uses the the number of independent variables. Uses the expected Euler characteristic (EC density)expected Euler characteristic (EC density)

1 Estimation of the smoothness = number of 1 Estimation of the smoothness = number of reselresel(resolution element) = (resolution element) = f(nbf(nb voxels, FWHM) voxels, FWHM) 2 expected Euler characteristic = number of clusters 2 expected Euler characteristic = number of clusters above the threshold above the threshold 3 Calculation of the threshold3 Calculation of the threshold

Page 16: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Random Field TheoryRandom Field Theory

The Euler characteristic can be seen as the number of The Euler characteristic can be seen as the number of blobs in an image after thresholding (p value that you blobs in an image after thresholding (p value that you select in SPM)select in SPM)At high threshold, EC = 0 or 1 per At high threshold, EC = 0 or 1 per reselresel: E[EC] : E[EC] ≈≈ppFWEFWE

E[EC] = R · (4 loge 2) · (2π)−2/3 · Zt · e−1/2 Z2t for a 2D image, more complicated in 3D

Page 17: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Random Field TheoryRandom Field Theory

For 100 For 100 reselsresels, the equation gives E[EC] = 0.049 for a , the equation gives E[EC] = 0.049 for a threshold Z of 3.8, i.e. the probability of getting one or threshold Z of 3.8, i.e. the probability of getting one or more blobs where Z is greater than 3.8 is 0.049more blobs where Z is greater than 3.8 is 0.049

100 voxels

100 voxels

If the If the reselresel size is much larger than the voxel size then size is much larger than the voxel size then E[EC] only depends on the E[EC] only depends on the nbnb of of reselsresels otherwise it otherwise it also depends on the volume, surface and diameter of also depends on the volume, surface and diameter of the search area (i.e. shape and volume matter)the search area (i.e. shape and volume matter)

Page 18: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

False discovery RateFalse discovery Rate

Whereas family wise approach corrects for any false Whereas family wise approach corrects for any false positive, the FDR approach aim at correcting among positive, the FDR approach aim at correcting among positive results only.positive results only.

1. Run an analysis with alpha = x%1. Run an analysis with alpha = x%2. Sort the resulting positive data2. Sort the resulting positive data3. Threshold to remove the false positives 3. Threshold to remove the false positives

Page 19: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

False discovery RateFalse discovery Rate

Signal+Noise

FEW correction

FDR correction

Page 20: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

7

False discovery RateFalse discovery Ratetakes the spatial structure into account

Under H0 the nb of voxels per cluster is known uncorrected p value for clusters apply FDR on the clusters (volume-wise correction)Assumes that the volume of each cluster is independent of the number of clusters

Page 21: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Levels of inferenceLevels of inference

Voxel, cluster and setVoxel, cluster and set

Page 22: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Levels of inferenceLevels of inference

3 levels of inference can be considered:3 levels of inference can be considered:-- Voxel level (Voxel level (probprob associated at each voxel)associated at each voxel)-- Cluster level (Cluster level (probprob associated to a set of voxels)associated to a set of voxels)-- Set level (Set level (probprob associated to a set of clusters)associated to a set of clusters)

The 3 levels are nested and based on a single probability The 3 levels are nested and based on a single probability of obtaining c or more clusters (set level) with k or more of obtaining c or more clusters (set level) with k or more voxels (cluster level) above a threshold u (voxel level): voxels (cluster level) above a threshold u (voxel level): PPww(u,k,c(u,k,c))

Page 23: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Levels of inferenceLevels of inference

Set levelSet level: we can reject H0 for an : we can reject H0 for an omnibus test, i.e. there are some omnibus test, i.e. there are some significant clusters of activation in the significant clusters of activation in the brain.brain.

Cluster levelCluster level: we can reject H0 for : we can reject H0 for an area of a size k, i.e. a cluster of an area of a size k, i.e. a cluster of ‘‘activatedactivated’’ voxels is likely to be voxels is likely to be true for a given spatial extend.true for a given spatial extend.

Voxel levelVoxel level: we can reject H0 at each voxel, i.e. a voxel is : we can reject H0 at each voxel, i.e. a voxel is ‘‘activatedactivated’’ if exceeding a given threshold if exceeding a given threshold

Page 24: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Levels of inferenceLevels of inference

Each level of inference is valid, but the inferences are Each level of inference is valid, but the inferences are different different –– e.g. a set might be enough to check that e.g. a set might be enough to check that subjects activated regions selected a priori for a subjects activated regions selected a priori for a connectivity analysis connectivity analysis –– clusters might be good enough if clusters might be good enough if hypotheses are about the use of different brain areas hypotheses are about the use of different brain areas between groupsbetween groups

Both voxel and cluster levels need to address the multiple Both voxel and cluster levels need to address the multiple comparison problem. If the activated region is predicted comparison problem. If the activated region is predicted in advance, the use of corrected p values is unnecessary in advance, the use of corrected p values is unnecessary and inappropriately conservative and inappropriately conservative –– a correction for the a correction for the number of predicted regions (number of predicted regions (BonferroniBonferroni) is enough ) is enough

Page 25: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Level of inferenceLevel of inference

Uncorrected (bad)

Using p=.001 this creates an excursion setProb clusters of that size Prob peack that height

after FDR correction

RFT

Page 26: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Circularity issuesCircularity issuesin fMRIin fMRI

Page 27: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

DefinitionDefinition

Refers to the problem of selecting data for analysisRefers to the problem of selecting data for analysisHow data (areas usually) are selected, analysed and How data (areas usually) are selected, analysed and sorted is key to avoid circularitysorted is key to avoid circularity

Put forward by Put forward by VulVul et al. 2009, et al. 2009, Perspectives on Psychological Perspectives on Psychological Science.Science. 4 4 Better explained in Better explained in KriegeskorteKriegeskorte et al., 2009 et al., 2009 Nat. Nat. NeuroscienceNeuroscience 1212

Page 28: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

CircularityCircularity

Double dipping Double dipping pblmpblm: : ““data are first analyzed to select a subset and then the subset is reanalyzed to obtain the results. In this context, assumptions and hypotheses determine the selection criterion and selection can, in turn, distort the results.”

Take a gp of subjects and measures RTs, then take 2 subgroups from the same subjects and re-do some analysis?? increases the diff.Take fMRI data and get activated areas, extract ROI and re-do some analyses??

Page 29: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

CircularityCircularity

Selection and tests must be independent Selection and tests must be independent –– non non independence create spurious effectsindependence create spurious effects

Page 30: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

CircularityCircularity

Independence of the selection and testsIndependence of the selection and tests1.1. Anatomic ROI, analysis of fMRIAnatomic ROI, analysis of fMRI2.2. SPM, minimal requirement is orthogonality of the SPM, minimal requirement is orthogonality of the

contrasts (e.g. find regions using A+B>0 C=[1 1] and contrasts (e.g. find regions using A+B>0 C=[1 1] and test A test A vsvs B C=[1 B C=[1 --1]) but if N1]) but if NAA and Nand NBB are different are different there is still a bias when testing Athere is still a bias when testing A--B (across subjects B (across subjects independence is ensured by Cindependence is ensured by Cselectionselection

TT(X(XTTX)X)--11CCtesttest))3.3. Select using a subset of data, test with another oneSelect using a subset of data, test with another one

Page 31: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh

Enough for today Enough for today ☺☺

Thanks for your attention


Recommended