Statistical Methodology 21 (2014) 59–87
Contents lists available at ScienceDirect
Statistical Methodology
journal homepage: www.elsevier.com/locate/stamet
General tests of independence based onempirical processes indexed by functionsSalim Bouzebda ∗Laboratoire de Mathématiques Appliquées de Compiègne, Université de Technologie de Compiégne, France
a r t i c l e i n f o
Article history:Received 29 July 2013Received in revised form4 March 2014Accepted 15 March 2014
Keywords:Empirical processExchangeabilityMultivariate empirical copula processesTests of independenceGaussian approximationContiguous alternativesMöbius decompositionHalf-spacesCramér–von Mises statistic
a b s t r a c t
The present paper is mainly concerned with the statistical tests ofthe independence problem between random vectors. We developan approach based on general empirical processes indexed by aparticular class of functions. We prove two abstract approximationtheorems that include some existing results as particular cases.Finally, we characterize the limiting behavior of the Möbiustransformation of empirical processes indexed by functions undercontiguous sequences of alternatives.
© 2014 Elsevier B.V. All rights reserved.
1. Introduction and main results
One of the classical and important problems in statistics is testing independence between twoor more components of a random vector. The traditional approach is based on Pearson’s correlationcoefficient, but its lack of robustness to outliers and departures from normality eventually ledresearchers to consider alternative nonparametric procedures. To overcome such a problem, somerank tests of independence – those of Savage, Spearman and van der Waerden in particular – rely onlinear rank statistics are proposed. Another way to test the independence is the use of functionals ofempirical processes, which is the approach that we will develop in the present work.
We first set out some notation and the basic definitions which will be used throughout the paper.Consider a random sample X1, . . . ,Xn of independent draws from a probability measure P on an
∗ Correspondence to: UTC Rue Roger Couttolenc CS 60319, 60203 Compiégne Cedex, France. Tel.: +33 0611224000.E-mail addresses: [email protected], [email protected].
http://dx.doi.org/10.1016/j.stamet.2014.03.0011572-3127/© 2014 Elsevier B.V. All rights reserved.
60 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
arbitrary sample space X. We define the empirical measure to be
Pn = n−1n
i=1
δXi ,
where δx is the measure that assigns mass 1 at x and zero elsewhere. Let f : X → R be a measurablefunction. In themodern theory of the empirical processes it is customary to identify P and Pn with themappings given by
f → Pf =
X
fdP, and f → Pnf =
X
fdPn =1n
nk=1
f (Xi).
For any class F of measurable functions f : X → R, an empirical process
{Gnf =√n (Pnf − Pf ) : f ∈ F }
can be defined. Throughout this paper, it will be assumed that F ⊂ L2(X, dP), which in turn impliesthat the finite-dimensional distributions of the sequence of random functions {Gn(f ) : f ∈ F } con-verge weakly, as n → ∞, to the finite-dimensional distributions of a mean zero Gaussian randomfunction {B(f ) : f ∈ F }with the same matrix of covariance as {Gn(f ) : f ∈ F }, that is,
⟨f , g⟩ = cov(B(f ), B(g)) = E(f (X)g(X))− E(f (X))E(g(X)), for f , g ∈ F .
In the context of the present article the process {B(f ) : f ∈ F } will always admit a version which isalmost surely bounded and continuous with respect to the intrinsic semi-metric
dP(f , g) =
E(f (X)− g(X))2, for f , g ∈ F .
We call the process {B(f ) : f ∈ F } a P-Brownian bridge indexed by F . We can assume without lossof generality that X = X1 × · · · ×Xp. We will consider the following particular class of functions
F =
f ∈ F : f =
pj=1
fj such that fj : Xj → R
. (1.1)
Let us introduce, for j = 1, . . . , p,
Fj =g : Xj → R : g ∈ L2(Xj, dPj)
. (1.2)
We will use the following notation
∥ · ∥F = supf∈F| · |.
LetP be the joint probability forX = (X1, . . . , Xp) andP(j) be themarginal probability forXj. Through-out this paper, it will be assumed tacitly that F and Fj, defined in (1.1) and (1.2), for j = 1, . . . , p,respectively, are P-Donsker classes of functions and
∥P(j)∥Fj <∞, for j = 1, . . . , p,
see Remark 2.4 below for more details. Indeed, these assumptions allow us to apply Theorem 3.8.1of van der Vaart and Wellner [68] to the process {An(f ) : f ∈ F } defined in (1.3) below. Notice thatX1, . . . , Xp are independent, if and only if,
H0 : P =p
j=1
P(j),
which implies that
H0 : Pf =p
j=1
P(j)fj, for all f ∈ F .
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 61
For any index set T , let us denote by ℓ∞(T ) the collection of all bounded functions f : T → R. Thegeneral independence process in ℓ∞(F ) is denoted by {An(f ) : f ∈ F }, and defined to be
An(f ) :=√n
Pnf −
pj=1
P(j)n fj
. (1.3)
Notice that the process {An(f ) : f ∈ F } measures the difference between the empirical probabilitymeasure Pn of P, associated with X, and the product of the empirical probability measures P(j)
n associ-ated with the components of the random vector (X1, . . . , Xp). Let us introduce
Ip = {A ⊂ {1, . . . , p} : |A| > 1},
where we denote by |A| the cardinality of the set A. If the Möbius transformation, refer to Rota [55]and Spitzer [65], is applied, the equivalent criterion follows:
X1, . . . , Xp are independent ⇐⇒ MA(f ) = 0 for all f ∈ F and A ∈ Ip,
where
MA(f ) :=B⊆A
(−1)|A|−|B|P
p
j=1
f Bj
k∈A\B
P(k)fk. (1.4)
Here, we use the notation
f Bi =fi, if i ∈ B,1, otherwise.
For each subset A ∈ Ip, the general Möbius independence process in ℓ∞(F ), is defined to be
MA,n(f ) :=B⊆A
(−1)|A|−|B|Pn
p
j=1
f Bj
k∈A\B
P(k)n fk. (1.5)
A rejection region for an independence test is constructed for example by combining Kolmogorov teststatistics for all subsets
A∈Ip
∥√nMA,n(f )∥F > mA
, (1.6)
for some critical values mA chosen to achieve an asymptotic preassigned global significance level α,see Section 7 for further discussions. Notice that the statistical test may be performed by using the2p− p − 1 Cramér–von Mises statistics obtained from the Möbius decomposition of the general in-
dependence process {An(f ) : f ∈ F } that are given byX
√nMA,n(f (x))
2dPn(x) : A ∈ Ip
. (1.7)
This kind of approaches have been used in particular case of classes of functions by Dugué [25], De-heuvels [23], Ghoudi et al. [31], Genest and Rémillard [30], Bilodeau and Lafaye de Micheaux [9], Be-ran et al. [5], Deheuvels [24], Genest et al. [29], Kojadinovic and Holmes [42] and Kojadinovic andYan [43]. Also Blum et al.’s [10] seminal paper, dealing with testing the null hypothesis of indepen-dence between the d ≥ 2 components of a multivariate vector with continuous distribution, is to bementioned here. We will give how to obtain their results from our general setting. Very recently, Se-jdinovic et al. [60] provide a unifying framework linking two classes of statistics used in two-sampleand independence testing: on the one hand, the energy distances and distance covariances from thestatistics literature, on the other, the distances between embeddings of distributions to reproducingkernel Hilbert spaces in a similar way as in the setting of the machine learning. In particular, the au-thors provide in Section 7.2 [60], how to test the independence hypothesis, including the statistic ofGretton et al. [33] andGretton et al. [34] that is aV -statistic estimate ofHilbert–Schmidt independence
62 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
criterion, wemay refer to Sejdinovic et al. [60] for definitions and details. In Theorem 33 of Sejdinovicet al. [60], it is proved that the proposed statistic converges to the weighted of sum of chi-squares.We will denote by 1 {·} indicator function of the set {·}. Let us denote by ‘‘❀’’ the weak convergence,which is to be understood in the sense used in the monograph by van der Vaart and Wellner [68], inparticular their Definition 1.3.3.
The main result to be proved here may now be stated precisely as follows.
Theorem 1.1. If X1, . . . , Xp are independent, then, for each A ∈ Ip, as n→∞, we have√nMA,n(f ) : f ∈ F
❀ {BA(f ) : f ∈ F } in ℓ∞(F ).
The processesBA(f ) : f ∈ F , A ∈ Ip
are independent zero mean Gaussian processes with covariance
function given, for A, A′ ∈ Ip and f , g ∈ F , by
E(BA(f )BA′(g)) = 1{A = A′}k∈A
{Pkfkgk − (Pkfk)(Pkgk)} .
The proof of Theorem 1.1 in given in Section 9.
Remark 1.2. Theorem 1.1 gives the main motivation which comes from the fact that, under themutual independence of X1, . . . , Xp, which is equivalent to having {MA(f ) = 0 : f ∈ F } for A ∈ Ip,the empirical process {An(f ) : f ∈ F } defined (1.3) can be decomposed, using the Möbius transform,into 2p
− p − 1 sub-processes√
nMA,n(f ) : f ∈ F , A ∈ Ip, that converge jointly to tight centered
mutually independent Gaussian processesBA(f ) : f ∈ F , A ∈ Ip
. In other words, we remark that
it is possible to test independently whether there are dependence relationships within each subset ofcoordinates of X. Genest and Rémillard [30] investigated how these processes could be combinedto obtain a global statistic for testing independence leading to a potentially more powerful test,see [42] for details. We may refer to Genest et al. [29], for comparison of the statistical tests basedon the processes given in (1.3) with tests involving different combinations of the Cramér–von Misesstatistics derived from theMöbius decomposition, in the framework of the copula processes describedin Section 4 and Example 6.2 below.
Theorem 1.1 will be established by applying the functional delta method. Notice that the functionaldelta method allows to separate the probabilistic part, weak convergence of the empirical processes{An(f ) : f ∈ F }, and the analytical part, compact differentiability, of the problemhereby considerablysimplifying the proof of the weak convergence of
√nMA,n(f ) : f ∈ F
, for A ∈ Ip. The results of the
present work and the techniques used to obtain them are of interest on various accounts. The maintheoretical reason, is that this paper provides the first derivation, undermild conditions, of the centrallimit theorems of the general Möbius independence process
√nMA,n(f ) : f ∈ F
, for A ∈ Ip, and its
exchangeable bootstrap, that will be of separate interest. The present paper greatly extends someprevious results as we will see below.
The rest of this paper is organized as follows. In the forthcoming section, we give the bootstrappedversion of Theorem 1.1. In Section 3, we discuss how to use our results to study the independenceof half-space processes. In a similar way, in Section 4, we consider the empirical copula processes.In Section 5, we investigate the independence empirical characteristic processes. In Section 6, aresult on the large-sample behavior of the process
√nMA,n(f ) : f ∈ F
under contiguous sequences
of alternatives is derived, which is of its own interest. In Section 7, we provide some practicalrecommendations of the statistical tests. Some concluding remarks and possible future developmentsare mentioned in Section 8. Finally, all the mathematical developments are deferred to Section 9.
2. Exchangeable bootstrap of the general Möbius independence process
In this section, we introduce a general notion of bootstrapped empirical measure, constructed byexchangeably weighting sample. The interest in considering general bootstrap rather than particularcases lies in the fact that we need, generally, amore flexiblemodeling to accommodate the problem at
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 63
hand in practice. The general resampling scheme was firstly proposed by Rubin [56] and extensivelystudied by Bickel and Freedman [6], who suggested the name ‘‘weighted bootstrap’’, e.g., BayesianBootstrap when the vector weights
(Wn1, . . . ,Wnn) = (Dn1, . . . ,Dnn),
is equal in distribution to the vector of n spacings of n − 1 ordered uniform (0, 1) random variables,that is,
(Dn1, . . . ,Dnn) ∼ Dirichlet(n; 1, . . . , 1).
The interested reader may refer to Lo [46]. These resampling plans lead to the interest of a unifiedapproach, generically designated as general weighted resampling, was first proposed by Mason andNewton [47] and amongst others extended by Præstgaard and Wellner [52]. A substantial body ofliterature, reviewed in [4], gives conditions for the bootstrap to be satisfied in order to provide desir-able distributional approximations. In [7], the performance of different kinds of bootstrap proceduresis investigated through asymptotic results and small sample simulation studies. In this section, weconsider a general weighted resampling approach which was firstly proposed by Mason and New-ton [47] and was extended by Haeusler and Mason [35], Præstgaard and Wellner [52] as well as byJanssen and Pauls [38]. In their landmark paper Giné and Zinn [32], proved that theweak convergenceof the original empirical process is equivalent to convergence of its bootstrap version. According toPræstgaard andWellner [52], ‘‘This result completely settles the question about the validity of Efron’sbootstrap in awide range of situations’’. Formore references on the subject, wemay refer to Cheng andHuang [16], Bouzebda and Cherfi [12], Bouzebda [11] and Bouzebda and Limnios [13]. In the sequel,the transpose of a vector x will be denoted by x⊤. Let Wn = (Wn1, . . . ,Wnn)
⊤ be an exchangeablevector of nonnegative weights which sum to n and
W = {Wni, i = 1, 2, . . . , n = 1, 2, . . .}
is a triangular array defined on the probability space (Z, E, PW ). Throughout the paper, we assumethat the bootstrap weightsWni’s are independent from the data Xi’s, thus
PXW = P× PW .
The bootstrap weights Wni’s are assumed to belong to the class of exchangeable bootstrap weightsintroduced by Mason and Newton [47] as well as by Præstgaard and Wellner [52]. The interestedreader may refer to Billingsley [8], Aldous [1] and Kallenberg [40] for excellent general coverage ofthe theory of exchangeability. In our analysis, we shall assume the following quite general conditionsof Præstgaard and Wellner [52].
W.1 The vectorWn = (Wn1, . . . ,Wnn)⊤ is exchangeable for all n = 1, 2, . . . , i.e., for any permutation
π = (π1, . . . , πn) of (1, . . . , n), the joint distribution of
π(Wn) = (Wnπ1 , . . . ,Wnπn)⊤
is the same as that ofWn;W.2 Wni ≥ 0 for all n, i and
ni=1
Wni = n;
W.3
lim supn→∞
∥Wn1∥2,1 ≤ κ <∞,
where
∥Wn1∥2,1 =
∞
0
PW (Wn1 ≥ u)du;
64 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
W.4
limλ→∞
lim supn→∞
supt≥λ
t2PW (Wn1 > t) = 0;
W.5
(1/n)n
i=1
(Wni − 1)2PW−→ ϱ2 > 0.
The bootstrap weights corresponding to Efron’s nonparametric bootstrap, i.e., when
Wn ∼ Mult(n; n−1, . . . , n−1),
satisfy conditions W.1–W.5. In general, conditions W.3–W.5 are easily satisfied under some momentconditions on Wni, see [52, Lemma 3.1]. In addition to Efron’s nonparametric bootstrap, the samplingschemes that satisfy conditionsW.1–W.5, include Bayesian bootstrap,Multiplier bootstrap, Double boot-strap, and Urn bootstrap. This list is sufficiently long to indicate that conditions W.1–W.5, are not un-duly restrictive. Notice that the value of ϱ in W.5 is independent of n and depends on the resamplingmethod, e.g., ϱ = 1 for the nonparametric bootstrap and Bayesian bootstrap, whereas ϱ =
√2 for
the double bootstrap. A more precise discussion of this general formulation of the bootstrap can befound in [52], [68, §3.6.2, p. 353], [70], [45, §10. p. 179] and [16].We define the bootstrapped empiricalmeasure to be
PWn = n−1
ni=1
WniδXi .
For each subsetA ∈ Ip, the bootstrappedMöbius independence general process in ℓ∞(F ) is defined as
MWA,n(f ) :=
B⊆A
(−1)|A|−|B|PWn
p
j=1
f Bj
k∈A\B
P(k),Wn fk. (2.1)
We denote by P W
the weak convergence conditional on the data, in probability, as defined by
Kosorok [45], that is,
βWn
P W
ϱB,
if
suph∈BL1(ℓ∞([0,1]d))
EWh(βWn )− Eh(ϱB)
P→ 0,
and
EWh(βWn )⋆ − EWh(βW
n )⋆P→ 0 for every h ∈ BL1(ℓ∞([0, 1]d)),
where
BL1ℓ∞(F )
=
f : ℓ∞(F )→ R, ∥f ∥∞ ≤ 1, |f (l1)− f (l2)| ≤ sup
x∈X|l1(x)− l2(x)| , ∀l1, l2 ∈ ℓ∞(F )
is the class of all uniformly bounded functions that are Lipschitz continuous with constant smallerthan 1, and EW denotes the conditional expectation with respect to the weights W given the dataX1, . . . ,Xn and
∥f ∥∞ = supx∈X|f (x)|.
Moreover, h(βWn )⋆ and h(βW
n )⋆ denote measurable majorants and minorants with respect to the jointdata, including the weights Wn. In the sequel, we assume that F ∈ M(P), i.e., F possesses enough
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 65
measurability for randomization with i.i.d. multipliers to be possible, i.e., Pn can be randomized, inother word, we can replace (δXi−P) by (Wni−1)δXi . It is known thatF ∈ M(P), e.g., ifF is countable,or if {Pn}
∞n are stochastically separable in F , or if F is image admissible Suslin; see [32, pp. 853–854].
The second main result may be stated in the following theorem.
Theorem 2.1. If X1, . . . , Xp are independent, then, as n→∞,√n(MW
A,n −MA,n)(f ) : f ∈ F P W
ϱB∗A(f ) : f ∈ F
, for A ∈ Ip.
The processesB∗A(f ) : f ∈ F , A ∈ Ip
are independent copies of
BA(f ) : f ∈ F , A ∈ Ip
.
The proof of Theorem 2.1 in given in Section 9.
Remark 2.2. It is noteworthy here to point out that, in a variety of statistical problems, the bootstrapprovides a simple method for circumventing technical difficulties due to the intractable distributiontheory and has become a powerful tool for setting confidence intervals and critical values of testsfor composite hypotheses. Here, we have considered more general setting for the bootstrap schemethat includes different resampling procedures as already mentioned in the introduction of thissection. Notice that Sen [61,62] was investigated the permutation tests of independence by using theU-statistics. The proof therein make use of the combinatorial central limit theorem of Hoeffding [36],extendedbyMotoo [48]. In the papers byHorváth and Shao [37], Haeusler andMason [35] andEinmahland Mason [26], the authors consider the empirical process which is based on randomly permutedindependent observations. More precisely, let the random vector R = (R(1), R(2), . . . , R(n)) be uni-formly distributed on the set of all permutation of {1, 2, . . . , n}, and be independent from{X1,X2, . . . ,Xn}. The permutation empirical measure is defined by
PRn =
1n
ni=1
δXR(i) .
We can define the permutation empirical process, {ARn(f ) : f ∈ F }, by
ARn(f ) =
√nPRnf − Pnf
,
whereF is a class of measurable functions satisfying some conditions. The limiting law of the process{AR
n(f ) : f ∈ F }may be found in Theorem3.7.1 of van der Vaart andWellner [68]. Given the precedingformulation, one can define, for each subset A ∈ Ip, the permutation Möbius independence generalprocess in ℓ∞(F ) is defined as
M RA,n(f ) :=
B⊆A
(−1)|A|−|B|PRn
p
j=1
f Bj
k∈A\B
P(k),Rn fk, (2.2)
where
P(k),Rn =
1n
ni=1
δXk,R(i) .
Using similar arguments to those use in the proof of Theorem 2.1, by replacing Theorem 2.2. of Præst-gaard and Wellner [52] by Theorem 3.7.1 of van der Vaart and Wellner [68], the limiting law of{√n(M R
A,n − MA,n)(f ) : f ∈ F } can be obtained under the conditions of Theorem 3.7.1 of van derVaart and Wellner [68] appropriately modified, this goes well beyond the scope of the present pa-per. The interested reader may refer to Einmahl and Mason [26] for more discussions and details onpermutation and exchangeable processes. The paper by Janssen and Pauls [38], presented a unifiedapproach on conditional and unconditional linear resampling statistics, and provided a connectionwith the exchangeable bootstrap.
66 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
Inspired by theworks of Schensted [57] and Baik et al. [2], recently García and González-López [28]proposed a new class of nonparametric tests for the supposition of independence between two con-tinuous random variables by using the permutation techniques, i.e., a test based on the size of thelongest increasing subsequence of a permutation of {1, 2, . . . , n}, (see Definition 2.1. of García andGonzález-López [28]). We refer to the last reference for more details and discussions.
Remark 2.3. Wemention that an appropriate choice of the bootstrap weightsWni’s leads to a smallerlimit variance, that is, ϱ2 is smaller than 1. For instance, typical examples are the multivariate hyper-geometric bootstrap, refer to [52, Example 3.4] and the Subsample Bootstrap, [50, Remark 2.2(3)].
Remark 2.4. Given a pseudometric space (T , d), the ϵ-covering number N(ϵ, T , d) is defined as
N(ϵ, T , d) = min{m : there exists a covering of T bym balls of d-radius ≤ ϵ}.
In what follows F denotes a class of functions f : X→ R and F is a measurable function with
F(x) ≥ supf∈F|f (x)|.
Given a positive measure µ on (X, R), we define
N2(ϵ, F , µ) = N(ϵ, F , ∥ · ∥L2(µ)).
The covering numbers N2(ϵ, F , µ) of many classes of functions F are bounded uniformly inµ. If F isa VC-subgraph class [51, Proposition 11.25] then there are finite constants A and v such that, for eachprobability measure µ with
µF 2 <∞,
and
N2(ϵ, F , µ) ≤ A
µ(F 2)1/2
ϵ
v
.
Let us introduce the uniform entropy integral with respect to L2-norm if
J(δ, F , L2) = δ
0supQ
log
N2ϵ[QF 2
]1/2, F ,Q
1/2dϵ,
where the supremum is taken over all finitely discrete probability measures Q with
∥F∥Q ,2 =
X
F 2dQ > 0.
We say that F has uniformly integrable entropy with respect to L2-norm if
J(∞, F , L2) <∞. (2.3)
Let F be an appropriately measurable class of measurable functions with
J(1, F , L2) <∞.
If
PF 2 <∞,
then F is P-Donsker.
Example 2.5. The set F of all indicator functions 1I{(∞, t]} of cells in R satisfies
N2 (ϵ, F , P) ≤2ϵ2
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 67
for any probability measure P and ϵ ≤ 1. Notice that 1
0
log
1ϵ
dϵ ≤
∞
0u1/2 exp(−u)du ≤ 1.
For more details and discussion on this example refer to Example 2.5.4 of van der Vaart andWellner [68] and [45, p. 157]. The covering numbers of the class of cells (∞, t] in higher dimensionsatisfy a similar bound, but with higher power of (1/ϵ), see Theorem 9.19 of Kosorok [45].
Example 2.6 (Classes of Functions that are Lipschitz in a Parameter, Section 2.7.4 in [68]). Let F be theclass of functions x → f (t, x) that are Lipschitz in the index parameter t ∈ T . Suppose that
|f (t1, x)− f (t2, x)| ≤ d(t1, t2)κ(x)
for some metric d on the index set T , the function κ(·) defined on the sample space X, and all x.According to Theorem 2.7.11 of van der Vaart and Wellner [68] and Lemma 9.18 of Kosorok [45], itfollows, for any norm ∥ · ∥F on F , that
N (ϵ∥F∥F , F , ∥ · ∥F ) ≤ N (ϵ/2, T , d).
Hence if (T , d) satisfy J(∞, T , d) <∞, then the conclusions holds for F . Let us consider as examplethe classes of functions that are smooth up to order α are defined as follows, see Section 2.7.1 of vander Vaart andWellner [68] and Section 2 of van der Vaart [67]. For 0 < α <∞ let ⌊α⌋ be the greatestinteger strictly smaller than α. For any vector k = (k1, . . . , kd) of d integers define the differentialoperator
Dk· =∂k·
∂k1 · · · ∂kd,
where
k· =d
i=1
ki.
Then, for a function f : X→ R, let
∥f ∥α = maxk·≤⌊α⌋
supx|Dkf (x)| + max
k·≤⌊α⌋supx
Dkf (x)− Dkf (y)∥x− y∥α−⌊α⌋
,
where the suprema are taken over all x, y in the interior of X with x = y. Let CαM(X) be the set of all
continuous functions f : X→ R with
∥f ∥α ≤ M.
Note that for α ≤ 1 this class consists of bounded functions f that satisfy a Lipschitz condition.Kolmogorov and Tihomirov [44] computed the entropy of the classes of Cα
M(X) for the uniform norm.As a consequence of their results van der Vaart [67] show that there exists a constant K dependingonly on α, d and the diameter of X such that for every measure γ and every ϵ > 0,
logN[ ](ϵMγ (X), CαM(X), L2(γ )) ≤ K
1ϵ
d/α
,
N[ ] is the bracketing number, refer to Definition 2.1.6 of van der Vaart andWellner [68] and we referto Theorem 2.7.1 of van der Vaart andWellner [68] for a variant of the last inequality. By Lemma 9.18of Kosorok [45], we have
logN (ϵMγ (X), CαM(X), L2(γ )) ≤ K
12ϵ
d/α
.
68 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
The following sections concern statistical applications of ourmain results. Even the list is by nomeansexhaustive, it is sufficient to point out how to apply our results in concrete situations that they standas archetypes for a variety statistical tests that can be investigated in a similar way.
3. Half-spaces and independence
According to Beran et al. [5], let |·| and ⟨·, ·⟩ denote, respectively, euclidean normand inner productin cartesian spaces Rdj . For j = 1, . . . , p, let
Sdj = {xj ∈ Rdj : |xj| = 1}
be the unit sphere in Rdj . For every (sj, tj) ∈ Sdj × R, define the half-space
H(sj, tj) =xj ∈ Rdj : ⟨sj, xj⟩ ≤ tj
.
The collection of half-spaces in Rdj , which separate probabilities Cramér and Wold [17], is denoted
F(dj)1 :=
H(sj, tj) : (sj, tj) ∈ Sdj × R
.
The following basic characterization of independence follows from characteristic functions:X1, . . . , Xp are independent if and only if
P1
p
j=1
H(sj, tj)
=
pj=1
P(j)1(H(sj, tj)),
for all
(H(sj, tj))pj=1 ∈ F
(d1)1 × · · · ×F
(dp)1 .
Let ℓ∞(F1), where
F1 = F(d1)1 × · · · ×F
(dp)1 ,
be the set of all bounded functions on F1 metrizedwith the supremumnorm ∥·∥F1 . The σ -algebra inℓ∞(F1) is that generated by open sets, i.e., the Borel σ -algebra. The independence half-space processin ℓ∞(F1), investigated by Beran et al. [5], is defined as
√n
Pn1
p
j=1
H(sj, tj)
−
pj=1
P(j)n 1(H(sj, tj))
.
However, if the Möbius transformation is applied, the equivalent criterion follows: X1, . . . , Xp areindependent if and only if
MA(s(j), t(j))pj=1
= 0
for all (H(sj, tj))pj=1 ∈ F1 and all A ∈ Ip, where
MA(s(j), t(j))pj=1
:=
B⊆A
(−1)|A|−|B|P1
p
j=1
HB(sj, tj)
k∈A\B
P(k)1(H(sk, tk)). (3.1)
Here, the notation
HB(sj, tj) :=H(sj, tj) if j ∈ B,Rdj if j ∈ B,
is used. The class of functions to be considered here is given by
F1 =
f = 1
· ∈
pj=1
H(sj, tj)
:
pj=1
H(sj, tj) ∈ F(d1)1 × · · · ×F
(dp)1
.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 69
Corollary 3.1. If X1, . . . , Xp are independent, then, for each A ∈ Ip, as n→∞, we have√nMA,n(f ) : f ∈ F1
❀BA,1(f ) : f ∈ F1
, in ℓ∞(F1).
The processesBA,1(f ) : f ∈ F1
are independent zeromean Gaussian processes with covariance function
given, for A, A′ ∈ Ip and f , g ∈ F1, by
E(BA,1(f )BA′,1(g)) = 1{A = A′}k∈A
{Pkfkgk − (Pkfk)(Pkgk)}
= 1{A = A′}k∈A
Pk(H(sk, tk) ∩ H(sk,t(k)))
− Pk(H(sk, tk))Pk(H(sk,tk)).
Corollary 3.1 is exactly Theorem 1 of Beran et al. [5], which an easy consequence of Theorem 1.1.
Corollary 3.2. If X1, . . . , Xp are independent, then, as n→∞,√n(MW
A,n −MA,n)(f ) : f ∈ F1 P W
ϱB∗A,1(f ) : f ∈ F1
, for A ∈ Ip.
The processesB∗A,1(f ) : f ∈ F1, A ∈ Ip
are independent copies of
BA,1(f ) : f ∈ F1, A ∈ Ip
.
In the case when
Wn ∼ Mult(n; n−1, . . . , n−1),
Corollary 3.2 reduces to Theorem 4 of Beran et al. [5].
4. Empirical copula processes
Consider a random vector X = (X1, . . . , Xd) with joint cumulative distribution function [df] F(·)andmarginal df.s F1(·), . . . , Fd(·). The characterization theorem of Sklar [63] implies that there existsa copula function C(·), such that,
F(x) = F(x1, . . . , xd) = C(F1(x1), . . . , Fd(xd)), for all x1, . . . , xd ∈ R. (4.1)
By definition, the copula functionC(·) is a d-variate cumulative distribution function, on the unit cube[0, 1]d, the margins of which are standard uniform distributions on the interval [0, 1]. If the marginaldf.s F1(·), . . . , Fd(·) are continuous, then the function C(·) is unique and
C(u) = C(u1, . . . , ud) = F(F−11 (u1), . . . , F−1d (ud)), (4.2)
where, for j = 1, . . . , d,
F−1j (u) = inf{x : Fj(x) ≥ u} with u ∈ (0, 1]
and
F−1j (0) = limt↓0
F−1j (t) = F←j (0+),
is the quantile function of Fj(·). If not stated otherwise, we assume that the Fj(·), for j = 1, . . . , d,are continuous functions. In the monographs by Nelsen [49] and Joe [39] the reader may find detailedingredients of the modeling theory as well as surveys of the commonly used copulas. For in depthand overview historical notes we refer to Schweizer [58]. We can refer also to Sklar [64], where theauthor sketches the proof of (4.1), develops some of its consequences, and surveys some of the workon copulas. Let Xk = (X1;k, . . . , Xd;k), k = 1, . . . , n, be i.i.d. random vectors with a d-dimensional
70 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
continuous df F(·) whose jth marginal, for j = 1, . . . , d, and associated copula are denoted by Fj(·)and C(·), respectively. The joint and marginal empirical df.s are given, respectively, by
Fn(x) =1n
nk=1
1X1;k ≤ x1, . . . , Xd;k ≤ xd
=
1n
nk=1
dj=1
1Xj;k ≤ xj
,
Fnj(xj) =1n
nk=1
1Xj;k ≤ xj
.
Nonparametric estimation of copulas goes back to Deheuvels [22] who proposed, in order to test forindependence, the following empirical copula estimator
Cn(u) = Fn(F−1n1 (u1), . . . , F−1nd (ud)),
where, for j = 1, . . . , d,
F−1nj (uj) = inf{x : Fnj(x) ≥ uj},
we may refer to van der Vaart and Wellner [69] for more details and also to Segers [59] forrecent references. We would like to test the mutual independence of p continuous random vectorsX1, . . . , Xp of dimensions d1, . . . , dp respectively. For this reason, we follow notation of Kojadinovicand Holmes [42] which is necessary for the statement of the results of this section. Let S = {1, . . . , p}and let
d = d1 + · · · + dp
be the dimension of the random vector (X1, . . . , Xp). Furthermore, define the integers b1, . . . , bp as
bj =j
k=1
dk, ∀j ∈ S,
with the convention that b0 = 0. Obviously, bj = bj−1 + dj for all j ∈ S. These integers will be usedto name the components of the random vectors X1, . . . , Xp: for any k ∈ S, the dk components of therandom vector Xk will be denoted by Xbk−1+1, Xbk−1+2, . . . , Xbk respectively. The copula of the randomvector (X1, . . . , Xp) = (X1, . . . , Xd) will be denoted by C(·). Moreover, given a vector u ∈ [0, 1]d anda subset B of S, the vector u{B} ∈ [0, 1]d is defined, for any i ∈ {1, . . . , d}, by
u{B}i =
ui, if i ∈j∈B
{bj−1 + 1, . . . , bj},
1, otherwise.
For any k ∈ S, themarginal copula ofXk is then givenbyC(u{k}),u ∈ [0, 1]d, andmutual independenceamong X1, . . . , Xp occurs when
C(u) :=
pk=1
Cu{k}
, for u ∈ [0, 1]d.
As we continue, we shall assume that we have at hand n independent copies of the random vector(X1, . . . , Xp) = (X1, . . . , Xd) that are denoted by (X1,1, . . . , X1,d), . . . , (Xn,1, . . . , Xn,d). A naturalextension of the independence copula process
γn(u) = n1/2
Cn(u)−
dj=1
GnjG−1nj (uj)
, for u ∈ [0, 1]d, (4.3)
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 71
based on this random sample is then
γ ∗n (u) = n1/2
Cn(u)−
pk=1
Cnu{k}
, for u ∈ [0, 1]d, (4.4)
where
Cn(u) =1n
ni=1
1Fn1(X1;k) ≤ u1, . . . , Fnd(Xd;k) ≤ ud
= Pnf ,
where
f = 1 {Fn1(·) ≤ u1, . . . , Fnd(·) ≤ ud} =
pj=1
1F{j}n (·) ≤ u{j}
=
pj=1
fj, for u ∈ [0, 1]d.
The class of functions to be considered here is given by
F2 =f = 1 {Fn1(·) ≤ u1, . . . , Fnd(·) ≤ ud} : u ∈ [0, 1]d
.
For all f ∈ F2 and A ∈ Ip, recall
MA(f ) :=B⊆A
(−1)|A|−|B|P
p
j=1
f Bj
k∈A\B
P(k)fk. (4.5)
For each subset A ∈ Ip, the Möbius independence copula process is defined, in ℓ∞(F2), to be
MA,n(f ) :=B⊆A
(−1)|A|−|B|Pn
p
j=1
f Bj
k∈A\B
P(k)n fk. (4.6)
The following corollary is an immediate consequence of Theorem 1.1, and corresponds to Theorem8 of Kojadinovic and Holmes [42]. Notice that Quessy [53] investigated a similar problem by usingdifferent methodology and proof from that proposed by the last mentioned authors.
Corollary 4.1. Suppose that the copula functionC(·) has continuous partial derivatives. If X1, . . . , Xp areindependent, then, for each A ∈ Ip, as n→∞, we have√
nMA,n(f ) : f ∈ F2
❀BA,2(f ) : f ∈ F2
, in ℓ∞(F2).
The processesBA,2(f ) : f ∈ F2
are independent zeromean Gaussian processes with covariance function
given, for A, A′ ∈ Ip and f , g ∈ F2, by
E(BA,2(f )BA′,2(g)) = 1{A = A′}k∈A
{Pkfkgk − (Pkfk)(Pkgk)}
= 1{A = A′}k∈A
Cu{k} ∧ v{k}
− C
u{k}
Cv{k}
.
Corollary 4.2. Suppose that C(·) has continuous partial derivatives. If X1, . . . , Xp are independent, then,as n→∞,√
n(MWA,n −MA,n)(f ) : f ∈ F2
P W
ϱB∗A,2(f ) : f ∈ F2
, for A ∈ Ip.
The processesB∗A,2(f ) : f ∈ F2, A ∈ Ip
are independent copies of
BA,2(f ) : f ∈ F2, A ∈ Ip
.
Again Corollary 4.2 extends Proposition 18 of Kojadinovic and Holmes [42], corresponding to thechoice
Wn ∼ Mult(n; n−1, . . . , n−1).
72 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
Remark 4.3. It is possible to consider the following class of functions
F2;ϕ =f = ϕ(u)1 {Fn1(·) ≤ u1, . . . , Fnd(·) ≤ ud} : u ∈ [0, 1]d
,
with the special form of the weight functions
ϕ(u) =
di=1
ϕi(ui) :=
di=1
uγii , for γi > −1, i = 1, . . . , d.
Then the weighted empirical copula process in a combinatorial formula of Möbius may becharacterized by using Theorem 1.1.
5. Empirical characteristic processes
Another basic characterization of independence follows from characteristic functions: X1, . . . , Xpare independent if and only if
H0 : 9(t) =p
i=1
Ψk(t{k}), for t ∈ Rd, (5.1)
where
9(t) = φ(t)+ iϕ(t), for t = (t1, . . . , tp) = (t1, . . . , td) ∈ Rd
is the characteristic functions belonging to X = (X1, . . . , Xp) and
Ψ (j)(t{j}) = 9(0, . . . , 0, t{j}, 0, . . . , 0) = 9(0, t{j}, 0)
be the jth marginal characteristic function belonging to Xj, 1 ≤ j ≤ p. In the papers by Kankainen andUshakov [41], Csörgő [19], Csörgő [18], Csörgő [21], Csörgő [20], Bilodeau and Lafaye de Micheaux [9]and Bakirov et al. [3] the authors considered the test statistics based on the characteristic function,the interested reader may refer to the mentioned papers for further details. Let
9n(t) =1n
nj=1
expi⟨t, X{1}j , . . . , X{p}j ⟩
=
Rd
exp {i⟨t,u⟩} dFn(u), for t ∈ Rd,
denote the empirical characteristic function associated with X1, . . . ,Xn, and let
Ψnj(t(j)) =1n
nk=1
expi⟨t{j}, X(j)
k ⟩
= 9n(0, t{j}, 0) (5.2)
be its jth empirical marginal characteristic function of Xk. The empirical characteristic process isdefined by, for t ∈ Rd,
0n(t) = n1/2
9n(t)−
pj=1
Ψ (j)n (t{j})
. (5.3)
It is easy to see that
0n(t) = n1/2
9n(t)−
pj=1
Ψ (j)n (t{j})
=√n
Pnf −
pj=1
P(j)n fj
,
where
f = exp {i⟨t, (·, . . . , ·)⟩} =p
j=1
expi⟨t{j}, ·⟩
=
pj=1
fj.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 73
The class of functions to be considered here is given by
F3 =f = exp {i⟨t, (·, . . . , ·)⟩} : t ∈ Rd .
From now on, we assume that the marginal vectors are normally distributed but without assumingthe joint normality of these marginal vectors. Recall that, in the independence case, i.e., H0 holds, forA ∈ Ip,
MA(f ) =B⊂A
(−1)|A|−|B|9(tB)k∈A\B
Ψ (t{k})
=
B⊆A
(−1)|A|−|B|P
p
j=1
f Bj
k∈A\B
P(k)fk = 0, for f ∈ F3,
and its empirical counterparts is given by
Mn,A(f ) =B⊂A
(−1)|A|−|B|9n(tB)k∈A\B
Ψ (k)n (t{k})
=
B⊆A
(−1)|A|−|B|Pn
p
j=1
f Bj
k∈A\B
P(k)n fk, for f ∈ F3.
Here, the notation
(tB)(j) :=t(j) if j ∈ B,0 if j ∈ B,
is used. The following shows that, under the independence assumption, the empirical processesarising from the Möbius decomposition of the independence empirical characteristic process (5.3)are asymptotically mutually independent.
Corollary 5.1. If X1, . . . , Xp are independent, then, for each A ∈ Ip, as n→∞, we have√nMA,n(f ) : f ∈ F3
❀BA,3(f ) : f ∈ F3
, in ℓ∞(F3).
The processesBA,3(f ) : f ∈ F3
are independent zeromean Gaussian processes with covariance function
given, for A, A′ ∈ Ip and f , g ∈ F3, by
E(BA,3(f )BA′,3(g)) = 1{A = A′}k∈A
{Pkfkgk − (Pkfk)(Pkgk)}
= 1{A = A′}k∈A
Ψk(t{k} − s{k})− Ψk(t{k})Ψk(s{k})
.
Corollary 5.1 corresponds to Theorem 2.1 of Bilodeau and Lafaye de Micheaux [9].
Corollary 5.2. If X1, . . . , Xp are independent, then, as n→∞,√n(MW
A,n −MA,n)(f ) : f ∈ F P W
ϱB∗A,3(f ) : f ∈ F3
, for A ∈ Ip.
The processesB∗A,3(f ) : f ∈ F3, A ∈ Ip
are independent copies of
BA,3(f ) : f ∈ F3, A ∈ Ip
.
This result gives the validity of the bootstrap, which is not considered in [9].We have as in Remark 4.3,the following.
Remark 5.3. It is possible to consider the following class of functions
F3;ϕ =
f (t) = ϕ(t)
pj=1
expi⟨t{j}, ·⟩
: t ∈ Rd
74 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
with the special form of the weight functions
ϕ(t) =d
i=1
ϕi(ti),
satisfying some regularity conditions, for further details, refer for example to Section 3 of Kankainenand Ushakov [41]. Then the weighted empirical characteristic process in a combinatorial formula ofMöbius may be characterized by using Theorem 1.1.
6. Asymptotic behavior of√nMA,n(f ) under contiguous alternatives
Asymptotic behavior under contiguous alternatives supposes that one wants to test the nullhypothesis
H0 : P = P0 =
pj=1
P(j),
against the sequence of alternatives
Hn : P = Qn,
where we assume, as in [68, §3.10.1, p. 406], that Qn converges to P0 in the sense that √ndQ1/2
n − dP1/20
−
12hdP1/2
0
→ 0, (6.1)
for some measurable function
h : X → R.
By definition the sequence of probability measures Qn(·) is contiguous with respect to the sequenceof probability measures Pn(·) if for any sequence of measurable sets An such that
limn→∞
Pn(An) = 0 implies limn→∞
Qn(An) = 0.
The necessary and sufficient condition for contiguity that is of interest to us is given by (6.1). Themainresult of this section may be stated in the following theorem.
Theorem 6.1. Assume that the condition (6.1) holds and
∥Qnf 2∥F = O(1). (6.2)
Under Hn, we have, as n→∞,√nMA,n(f ) : f ∈ F
❀ BA(f ) = BA(f )+ µA(h, f ) : f ∈ F
, for A ∈ Ip,
where
µA(h, f ) =B⊆A
(−1)|A|−|B|P0
p
j=1
f Bj
hk∈A\B
P(k)fk.
The processes BA(f ) : f ∈ F , A ∈ Ip
are independent Gaussian processes with mean, for A ∈ Ip and
f ∈ F ,
E(BA(f )) = µA(h, f ),
and the covariance function is given, for A, A′ ∈ Ip and f , g ∈ F , by
E(BA(f )BA′(g)) = 1{A = A′}k∈A
{Pkfkgk − (Pkfk)(Pkgk)} .
The proof of Theorem 6.1 in given in Section 9.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 75
Let us define
An(f ) :=√n (Pnf − Qnf )−
√n
p
j=1
P(j)n fj −
pj=1
Q(j)n fj
.
If the following condition (6.2) is satisfied, then by [68, Theorem 3.10.12, p. 407], we have
∥√n(Qn − P0)f − P0fh∥F → 0, as n→∞.
Indeed, from the proof of [68, Theorem 3.10.12, p. 407], under conditions (6.1) and (6.2), we have, asn→∞,
√n (Qnf − P0f )− P0fh
=12
fhdP1/2
0 (dQ1/2n − dP1/2
0 )+
f√n(dQ1/2
n − dP1/20 )−
12hdP1/2
0
dQ1/2
n + dP1/20
→ 0,
uniformly in f ∈ F . This, in turn, implies under Hn, thatAn(f ) : f ∈ F
❀
A(f ) = B(f )−
pi=1
B(fj)p
j=1,j=i
P(j)fj : f ∈ F
,
provided that the conditions (6.1) and (6.2) hold.
Example 6.2. In this example we recall some results from Genest et al. [29] concerning the empiricalcopula processes under sequences of contiguous alternatives. We may refer also to Bouzebda andZari [14] where the authors studied the behavior of the weighted quadratic functionals of themultivariate empirical copula processes under sequences of contiguous alternatives. Let 2 ⊂ R be aclosed interval andF = {Cθ : θ ∈ 2} be a given family of copulas that aremonotone in θ with respectto the concordance ordering and for which θ0 ∈ 2 corresponds to independence. Assume thatCθ(·) isabsolutely continuous and its density is cθ(·). More specifically, let us assume that F = {Cθ : θ ∈ 2}
to be a family of copula fulfilling the following conditions.
(i) Cθ(·) is monotone in θ, i.e.,
if θ1 < θ2 then Cθ1(u) < Cθ2(u) for all u ∈ (0, 1)d and all θ1, θ2 ∈ 2;
(ii)
Cθ0(u) =
di=1
ui
for all u ∈ (0, 1]d and some θ0 ∈ 2;(iii) for every u ∈ (0, 1)d, the following identity holds
Cθ0(u) = limθ→θ0
∂Cθ(u)
∂θ=
u1
0. . .
ud
0cθ0(s)ds1 . . . dsd, (6.3)
where
cθ = ∂cθ/∂θ.
Asymptotic behavior under contiguous alternatives supposes that one wants to test the nullhypothesis
H0 : θ = θ0,
against the sequence of alternatives
Hn : θ = θn := θ0 +δn√n,
76 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
for n sufficiently large, where δn → δ ∈ R. Suppose under the alternative hypothesis Hn, the randomsample {Xi = (Xi1, . . . , Xid) : i = 1, . . . , n} has a joint distribution Qn(·) associated with Cθn(·), anddenote by Pn(·) the joint distribution of the same sample under independence. The necessary andsufficient condition for contiguity that is of interest to us is given by van der Vaart and Wellner [68],where it is supposed that
limn→∞
(0,1)d
n1/2
cθn(u)− 1
−
δ
2cθ0(u)
2
du1 . . . dud = 0, (6.4)
where the density cθ(·) admits a square-integrable, right derivative cθ(·) at θ = θ0. Under theseassumptions, Genest et al. [29] establish theweak convergence of
γn(u) : u ∈ [0, 1]d, n ≥ 1
, whose
results are summarized in the following theorem. Under Hn, as n→∞,√nMA,n(f ) : f ∈ F2
❀BA,2(f )+ δMA
Cθ0(u)
: f ∈ F2
. (6.5)
The weighted version of the last result is characterized by Bouzebda and Zari [14]. Indeed, we haveunder Hn, as n→∞,
√nMA,n(f ) : f ∈ F2;ϕ
❀
BA,2(f )+ δMA
d
i=1
uγ ii
Cθ0(u)
: f ∈ F2;ϕ
.
Finally, we also mention that the drift term MA(Cθ0(u)) identified in (6.5) can be computed explicitlyin many families of copulas as it is done in [29, Section 3], e.g., the multivariate equicorrelated nor-mal copula, one-parameter multivariate Farlie–Gumbel–Morgenstern copula and the Archimedeancopulas satisfying the conditions of Proposition 3.2. of Genest et al. [29]. In the same reference, theconditions (6.3) and (6.4) are checked, e.g., for Frank’s family of d-variate copulas, Clayton’s d-variatefamily of copulas, Gumbel–Hougaard family of copulas.
Remark 6.3. From Theorem 6.1, the power against a sequence of alternatives Qn satisfying (6.1)equals
PQn
∥√nMA,n(f )∥F > mA, A ∈ Ip
→ P
∥BA(f )+ µA(h, f )∥F > mA, A ∈ Ip
,
or
PQn
X
√nMA,n(f (x))
2dPn(x) > m′A, A ∈ Ip
→ P
X
{BA(f )+ µA(h, f )}2 dP(x) > m′A, A ∈ Ip
.
Unfortunately, these last expressions can rarely be evaluated explicitly.
7. Practical computation aspects of the statistical tests
It is well known that Theorem 2.1 can be used easily through routine bootstrap sampling, whichwe describe briefly as follows. Let N be a large integer. Let
W (ℓ)n = (W (ℓ)
n1 , . . . ,W (ℓ)nn )⊤, for ℓ = 1, . . . ,N,
be exchangeable vectors of nonnegative weights satisfying the preceding conditions (W.1–W.5), andbeing independent of X1, . . . ,Xn. Moreover, for any ℓ = 1, . . . ,N , let
MW (ℓ)
A,n (f ) :=B⊆A
(−1)|A|−|B|PWℓ
n
p
j=1
f Bj
k∈A\B
P(k),W (ℓ)
n fk, for A ∈ Ip, f ∈ F .
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 77
Now, according to Theorem 2.1, we readily obtain that, for A ∈ Ip, as n→∞,(√nMA,n(f ),
√n(MW (1)
A,n −MA,n)(f ), . . . ,√n(MW (N)
A,n −MA,n)(f )) : f ∈ F
❀
(BA(f ), ϱB
(1)A (f ), . . . , ϱB
(N)A (f )) : f ∈ F
, forA ∈ Ip, in ℓ∞(F )⊗(N+1),
whereB
(1)A (f ) : f ∈ F
, . . . ,
B
(N)A (f ) : f ∈ F
are independent copies of {BA(f ) : f ∈ F }. In
order to approximate the limiting distribution of√
nMA,n(f ) : f ∈ F, one can use the empirical
distribution of√n(MW (ℓ)
A,n −MA,n)(f ) : f ∈ F
, for ℓ = 1, . . . ,N and N large enough. In the
above mentioned examples, given in (1.6) and (1.7), the statistics can be written as a functional of√nMA,n(f ) : f ∈ F
and then their asymptotic behavior can be deduced from theweak convergence
properties of√n(MW
A,n −MA,n)(f ) : f ∈ F.
To be more precise, if we are interested to perform a statistical test based on a smooth functional, forA ∈ Ip,
SA,n := ϕ√
nMA,n(f ),
with the convention that large values of SA,n lead to the rejection of the null hypothesis,H0 say, undersome regularity conditions, a valid approximation to the P-value for the test based on SA,n, for N largeenough, is given by
1N
Nℓ=1
1{S(ℓ)A,n ≥ SA,n},
where
S(ℓ)A,n := ϕ(
√n(MW (ℓ)
A,n −MA,n)(f )).
We may refer for more details and further discussions about approximation to the P-value byusing the bootstrap in this framework, among others, to Genest and Rémillard [30], Bilodeau andLafaye de Micheaux [9], Beran et al. [5], Rémillard and Scaillet [54], Kojadinovic and Holmes [42],Bouzebda [11] and Bouzebda and Zari [15]. Notice that the rejection region for the test based on theMöbius decomposition can be constructed, according to Genest and Rémillard [30] and Kojadinovicand Holmes [42], as follows. We have
A∈Ip
SA,n ≥ mA
,
wheremA are critical values chosen to achieve an asymptotic global significance level α. According toGenest and Rémillard [30] and Kojadinovic and Holmes [42], it is convenient to choose these criticalvalues such that, under independence, for A ∈ Ip,
P {SA ≥ mA} = 1− β,
where
SA := ϕ (BA(f )) ,
and
β = (1− α)1/(2p−p−1).
Once more, Theorem 2.1 can be used to approximate the critical values. We close this section bymentioning that Genest and Rémillard [30], Beran et al. [5] and Kojadinovic and Holmes [42] haveused Fisher’s [27] and Tippett’s [66] P-value combination method in order to increase the power ofthe proposed statistical tests, we may refer also to Genest et al. [29] where additional combination
78 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
rules were investigated. By the fact that the variables
SA,n := ϕ√
nMA,n(f ), for A ∈ Ip,
are asymptotically independent (with a continuous limiting cdf), then Fisher’s P-value combinationmethod yields the overall test of independence
Wn = −2A∈Ip
log{1− FA,0(SA,n)}
which should be approximately distributed as chi-square with 2(2p− p − 1) degrees of freedom,
refer to Genest and Rémillard [30]. Here FA,0 denotes the distribution function under the hypothesisof independence of SA,n. As discussed in [30], under independence, the P-values 1 − FA,0(SA,n) ob-tained from the statistics SA,n, for A ∈ Ip, are approximately uniform on [0, 1]. However, FA,0 beingunknown, in practice, the test could be run by using the bootstrap (Theorem 2.1) in a similar way asin [5,42]. In order to extract methodological recommendations for the use of the proposed statisticsin this work, it will be interesting to conduct extensive Monte Carlo experiments to compare the pro-cedures presented in the preceding sections, but this would go well beyond the scope of the presentpaper.
8. Concluding remarks
The problem of testing independence is addressed in the present paper in a general setting.Asymptotic properties of the processes considered in this paper are obtainedbymeans of the empiricalprocess theory in connection with the functional delta methods. Our results include some previousresults as particular cases by choosing appropriate classes of functions. It would be of interest toextend the presentwork to the problemof testing conditional independencewhich requires nontrivialmathematics, that goes well beyond the scope of the present paper. A future research direction wouldbe to study the problem of testing independence as such investigated in this work in the setting ofserially dependent observations.
9. Proof
This section is devoted to the proofs of our results. The previously presented notation continues tobe used in the following.
Proof of Theorem 1.1. Let ai and bi, i = 1, . . . , k, be real numbers. We have the following identityk
i=1
ai
−
k
i=1
bi
=
ki=1
(ai − bi)
i−1j=1
bj
k
h=1+i
ah
. (9.1)
This when combined with the independence assumption, implies, in turn, that, for each f ∈ F ,
An(f ) :=√n
Pnf −
pj=1
P(j)n fj
=√n (Pnf − Pf )−
√n
p
j=1
P(j)n fj −
pj=1
P(j)fj
=√n (Pnf − Pf )−
pi=1
√n(P(i)
n fi − P(i)fi)i−1j=1
P(j)fjp
h=1+i
P(h)n fh. (9.2)
Notice that the class of functions F is a P-Glivenko–Cantelli class of functions,
∥Pn − P∥F = supf∈F|Pnf − Pf | → 0.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 79
This when combined with (9.7), gives, for each f ∈ F ,
An(f ) =√n (Pnf − Pf )−
pi=1
√n(P(i)
n fi − P(i)fi)p
j=1,j=i
P(j)fj + oP(1).
We have, in ℓ∞(F ),√n (Pnf − Pf ) : f ∈ F
❀ {B(f ) : f ∈ F }
and, in ℓ∞(Fj),√nP(j)n fj − P(j)fj
: fj ∈ F
❀B(fj) : fj ∈ Fj
.
Hence, we have in ℓ∞(F ),
{An(f ) : f ∈ F } ❀
A(f ) = B(f )−
pi=1
B(fi)p
j=1,j=i
P(j)fj : f ∈ F
.
For any A ∈ Ip, the map MA(f )(·) is Hadamard differentiable tangentially to ℓ∞(F ) and its derivative(a continuous linear map from ℓ∞(F ) to ℓ∞(F )) at f ∈ ℓ∞(F ). Indeed, fix A ∈ Ip and f (·) ∈ ℓ∞(F ),and let tn be a sequence of reals converging to 0. Let an(·) ∈ ℓ∞(F ) be a sequence of functionsconverging to a(·) ∈ ℓ∞(F ) such that (f + tnan)(·) ∈ ℓ∞(F ) for every n. Then by similar argumentsas in the proof of Lemma 4 of Kojadinovic and Holmes [42], uniformly in f ∈ F , as n→∞,
1tn
(MA(f + tnan)−MA(f ))
=
B⊆A
(−1)|A|−|B|
tn
(f + tnan)(xB)
k∈A\B
(f + tnan)(xk)− f (xB)k∈A\B
f (xk)
→
B⊆A
(−1)|A|−|B|f (xB)
k∈A\B
a(xk)
i∈A\B,i=k
f (xi)+ a(xB)k∈A\B
f (xk)
:= M ′
A,f (a)(x), for A ∈ Ip. (9.3)
Let
−→M : ℓ∞(F )→ (ℓ∞(F ))2
p−p−1
denote the map whose 2p− p− 1 components are the maps MA, A ∈ Ip. Similar arguments as in [42]
lead to the following weak convergence, in (ℓ∞(F ))2p−p−1, as n→∞,√
n−→M A,n(f ) : f ∈ F
❀
−→MA′
(A(f )) : f ∈ F
,
whose corresponding components are defined by
M ′
A,f (A(f )) = BA(f ) =B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk, for f ∈ F and A ∈ Ip.
In the sequel, we use the following identity tacitlyB⊆A
(−1)|A|−|B| = 0.
80 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
We give the details for the form of the limiting process, for which we use similar arguments as in [42],for each f ∈ F , for A ∈ Ip,
M ′
A,f (A(f )) = BA(f )
=
B⊆A
(−1)|A|−|B|
Pf Bk∈A\B
A(fk)
i∈A\B,i=k
P(i)fi + A(f B)k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|
k∈A\B
A(fk)
i∈A,i=k
P(i)fi + A(f B)k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|
k∈A\B
B(fk)
i∈A,i=k
P(i)fi
+
B(f B)−
i∈B
B(fi)p
j∈B,j=i
P(j)fj
k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|
k∈A\B
B(fk)−k∈B
B(fk)
×
i∈A,i=k
P(i)fi + B(f B)k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk.
To obtain the last equality we have used the following equation
B⊆A
(−1)|A|−|B|
k∈A\B
B(fk)−k∈B
B(fk)
i∈A,i=k
P(i)fi
= 0. (9.4)
IndeedB⊆A
(−1)|A|−|B|
k∈A\B
B(fk)
i∈A,i=k
P(i)fi
=
B⊆A
k∈A\B
(−1)|A|−|B|B(fk)
i∈A,i=k
P(i)fi
=
k∈A
B⊆A\{k}
(−1)|A|−|B|B(fk)
i∈A,i=k
P(i)fi
=
k∈A
B(fk)
i∈A,i=k
P(i)fi
B⊆A\{k}
(−1)|A|−|B|
= 0.
In a similar way, we haveB⊆A
(−1)|A|−|B|
k∈B
B(fk)
i∈A,i=k
P(i)fi
=
B⊆A
k∈B
(−1)|A|−|B|B(fk)
i∈A,i=k
P(i)fi
=
k∈A
B⊆A
(−1)|A|−|B|B(fk)
i∈A,i=k
P(i)fi
=
k∈A
B(fk)
i∈A,i=k
P(i)fiB⊆A
(−1)|A|−|B|
= 0.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 81
We give now the calculus of the covariance function of the limiting process. We will use the samearguments of [42, pp. 1143–1144], extended to our setting. Fix A and A′ in Ip and f , g ∈ F
E(BA(f )BA′(g)) = E
B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk
×
B′⊆A′
(−1)|A′|−|B′|B(gB′)
k′∈A′\B′
P(k′)gk′
=
B′⊆A′
B⊆A
(−1)|A|−|B|(−1)|A′|−|B′|E
B(f B)B(gB′)
k∈A\B
P(k)fk
k′∈A′\B′P(k′)fk′
=
B′⊆A′
B⊆A
(−1)|A|−|B|(−1)|A′|−|B′|
Pf BgB′
− Pf BPgB′
×
k∈A\B
P(k)fk
k′∈A′\B′P(k′)fk′ .
Let
A ∩ A′ = R = ∅.
Then we readily infer that
E(BA(f )BA′(g)) =
K⊆A\R
L⊆R
K ′⊆A′\R
L′⊆R
(−1)|A|−|K |−|L|+|A′|−|K ′|−|L′|
×
Pf K∪LgK ′∪L′
− Pf K∪LPgK ′∪L′
k∈A\(K∪L)
P(k)fk
k′∈A′\(K ′∪L′)
P(k′)fk′ .
By construction,
K ∩ (L ∪ L′ ∪ K ′) = ∅
and similarly for K ′. The mutual independence, in turn, implies that
E(BA(f )BA′(g)) =L⊆R
L′⊆R
(−1)|L|+|L′|
Pf LgL′
− Pf LPgL′
×
k∈A\L
P(k)fk
k′∈A′\L′P(k′)fk′
K⊆A\R
(−1)|A|−|K |
K ′⊆A′\R
(−1)|A′|−|K ′|.
Keep in mind that covariance is equal to zero unless
A = R = A′.
This readily implies that
E(BA(f )BA(g)) =B⊆A
K⊆A\B
L⊆B
(−1)|B|+|K |+|L|Pf BgK∪L
− Pf BPgK∪L×
k∈A\B
P(k)fk
k∈A\(K∪L)
P(k)fk
=
B⊆A
L⊆B
(−1)|B|+|L|Pf BgL
− Pf BPgL×
k∈A\B
P(k)fkk∈A\L
P(k)fk
K⊆A\B
(−1)|K |.
82 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
Notice that the summand being zero if we have B = A. Now, we obtain the following
E(BA(f )BA(g)) =L⊆A
(−1)|A|+|L|Pf AgL
− Pf APgL k∈A\L
P(k)gk
=
L⊆A
(−1)|A|+|L|Pf LgL
− Pf LPgL k∈A\L
P(k)gkP(k)fk
=
L⊆A
(−1)|A|+|L|
j∈L
P(j)fjgj −j∈L
P(j)fjP(j)gj
k∈A\L
P(k)gkP(k)fk.
This is the difference of two terms, the second of which is zero by the fact thatk∈A
P(k)gkP(k)fk
is independent of L andL⊆A
(−1)|A|+|L| = 0.
Let us recall the multinomial formula, for a1, . . . , ap, b1, . . . , bp ∈ R,
pi=1
(ai + bi) =A∈Ip
i∈A
ai
i∈A
bi
.
Making use of the multinomial formula, we finally infer that
E(BA(f )BA(g)) =L⊆A
(−1)|A|+|L|
j∈L
P(j)fjgj
k∈A\L
P(k)gkP(k)fk
=
k∈A
{Pkfkgk − (Pkfk)(Pkgk)}.
This completes the proof of Theorem 1.1. �
Proof of Theorem 2.1. First, we have the following representation of the process {AWn (f ) : f ∈ F },
AWn (f ) :=
√nPWn f − Pnf
−√n
p
j=1
P(j)Wn fj −
pj=1
P(j)n fj
=√nPWn f − Pnf
−
pi=1
√n(P(i),W
n fi − P(i)n fi)
i−1j=1
P(j)n fj
ph=1+i
P(h),Wn fh
=√nPWn f − Pnf
−
pi=1
√n(P(i),W
n fi − P(i)n fi)
pj=1,j=i
P(j)n fj
=√nPWn f − Pnf
−
pi=1
√n(P(i),W
n fi − P(i)n fi)
pj=1,j=i
P(j)fj.
Making use of Theorem 2.2. of Præstgaard and Wellner [52], we infer that, in ℓ∞(F ), as n→∞,√nPWn f − Pnf
: f ∈ F
P W{ϱB(f ) : f ∈ F }. (9.5)
We infer likewise that, in ℓ∞(Fj), as n→∞,√nP(j),Wn fj − P(j)
n fj: fj ∈ Fj
P W
ϱB(fj) : fj ∈ Fj
. (9.6)
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 83
We then make use of the Eq. (9.5) in combination with (9.6) to infer that, in ℓ∞(F ), as n→∞,
AW
n (f ) : f ∈ F P W
A∗(f ) = ϱB(f )−
pi=1
ϱB(fj)p
j=1,j=i
P(j)fj : f ∈ F
.
An easy argument now shows by an application of (9.3) that the following weak convergence holds in(ℓ∞(F ))2
p−p−1, as n→∞,√n(−→M A,n −
−→M A,n)(f ) : f ∈ F
❀
ϱ−→MA′
(A(f )) : f ∈ F
,
or equivalently√n(MW
A,n −MA,n)(f ) : f ∈ F P W{ϱBA(f ) : f ∈ F } , for each A ∈ Ip,
where we recall that
BA(f ) =B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk for f ∈ F .
This suffices to the proof of Theorem 2.1. �
Proof of Theorem 6.1. Recall that
An(f ) :=√n (Pnf − P0f )−
√n
p
j=1
P(j)n fj −
pj=1
P(j)fj
=√n (Pnf − P0f )−
pi=1
√n(P(i)
n fi − P(i)fi)i−1j=1
P(j)fjp
h=1+i
P(h)n fh. (9.7)
We have under Hn,
An(f ) =√n (Pnf − Qnf )−
pi=1
√n(P(i)
n fi − Q(i)n fi)
pj=1,j=i
Q(j)n fj +
√n (Qnf − P0f )
+√n
p
j=1
Q(j)n fj −
pj=1
P(j)fj
+ oP(1). (9.8)
Notice that by the identity (9.1), we have
√n
p
j=1
Q(j)n fj −
pj=1
P(j)fj
=
pi=1
√n(Q(i)
n fi − P(i)fi)i−1j=1
P(j)fjp
k=1+i
Q(k)n fh.
Recall that under conditions (6.1) and (6.2), we have, as n→∞,
√n (Qnf − P0f )− P0fh =
12
fhdP1/2
0 (dQ1/2n − dP1/2
0 )
+
f√n(dQ1/2
n − dP1/20 )−
12hdP1/2
0
dQ1/2
n + dP1/20
→ 0,
uniformly in f ∈ F . Applying Theorem 3.10.12 of van der Vaart and Wellner [68], we readily inferunder Hn, in ℓ∞(F ), that, as n→∞,√
n (Pnf − Qnf ) : f ∈ F
❀ {B(f ) : f ∈ F }
84 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
and, in ℓ∞(Fj),√nP(j)n fj − Q(j)
n fj: fj ∈ F
❀B(fj) : fj ∈ Fj
.
Hence, we have, under Hn, in ℓ∞(F ), as n→∞,
{An(f ) : f ∈ F } ❀
A(f ) = B(f )−p
i=1
B(fi)p
j=1,j=i
P(j)fj + P0fh
+
pi=1
P(i)fihi
pj=1,j=i
P(j)fj : f ∈ F
.
Let us denote
A(f ) = A(f )+Pfh,where
Pfh = P0fh+p
i=1
P(i)fihi
pj=1,j=i
P(j)fj,
and
hi(xi) =
hdP1 . . . dPi−1dPi+1 . . . dPp, for xi ∈ Xi.
Under Hn, we have the following weak convergence, in (ℓ∞(F ))2p−p−1,√
n−→M A,n(f ) : f ∈ F
❀
−→MA′
(A(f )) : f ∈ F
,
whose corresponding components are defined by
M ′
A,f (A(f )) =
B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk +B⊆A
(−1)|A|−|B|P0f Bhk∈A\B
P(k)fk.
We give now the details for the form of the limiting process in the Theorem 6.1. We have, for eachf ∈ F and A ∈ Ip,
M ′
A,f (A(f )) =
B⊆A
(−1)|A|−|B|
P0f Bk∈A\B
A(fk)
i∈A\B,i=k
P(i)fi + A(f B)k∈A\B
P(k)fk
+
B⊆A
(−1)|A|−|B|
P0f Bk∈A\B
Pfkhk
i∈A\B,i=k
P(i)fi +Pf B k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk
+
B⊆A
(−1)|A|−|B|
P0f Bk∈A\B
Pfkhk
i∈A\B,i=k
P(i)fi +Pf Bh k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|B(f B)k∈A\B
P(k)fk +B⊆A
(−1)|A|−|B|P0f Bhk∈A\B
P(k)fk.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 85
Here we have used the fact thatB⊆A
(−1)|A|−|B|
P0f Bk∈A\B
Pfkhk
i∈A\B,i=k
P(i)fi +Pf Bh k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|
k∈A\B
Phkfk
i∈A,i=k
P(i)fi
+
P0f Bh−
i∈B
Pfihi
pj∈B,j=i
P(j)fj
k∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|
k∈A\B
Pfkhk −k∈B
Pfkhk
i∈A,i=k
P(i)fi + P0f Bhk∈A\B
P(k)fk
=
B⊆A
(−1)|A|−|B|P0f Bhk∈A\B
P(k)fk.
Notice that, as in Eq. (9.4), we haveB⊆A
(−1)|A|−|B|
k∈A\B
Pfkhk −k∈B
Pfkhk
i∈A,i=k
P(i)fi
= 0.
This completes the proof of Theorem 6.1. �
Acknowledgments
The author would like to thank an Associate Editor and two referees for their very helpfulcomments, which led to a considerable improvement of the original version of the paper and a moresharply focused presentation.
References
[1] D.J. Aldous, Exchangeability and related topics, in: École d’été de Probabilités de Saint-Flour, XIII—1983, in: Lecture Notesin Math., vol. 1117, Springer, Berlin, 1985, pp. 1–198.
[2] J. Baik, P. Deift, K. Johansson, On the distribution of the length of the longest increasing subsequence of randompermutations, J. Amer. Math. Soc. 12 (4) (1999) 1119–1178.
[3] N.K. Bakirov,M.L. Rizzo, G.J. Székely, Amultivariate nonparametric test of independence, J. Multivariate Anal. 97 (8) (2006)1742–1756.
[4] R. Beran, The impact of the bootstrap on statistical algorithms and theory, Statist. Sci. 18 (2) (2003) 175–184. SilverAnniversary of the Bootstrap.
[5] R. Beran, M. Bilodeau, P. Lafaye de Micheaux, Nonparametric tests of independence between random vectors, J.Multivariate Anal. 98 (9) (2007) 1805–1824.
[6] P.J. Bickel, D.A. Freedman, Some asymptotic theory for the bootstrap, Ann. Statist. 9 (6) (1981) 1196–1217.[7] P.J. Bickel, F. Götze, W.R. van Zwet, Resampling fewer than n observations: gains, losses, and remedies for losses, Statist.
Sinica 7 (1) (1997) 1–31. Empirical Bayes, sequential analysis and related topics in statistics and probability (NewBrunswick, NJ, 1995).
[8] P. Billingsley, Convergence of Probability Measures, John Wiley & Sons Inc., New York, 1968.[9] M. Bilodeau, P. Lafaye de Micheaux, A multivariate empirical characteristic function test of independence with normal
marginals, J. Multivariate Anal. 95 (2) (2005) 345–369.[10] J.R. Blum, J. Kiefer, M. Rosenblatt, Distribution free tests of independence based on the sample distribution function, Ann.
Math. Statist. 32 (1961) 485–498.[11] S. Bouzebda, On the strong approximation of bootstrapped empirical copula processes with applications, Math. Methods
Statist. 21 (3) (2012) 153–188.[12] S. Bouzebda, M. Cherfi, General bootstrap for dual φ-divergence estimates, J. Probab. Stat. (2012) 33 pages.
http://dx.doi.org/10.1155/2012/834107. Art. ID 834107.[13] S. Bouzebda, N. Limnios, On general bootstrap of empirical estimator of a semi-Markov kernel with applications, J.
Multivariate Anal. 116 (2013) 52–62.[14] S. Bouzebda, T. Zari, Asymptotic behavior of weighted multivariate Cramér–von Mises-type statistics under contiguous
alternatives, Math. Methods Statist. 22 (3) (2013) 226–252.[15] S. Bouzebda, T. Zari, Strong approximation of empirical copula processes by Gaussian processes, Statistics 47 (5) (2013)
1047–1063.
86 S. Bouzebda / Statistical Methodology 21 (2014) 59–87
[16] G. Cheng, J.Z. Huang, Bootstrap consistency for general semiparametric M-estimation, Ann. Statist. 38 (5) (2010)2884–2915.
[17] H. Cramér, H. Wold, Some theorems on distribution functions, J. Lond. Math. Soc. S1-11 (4) (1936) 290.[18] S. Csörgő, Multivariate empirical characteristic functions, Z. Wahrscheinlichkeitstheor. Verwandte Geb. 55 (2) (1981)
203–229.[19] S. Csörgő, Testing by the empirical characteristic function: a survey, in: Asymptotic Statistics, Vol. 2 (Kutná Hora, 1983),
Elsevier, Amsterdam, 1984, pp. 45–56.[20] S. Csörgő, Testing for independence by the empirical characteristic function, J. Multivariate Anal. 16 (3) (1985) 290–299.[21] S. Csörgő, Consistency of some tests for multivariate normality, Metrika 36 (2) (1989) 107–116.[22] P. Deheuvels, La fonction de dépendance empirique et ses propriétés. Un test non paramétrique d’indépendance, Acad.
Roy. Belg. Bull. Cl. Sci. (5) 65 (6) (1979) 274–292.[23] P. Deheuvels, An asymptotic decomposition for multivariate distribution-free tests of independence, J. Multivariate Anal.
11 (1) (1981) 102–113.[24] P. Deheuvels,Weightedmultivariate tests of independence, Comm. Statist. TheoryMethods 36 (13–16) (2007) 2477–2491.[25] D. Dugué, Sur des tests d’indépendance indépendants de la loi, C. R. Acad. Sci., Paris Sér. A 281 (24) (1975) 1103–1104.[26] U. Einmahl, D.M. Mason, Approximations to permutation and exchangeable processes, J. Theoret. Probab. 5 (1) (1992)
101–126.[27] R.A. Fisher, Statistical Methods for Research Workers, Olivier and Boyd, London, 1932.[28] J.E. García, V. González-López, Independence tests for continuous random variables based on the longest increasing
subsequence, J. Multivariate Anal. 127 (2014) 126–146.[29] C. Genest, J.-F. Quessy, B. Remillard, Asymptotic local efficiency of Cramér–vonMises tests for multivariate independence,
Ann. Statist. 35 (1) (2007) 166–191.[30] C. Genest, B. Rémillard, Tests of independence and randomness based on the empirical copula process, TEST 13 (2) (2004)
335–370.[31] K. Ghoudi, R.J. Kulperger, B. Rémillard, A nonparametric test of serial independence for time series and residuals, J.
Multivariate Anal. 79 (2) (2001) 191–218.[32] E. Giné, J. Zinn, Bootstrapping general empirical measures, Ann. Probab. 18 (2) (1990) 851–869.[33] A. Gretton, O. Bousquet, A. Smola, B. Schölkopf, Measuring statistical dependence with Hilbert–Schmidt norms,
in: Algorithmic Learning Theory, in: Lecture Notes in Comput. Sci., vol. 3734, Springer, Berlin, 2005, pp. 63–77.[34] A. Gretton, K. Fukumizu, C.H. Teo, L. Song, B. Schölkopf, A. Smola, A kernel statistical test of independence, in: Advances in
Neural Information Processing Systems, Vol. 20, MIT Press, Cambridge, MA, 2008, pp. 585–592.[35] E. Haeusler, D.M. Mason, Weighted approximations to continuous time martingales with applications, Scand. J. Stat. 26
(2) (1999) 281–295.[36] W. Hoeffding, A class of statistics with asymptotically normal distribution, Ann. Math. Statist. 19 (1948) 293–325.[37] L. Horváth, Q.-M. Shao, Limit theorems for permutations of empirical processes with applications to change point analysis,
Stochastic Process. Appl. 117 (12) (2007) 1870–1888.[38] A. Janssen, T. Pauls, How do bootstrap and permutation tests work? Ann. Statist. 31 (3) (2003) 768–806.[39] H. Joe, Multivariate Models and Dependence Concepts, in: Monographs on Statistics and Applied Probability, vol. 73,
Chapman & Hall, London, 1997.[40] O. Kallenberg, Foundations of Modern Probability, second ed., in: Probability and its Applications (New York), Springer-
Verlag, New York, 2002.[41] A. Kankainen, N.G. Ushakov, A consistent modification of a test for independence based on the empirical characteristic
function, J. Math. Sci. (N. Y.) 89 (5) (1998) 1486–1494. Stability problems for stochastic models, part 1 (Moscow, 1996).[42] I. Kojadinovic, M. Holmes, Tests of independence among continuous random vectors based on Cramér–von Mises
functionals of the empirical copula process, J. Multivariate Anal. 100 (6) (2009) 1137–1154.[43] I. Kojadinovic, J. Yan, Tests of serial independence for continuous multivariate time series based on a Möbius
decomposition of the independence empirical copula process, Ann. Inst. Statist. Math. 63 (2) (2011) 347–373.[44] A.N. Kolmogorov, V.M. Tihomirov, ε-entropy and ε-capacity of sets in functional space, Amer. Math. Soc. Transl. Ser. 2 17
(1961) 277–364.[45] M.R. Kosorok, Introduction to Empirical Processes and Semiparametric Inference, in: Springer Series in Statistics, Springer,
New York, 2008.[46] A.Y. Lo, A Bayesian method for weighted sampling, Ann. Statist. 21 (4) (1993) 2138–2148.[47] D.M. Mason, M.A. Newton, A rank statistics approach to the consistency of a general bootstrap, Ann. Statist. 20 (3) (1992)
1611–1624.[48] M. Motoo, On the Hoeffding’s combinatorial central limit theorem, Ann. Inst. Statist. Math. Tokyo 8 (1957) 145–154.[49] R.B. Nelsen, An Introduction to Copulas, second ed., in: Springer Series in Statistics, Springer, New York, 2006.[50] M. Pauly, Consistency of the subsample bootstrap empirical process, Statistics 46 (5) (2012) 621–626.[51] D. Pollard, Convergence of Stochastic Processes, in: Springer Series in Statistics, Springer-Verlag, New York, 1984.[52] J. Præstgaard, J.A. Wellner, Exchangeably weighted bootstraps of the general empirical process, Ann. Probab. 21 (4) (1993)
2053–2086.[53] J.-F. Quessy, Applications and asymptotic power of marginal-free tests of stochastic vectorial independence, J. Statist.
Plann. Inference 140 (11) (2010) 3058–3075.[54] B. Rémillard, O. Scaillet, Testing for equality between two copulas, J. Multivariate Anal. 100 (3) (2009) 377–386.[55] G.-C. Rota, On the foundations of combinatorial theory, I. Theory of Möbius functions, Z. Wahrscheinlichkeitstheor.
Verwandte Geb. 2 (1964) 340–368.[56] D.B. Rubin, The Bayesian bootstrap, Ann. Statist. 9 (1) (1981) 130–134.[57] C. Schensted, Longest increasing and decreasing subsequences, Canad. J. Math. 13 (1961) 179–191.[58] B. Schweizer, Thirty years of copulas, in: Advances in Probability Distributions with Given Marginals (Rome, 1990),
in: Math. Appl., vol. 67, Kluwer Acad. Publ., Dordrecht, 1991, pp. 13–50.[59] J. Segers,Weak convergence of empirical copula processes under nonrestrictive smoothness assumptions, Bernoulli 18 (3)
(2012) 764–782.
S. Bouzebda / Statistical Methodology 21 (2014) 59–87 87
[60] D. Sejdinovic, B.K. Sriperumbudur, A. Gretton, K. Fukumizu, Equivalence of distance-based and RKHS-based statistics inhypothesis testing, Ann. Statist. 41 (5) (2013) 2263–2291.
[61] P.K. Sen, A class of permutation tests for stochastic independence, I, Sankhya A 29 (1967) 157–174.[62] P.K. Sen, On a class of permutation tests for stochastic independence, II, Sankhya A 30 (1968) 23–30.[63] A. Sklar, Fonctions de répartition à n dimensions et leurs marges, Publ. Inst. Statist. Univ. Paris 8 (1959) 229–231.[64] A. Sklar, Random variables, joint distribution functions, and copulas, Kybernetika (Prague) 9 (1973) 449–460.[65] F.L. Spitzer, Introduction aux processus deMarkov à parametre dans Zν , in: A. Badrijian, P.-L. Hennequin (Eds.), École d’Été
de Probabilités de Saint-Flour, II—1973, in: Lecture Notes in Math., vol. 390, Springer, New York, 1974, pp. 115–189.[66] L.H.C. Tippett, The Methods of Statistics, Williams and Norgate, London, 1931.[67] A. van der Vaart, New Donsker classes, Ann. Probab. 24 (4) (1996) 2128–2140.[68] A.W. van der Vaart, J.A. Wellner, Weak Convergence and Empirical Processes, with Applications to Statistics, in: Springer
Series in Statistics, Springer-Verlag, New York, 1996.[69] A.W. van der Vaart, J.A.Wellner, Empirical processes indexed by estimated functions, in: Asymptotics: Particles, Processes
and Inverse Problems, in: IMS Lecture Notes Monogr. Ser., vol. 55, Inst. Math. Statist., Beachwood, OH, 2007, pp. 234–252.[70] J.A. Wellner, Y. Zhan, Bootstrapping Z-Estimators. Techn. Report No. 308, Univ. Washington, Seattle, 1996.