+ All Categories
Home > Documents > Fuzzy data treated as functional data: A one-way ANOVA test approach

Fuzzy data treated as functional data: A one-way ANOVA test approach

Date post: 22-Jan-2023
Category:
Upload: uniovi
View: 0 times
Download: 0 times
Share this document with a friend
13
ARTICLE IN PRESS Computational Statistics and Data Analysis ( ) Contents lists available at ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Fuzzy data treated as functional data: A one-way ANOVA test approach Gil González-Rodríguez a,* , Ana Colubi b , María Ángeles Gil b a European Centre for Soft Computing, Edificio Científico-Tecnológico, 33600 Mieres, Asturias, Spain b Departamento de Estadística, I.O. y D.M., Universidad de Oviedo, 33071 Oviedo, Spain article info Article history: Received 14 January 2010 Received in revised form 13 June 2010 Accepted 13 June 2010 Available online xxxx Keywords: Functional data Fuzzy data k-samples test ANOVA statistic Hilbert space Convex cone Bootstrap Local alternatives abstract The use of the fuzzy scale of measurement to describe an important number of observations from real-life attributes or variables is first explored. In contrast to other well-known scales (like nominal or ordinal), a wide class of statistical measures and techniques can be properly applied to analyze fuzzy data. This fact is connected with the possibility of identifying the scale with a special subset of a functional Hilbert space. The identification can be used to develop methods for the statistical analysis of fuzzy data by considering techniques in functional data analysis and vice versa. In this respect, an approach to the FANOVA test is presented and analyzed, and it is later particularized to deal with fuzzy data. The proposed approaches are illustrated by means of a real-life case study. © 2010 Elsevier B.V. All rights reserved. 1. Introduction and motivation Data which cannot be exactly described by means of numerical values, such as evaluations, medical diagnosis or quality ratings, to name but a few, are frequently classified as either nominal or ordinal. A well-known example is the so-called Likert scales (cf. Likert, 1932; Allen and Seaman, 2007) in which categories are labeled with numerical values. Using these scales, the statistical analysis is limited. Many parameters and techniques cannot be directly used or, when they can, the interpretation and reliability of the conclusions are considerably reduced (see, for instance, Stevens, 1946, or Chimka and Wolfe, 2009). Additionally, the transition from one category to another is rather abrupt (see, for instance Ammar and Wright, 2000). A third concern is that categories are not perceived in the same manner by different observers, so that the variability and accuracy cannot always be well captured. A new easy-to-use representation of such data through fuzzy values is to be considered. The measurement scale of fuzzy values includes, in particular, real vectors and set values as special elements. It is more expressive than ordinal scales and more accurate than rounding or using real or vectorial-valued codes. The arithmetic and the metric to be used make it possible to extend naturally many of the usual statistical measures and techniques. The transition between closely different values can be made gradually, and the variability, accuracy and possible subjectiveness can be well reflected in describing data. Although fuzzy data could be directly viewed as special functional data, the arithmetic and the metric being coherent with their meaning do not coincide with the usual ones for functional data. Nevertheless, the so-called support function establishes a useful embedding of the space of fuzzy values into a cone of a functional Hilbert space (see Puri and Ralescu, 1983, 1985; Körner and Näther, 2002) * Corresponding author. Tel.: +34 985456545; fax: +34 985456699. E-mail address: [email protected] (G. González-Rodríguez). 0167-9473/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2010.06.013 Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. Computational Statistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013
Transcript

ARTICLE IN PRESSComputational Statistics and Data Analysis ( ) –

Contents lists available at ScienceDirect

Computational Statistics and Data Analysis

journal homepage: www.elsevier.com/locate/csda

Fuzzy data treated as functional data: A one-way ANOVA test approach

Gil González-Rodríguez a,∗, Ana Colubi b, María Ángeles Gil ba European Centre for Soft Computing, Edificio Científico-Tecnológico, 33600 Mieres, Asturias, Spainb Departamento de Estadística, I.O. y D.M., Universidad de Oviedo, 33071 Oviedo, Spain

a r t i c l e i n f o

Article history:Received 14 January 2010Received in revised form 13 June 2010Accepted 13 June 2010Available online xxxx

Keywords:Functional dataFuzzy datak-samples testANOVA statisticHilbert spaceConvex coneBootstrapLocal alternatives

a b s t r a c t

The use of the fuzzy scale ofmeasurement to describe an important number of observationsfrom real-life attributes or variables is first explored. In contrast to other well-knownscales (like nominal or ordinal), a wide class of statistical measures and techniques canbe properly applied to analyze fuzzy data. This fact is connected with the possibility ofidentifying the scale with a special subset of a functional Hilbert space. The identificationcan be used to develop methods for the statistical analysis of fuzzy data by consideringtechniques in functional data analysis and vice versa. In this respect, an approach to theFANOVA test is presented and analyzed, and it is later particularized to deal with fuzzydata. The proposed approaches are illustrated by means of a real-life case study.

© 2010 Elsevier B.V. All rights reserved.

1. Introduction and motivation

Data which cannot be exactly described by means of numerical values, such as evaluations, medical diagnosis or qualityratings, to name but a few, are frequently classified as either nominal or ordinal. A well-known example is the so-calledLikert scales (cf. Likert, 1932; Allen and Seaman, 2007) in which categories are labeled with numerical values. Using thesescales, the statistical analysis is limited. Many parameters and techniques cannot be directly used or, when they can, theinterpretation and reliability of the conclusions are considerably reduced (see, for instance, Stevens, 1946, or Chimka andWolfe, 2009). Additionally, the transition fromone category to another is rather abrupt (see, for instance Ammar andWright,2000). A third concern is that categories are not perceived in the samemanner by different observers, so that the variabilityand accuracy cannot always be well captured.A new easy-to-use representation of such data through fuzzy values is to be considered. The measurement scale of fuzzy

values includes, in particular, real vectors and set values as special elements. It is more expressive than ordinal scales andmore accurate than rounding or using real or vectorial-valued codes. The arithmetic and the metric to be used make itpossible to extend naturally many of the usual statistical measures and techniques. The transition between closely differentvalues can be made gradually, and the variability, accuracy and possible subjectiveness can be well reflected in describingdata. Although fuzzy data could be directly viewed as special functional data, the arithmetic and the metric being coherentwith their meaning do not coincide with the usual ones for functional data. Nevertheless, the so-called support functionestablishes a useful embedding of the space of fuzzy values into a cone of a functional Hilbert space (see Puri and Ralescu,1983, 1985; Körner and Näther, 2002)

∗ Corresponding author. Tel.: +34 985456545; fax: +34 985456699.E-mail address: [email protected] (G. González-Rodríguez).

0167-9473/$ – see front matter© 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.csda.2010.06.013

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESS2 G. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) –

Fig. 1. Application to express the perception on the relative length of segments.

A guideline suggesting how to collect/describe fuzzy data associated with random experiments will be presented inSection 2. In Section 3, the natural arithmetic, the support function and a family of metrics on the space of fuzzy valueswill be recalled. Random fuzzy sets and their connection with Hilbert space-valued random elements will be described inSection 4. In Section 5, a one-way ANOVA test for functional data will be introduced, and it will be particularized to dealwith fuzzy data. Specifically, an asymptotic test and its behaviour under local alternatives, as well as a bootstrap procedure,will be analyzed. A case study introduced in Section 2 will be considered in Section 6 to illustrate the approach.

2. Collecting fuzzy data from random experiments

The space Fc(Rp) of fuzzy values to be considered contains the mappings U : Rp → [0, 1] so that for each α ∈ (0, 1] theα-level set Uα = x ∈ Rp : U(x) ≥ α is a nonempty compact convex set of Rp. When p = 1 the fuzzy values are referred toas fuzzy numbers. Formally, fuzzy values are [0, 1]-valued upper semicontinuous functionswith nonempty convex boundedα-levels. Real, vectorial, interval and set-valued data can be viewed as particular fuzzy data, by identifying them with theassociated indicator functions.A fuzzy value U ∈ Fc(Rp)models an ill-defined subset of Rp, so that for each x ∈ Rp the value U(x) can be interpreted

as ‘degree of membership’ of x to U . Alternatively, U may be interpreted as the ‘degree of compatibility’ of x with an ill-defined property U . In practice, fuzzy data usually come from either a pre-established classification, such as the danger offorest fires (see Colubi and González-Rodríguez, 2007), or from a designed experiment. This is the case of the expert eval-uation of the trees in a reforestation analyzed in Colubi (2009), where the ill-defined characteristic ‘quality’ is individuallydescribed through a fuzzy set. Obviously, accuracy and variability of data aremuch better captured by using individual fuzzyassessments than by considering a pre-fixed list of fuzzy values.The main concepts and methods are to be illustrated by means of a case study which is now introduced along with

some guidelines to describe fuzzy data. The case study regards an experiment in which people have been asked for theirperception of the relative length of different line segments with respect to a fixed longer segment that is used as a standardfor comparison. Fig. 1 displays the screen of the application. On the center top of the screen the pattern (longest line segment)is drawn in black. At each trial a gray shorter line segment is generated and placed below the pattern one, parallelly andwithout considering a concrete location (i.e., indenting or centering).After an explanation of the fuzzy values, participants are asked by their judgment of relative length for each of several

line segments in two ways. First, to choose a label from a Likert-like list, very small, small, medium, large, very large.Second, to describe the perception through a trapezoidal fuzzy number with support included in [0, 100] (0% indicating theminimum relative length and 100%maximum the one). The support is to be chosen as the set of all values that the participantsubjectively considers to be compatible with the relative length of the generated segment to a greater or lesser extent. The1-level has to be the set of all values that the participant considers to be completely compatible with his/her perceptionabout the relative length of the generated segment. The trapezoidal fuzzy set is formed by the linear interpolation of bothintervals, although it is possible to change the shape. In other words, out of the support are the values that the participantis not willing to accept as possible values for the relative length at all. The membership degree is linearly increasing fromthe minimum of the support to the first value for which the participant would say that it is the relative length of the line(see Fig. 1). However, since the participant may have doubts, often there is not a unique value in these conditions, but an

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESSG. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) – 3

interval. This interval with full membership degree is the 1-level set. Analogously, from the last value in this set and themaximum of the support, the membership degree is linearly decreasing.In Section 6 data corresponding to 17 participantswill be analyzed. The relative length can be considered as an underlying

real-valued random variable which could be physically measured. However, in Section 6 the interest will only refer tothe fuzzy-valued variable corresponding to the judgment of the relative length. Since each participant can express his/herperception and degree uncertainty, this fuzzy valued-based description provides with a richer information than traditionalones, and the variability of collected data is better captured, which entails an informational gain. Other real-life experimentswhere experts have employed this fuzzy scale to represent their opinions can be found in a forestry study in Colubi (2009)and in a flood analysis in Fernández et al. (in press).

3. Fuzzy data viewed as functional data

The operations to be used are the sum and the product by a scalar (Zadeh, 1975; Nguyen, 1978), which extend thecorresponding ones for sets and pay attention to the fuzzy meaning. Given U, V ∈ Fc(Rp), and γ ∈ R, U + γ V ∈ Fc(Rp) isdefined so that for each α ∈ [0, 1]:

(U + γ V )α = Uα + γ Vα =y+ γ z : y ∈ Uα, z ∈ Vα

.

This arithmetic does not coincide with the usual one for functions. The application of the functional arithmetic inFc(Rp)may lead to elements out of this space, and the fuzzy meaning would be lost. The space (Fc(Rp),+, ·) has not a linear (but asemilinear-conical) structure, because the sum extends level-wise the Minkowski sum of sets.

3.1. The support function: a functional representation of fuzzy values

Consider the spaceH = L2(Sp−1× (0, 1], λp×λ) of the L2-type real-valued functions defined on the unit sphere Sp−1 ofRp times the interval (0, 1] with respect to the corresponding normalized Lebesgue measures denoted by λp and λ. Takinginspiration in Trutschnig et al. (2009), the mid/spr decomposition of a function f ∈ H can be defined as f = mid f + spr fwhere, for all u ∈ Sp−1 and α ∈ (0, 1],

mid f (u, α) =f (u, α)− f (−u, α)

2, spr f (u, α) =

f (u, α)+ f (−u, α)2

,

where if u = (u1, . . . , up) ∈ Sp−1,−u denotes the element (−u1, . . . ,−up) ∈ Sp−1. It can be proven that mid f , spr f ∈ H ,mid f is an odd function and spr f is an even one, w.r.t. the first component.On this basis, a valuable inner product in H can be defined. More precisely, let θ ∈ (0,+∞) and let ϕ be a weighting

measure formalized as an absolutely continuous probabilitymeasure on ([0, 1],B[0,1])with positivemass function in (0, 1).For f , g ∈ H consider the value⟨

f , g⟩ϕθ=[mid f , mid g

]ϕ+ θ

[spr f , spr g

]ϕ,

where[f , g

]ϕ=

∫(0,1]

∫Sp−1f (u, α)g(u, α) dλp(u) dϕ(α).

Then, the following properties are satisfied:

(i)⟨f , g

⟩ϕθis an inner product inH , for which the associated norm is denoted by ‖ · ‖ϕθ .

(ii) The mid/spr decomposition of a function f ∈ H is orthogonal.(iii) (H, 〈·, ·〉ϕθ ) is a separable Hilbert space.

The support function of U ∈ Fc(Rp) (see Puri and Ralescu, 1985) extends level-wise the notion of the support function ofa set (see, for instance Castaing and Valadier, 1977) and it is given by the mapping sU : S

p−1× (0, 1] → R defined so that

sU(u, α) = supv∈Uα

〈u, v〉

for all u ∈ Sp−1, α ∈ (0, 1], where 〈·, ·〉 denotes the inner product on Rp. In general, one can state that sU(u, α) representsthe ‘‘oriented’’ distance from 0 ∈ Rp to the supporting hyperplane of Uα which is orthogonal to u.The mapping s : F 2

c (Rp)→ H , such that s(U) = sU for all U ∈ F 2

c (Rp) = U ∈ Fc(Rp) : sU ∈ H, is semilinear, that is,

it transforms the fuzzy arithmetic to the functional arithmetic in the corresponding cone.Let U ∈ F 2

c (Rp), then, from Trutschnig et al. (2009) we have that

(i) for all α ∈ (0, 1] the projection of Uα over a direction u ∈ Sp−1 is given by the interval

Πu Uα =[−sU(−u, α), sU(u, α)

];

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESS4 G. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) –

(ii) mid sU(u, α) =sU (u,α)−sU (−u,α)

2 =mid-point/center ofΠu Uα;

(iii) spr sU(u, α) =sU (u,α)+sU (−u,α)

2 = spread/radius ofΠu Uα;(iv) if mid U(•, ?) =mid-point ofΠ• U?, spr U(•, ?) = spread ofΠ• U?, then

sU = mid U + spr U .

Consequently, themid and the spr of a fuzzy value can be interpreted as a kind of functionalmeasurements of its ‘location’and ‘shape’, respectively.

3.2. Distance between fuzzy values and isometrical embedding

The above-established connection induces a family of L2 metrics on F 2c (R

p) from that associated with the norms ‖ · ‖ϕθonH . Specifically, Trutschnig et al. (2009) have introduced the following family of metrics.Let θ ∈ (0,+∞) and letϕ be an absolutely continuous probabilitymeasure on ([0, 1],B[0,1])with positivemass function

in (0, 1). Then, the mapping Dϕθ : F2c (R

p)× F 2c (R

p)→ [0,+∞) such that for any U, V ∈ F 2c (R

p)(Dϕθ (U, V )

)2= 〈sU − sV , sU − sV 〉

ϕθ =

(‖sU − sV‖

ϕθ

)2satisfies that

(i)(F 2c (R

p),Dϕθ)is a separable L2-type metric space.

(ii) The support function s : F 2c (R

p)→ H states an isometrical embedding of F 2c (R

p) onto a closed convex cone ofH .

As a result, data in the fuzzy setting with the fuzzy arithmetic and the metric Dϕθ can be systematically translated intodata in the setting of functional values with the functional arithmetic and the metric based on the norm ‖ · ‖ϕθ . In this way,although fuzzy data should not be treated directly as functional data, they can be treated as functional data by consideringthe identification with their support functions. Many developments in functional data analysis could be applied to fuzzydata by using the appropriate identifications and correspondences, whenever it can be guaranteed that the elements whichshould belong to s

(F 2c (R

p))are well-defined within it.

The Dϕθ metric on F 2c (R

p) can be equivalently expressed as follows:

Dϕθ (U, V ) =√(‖mid U −mid V‖ϕ1

)2+ θ

(‖spr U − spr V‖ϕ1

)2.

For each level, the choice of θ allows us to weight the effect of the deviation between spreads (which can be intuitivelytranslated into the difference in ‘shape’ or ‘imprecision’), in contrast to the effect of the deviation betweenmid’s (intuitivelytranslated into the difference in ‘location’). On the other hand, the choice of ϕ enables to weight the relevance of differentlevels.

4. Random fuzzy sets and relevant parameters

Random fuzzy sets (for short RFS) were introduced by Puri and Ralescu (1986), as a mathematical model associatinga fuzzy value with each outcome of a random experiment and extending level-wise the concept of random set. They areoften referred to in the literature as fuzzy random variables in Puri and Ralescu’s sense. From Colubi et al. (2001, 2002) andTrutschnig et al. (2009), several measurability conditions are deduced to be equivalent. Namely,

Theorem 4.1. Given a probability space (Ω,A, P), consider the mapping X : Ω → F 2c (R

p). Then, the following statementsare equivalent:

(i) X is a RFS, that is, for all α ∈ (0, 1] the α-level set-valued mapping

Xα : Ω → Kc(Rp) = nonempty compact convex sets of Rp, ω 7→ (X(ω))α ,

is a compact convex random set (that is,A|β−measurable, where β is the Borel σ -field associated with the Hausdorff metriconKc(Rp)).

(ii) X is a Borel measurable mapping w.r.t. A and the Borel σ -field generated by the topology induced by the metric Dϕθ onF 2c (R

p).(iii) sX : Ω → H is anH-valued random element, that is, a Borel measurable mapping w.r.t.A and the Borel σ -field generated

by the topology induced by ‖ · ‖ϕθ onH .(iv) For all α ∈ (0, 1] and u ∈ Sp−1, the function sX(u, α) : Ω → R is a real-valued random variable.(v) For all α ∈ (0, 1] and u ∈ Sp−1, the functionsmid sX(u, α) : Ω → R and spr sX(u, α) : Ω → [0,+∞) are real-valuedrandom variables.

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESSG. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) – 5

On the basis of Theorem 4.1 it is concluded that notions like the distribution induced by an RFS or the stochasticindependence of RFSs are the usual ones for Borel measurable mappings in metric spaces.The mean value of an RFS can be presented in two equivalent ways, either as an extension of the set-valued Aumann

expectation or induced from the expectation of an H-valued random element (see Puri and Ralescu, 1983, 1985, 1986).Thus,

Definition 4.1. Given a probability space (Ω,A, P) and an associated RFSX such that sX ∈ L1(Ω,A, P), the (Aumann type)mean value or expected value ofX is the fuzzy value E(X) ∈ Fc(Rp) such that for all α ∈ (0, 1](

E(X))α= Aumann integral of Xα

=

∫RpX(ω) dP(ω) for all X : Ω → Rp, X ∈ L1(Ω,A, P), X ∈ Xα a.s. [P]

or, equivalently, such that

sE(X) = E(sX).

In case p = 1, ifX is an RFS such that max| infX0|, | supX0|

∈ L1(Ω,A, P), we have that for each α ∈ [0, 1]:(

E(X))α= [E(infXα), E(supXα)] .

This definition for themeanvalue is coherentwith the considered arithmetic and satisfies the usual properties of linearity.Moreover, it is Fréchet’s expectation w.r.t. Dϕθ .Following Lubiano et al. (2000) and Körner and Näther (2002) the variance of an RFS will be based on Fréchet’s approach.

The (θ, ϕ)-Fréchet variance is conceived as a measure of the error in approximating or estimating the values of the RFSthrough the corresponding mean value. The real-valued quantification of the dispersion will enable to compare randomelements, populations, samples, estimators, etc. by simply ranking real numbers. Due to the properties of the supportfunction and the Hilbertian random elements, the considered variance satisfies the usual properties for this concept.

Definition 4.2. Given a probability space (Ω,A, P) and an associated RFSX such that sX ∈ L2(Ω,A, P), the (θ, ϕ)-Fréchetvariance ofX is the real number

σ 2X = E([Dϕθ(X, E(X)

)]2)or, equivalently,

σ 2X = E([∥∥sX − sE(X)∥∥ϕθ ]2) = E ([‖sX − E (sX)‖ϕθ ]2) = Var(sX)

= E([‖midX− E (midX)‖ϕ]2

)+ θ E

([‖sprX− E (sprX)‖ϕ]2

)= Var(midX)+ θ Var(sprX).

5. One-way FANOVA test and its particularization to a one-way ANOVA test for fuzzy data

There are some key distinctive features of the statistical inference from fuzzy data in contrast to the case of randomvariables/vectors. Firstly, the lack of realistic and operational ‘parametric’ families of probability distributionmodels for RFSs(a model for normal RFSs was suggested in Puri and Ralescu (1985), but it is very restrictive and unrealistic). Secondly, thelack of Central Limit Theorems for RFSs being directly applicable for inferential purposes. There exist some results assuringthat the limit in law of the normalized distance between the sample and population fuzzy means is the norm of a Gaussianrandom element with values belonging to a wide functional space (usually not belonging to the cone).This section aims to develop first an ANOVA test for functional data, and to particularize it later to an ANOVA test for

fuzzy data by using the identification explained in Section 3. Special care has to be taken in guaranteeing that the resultremains in the considered cone; the bootstrap approach will guarantee it.

5.1. An ANOVA test approach for functional data

The statistical study of functional data has received much attention in the last years (see, for instance Ramsay andSilverman, 1997, 2002; Ferraty and Vieu, 2006; Yao et al., 2005;Müller, 2005; Cuevas et al., 2006; for an overview of some ofthe recent trends, see the special issue edited by González-Manteiga and Vieu, 2007, and for some very recent publicationssee Cao and Ramsay, 2009; Ferraty and Vieu, 2009; Müller and Yang, 2010).The problem of testing the equality of means of k independent Hilbert space-valued random elements is now to be

considered. The framework is quite close to, although more general than, the one analyzed in Cuevas et al. (2004). Let H

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESS6 G. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) –

be a separable Hilbert space with inner product 〈·, ·〉 and associated with a norm ‖ · ‖, and let H1, . . . ,Hk be k independentH-valued random elements for which there exist the expected valuesm1, . . . ,mk, and the covariance functions K1, . . . , Kk,respectively. The aim is to test

H0 : m1 = · · · = mk versus Ha : ∃ i1 6= i2 withmi1 6= mi2 .

In Cuevas et al. (2004) an asymptotic ANOVA test for functional data based on a statistic quantifying the sum of thepairwise differences between the sample means has been determined. In contrast, the test statistic to be used in thissubsection is a natural extension of the ANOVA statistic for real-valued data (its potential use has been suggested byCuevas et al., although they have declined to enter its study). Additionally, an analysis of the consistency of the test underlocal alternatives and the bootstrap approximations for both the asymptotic procedure and the consistency result will bedeveloped. The class of Hilbert space-valued random elements the approach applies to is slightly wider than that in Cuevaset al. (2004).Let Hij

nij=1 be a simple random sample obtained from the Hilbert space-valued random element Hi for all i ∈ 1 . . . , k,

and let n ∈ N be the overall sample size, that is, n = n1 + · · · + nk. The natural extension of the ANOVA statistic is given by

An =k∑i=1

ni∥∥Hi· − H··∥∥2 ,

where Hi· =∑nij=1 Hij/ni and H·· =

∑ki=1∑nij=1 Hij/n. This statistic can be decomposed as follows:

An =k∑i=1

ni∥∥∥Hci· − Hc··∥∥∥2 + k∑

i=1

ni‖mi − µn‖2 + 2k∑i=1

ni⟨Hci· − H

c··,mi − µn

⟩,

where Hcij = Hij −mi, µn =1n

∑ki=1 nimi and 〈·, ·〉 stands for the corresponding inner product. As well, the first term can be

expressed in functional form as follows:k∑i=1

ni∥∥∥Hci· − Hc··∥∥∥2 = ηn(√n1 · Hc1·, . . . ,√nk · Hck·),

with ηn : H × · · · ×H → Rmapping (h1, . . . , hk) into

k∑i=1

∥∥∥∥∥hi − k∑l=1

αnlihl

∥∥∥∥∥2

,

where αnli =√nl/ni/

∑kr=1(nr/ni).

Proposition 5.1. If ni → ∞, ni/n→ pi > 0 as n→ ∞ for all i ∈ 1, . . . , k, and the null hypothesis H0 is fulfilled, then, Anconverges in law to the distribution of

A =k∑i=1

∥∥∥∥∥Zi − k∑l=1

αliZl

∥∥∥∥∥2

,

where αli =√pl/pi/

∑kr=1(pr/pi) for all i, l ∈ = 1, . . . , k, and Z1, . . . , Zk are independent centered Gaussian processes inH

with covariance functions K1, . . . , Kk, respectively.

Proof. If H0 is true, then the above-mentioned decomposition of An is reduced to

An = ηn(√n1 · Hc1·, . . . ,

√nk · Hck·).

The Central Limit Theorem in Hilbert spaces (see, for instance Laha and Rohatgi, 1979) assures that (√n1 · Hc1·, . . . ,

√nk · Hck·) converges in law to (Z1, . . . , Zk), where Z1, . . . , Zk are centered independent Gaussian processes with the samecovariance function as H1, . . . ,Hk respectively.Consider

η(h1, . . . , hk) =k∑i=1

∥∥∥∥∥hi − k∑l=1

αlihl

∥∥∥∥∥2

.

Since η only involves linear combinations and the L2 norm, then it is continuous, whence if (hn1, . . . , hnk) converges to

(h01, . . . , h0k) as n→∞ inH × · · · ×H , then ηn(hn1, . . . , h

nk) converges to η(h

01, . . . , h

0k) as n→∞.

Consequently, Slutsky theorem and the continuousmapping theorem guarantee the convergence in law of An = ηn(√n1 ·

Hc1·, . . . ,√nk · Hck·) to A = η(Z1, . . . , Zk).

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESSG. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) – 7

To complete the asymptotic study of the statisticAn the behaviour under local alternatives can be analyzed. To this purpose,consider

mi = m∗ +δn√nm∗i withm∗,m∗i ∈ H,

for δn ∈ R+ and n ∈ N, and so that there exist i1 6= i2 with m∗i1 6= m∗

i2. If δn/

√n→ 0 then ‖mi − mj‖2 → 0 as n→∞ for

all i, j ∈ 1, . . . , k, that is, although H0 is not fulfilled for all n ∈ N it is approached with ‘‘speed’’ δn/√n. Thus,

Proposition 5.2. If ni →∞ and ni/n→ pi > 0 as n→∞ for all i ∈ 1, . . . , k, δn →∞ and δn/√n→ 0 as n→∞, then

P(An ≤ t)→ 0 as n→∞ for all t ∈ R.

Proof. To apply arguments analogous to those in Proposition 5.1, by taking into account the expression of the consideredpopulation meansmi, the decomposition of An can be written as follows:

An = ηn(√n1 · Hc1·, . . . ,

√nk · Hck·)+ δ

2n

k∑i=1

nin‖m∗i − µ

n‖2

+ δn ζn(√n1 · Hc1·, . . . ,

√nk · Hck·,m

i − µ∗

n, . . . ,m∗

k − µ∗

n)

where µ∗n = n−1∑k

i=1 nim∗

i and

ζn(h1, . . . , hk, g1, . . . , gk) =k∑i=1

√nin

⟨hi −

k∑l=1

αnlihl, gi

⟩for all (h1, . . . , hk, g1, . . . , gk) ∈ [H]2k.The first term has been proved to converge in law to η(Z1, . . . , Zk) in Proposition 5.1. Regarding the second term, we

have that∑ki=1 ni‖m

i −µ∗n‖2/n converges to

∑ki=1 pi‖m

i −µ∗‖2, withµ∗ =

∑ki=1 pim

i . Thus, the second term divided byδ2n converges to a positive quantity, since there exist ii 6= i2 with ‖m

i1−m∗i2‖

2 > 0. Concerning the third term, if

ζ (h1, . . . , hk, g1, . . . , gk) =k∑i=1

√pi

⟨hi −

k∑l=1

αlihl, gi

⟩for all (h1, . . . , hk, g1, . . . , gk) ∈ [H]2k, analogously to Proposition 5.1, one has that ζn(

√n1 · Hc1·, . . . ,

√nk · Hck·,m

1 −

µ∗n, . . . ,m∗

k −µ∗n) converges in law to ζ (Z1, . . . , Zn,m

1 −µ∗, . . . ,m∗k −µ

∗). Consequently, it can be easily concluded thatAn/δ

3/2n → ∞ in probability, and, hence, An → ∞ in probability, which implies that P(An ≤ t) → 0 as n → ∞ for all

t ∈ R.

The proposed statistic can be studentized by considering as usual a denominator related to the overall variability. Thestrong law of large numbers and Slutsky theorem, along with the preceding propositions, enable to get immediately thenext result.

Theorem 5.3. If ni →∞ and ni/n→ pi > 0 as n→∞ for all i ∈ 1, . . . , k andHi is non-degenerated for some i ∈ 1, . . . k,then

Bn =1n

k∑i=1

ni∑j=1

‖Hij − Hi·‖2 →k∑i=1

piE‖Hi −mi‖2 > 0 a.s. [P].

Consequently, if the null hypothesis H0 is true, then the following convergence in law holds

k∑i=1ni‖Hi· − H··‖2

1n

k∑i=1

ni∑j=1‖Hij − Hi·‖2

k∑i=1‖Zi −

k∑l=1αliZl‖2

k∑i=1piE‖Hi −mi‖2

.

Moreover, under the local alternatives above-described, we have that if δn →∞ as n→∞, then P(An/Bn ≤ t)→ 0 as n→∞for all t ∈ R.

On the basis of Theorem 5.3, an asymptotic ANOVA test generalizing the usual one to the functional data analysis case isderived.A bootstrap approximation is now considered to improve asymptotic results bymeans of re-sampling techniques. In this

respect, the Bootstrap Central Limit Theorem by Giné and Zinn (1990) ensures that if for all i ∈ 1, . . . k the ni functional

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESS8 G. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) –

random variables H∗ij randomly chosen from Hi1 − Hi·, . . . ,Hini − Hi· are considered, one has that√ni · H∗i· → Zi in law

a.s.−[P] as ni →∞. Consequently, if the bootstrap statistic is defined as

A∗n =k∑i=1

ni

∥∥∥∥∥H∗i· − k∑l=1

αnliH∗

∥∥∥∥∥2

,

by the same arguments than those in Proposition 5.1, the following result is obtained.

Theorem 5.4. If ni → ∞ and ni/n → pi > 0 as n → ∞ for all i ∈ 1, . . . , k, then A∗n converges in law a.s. −[P] to thedistribution of A in Proposition 5.1 (thus, the distribution of An under H0 can be approximated by the one of A∗n).

5.2. An ANOVA test approach for fuzzy data

The results in Section 5.1 can be particularized to fuzzy data by using the identification in Section 3. Thus, an extension ofthe ANOVA introduced in Gil et al. (2006) (valid only for simple random fuzzy sets in the 1-dimensional setting) is obtained.Consider the problem of testing the equality of means of RFSs. LetX1, . . . ,Xk be k independent RFSs with values inF 2

c (Rp).

Assume that for each i ∈ 1, . . . , k there exist the fuzzy mean values E(Xi) = mi ∈ F 2c (R

p), the (θ, ϕ)-Fréchet variancesσ 2Xi , i ∈ 1, . . . , k, and the covariance functions K1, . . . , Kk of sX1 , . . . , sXk , respectively. The aim is to test

H0 : m1 = · · · = mk versus Ha : ∃ i1 6= i2 with mi1 6= mi2 ,

on the basis of a simple random sample Xijnij=1 obtained from the RFSXi for all i ∈ 1 . . . , k,

Theorem 5.5. If ni →∞, ni/n→ pi > 0 as n→∞ for all i ∈ 1, . . . , k andXi is non-degenerated for some i ∈ 1, . . . k,then if the null hypothesis H0 is true, the following convergence in law holds

Tn =

k∑i=1ni(Dϕθ (Xi·,X··)

)21n

k∑i=1

ni∑j=1

(Dϕθ (Xij,Xi·)

)2 →k∑i=1

(∥∥∥∥Zi − k∑l=1αliZl

∥∥∥∥ϕθ

)2k∑i=1pi σ 2Xi

,

where Z1, . . . , Zk are independent centered Gaussian processes inH with covariance functions K1, . . . , Kk,Xi· =∑nij=1Xij/ni

andX·· =∑ki=1∑nij=1Xij/n.Moreover, under the local alternatives, if δn →∞ as n→∞, then P(Tn ≤ t)→ 0 as n→∞

for all t ∈ R.Let X∗ij

nij=1 be randomly chosen from Xi1, . . . ,Xini. If the bootstrap statistic is defined as

A∗n =k∑i=1

ni(Dϕθ (X

i· +X··,Xi· +X∗··))2,

then the distribution of An under H0 can be approximated by the one of A∗n . As a consequence, H0 will be rejected at a givensignificance level α whenever

∑ki=1 ni

(Dϕθ (Xi·,X··)

)2 takes on a value greater than the estimated 100(1 − α) fractile of thedistribution of A∗n .

In particularizing the asymptotic results in Section 5.1 to the case of fuzzy data, the Gaussian processes involved in theasymptotic result is not F 2

c (Rp)-valued. However, the suggested bootstrap approximation has been expressed in terms of

sums and distances, and it does not involve difference of fuzzy values, whence the bootstrap statistic is defined on thesampling cone.

6. Illustrative example: application to a case study

The survey described in Section 2 was electronically distributed. A sample of 17 people from different countries,professions, ages and sexes (8 women and 9 men) have participated in. Each of the 17 people have performed a sequenceof 27 trials. The generation of the gray line segment was made at random and so that one from 9 different relative sizescan be chosen in each trial. The 9 relative sizes are equally spaced in such a way that possibilities from very small linesegment to very large are covered. Each of the 9 relative sizes appears 3 times in the sequence, although positions andlocations vary within each sequence and between sequences (very slight changes can appear in the list of the 9 relativesizes depending on the screen resolution). Variations in the data could be imputed to either variations in the perceptionsor variations due to the line segment size; showing the same 9 relative sizes 3 times to each of the participants is useful tocontrol the variations associated with the line segment size. In order to compare the overall perception of relative lengths,an not only the perceptions associated with a particular length, tests about the average perception, as a summary measure,

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESSG. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) – 9

Fig. 2. Dotted and dashed lines: two fuzzy data for line segments of 26.76% and 96.27% relatives chose by two participants (left—female, right—male).Thick line: average of the 27 trials of both participants.

Fig. 3. Distances between the two couple of trials for the female (dotted lines), for the male (dashed lines) and between the averages of both individuals(thick line) in Fig. 2.

will be carried out. Since all the participants have evaluated the same set of lengths, themean values should coincide exceptfor differences in perception. Additionally, to show the results for each length, avoiding in this way possible compensations,the 9 relative sizes are considered separately to test differences in perception between men and women. The dataset to beanalyzed and the application providing it can be found in http://bellman.ciencias.uniovi.es/SMIRE/perceptions.html (web ofresearch group SMIRE).In Fig. 2 two of the fuzzy data obtained for line segments of 26.76% and 96.27% relative lengths for 2 individuals (female

and male), as well as the average of the 27 trials of each individual (thick line) are shown. The mid-points of the level setsare linked to the location, as the class mark of grouped and interval data. The spreads are linked to the imprecision, in thesense that when the participants have more doubts, they choose wider intervals. For both individuals, the imprecision isgreater for the 26.76% length, especially for the female. Concerning the average, since the mid (respectively the spread) ofthe mean perception is the mean of the mids (respectively the spreads), we can conclude that the imprecision is greater forthe female in mean, although the average locations of the perceptions seem quite similar. To corroborate numerically thisassertion, distances are determined.The distance between fuzzy sets is now computed as a function of the weighted distances between mids and spreads, by

averaging the results of each α-level. Thus, differences between fuzzy sets can be associated with differences betweenmids(location) and differences between spreads (imprecision). To make this fact clearer, the metric (with ϕ = λ = Lebesguemeasure on (0, 1]) will be expressed as a convex linear combination of the squared distances between mids and betweenspreads in the following way

D2τ (U, V ) = (1− τ)(‖mid U −mid V‖λ1

)2+ τ

(‖spr U − spr V‖λ1

)2for all U, V ∈ F 2

c (Rp), where τ ∈ (0, 1) represents the weight of the spreads against the mids.

In Fig. 3 the distances between some fuzzy sets in Fig. 2 are displayed as functions of the weight τ . The dotted linescorrespond to the perceptions of the female, the dashed lines to those of the male, and the thick line is associated withthe distance of the averages of both individuals. For both individuals, the larger distance corresponds to the case of relativelengths of 26.76%, which seems to be more difficult to evaluate. In this case, the difference in location (the lower value of τ ,the more weighted the location) is greater for the male, who chose for the two trials fuzzy sets located in rather differentplaces althoughwith very similar imprecision (the larger value of τ , themoreweighted the imprecision).Whereas distancesbetween the trials decreases as τ increases, the distance between the averages increases with the spread weight, due to theabove-mentioned difference in imprecision.

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESS10 G. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) –

Fig. 4. Sample means for the women group (dotted line) and for the men group (dashed line) for the relative lengths of 26.76%, 96.27% and the average ofthe 9 lengths.

Fig. 5. p-values of the ANOVA tests for individuals (k = 17), and sex groups (two-sample) as a function of the proportion %.

Several ANOVA tests have been studied, namely, a first one for individual analysis (i.e, k = 17 samples of sizes ni = 27), asecond one for sex analysis (i.e., k = 2 with n1 = 27 · 8 = 216 and n2 = 27 · 9 = 243), the third and the fourth individuallyfor women (i.e., k = 8 samples of sizes ni = 27) and men (i.e., k = 9 samples of sizes ni = 27), respectively, and 9 for eachone of the relative sizes grouped by sex (i.e., k = 2 with n1 = 3 · 8 = 24 and n2 = 3 · 9 = 27). The bootstrap approach inTheorem 5.5 has been applied to approximate the corresponding p-valueswith 10,000 bootstrap replications. For simplicity,the samplemeans are represented only for the overall sex analysis and the relative lengths of 26.76% and 96.27% (see Fig. 4).The test results depend on Dτ , and therefore, on the parameter τ . To analyze the differences in the fuzzymeans by taking

advantage of the decomposition of the variance ofX in the part related to the location, midX, and the part related to theimprecision, sprX, the p-values have been computed as a function of the proportion % of the total variation that is due to the(weighted) ‘variation in imprecision’, that is,

% =τ Var(sprX)

(1− τ)Var(midX)+ τ Var(sprX).

This proportion % is an increasing function of τ and satisfies that % → 0+ iff τ → 0+, % = .5 iff (1 − τ)Var(midX) =τVar(sprX) (i.e., half of the overall variation is due to the variation of the spreads, and the other half by the one of themids),% → 1− iff τ → 1−, and so on. In practice Var(midX) is usually much greater than Var(sprX), due to the magnitudedifference. Consequently, in most of the real-life situations, assessments of weights τ ≤ 0.5 are associated with very smallvalues of %. The population variances in the definition of ρ have been approximated by using analogue estimates based onthe total sample in each case.Fig. 5 (respectively Fig. 6) displays the p-values of the two first (respectively two second) bootstrap ANOVA tests as a

function of the proportion %. Two vertical reference lines cross the figures; the one on the right corresponds to % = 0.5,whereas the left-side one corresponds to a very small value of % (the one associatedwith τ = 0.5). Figs. 5 and 6 indicate thatthe greater theweight associatedwith the variation in imprecision, the lower the p-value of both tests. Significant differencesaremore easily detectable in case of comparing themean perception after grouping by sex (respectivelywomen) than in caseof comparing individually (respectively men). This can be justified because the variation in imprecision between sex groups(respectively women) is substantially greater than the variation in imprecision between individuals (respectively men).To interpret the results two representative situations have been considered. Table 1 contains the p-values for the cases

% = 0.05 and % = 0.25. Roughly speaking, in the first situation, the test is more focused on differences in locations of

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESSG. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) – 11

Fig. 6. p-values of the ANOVA tests for women’s individuals (k = 8), and men’s individuals (k = 7) as a function of the proportion %.

Table 1p-values of the four ANOVA tests for two representative values of ρ = 0.25.

Individuals Males/females Males Females

% = 0.05 0.734 0.111 0.979 0.384% = 0.25 0.000 0.000 0.165 0.000

Table 2Contingency table for the labels classified by gender.

very small (%) small (%) medium (%) large (%) very large (%)

Men 18.93 20.17 24.28 22.22 14.40Women 12.96 25.92 21.30 20.30 18.52

the average perception, because the imprecision is low weighted. In this case, no significant differences are detected. In thesecond situation, the imprecision ismoreweighted by considering % = 0.25. Thus, the test is more affected by differences inimprecision. In this case, significant differences have been detected among individuals, betweenmen andwomen (see Fig. 4for the sample results) and among females, but they are not among males, which are concluded to be more uniform inexpressing their average perception.The conclusions concerning the differences related to the imprecision in the average perception due to the gender are

obtained because of the use of the fuzzy scale, which allows to indicate the degree of uncertainty that the participants havewhen he or she evaluates each instance. To illustrate this fact, the data of the Likert-like labels chosen by the participantsclassified according to the gender was analyzed. In Table 2 the sample results of 253 data for men and 216 for women areshown. The p-value of the χ2-test is 0.204. If data are understood to come from a percentage-type evaluation, the p-valueof the corresponding bootstrap ANOVA test would be 0.271. Both tests fail to reject the null hypothesis, and no significantdifferences are found.It should be underlined that since the perception for small and large relative lengths may be different, the average over

the 9 relative lengths may compensate and obscure the results. For this reason, the analysis of each one of the 9 relativesizes by sex are shown in Figs. 7–9. No significant differences are found for the cases of 26.76% and 96.27% relative sizes(see Fig. 4). For the rest of the cases, the conclusions are the same than those for the average perceptions of all lengths.

7. Concluding remarks

An approach has been presented to develop statistics with fuzzy data by identifying them as special functional data.Based on this identification, the random mechanisms producing fuzzy data and relevant associated parameters have beenformalized by using concepts for Hilbert space-valued random elements (see also Körner (2000), Körner and Näther (2002)and González-Rodríguez et al. (2006b)). Nevertheless, special attention should be paid in particularizing the methods fromfunctional to fuzzy data, since fuzzy data have a conical structure instead of the linear one shown by functional data. For thisreason, some previous statistical techniques to deal with fuzzy data have been designed ad hoc (cf. Montenegro et al., 2001,2004; Gil et al., 2006; González-Rodríguez et al., 2006b).An ANOVA test for functional data has been introduced. The proposed test statistic is an extension of the classical

one for real-valued data. The ANOVA approach has been later particularized to an ANOVA test for fuzzy data; it has beenremarked that whereas the asymptotic approach would not involve a fuzzy-valued random element, the bootstrap one iswell-developed. Several open related problems can be considered, namely, the one-way ANOVA test for dependent samples,

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESS12 G. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) –

Fig. 7. p-values of the ANOVA tests for the small relative sizes, and sex groups as a function of the proportion %.

Fig. 8. p-values of the ANOVA tests for the medium relative sizes, and sex groups as a function of the proportion %.

Fig. 9. p-values of the ANOVA tests for the large relative sizes, and sex groups as a function of the proportion %.

the factorial ANOVA test, the posterior tests, the experimental design analyses, and so on. On the other hand, the results inTheorem 5.3 could be applied along with the fuzzy representation of real-valued random variables (see González-Rodríguezet al., 2006a) to develop an ANOVA test for distributions.

Acknowledgements

This research has been partially supported by/benefited from the Spanish Ministry of Science and Innovation GrantsMTM2009-09440-C02-01 and MTM2009-09440- C02-02, the Principality of Asturias Grants IB09-042C1 and IB09-042C2,

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013

ARTICLE IN PRESSG. González-Rodríguez et al. / Computational Statistics and Data Analysis ( ) – 13

and the COST Action IC0702. The paper has benefited from the suggestions of the anonymous referees and the discussionswith colleagues.

References

Allen, I.E., Seaman, C.A., 2007. Likert scales and data analyses. Qual. Prog 40, 64–65.Ammar, S., Wright, R., 2000. Applying fuzzy-set theory to performance evaluation. Socio-Econ. Plan. Sci. 34, 285–302.Cao, J., Ramsay, J.O., 2009. Generalized profiling estimation for global and adaptive penalized spline smoothing. Comput. Statist. Data Anal. 53, 2550–2562.Castaing, C., Valadier, M., 1977. Convex Analysis and Measurable Multifunctions. In: Lec. Notes in Math., vol. 580. Springer-Verlag, Berlin.Chimka, J.R., Wolfe, H., 2009. History of ordinal variables before 1980. Scientific Research and Essays 4, 853–860.Colubi, A., 2009. Statistical inference about the means of fuzzy random variables: applications to the analysis of fuzzy- and real-valued data. Fuzzy Sets andSystems 160, 344–356.

Colubi, A., Domínguez-Menchero, J.S., López-Díaz, M., Ralescu, D.A., 2001. On the formalization of fuzzy random variables. Inform. Sci. 133, 3–6.Colubi, A., Domínguez-Menchero, J.S., López-Díaz,M., Ralescu, D.A., 2002. ADE [0, 1]-representation of randomupper semicontinuous functions. Proc. Amer.Math. Soc. 130, 3237–3242.

Colubi, A., González-Rodríguez, G., 2007. Triangular fuzzification of random variables and power of distribution tests: empirical discussion. Comput. Statist.Data Anal. 51, 4742–4750.

Cuevas, A., Febrero, M., Fraiman, R., 2004. An anova test for functional data. Comput. Statist. Data Anal. 47, 111–122.Cuevas, A., Febrero, M., Fraiman, R., 2006. On the use of the bootstrap for estimating functions with functional data. Comput. Statist. Data Anal. 51,1063–1074.

Fernández, E., Fernández, M., Anadón, S., González-Rodríguez, G., Colubi, A., 2010. Flood analysis: on the automation of the geomorphological-historicalmethod. In: Combining Soft Computing and Statistical Methods in Data Analysis. Advances in Intelligent and Soft Computing Series, vol. 77. Springer-Verlag, Berlin, Heidelberg, pp. 239–246.

Ferraty, F., Vieu, P., 2006. Nonparametric Functional Data Analysis: Theory and Practice. Springer-Verlag, New York.Ferraty, F., Vieu, P., 2009. Additive prediction and boosting for functional data. Comput. Statist. Data Anal. 53, 1400–1413.Gil, M.A., Montenegro, M., González-Rodríguez, G., Colubi, A., Casals, M.R., 2006. Bootstrap approach to themulti-sample test of means with imprecise data.Comput. Statist. Data Anal. 51, 148–162.

Giné, E., Zinn, J., 1990. Bootstrapping general empirical measures. Ann. Probab. 18, 851–869.González-Manteiga, W., Vieu, P., 2007. Guest editors of the special issue on statistics for functional data. Comput. Statist. Data Anal. 51, 4788–5008.González-Rodríguez, G., Colubi, A., Gil, M.A., 2006a. A fuzzy representation of random variables: an operational tool in exploratory analysis and hypothesistesting. Comput. Statist. Data Anal. 51, 163–176.

González-Rodríguez, G., Montenegro, M., Colubi, A., Gil, M.A., 2006b. Bootstrap techniques and fuzzy random variables: synergy in hypothesis testing withfuzzy data. Fuzzy Sets and Systems 157, 2608–2613.

Körner, R., 2000. An asymptotic α-test for the expectation of random fuzzy variables. J. Statist. Plann. Inference 83, 331–346.Körner, R., Näther, W., 2002. On the variance of random fuzzy variables. In: Bertoluzza, C., Gil, M.A., Ralescu, D.A. (Eds.), Statistical Modeling, Analysis andManagement of Fuzzy Data. Physica-Verlag, Heidelberg, pp. 22–39.

Laha, R.G., Rohatgi, V.K., 1979. Probability Theory. Wiley, New York.Likert, R., 1932. A technique for the measurement of attitudes. Arch. Psychol. 140, 1–55.Lubiano, M.A., Gil, M.A., López-Díaz, M., López-García, M.T., 2000. The

−→λ -mean squared dispersion associated with a fuzzy random variable. Fuzzy Sets

and Systems 111, 307–317.Montenegro, M., Casals, M.R., Lubiano, M.A., Gil, M.A., 2001. Two-sample hypothesis tests of means of a fuzzy random variable. Inform. Sci 133, 89–100.Montenegro, M., Colubi, A., Casals, M.R., Gil, M.A., 2004. Asymptotic and Bootstrap techniques for testing the expected value of a fuzzy random variable.Metrika 59, 31–49.

Müller, H.-G., 2005. Functional modelling and classification of longitudinal data. Scandinavian J. Statist. 32, 223–240.Müller, H.-G., Yang, W., 2010. Dynamic relations for sparsely sampled Gaussian processes (invited paper with discussions). Test 19, 1–65.Nguyen, H.T., 1978. A note on the extension principle for fuzzy sets. J. Math. Anal. Appl. 64, 369–380.Puri, M.L., Ralescu, D.A., 1983. Differentials of fuzzy functions. J. Math. Anal. Appl. 91, 552–558.Puri, M.L., Ralescu, D.A., 1985. The concept of normality for fuzzy random variables. Ann. Probab. 11, 1373–1379.Puri, M.L., Ralescu, D.A., 1986. Fuzzy random variables. J. Math. Anal. Appl. 114, 409–422.Ramsay, J.O., Silverman, B.W., 1997. Functional Data Analysis. Springer-Verlag, New York.Ramsay, J.O., Silverman, B.W., 2002. Applied Functional Data Analysis. Springer-Verlag, New York.Stevens, S.S., 1946. On the theory of scales of measurement. Science 103 (2884), 677–680.Trutschnig, W., González-Rodríguez, G., Colubi, A., Gil, M.A., 2009. A new family of metrics for compact, convex (fuzzy) sets based on a generalized conceptof mid and spread. Inform. Sci. 179, 3964–3972.

Yao, F., Müller, H.-G., Wang, J.-L., 2005. Functional linear regression analysis for longitudinal data. Ann. Statist. 33, 2873–2903.Zadeh, L.A., 1975. The concept of a linguistic variable and its application to approximate reasoning, Part 1. Inform. Sci. 8, 199–249; Part 2. Inform. Sci. 8,301–353; Part 3. Inform. Sci. 9, 43–80.

Please cite this article in press as: González-Rodríguez, G., et al., Fuzzy data treated as functional data: A one-way ANOVA test approach. ComputationalStatistics and Data Analysis (2010), doi:10.1016/j.csda.2010.06.013


Recommended