+ All Categories
Home > Documents > Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF)...

Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF)...

Date post: 23-Apr-2021
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
8
Uncertainty–aware similarity measures - properties and construction method Patryk ˙ Zywica and Anna Stachowiak Department of Imprecise Information Processing Methods, Faculty of Mathematics and Computer Science, Adam Mickiewicz University in Pozna´ n, Umultowska 87, 61-614 Pozna´ n, Poland, [email protected], [email protected] Abstract A concept of uncertainty–aware similarity measure is being defined and discussed. The aim of the paper is to support the opinion that both definition and construction of such measure should take into account the epis- temic nature of compared incomplete (un- certain) information. This approach, how- ever, generates new challenges resulted from the computational complexity of the prob- lem. We define a set of properties to be satis- fied by uncertainty–aware similarity measure and we propose a new technique of construct- ing such measures for Interval–Valued Fuzzy Sets. Keywords: Similarity, Uncertainty, Interval–Valued Fuzzy Sets, IVFS. 1 Introduction The need for modeling imprecise and incomplete in- formation gave rise to a theory of fuzzy sets and its many extensions. In this paper we give a special atten- tion to Interval–Valued Fuzzy Sets (IVFS) since they provide a way not only to model vagueness of mem- bership values (information imprecision) but also hesi- tation about those values (information incompleteness or uncertainty). An IVFS can be defined as a set of possible fuzzy sets, one of which is the ”true” or ”real” one, presently not known due to the lack of knowledge. Thus, IVFS is a way to describe or represent some information that is uncertain (the uncertainty is not of the probability type, but arises from the lack of knowledge; it can be reduced when knowledge increases). Such interpreta- tion of an interval is of epistemic nature - contrary to ontic one, when the interval is understood as a com- plex, but certain, information [7]. We would thus re- fer to an IVFS as to the set of its possible states. We clearly distinguish a description of an (unknown) ob- ject from the object itself (represented by one of the possible states). The notion of uncertainty–aware sim- ilarity measure that is proposed in this paper takes this distinction into account. Let us assume for example that we want to compare two identical IVFSs - [0.1, 0.8] and [0.1, 0.8]; obviously we notice the total simi- larity of their description, however, it does not imply total similarity of the objects that are being described. The paper is entirely devoted to considerations about uncertainty–aware similarity measures that take into account an epistemic nature of data and their con- struction methods for IVFSs. We propose a set of properties to be satisfied by such measure, and we give a method to construct it. The motivation for our work is clear - similarity measure plays a fundamental role in many fields and applications such as approximate reasoning, decision-making systems, recommender sys- tems, pattern recognition and others. On the other hand we believe that the problem of information in- completeness still requires closer look and insightful investigation. 2 Definitions Let U = {u 1 ,u 2 ,...,u n } be a crisp universal set. A mapping A : U [0, 1] is called a fuzzy set (FS) in U . For each 1 i n, the value A(u i )(a i for short) represents the membership grade of u i in A. Any crisp set X U can be represented as a fuzzy set by its characteristic function 1 X . Let F (U ) be the family of all fuzzy sets in U . A binary operation t : [0, 1] × [0, 1] [0, 1] is called a triangular norm (t-norm, for short) if it is commu- tative, associative, non-decreasing in each argument, and has 1 as neutral element. The most important t-norms are minimum t min (x, y) = min(x, y), prod- uct t prod (x, y)= xy, and Lukasiewicz t Luk (x, y)= max(0,x + y - 1). A thorough investigation on t- norms is done in the classical monograph of Klement 11th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT 2019) Copyright © 2019, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/). Atlantis Studies in Uncertainty Modelling, volume 1 512
Transcript
Page 1: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

Uncertainty–aware similarity measures - properties and constructionmethod

Patryk Zywica and Anna Stachowiak

Department of Imprecise Information Processing Methods,Faculty of Mathematics and Computer Science,

Adam Mickiewicz University in Poznan,Umultowska 87, 61-614 Poznan, Poland,

[email protected], [email protected]

Abstract

A concept of uncertainty–aware similaritymeasure is being defined and discussed. Theaim of the paper is to support the opinionthat both definition and construction of suchmeasure should take into account the epis-temic nature of compared incomplete (un-certain) information. This approach, how-ever, generates new challenges resulted fromthe computational complexity of the prob-lem. We define a set of properties to be satis-fied by uncertainty–aware similarity measureand we propose a new technique of construct-ing such measures for Interval–Valued FuzzySets.

Keywords: Similarity, Uncertainty,Interval–Valued Fuzzy Sets, IVFS.

1 Introduction

The need for modeling imprecise and incomplete in-formation gave rise to a theory of fuzzy sets and itsmany extensions. In this paper we give a special atten-tion to Interval–Valued Fuzzy Sets (IVFS) since theyprovide a way not only to model vagueness of mem-bership values (information imprecision) but also hesi-tation about those values (information incompletenessor uncertainty).

An IVFS can be defined as a set of possible fuzzy sets,one of which is the ”true” or ”real” one, presently notknown due to the lack of knowledge. Thus, IVFS isa way to describe or represent some information thatis uncertain (the uncertainty is not of the probabilitytype, but arises from the lack of knowledge; it can bereduced when knowledge increases). Such interpreta-tion of an interval is of epistemic nature - contrary toontic one, when the interval is understood as a com-plex, but certain, information [7]. We would thus re-fer to an IVFS as to the set of its possible states. We

clearly distinguish a description of an (unknown) ob-ject from the object itself (represented by one of thepossible states). The notion of uncertainty–aware sim-ilarity measure that is proposed in this paper takes thisdistinction into account. Let us assume for examplethat we want to compare two identical IVFSs - [0.1,0.8] and [0.1, 0.8]; obviously we notice the total simi-larity of their description, however, it does not implytotal similarity of the objects that are being described.

The paper is entirely devoted to considerations aboutuncertainty–aware similarity measures that take intoaccount an epistemic nature of data and their con-struction methods for IVFSs. We propose a set ofproperties to be satisfied by such measure, and we givea method to construct it. The motivation for our workis clear - similarity measure plays a fundamental rolein many fields and applications such as approximatereasoning, decision-making systems, recommender sys-tems, pattern recognition and others. On the otherhand we believe that the problem of information in-completeness still requires closer look and insightfulinvestigation.

2 Definitions

Let U = {u1, u2, . . . , un} be a crisp universal set. Amapping A : U → [0, 1] is called a fuzzy set (FS) inU . For each 1 ≤ i ≤ n, the value A(ui) (ai for short)represents the membership grade of ui in A. Any crispset X ⊆ U can be represented as a fuzzy set by itscharacteristic function 1X . Let F(U) be the family ofall fuzzy sets in U .

A binary operation t : [0, 1] × [0, 1] → [0, 1] is calleda triangular norm (t-norm, for short) if it is commu-tative, associative, non-decreasing in each argument,and has 1 as neutral element. The most importantt-norms are minimum tmin(x, y) = min(x, y), prod-uct tprod(x, y) = xy, and Lukasiewicz t Luk(x, y) =max(0, x + y − 1). A thorough investigation on t-norms is done in the classical monograph of Klement

11th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT 2019)

Copyright © 2019, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Atlantis Studies in Uncertainty Modelling, volume 1

512

Page 2: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

-2-1.8 -1.2-1 0 1 1.2 1.8 2

0

1

x

µ

A1 A1

A2

A

Figure 1: Visualization of the Interval–Valued FuzzySet A and its Fuzzy Membership Function Family A(shaded area). Dashed lines represent two embedded

fuzzy sets (possible states) A1 ∈ A and A2 ∈ A.

et. al.[13].

Definition 1. A similarity measure of fuzzy sets isdefined as a function on E ⊂ F(U)×F(U) [6, 30]

s : E→ R , (1)

where E needs to satisfy:

1. (A,B) ∈ E if and only if (B,A) ∈ E,

2. (A,B) ∈ E if and only if (A,1U ) ∈ E.

It is common to assume that the higher measure valuesindicate higher similarity of arguments.

In the case where E = F(U) × F(U), all fuzzy setsare comparable by a given similarity measure. Such asituation is not always possible, because some of thestandard similarity measures are not defined for cer-tain pairs of fuzzy sets.

Any closed subset A of F(U) will be called FuzzyMembership Function Family (FMFF) (see [35]). Set

A represents all the possible states that can hide be-hind uncertain information. Example FMFF is givenon Figure 1. We denote by FMFF(U) the set ofall FMFFs. This approach is largely inspired by theMendel representation theorem and his Wavy-Slicerepresentation [16, 17].

The cardinality of fuzzy sets has been extensively dis-cussed in the literature (see [27]). In this paper we willfocus on scalar cardinalities of fuzzy sets which can becharacterised by the formula

σf (A) =∑

1≤i≤n

f(A(xi)) , (2)

where f : [0, 1] → [0, 1] is a weighting function suchthat f(0) = 0, f(1) = 1 and f(a) ≤ f(b) whenever a ≤

b. This approach formalises and reflects real humancounting process under information imprecision [28].The most common weighting function is the identityfunction fid(x) = x.

3 Similarity

Defining the similarity of epistemic data is a complexproblem. For example, it is necessary to answer thequestion of how to determine the degree of similar-ity, so that it reflects the similarity of information de-scribed in an incomplete way. As it was noted in theIntroduction, even the total similarity of incompletedescriptions does not guarantee the similarity of thedescribed phenomena or objects. For this reason, it isnecessary to model the similarity by means of a rangeor subset.

Thanks to collecting and systematising properties ofmany similarity measures [2–5, 8, 11, 14, 15, 18, 20–24, 29, 32–34], it is possible to propose the concept ofuncertainty–aware similarity measure. In the furtherpart of the paper, examples of such measures will bepresented along with their basic properties.

Since the similarity of uncertain objects is a more com-plex problem in comparison to classical similarity mea-sure, the set of properties to be hold is wider.

Definition 2. A function s : E → P([0, 1]) on E ⊂FMFF(U)×FMFF(U) such that:

1. (A, B) ∈ E if and only if (B, A) ∈ E,

2. (A, B) ∈ E if and only if(A, {1U}

)∈ E,

3. (A, B) ∈ E if and only if for any fuzzy sets A ∈ A,

B ∈ B: ({A} , {B}

)∈ E , (3)

is a uncertainty–aware similarity measure if it satisfiesfollowing conditions:

(P1) For all (A, B) ∈ E,

s(A, B) = s(B, A) . (4)

(P2) For A, B = F(U),

s(A, B) = [0, 1] (5)

(P3) For all (A, B) ∈ E, (A, C) ∈ E such that 1X ∈A, 1X ∈ B and 1Xc ∈ C for some X ⊂ U ,

1 ∈s(A, B) , (6)

0 ∈s(A, C) . (7)

513

Page 3: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

(P4) For all fuzzy sets A,B ∈ F(U) such that

({A}, {B}) ∈ E,

s({A}, {B}) = {a}, for some a ∈ [0, 1] . (8)

(P5) For all (A, B) ∈ E and for any A,B ∈ F(U)

such that A ∈ A, B ∈ B,

s({A}, {B}) ⊂ s(A, B) . (9)

(P6) For any (A, C) ∈ E, (B, D) ∈ E such that A ⊂B and C ⊂ D,

s(A, C) ⊂ s(B, D) . (10)

(P7) For all (A, B) ∈ E such that 0 ∈ s(A, B) there

exist A ∈ A and B ∈ B such that σ(A ∩B) = 0.

To support the choice of the properties (P1)-(P7) letus now discuss them briefly.

The first property, symmetry, is a common and widelyaccepted condition for every similarity measure, andso it is in the presence of uncertainty. Next prop-erties should be considered taking into account thespecificity of epistemic information. Thus, (P2) re-quires that no information implies no conclusions -when comparing totally unknown object, the similar-ity should also remain unknown. On the other hand,if the information is complete (FMFF reduces to a sin-gle FS) then their similarity should also be completelyknown (without uncertainty) - that is the meaning ofthe property (P4).

By (P3) we make two observations; two FMFFs couldbe similar to a degree 1 only if they share at least onecommon state (6). On the other hand, if two FMFFsare to some extend inconsistent, then a value 0 shouldbe a possible value of their similarity (7). In general,when a degree of uncertainty of two FMFFs decreasesso does similarity measure - see (P6). Consequently,for any pair of possible states, their similarity measurebelongs to the similarity of FMFFs (P5).

Finally, (P7) indicates that if a value 0 is one of thepossible values of similarity of two FMFFs then thereexists two states that are disjoint.

4 Uncertainty–aware similaritymeasures for IVFS

Interval–valued fuzzy set (IVFS) theory, which is aspecial case of type-2 fuzzy set theory, was introducedby Zadeh [31]. Let I([0, 1]) be the set of all closedsubintervals of [0, 1]. A mapping A : U → I([0, 1]) iscalled an interval–valued fuzzy set. For each 1 ≤ i ≤ n,

the value A(ui) = [A(ui), A(ui)] ∈ I([0, 1]) representsthe membership of an element ui in A. Usually A andA are called the lower and upper membership func-tions of A respectively. In epistemic approach, intervalA(ui) is understood to contain the true membershipdegree of ui in some incompletely known fuzzy set Arepresented by A. We denote the set of all interval–valued fuzzy sets in U by IV(U).

Most known extensions of fuzzy sets can be fully accu-rately represented using FMFF. Interval–Valued FuzzySet A can be also viewed as the following FMFF (seeFigure 1):

A ={A ∈ F(U) : ∀x∈U A(x) ≤ A(x) ≤ A(x)

}. (11)

Referring to the Mendel’s Wavy-Slice representationtheorem [16, 17], in case of Interval–Valued Fuzzy Sets,FMFF is equivalent to FOU (Footprint of Uncertainty)

A = FOU(A) . (12)

Since there is a one-to-one correspondence between Aand A, those two representations will be used inter-changeably.

In the previous sections, we presented the properties ofa uncertainty aware similarity measure for the generalepistemic data represented by FMFF. In the followingwe will propose the construction method of such mea-sures for IVFS. Note that each function f : X → Y canbe calculated for set-valued data A ⊂ X in followingway

f(A) = {f(a) : a ∈ A} ⊂ Y . (13)

This approach can be used to obtain new similaritymeasures for uncertain data.

Definition 3. Let s : E → [0, 1] be a similarity mea-

sure of fuzzy sets. Function s : E → P([0, 1]) can bedefined in following way:

s(A, B) ={s(A,B) : A ∈ A, B ∈ B

}, (14)

where

E ={

(A, B) ∈ IV(U)× IV(U) : A× B ⊂ E}. (15)

Sometimes it may be useful to represent fuzzyset A on finite universe U as a vector xA =(A(u1), · · · , A(u|U |)

)∈ [0, 1]|U |.

Definition 4. Fuzzy set similarity measure s : E →[0, 1] is called continuous if function f : X → [0, 1]defined on

X ={

(xA,xB) ∈ [0, 1]2|U | : (A,B) ∈ E}. (16)

asf(xA,xB) = s(A,B) , (17)

is continuous in the whole domain.

514

Page 4: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

Lemma 1. For each continuous fuzzy set similaritymeasure s : E → [0, 1], function s from Definition 3can be simplified to

s(A, B) =

infA∈AB∈B

s(A,B), supA∈AB∈B

s(A,B)

. (18)

Theorem 2. For each continuous fuzzy set similaritymeasure s : E→ [0, 1] that satisfies

1. for each (A,B) ∈ E, we have s(A,B) = s(B,A),

2. for each (A,B) ∈ E, if s(A,B) = 0 then σ(A ∩B) = 0,

3. for each X ⊂ U such that (1X ,1Xc) ∈ E we have(1X ,1Xc) = 0 and (1X ,1X) = 1,

function s from Definition 3 is uncertainty–aware sim-ilarity measure of IVFS from Definition 2.

The proof was given in [34].

5 Extensions of popular similaritymeasures

The following section presents the extensions of knownsimilarity measures to their uncertainty–aware ver-sions obtained using the Definition 3. As will beshown, the calculation of some generated similaritymeasures is computationally difficult, while other mea-sures can be calculated using simple formulas. A par-ticularly interesting case is the Jaccard index.

5.1 Distance based similarity measures

The similarity measure based on the metric of the dis-tance meets the assumption of the Theorem 2. Thanksto this, appropriate extensions are uncertainty–awaresimilarity measures.

5.1.1 Minkowski distance

The extensions of similarity measures based on gener-alized Minkowski’s metric are relatively simple to cal-culate. By using (18) we get the following measure ofsimilarity:

sdr (A, B) =

infA∈AB∈B

1− dr(A,B)

|U |, supA∈AB∈B

1− dr(A,B)

|U |

,(19)

which is defined for all pairs of interval–valued fuzzysets. For simplicity, let us denote this interval by [a, b].

Then we can transform the lower limit value in thefollowing way:

a = infA∈AB∈B

1− dr(A,B)

|U | = 1− supA∈AB∈B

dr(A,B)

|U |

= 1− 1

|U | supu∈U

au∈A(u)

bu∈B(u)

(∑u∈U

|au − bu|r) 1

r

= 1− 1

|U |

(∑u∈U

max{|A(u)−B(u)|, |A(u)−B(u)|

}r) 1r

.

(20)

Analogous transformations lead to a direct formula forthe upper limit b. This formula is a simple sum andno numerical optimisation is needed to calculate it.Another important observation is the fact that thisformula takes into account only values of the lowerand upper membership functions of IVFSs A and B.Hence, these measures are computationally efficientdespite the fact that their construction uses a compu-tationally inefficient method that takes into accountthe infinite FMFF.

In a very similar way we can get extensions of simi-larity measures based on d∞ distance. The formulasobtained are analogous and have the same computa-tional complexity.

5.1.2 Other distances

The similarity measure can also be defined as the anglebetween two vectors:

scos θ(A,B) =

∑u∈U

µA(u)µB(u)√∑u∈U

µA(u)2√∑u∈U

µB(u)2. (21)

In the case when any of the fuzzy sets is empty, themeasure is not specified, so

E ={(A,B) ∈ IV(U)× IV(U) : A 6= 1∅ & B 6= 1∅

}.(22)

Measure scos θ(A,B) takes the largest value when theangle between the vectors representing the fuzzy setsis 90◦. This similarity measure is used when it is par-ticularly important to compare the shape of the mem-bership function of fuzzy sets. The cosine of the angleis not the only angular measure. Many modificationsof (21) can be found in the literature [12].

The above definition can not easily be simplified as itwas in the case of Minkowski’s distance. The problemof constructing an algorithm that enables the efficientcalculation of such a similarity measure is still open.

515

Page 5: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

5.2 Logic based similarity measures

Logic–based measures [9, 10] use the interpretation ofthe membership function of a fuzzy set as the degree oftruth of the proposition represented by this fuzzy set.The basic method assumes the use of an implicationoperator, which allows constructing both the measuresof inclusion and similarity. In classic logic, the implica-tion operator can be defined in several equivalent ways.The generalization to the case of fuzzy logic, where in-finitely many degrees of truth are admitted, resulted inthe creation of many not equivalent definitions of theconcept. The most frequently used implication opera-tors are S–implications and R–implications [1, 25].

The simplest co-implication operator is defined as:

Ψ(a, b) = min(a⇒ b, b⇒ a) . (23)

The similarity measure is then defined as the mini-mum, average or maximum value obtained for all ele-ments of the universe. The most interesting is the caseof the average, where the similarity measure is definedas

sΨ(A,B) =1

|U |∑u∈U

Ψ(µA(u), µB(u)) . (24)

For simplicity, we denote by [a, b] the similarity valuereturned by measure extended according to the Defi-nition 3. Then the value of the lower bound can betransformed as follows:

a = infA∈AB∈B

1

|U |∑u∈U

Ψ(µA(u), µB(u)) =

=1

|U |∑u∈U

infµA

(u)≤x≤µA(u)

µB

(u)≤y≤µB(u)

Ψ(x, y) . (25)

Similarly, we can convert the upper limit. This simpletransformation allows for a significant simplificationof the problem of calculating the infimum and supre-mum. Instead of optimizing the value of the wholesum, it is enough to examine how the co-implicationoperator behaves. For example, for the Lukasiewicz’simplication operator, we get the following equality:

infµA

(u)≤x≤µA(u)

µB

(u)≤y≤µB(u)

Ψ Luk(x, y)

= infµA

(u)≤x≤µA(u)

µB

(u)≤y≤µB(u)

min{1, 1− x+ y, 1− y + x}

= min{1, 1− µA(u) + µB

(u), 1− µB(u) + µA

(u)} .

Thanks to this property, the calculation of the ex-tended similarity measure sΨ is possible directly with-out the need for numerical optimization techniques.

Unfortunately, the problem of calculating the measuresΨ is not so simple in the general case. For the twobasic families of implication operators: S–implicationsand R–implications, the simple way to directly cal-culate the similarity measure is not known. It is aninteresting and still open problem for further research,but it is not in the scope of this paper.

5.3 Set theory based similarity measures

The Jaccard index is the most commonly used simi-larity measure. It formalizes the observation that, fortwo sets: the more common and less different elementsthey have, the more similar they are. As a reminder,the Jaccard index for fuzzy sets is defined as

sJ(A,B) =|A ∩B||A ∪B|

, where |A ∪B| 6= 0 . (26)

Because A∩B ⊂ A∪B, Jaccard index can be viewed asthe ratio of the number of common elements of A andB to the number of all elements in A or B. Anotherlook at Jaccard’s index comes from the observationthat

sJ(A,B) =|A ∩B||A ∪B|

=|(A ∪B) ∩ (A ∩B)|

|A ∪B|= σ(A ∩B|A ∪B) . (27)

According to the interpretation of relative cardinalityas a degree of inclusion, the degree of similarity be-tween two fuzzy sets is defined as the degree to whichthe fuzzy set A ∪B is contained in A ∩B. Of course,the opposite inclusion always holds. This approach isalso interesting because it refers to the concept of in-clusion of fuzzy sets (in this case defined as relativecardinality).

Both proposed interpretations can be generalized us-ing t-norm T and t-conorm S. In addition, the cardi-nality of the fuzzy set can also be defined using anyweighting function f . In this way, we get the followingtwo definitions of the generalized Jaccard index:

s′T,S,f (A,B) =σf (A ∩T B)

σf (A ∪S B), (28)

s′′T,S,f (A,B) =σT,f (A ∩T B|A ∪S B)

=σf ((A ∪S B) ∩T (A ∩T B))

σf (A ∪S B). (29)

It should be noted that both generalizations are notspecified in the case where σf (A ∪S B) = 0. Unfor-tunately, the unambiguous definition of the similarityvalue in this case is not possible. Thus, the domain ofsimilarity measures can be defined as follows:

ES,f = {(A,B) ∈ F(U)×F(U) : σf (A ∪S B) 6= 0} .(30)

516

Page 6: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

For any t-conorm S and weighting function f , ES,fsatisfies both conditions required for the domain ofsimilarity measure given in the Definition 1. In addi-tion, many specific pairs of fuzzy sets belong to thisfamily. For example, for any set of X ⊂ U both(1X ,1X) ∈ ES,f , as well as (1X ,1Xc) ∈ ES,f , whichresults directly from the properties of t-operations andoperations on fuzzy sets. One special case seems par-ticularly important when

ES,f = IV(U)× IV(U) \ {(∅, ∅)} = E∅ , (31)

which occurs, for example, for ESmax,fid .

Both similarity measures are equivalent for both clas-sic and fuzzy sets, for which the intersection and sumof sets is defined using Tmin and Smax with the iden-tity weighting function. For the other t-operations thisequality may not hold. Therefore, in the further partof this section, we will consider them separately. Asit turns out, they have very similar but not identicalproperties.

Theorem 3. Fuzzy set similarity measure s′T,S,fmeets the assumptions of the Theorem 2. In addi-tion, if the t-norm T does not have zero divisors and∀x∈(0,1]f(x) > 0, s′′T,S,f also has this property.

Remark 1. Classic fuzzy Jaccard index

s′Tmin,Smax,fid(A,B) =

|A ∩min B||A ∪max B|

(32)

satisfies the assumptions of Theorem 2.

Theorem 4. Fuzzy set similarity measures s′T,S,f ands′′T,S,f are continuous if functions T , S, f are contin-uous in their entire domains.

As with the angular distance, it is not possible to easilysimplify formulas for lower and upper limits. Measures′′T,S,f is defined by the generalized relative cardinalityof fuzzy sets, making it possible to simplify the prob-lem of its efficient calculation. For this purpose, wewill use the concept of generalized relative cardinalityof interval–valued fuzzy sets [36].

Definition 5. Function σf,T : Ef → I([0, 1]) inducedby a continuous t-norm T and a continuous weightingfunction f where

σf,T (A|B) =

[infA∈AB∈B

σf,T (A|B), supA∈AB∈B

σf,T (A|B)

]

and

Ef = {(A, B) ∈ IV(U)× IV(U) : ∀B∈B

σf (B) > 0}, (33)

is called generalized interval–valued relative cardinal-ity.

One can see the following relationship between theextended Jaccard’s index s′′T,S,f and the generalizedinterval–valued relative cardinality σf,T :

s′′T,S,f (A, B) =

=

[infA∈AB∈B

A′=A∩TBB′=A∪SB

σf,T (A′|B′), supA∈AB∈B

A′=A∩TBB′=A∪SB

σf,T (A′|B′)

]

=σf,T (A ∩T B|A ∪S B) . (34)

Thanks to the above equation, the value of the ex-tended Jaccard index can be calculated by means ofgeneralized interval–valued relative cardinality. Theeffective solution of this computational problem wasgiven in [36]. Unfortunately, the problem of effectivecalculation of the s′T,S,f in the general case still re-mains an open problem. However, it should be men-tioned that for the extension of the classical Jaccardindex, effective algorithms enabling its calculation inO(n log n) were given [19, 26].

6 Conclusions

There are two main contributions made in this pa-per. The first one was collecting and systematizing aset of properties for uncertainty–aware similarity mea-sure. Our approach gives a full picture of the simi-larity of incompletely known information and allowsreasoning about the amount of uncertainty of com-pared objects, thus informs about the quality of thiscomparison. The second result, presented in Section 4and 5, is the observation, that uncertainty–aware sim-ilarity measure may be constructed from fuzzy simi-larity measures under certain conditions. In this waywe have opened a new path for constructing new sim-ilarity measures. We showed that this path is verypromising since we have already managed to obtainsome interesting results by restricting our considera-tions to IVFS. However, it must be mentioned thatprocessing general epistemic data requires much moreresearch to be computationally feasible.

Acknowledgement

This work was partially supported by the Pol-ish National Science Centre grant number2016/21/N/ST6/00316.

References

[1] M. Baczynski, B. Jayaram, Fuzzy Implications,Springer, Heidelberg, 2008.

[2] H. Bustince, Indicator of inclusion grade forinterval-valued fuzzy sets. application to approx-imate reasoning based on interval-valued fuzzy

517

Page 7: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

sets, International Journal of Approximate Rea-soning 23 (3) (2000) 137–209.

[3] H. Bustince, P. Burillo, Vague sets are intuition-istic fuzzy sets, Fuzzy Sets and Systems 79 (3)(1996) 403–405.

[4] S.-M. Chen, Measures of similarity between vaguesets, Fuzzy Sets and Systems 74 (2) (1995) 217–223.

[5] I. Couso, L. Garrido, L. Sanchez, Similarity anddissimilarity measures between fuzzy sets: A for-mal relational study, Information Sciences 229(2013) 122–141.

[6] V. V. Cross, T. A. Sudkamp, Similarity and Com-patibility in Fuzzy Set Theory. Assessment andApplications, Physica-Verlag, Heidelberg, 2002.

[7] D. Dubois, H. Prade, Gradualness, uncertaintyand bipolarity: Making sense of fuzzy sets, FuzzySets and Systems 192 (2012) 3–24.

[8] W.-L. Gau, D. J. Buehrer, Vague sets, IEEETransactions on Systems, Man and Cybernetics23 (2) (1993) 610–614.

[9] K. Hirota, W. Pedrycz, Handling fuzziness andrandomness in process of matching fuzzy data, in:Proceedings of the Third IFSA Congress, Seattle,USA, 1989, pp. 97–100.

[10] K. Hirota, W. Pedrycz, Matching fuzzy quanti-ties, IEEE Transactions on Systems, Man andCybernetics 21 (6) (1991) 1580–1586.

[11] I. Jenhani, S. Benferhat, Z. Elouedi, Possibilis-tic similarity measures, in: B. Bouchon-Meunier,L. Magdalena, et al. (Eds.), Foundations of Rea-soning under Uncertainty, Springer-Verlag, Hei-delberg, 2010, pp. 99–123.

[12] T. Kailath, The divergence and Bhattacharyyadistance measures in signal selection, IEEETransactions on Communication Technology15 (1) (1967) 52–60.

[13] E. P. Klement, R. Mesiar, E. Pap, Triangularnorms, Vol. 8 of Trends in Logic, Springer, 2000.

[14] Z. Liang, P. Shi, Similarity measures on intu-itionistic fuzzy sets, Pattern Recognition Letters24 (15) (2003) 2687–2693.

[15] X. Luo, C. Zhang, An axiom foundation for un-certain reasonings in rule-based expert systems:NT-algebra, Knowledge and Information Systems1 (4) (1999) 415–433.

[16] J. M. Mendel, Tutorial on the uses of the inter-val type-2 fuzzy set’s Wavy Slice RepresentationTheorem, in: Proceedings of Annual Meeting ofthe North American Fuzzy Information Process-ing Society (NAFIPS), New York, USA, 2008, pp.1–6.

[17] J. M. Mendel, R. I. John, F. Liu, Interval type-2fuzzy logic systems made simple, IEEE Transac-tions on Fuzzy Systems 14 (6) (2006) 808–821.

[18] H. B. Mitchell, On the Dengfeng–Chuntian sim-ilarity measure and its application to patternrecognition, Pattern Recognition Letters 24 (16)(2003) 3101–3104.

[19] H. T. Nguyen, V. Kreinovich, Computing degreesof subsethood and similarity for interval-valuedfuzzy sets: fast algorithms, Tech. Rep. 94, De-partment of Computer Science, UTEP (2008).

[20] A. Stachowiak, P. Zywica, K. Dyczkowski,A. Wojtowicz, An Interval-Valued Fuzzy Clas-sifier Based on an Uncertainty-Aware Similar-ity Measure, in: P. Angelov, K. T. Atanassov,et al. (Eds.), Intelligent Systems’ 2014. Volume1: Mathematical Foundations, Theory, Analyses,Springer, Switzerland, 2015, pp. 741–751.

[21] E. Szmidt, Distances and Similarities in Intuition-istic Fuzzy Sets, Springer, Switzerland, 2014.

[22] E. Szmidt, J. Kacprzyk, Distances between in-tuitionistic fuzzy sets, Fuzzy Sets and Systems114 (3) (2000) 505–518.

[23] E. Szmidt, J. Kacprzyk, Entropy for intuitionisticfuzzy sets, Fuzzy Sets and Systems 118 (3) (2001)467–477.

[24] E. Szmidt, J. Kacprzyk, A measure of similar-ity for intuitionistic fuzzy sets, in: Proceedings ofEUSFLAT Conference, Zittau, Niemcy, 2003, pp.206–209.

[25] E. Trillas, L. Valverde, On mode and implicationin approximate reasoning, in: M. Gupta, A. Kan-del (Eds.), Approximate reasoning in expert sys-tems, North-Holland, Amsterdam, 1985, pp. 157–166.

[26] D. Wu, J. M. Mendel, Efficient algorithms forcomputing a class of subsethood and similaritymeasures for interval type-2 fuzzy sets, in: Pro-ceedings of IEEE International Conference onFuzzy Systems (FUZZ), Barcelona, Hiszpania,2010, pp. 1–7.

[27] M. Wygralak, Cardinalities of fuzzy sets,Springer, Heidelberg, 2003.

[28] M. Wygralak, Intelligent Counting Under Infor-mation Imprecision, Vol. 292 of Studies in Fuzzi-ness and Soft Computing, Springer, 2013.

[29] L. Xuecheng, Entropy, distance measure and sim-ilarity measure of fuzzy sets and their relations,Fuzzy Sets and Systems 52 (3) (1992) 305–318.

[30] L. A. Zadeh, Similarity relations and fuzzy order-ings, Information Sciences 3 (2) (1971) 177–200.

518

Page 8: Uncertainty{aware similarity measures - properties and … · Membership Function Family (FMFF) (see [35]). Set Aerepresents all the possible states that can hide be-hind uncertain

[31] L. A. Zadeh, The concept of a linguistic variableand its application to approximate reasoning—I,Information Sciences 8 (3) (1975) 199–249.

[32] W. Zeng, H. Li, Relationship between similaritymeasure and entropy of interval valued fuzzy sets,Fuzzy Sets and Systems 157 (11) (2006) 1477–1484.

[33] H. Zhang, W. Zhang, C. Mei, Entropy of interval-valued fuzzy sets based on distance and its re-lationship with similarity measure, Knowledge-Based Systems 22 (6) (2009) 449–454.

[34] P. Zywica, Similarity measures of Interval–ValuedFuzzy Sets in classification of uncertain data.Applications in the diagnosis of ovarian tu-mors, Ph.D. thesis, Adam Mickiewicz University,Poznan, Poland, (in Polish) (2016).

[35] P. Zywica, Modelling medical uncertainties withuse of fuzzy sets and their extensions, in: Interna-tional Conference on Information Processing andManagement of Uncertainty in Knowledge-BasedSystems, Springer, Cham, 2018, pp. 369–380.

[36] P. Zywica, A. Stachowiak, M. Wygralak, AnAlgorithmic Study of Relative Cardinalities forInterval-Valued Fuzzy Sets, Fuzzy Sets and Sys-tems 294 (2016) 105–124.

519


Recommended