+ All Categories
Home > Documents > Measure based representation of uncertain information

Measure based representation of uncertain information

Date post: 01-Oct-2016
Category:
Upload: naif
View: 212 times
Download: 0 times
Share this document with a friend
23
Fuzzy Optim Decis Making DOI 10.1007/s10700-012-9127-8 Measure based representation of uncertain information Ronald R. Yager · Naif Alajlan © Springer Science+Business Media, LLC 2012 Abstract We discuss the use of monotonic set measures for the representation of uncertain information. We look at some important examples of measure-based uncer- tainty, specifically probability and possibility and necessity. Others types of uncer- tainty such as cardinality based and quasi-additive measures are discussed. We consider the problem of determining the representative value of a variable whose uncertain value is formalized using a monotonic set measure. We note the central role that averaging and particularly weighted averaging operations play in obtaining these representative values. We investigate the use of various integrals such as the Choquet and Sugeno for obtaining these required averages. We suggest ways of extending a measure defined on a set to the case of fuzzy sets and the power sets of the original set. We briefly consider the problem of question answering under uncertain knowledge. Keywords Measure · Uncertainty · Probability · Integrals · Averages 1 Introduction The representation and manipulation of uncertainty is fundamental to many tasks in our information focused technological environment. While probability has long pro- R. R. Yager (B ) Iona College, Machine Intelligence Institute, New Rochelle, NY 10801, USA e-mail: [email protected] R. R. Yager King Saud University, Riyadh, Saudi Arabia N. Alajlan ALISR Laboratory, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia e-mail: [email protected] 123
Transcript

Fuzzy Optim Decis MakingDOI 10.1007/s10700-012-9127-8

Measure based representation of uncertain information

Ronald R. Yager · Naif Alajlan

© Springer Science+Business Media, LLC 2012

Abstract We discuss the use of monotonic set measures for the representation ofuncertain information. We look at some important examples of measure-based uncer-tainty, specifically probability and possibility and necessity. Others types of uncer-tainty such as cardinality based and quasi-additive measures are discussed. We considerthe problem of determining the representative value of a variable whose uncertain valueis formalized using a monotonic set measure. We note the central role that averagingand particularly weighted averaging operations play in obtaining these representativevalues. We investigate the use of various integrals such as the Choquet and Sugeno forobtaining these required averages. We suggest ways of extending a measure definedon a set to the case of fuzzy sets and the power sets of the original set. We brieflyconsider the problem of question answering under uncertain knowledge.

Keywords Measure · Uncertainty · Probability · Integrals · Averages

1 Introduction

The representation and manipulation of uncertainty is fundamental to many tasks inour information focused technological environment. While probability has long pro-

R. R. Yager (B)Iona College, Machine Intelligence Institute, New Rochelle, NY 10801, USAe-mail: [email protected]

R. R. YagerKing Saud University, Riyadh, Saudi Arabia

N. AlajlanALISR Laboratory, College of Computer and Information Sciences, King Saud University,Riyadh, Saudi Arabiae-mail: [email protected]

123

R. R. Yager, N. Alajlan

vided a formalism for the representation of uncertain information our desire to includemany different sources of information in our decision-making requires us to augmentprobability theory with other modes of representation of uncertainty. This need imme-diately becomes clear when we consider the type of uncertainty associated with theimprecise nature of the information conveyed by natural language. Monotonic setmeasures provide a general way of representing uncertain information (Klir 2006).It allows for the representation in a unified framework of various different types ofuncertainty including as notable examples probability and possibility. As we shall seea set measure can be seen as a type of granular object used in granular computing(Pedrycz et al. 2008). Here we shall look at number of aspects related to the use of ameasure representation of uncertain information.

2 Measure representation of uncertain information

Assume V is a variable taking its value in the space X. One approach to representing adecision maker’s information about the value of V is to use a monotonic set measure(Klir 2006; Yager 2011). We recall if X is a set, a monotonic set measure on X, isa mapping μ : 2X → [0, 1] such that μ(Ø) = 0,μ(X) = 1 and μ(B) ≤ μ(A)) ifB ⊆ A. We see a monotonic set measure associates values with subsets of the spaceX. Here we shall simply refer to these as measures. We shall follow the convention ofusing the expression V is μ to indicate the fact that the measure μ is representing ourinformation about the variable.

We can easily show that for any subsets A and B that μ(A ∩ B)≤ Min(μ(A),μ(B))and μ(A ∪ B) ≥ Max(μ(A), μ(B). We see these are true because A ∪ B ⊇ Aand A ∩ B ⊆ B hence from monotonicity μ(A ∩ B) ≤ Min(μ(A), μ(B)) andsince A ∪ B ⊇ A and A ∪ B ⊇ B then again using monotonicity we getμ(A ∪ B)≥ Max(μ(A),μ(B)).

In using a measure μ to our model information about the value of a variable V weinterpret μ(A) to indicate our anticipation that the value of the variable V lies in theset A.

We can associate with any monotonic set measure a related monotonic set measure,μ, called its dual and define it as μ(A) = 1 − μ(A). We see that whereas μ(A) isinterpreted as the anticipation of finding the value of V in A, the dual μ(A) providesthe anticipation of not finding the value of V in not A. We note that a measure and itsdual are unique pairs since the dual of the dual is the original measure ˆμ(A) = μ(A).

As we shall subsequently see if we are interested in making some decision basedupon the knowledge of whether V lies in A we see both the measure μ(A), our antic-ipation that V lies in A and the dual our anticipate that of not finding V is not A willbe both useful.

One property of interest, related to these set measures, is related to additivity (Wanget al. 2010; Wang and Klir 2009). We say a measure μ is subadditive if for all A andB, μ(A ∪ B) ≤ μ(A) + μ(B). We say μ is superadditive if for all A ∩ B = Ø we haveμ(A ∪ B) ≥ μ(A) + μ(B). We say μ is additive if it is both sub and super additive, inthis case for all A ∩ B = Ø we have μ(A ∪ B) = μ(A) + μ(B).

123

Representation of uncertain information

An idea closely related to the issue of additivity is the concept of a Mobius trans-form of a measure μ (Grabisch 1997; Wang and Klir 2009; Wang et al. 2010). Inanticipation introducing this idea consider the following. Let

ν((x1, x2)) = μ({x1, x2}) − (μ({x1}) + μ({x2})).

It is the difference between the measure of {x1, x2} and the sum of the measures ofthe individual elements. We see that if μ is additive then the ν(x1, x2) = 0. If μ issuperadditive then ν({x1, x2)) ≥ 0 and if μ is subadditive then ν({x1, x2}) ≤ 0. Ageneralization of the ν is the Mobius transform of μ. In particular if X is finite thenthe Mobius transform of μ, is set mapping νdefined such that for each E ⊆ X.

ν (E) =∑

F ⊆ E(−1)|E −F|μ(F)

We note if E = {xi} then ν(E) = μ(E).One property of the Mobius transformation is the following (Klir 2006; Wang et al.

2010) ν(E) = μ (E) − ∑F⊂E υ(F). A direct implication of this is that μ (E) =∑

F⊆E υ (F). We note that for an additive type measure we have ν(E) = μ(E) forE = {xj} and ν(E) = 0 for all other subsets.

While measures can provide a very general framework for representing informa-tion about the value of a variable one issue that eventually must be addressed is thequestion of the determination of the measures value associated with the subsets of X.As we shall subsequently see the most commonly used measures are those that havevery special properties, such as additivity, which allow the calculation of the valueof a measure for the different sets with few parameters. We note the Grabisch (1997)has made considerable use of the Mobius transfer to help in this problem. One ideahe describes using these Mobius transforms is the concept of K-additive measures. Ameasure μ is said to be the K-additive if its Mobius transfer satisfies ν(F) = 0 forany F such that |F| > K and there exists at least one subset E, with |E| = K suchthat ν(E) = 0. The property of K-additive on a measure provides some bounds on thenumber of parameters needed to calculate the value of μ(A) for all A.

3 Probability, possibility and necessity measures

Here we provide a description of some of the measures that are most commonly usedin representation of information about the value of a variable.

A most important measure is the probability measure. This is an additive measure.Here for each xi we have a value pi such that μ({xi}) = pi and μ(A) = ∑

xi∈A μ(xi),If X = {x1, x2, . . ., xn} then we need only n parameters and they must satisfy pi ≥ 0and

∑pi

= 1. We note that for an additive type measure the Mobius transform is quitsimple, ν(E) = μ(E) for E = {xj} and ν(E) = 0 for all other subsets.

A very important feature of the additive type of probabilistic uncertainty rep-resentation is the fact that it is self-dual. In particular μ(A) = 1 − μ(A) =1 −∑

xk∈A pk = ∑xj∈A pj = μ(A). Essentially this means that the probability that V

123

R. R. Yager, N. Alajlan

lies in A is the same as the negation of the probability it lies in not A. Here the measureand its dual provide the same information.

One special case of this measure is the situation in which we know exactly thevalue of variable V, here pj = 1 for some xj. In this case μ(A) = 1 if xj ∈ A andμ(A) = 0 if xj /∈ A. A second special case is where each value in X is equally likely,

here pj = 1/n for all xj. In this case μ (A) = |A|n .

As we noted one property of any monotonic measure is that μ(A∪B) ≥ Max(μ(A),μ(B)), an important class of uncertainty measures are the possibility measures (Zadeh1978; Dubois and Prade 1988) these are measures that always satisfy the equality. Ameasure μ is called a possibility measure if for all A and B we have μ(A ∪ B) =Max(μ(A),μ(B)). In this case we often refer to μ as Poss. We see here if μ({xj}) = πjthen μ(A) = Maxj∈A[πj}. Thus the measure is completely determined by the n valuesof the πj

′s. We note that these values must satisfy πj ≥ 0 and Maxj[πj] = 1. Thushere one of the πj must be equal to one. One special case of possibility measure iswhere we know the exact value of V. Here πj = 1 for some xj and πk = 0 for allothers. In this case μ(A) = 1 if xj ∈ A and μ(A) = 0 if xj /∈ A. Actually in this caseboth the probability and possibility measures have the same form.

Another special case of possibility measure is one in which πj = 1 for all j. Thiscorresponds to a situation in which we have no knowledge distinguishing the possi-bility of any of the xj. We note here μ(A) = 1 for all A = Ø and μ(Ø) = 0. Wesometimes denote this special measure as μ∗. It has the unique feature that for anymeasure μ we have μ∗(A) ≥ μ(A) for all A.

We note that possibilistic uncertain often arises from linguistically expressed infor-mation. As Zadeh has discussed in (1978) we can use fuzzy subsets to describe lin-guistic information and then obtain a possibility measures from these fuzzy sets byassociating the value πj with the membership grade of the corresponding element inthe fuzzy set.

Let us look at the Mobius transformation of a possibility distribution. We recallfor any E, ν(E) = ∑

F⊆E (−1)|E−F|μ(F). Let X = {x1, . . ., xn} and without loss ofgenerality we shall assume the indexing has been done so that πj ≥ πk if j < k, thusπ1 ≥ π2 ≥· · ·≥πn. We see if E = {xj} then ν(E) = πj. Consider E = {x1, x2}, inthis case

ν(E) = μ({x1, x2}) − μ(x1) − μ(x2) = Max(π1,π2) − π1 −π2

= π1 −π1 −π2 = −π2

More generally if E = {xi, xk} then ν(E) = (−1)Minxj∈E[πj].Consider now E = {x1, x2, x3) here

ν(E) = Max(π1,π2,π3) − (Max(π1,π2) + Max(π1,π3)

+Max(π2,π3)) + π1 +π2 +π3

ν(E) = π1 −(π1 +π1 +π2) + π1 +π2 +π3 = π3

123

Representation of uncertain information

More generally E = {xi, xj, xk) then ν(F) = (−1)|E|−1Minxr∈E[πr]. Continuing inthis manner we can show that for any E,

ν(E) = (−1)|E| −1 Minxr∈E

[πr]

Closely related to the possibility measures are the certainty or necessity measures(Dubois and Prade 1987). For these measures

μ(A ∩ B) = Min(μ(A),μ(B))

We note that while for any measure μ(A∩B) ≥ Min(μ(A),μ(B)) these always attainthe boundary. The values of the measure for necessity measures can also be expressedin term of n parameters. Let Fj = X − {xj} and let us denote μ(Fj) = rj then we candefine any μ(A) in terms of these. In particular

μ (A) = Minxj /∈A

[μ(Fj)] = Minxj∈A

[μ(Fj)]

We note that since Ø = ∩j=1 Fj and μ(Ø) = 0 we must have Minr[rj] = 0. Thus atleast one rj = 0. One special case of these measures is the one where rj = 0 for allj. Here μ(A) = 0 for all A = X and μ(X) = 1. We often denote this as μ∗. We seethat if μ is any measure μ∗(A) ≤ μ(A) for all A.

Consider the case where we know the exactly that the value of V is xK. Here werecall μ(A) = 1 if xK ∈ A and μ(A) = 0 and xK /∈ A. We can see this a special caseof necessity measure where rK = 0 and rj = 1 for all j = K. We observe that for anyset A such that xK /∈ A we see A = FK ∩ E and this μ(A) = 0. On the other hand ifxK ∈ A then A = ∩j s.t.

j =kFj and hence μ(A) = 1.

A duality relationship exists between possibility and the necessity measures.Assume μ is a possibility measure. In this case its dual

μ(A) = 1 − μ(A) = 1 − Maxxj∈A

[πj]

is a necessity measure. Consider

�μ(A ∩ B)= 1 − μ(A ∩ B)= 1 − μ(A ∪ B) = 1 − Max(μ(A), μ(B))

Recalling μ(A) = 1 − μ(A) and μ(B) = 1 − μ(B) we get

�μ(A ∩ B) = 1 − Max(1 − �

μ(A), 1 − �μ(B))

= 1 − (1 − Min(�μ(A),

�μ(B))) = Min(

�μ(A),

�μ(B))

Thus we see that if μ is a possibility measure then�μ is a necessity measure. Similarly

if μ is a necessity measure then�μ is a possibility measure.

123

R. R. Yager, N. Alajlan

There are some very important relationships that these dual satisfy. Assume thatμ is a possibility measure where μ({xj}) = πj, Consider μ(A) = 1 − μ(A) =1 − Maxxj∈A[πj] in the case where A = Fj = X − {xj}. In this case

�μ(Fj) = 1 −

Maxxk∈Fj[πk] = 1 − πj. Thus rj = 1 − πj.

Another relationship between possibility and necessity measures is the following.Assume μPoss and μNec are possibility and necessity measures that are duals then forany A μPoss(A) ≥ μNec(A). Let us see that this relationship holds. Assume μPoss is apossibility measure then its dual necessity measure has μNec(A) = 1 − μPoss(A) =1−Maxxj∈A[πj]. Since at least one πj = 1, without loss of generality assume π1 = 1.Consider an arbitrary set A, either x1 ∈ A or x1 /∈ A. Assume x1 ∈ A then μPoss(A) =1 and since μNec(A) ≤ 1 then we have μPoss(A) ≥ μNec(A). Now assume xj /∈ A. Inthis case μNec(A) = 1 − Maxxj∈A[πj] = 1 − 1 = 0.

Here then again we have μPoss(A) ≥ μNec(A). This relationship is very special anduseful relationship.

4 Cardinality based measures

Another useful class of measures for modeling uncertain information are cardinality-based measures. These measures are useful when the fundamental feature determiningthe anticipation of the occurrence of a subset depends just on the number elements inthe set independent of the specific elements in the subset. Assuming x = {x1, . . ., xn},a cardinality based measure μ on X is a defined by set of parameters,

0 = a0 ≤ a1 ≤ a2· · · an−1 ≤ an = 1

such that μ(E) = anE where nE is the number of elements in E. We note some specialcases of this measure. One is the case where aj = 0 for j = n and an = 1. In this casewe get the certainty measure μ∗. Similarly if aj = 1 for j = 0 and a0 = 1 then we getthe possibility measure μ∗. Another special case is when aj = j/n. In this case we getthe special probability measure with pi = 1/n.

Another kind of special case is when aj = 0 for j < K and aj = 1 for j ≥ K. Herewe have a kind of tipping point uncertainty representation. If the number of elementsin a set is less than K we do not anticipate its occurrence but if this number of elementsis at least K we have anticipation of one.

It is worth noting the difference between μ∗ and μ∗. It appears that μ∗ is extremelyoptimistic in the sense that here we anticipate finding the value of V in any set exceptthe null. At the other extreme is the case of μ∗ we see this as extremely pessimistic.Here we anticipate finding V in no set except the full space. Here we see one featureof these cardinality-based measures is its ability to reflect information about a deci-sion maker’s attitude with respect to optimism-pessimism. The attitude with respectto optimism versus pessimism can be associated with the weights in the cardinalitybased measure μ (Yager 2002). To quantify this attitude we define

Op(μ) = 1

n − 1

n∑

j=1

(n − j)(aj − aj−1)

123

Representation of uncertain information

We see that if a0 = 1 and aj = 1 for j ≥ 1 then Op(μ) = n−jn−1 = 1. At the other

extreme if aj = 0 for j ≤ n and an = 1 then Op(μ) = 0. We observe for a K-tipping

form measure Op(μ) = n − Kn − 1 . The smaller the tipping-point, the more optimism. For

the case where aj = j/n we can show that Op(μ) = 0.5, it is neutral.The Mobius transform of a cardinality-based measure can be very neatly expressed.

Assume X = {x1, . . ., xn} and let B be a subset of X of cardinality C. We recallν(B) = ∑

F⊆B (−1)|B−F|μ(F). In this case we see that the sets F ⊆ B have cardinal-ities j = 1 to C and note that any set F ⊆ B of cardinality j has μ(F) = aj. Furthermore

the number of subsets of cardinality j contained in B is nj = C!j!(C−j)! . Using this we

see that

ν(B) =C∑

j=1

(−1)C−j(

C!j! (C − j)!

)aj = C!

C∑

j=1

(−1)C−j aj

j!(C − j)!

In the case of a k-tipping point measure, aj = 0 for j < K and aj = 1 for j ≥ K then

ν(B) = C! ∑Cj=K (−1)C−j 1

j!(C−j)! . We see if C < K then ν(B) = 0.

Let us now look at the dual of a cardinality-based measure. We recall μ(A) = 1 −μ(A). If A has cardinality is nA then cardinality of A is n−nA. Thus μ(A) = 1−an−nA .First we observe that μ is itself a cardinality-based measure. We further see that ifwe let bj denote the weight associated with a set A of cardinality j under μ thenμ(A) = bj = 1 − an−j. We see then that the weights are associated with μ and μ

duals.We observe that μ∗ and μ∗ are duals as follows. For μ∗, aj = 1 for j = 1 to n and

a0 = 0 while for μ∗, we have aj = 0 for j = 0 to n − 1 and an = 1. Consider now μ∗the dual of μ∗. Here bj = 1 − an − j. In the case b0 = 1 − an = 0, bn = 1 − a0 = 1and for j = 1 to n − 1,we have that b = 1 − an−j = 0. Thus bj = aj.

We see that the following relationship with respect to Op for these cardinality-basedmeasures, Op(μ) = 1 − Op(μ).

As discussed in Yager (2002) we can express a cardinality-based measure in a man-ner independent of the cardinality of X. Here we use the function f : [0, 1] → [0, 1]such that f(0) = 0, f(1) = 1 and f(y) ≥ f(z) if y > z. These are referred to as BUMfunctions. If X is a set of cardinality n then using f we can obtain aj = f(j/n). For thesefunction based cardinality measures we can associate a measure of optimism that isdefined as

Op(f) =1∫

0

f(y)dy.

An interesting class of such functions is f(y) = yα for α > 0. Here we see Op(f) =1

α +1 . Thus as α → 0 we Op(f) = 1 and as α → ∞, Op(f) = 0. We see that if α = 1then f(y) = y and the result is aj = j/n. Here we see that f conveys a kind of attitude,optimistic or pessimistic depending on α.

123

R. R. Yager, N. Alajlan

Another class of these BUM functions are the piecewise linear:

f(y) = 0 if y ≤ af(y) = y−a

b−a if a < y < bf(y) = 1 if y ≥ b

For these functions

Op(f) =1∫

0

f(y)dy =b∫

a

x − a

b − a+

1∫

b

dy = 1 − 1

2(a + b)

We see if a = b = 0, then Op(f) = 1. Here aj = 1 for j > 0. If a = b = 1 then aj = 0for j < n and Op(f) = 0. More generally if a = b = K, then Op(f) = 1 − K. Herewe see a tipping point type measure. In particular aj = 0 for j < Kn and aj = 1 forj ≥ Kn. Generally as a and b move to the left we increase our optimism.

The preceding inspires us to consider another class of related measures, whichwe refer to as attitudinal based probability measures (ABP) measures. Assume X ={x1, . . ., xn} and associated with each xj we have a probability, pj where pj ∈ [0, 1] and�jPj = 1. Let f be a BUM function, a monotonic mapping f : [0, 1] → [0, 1] with f(0)= 0, and f(1) = 1. We now define a measure μ on X such that μ(A) = f(

∑xj∈A pj) =

f(∑n

j=1 pjAj(xj)). Here then depending on f we obtain as our anticipation the proba-bility biased so as to reflect the decision maker’s attitude.

5 Quasi-additive measures

Another class of measures that can be used in the framework of modeling uncertaininformation are quasi-additive measures. Let Sj for j = 1 to q be a collection of subsetsof X where we associate with each subset a value αj ∈ [0, 1] and require �jαj = 1.We define the measure μ such that μ(A) = ∑q

j=1 Gj(A) αj where

Gj(A) = 1 if A ∩ Sj = Ø

Gj(A) = 0 if A ∩ Sj = Ø

Alternatively we can express

μ(A) =q∑

j=1

αjMaxx[A(x)Sj(x)]

A more general class of these quasi-additive measures can be had if we use a BUMfunction, a monotonic function f. Using this we obtain

μ(A) =q∑

j=1

αjf

(∣∣A ∩ Sj∣∣

∣∣Sj∣∣

)

123

Representation of uncertain information

We refer to this as f-quasi-additive measures. We see that if f is such that f(0) = 0and f(y) = 1 for y = 0 then we get our original quasi-additive measure. At the otherextreme is the case where f(1) = 1 and f(y) = 0 for y = 1. In this case where arerequiring Sj ⊆ A to get the weight αj. Another special case is a linear f, f(y) = y.

In this case μ(A) = ∑qj=1 αj

|A∩Sj||Sj| . Here it is related to the portion of the Sj that is

in A. It should be noted that these quasi-additive measures are closely related to theDempster-Shafer belief structure (Shafer 1976; Smets 1988; Liu and Yager 2008).

6 Averaging, integrals and measure

The multi-dimensional structure of an uncertain description makes it difficult for com-parison and choice among alternatives. In situations were we have uncertainty, oftenwe must simplify the complexity of an uncertain environment in order to be able tomake decisions. In these cases we often use a representative value to characterize analternative having uncertainty. Averaging and particularly weighted averaging opera-tions play a central role obtaining these representative values. A prototypical exampleis the use of the expected value in probability theory, which is a weighted averageof the possible outcomes using their probabilities as the weights. We emphasize thateven in the case of probabilistic uncertainty there are averaging operations other thenthe expected value, such as the median, for characterizing the representative value ofan uncertain alternative.

We now turn to the issue of calculating averages in situations in which we havemore general uncertain information then simply probabilistic. We begin by looking atthe basic averaging process.

Some fundamental properties of an average process are monotonicity, bounded-ness and idempotency (Beliakov et al. 2007). Monotonicity implies that if the valuesbeing averaged increase, then the value of the average can’t decrease. Boundednessimplies that the average must lie between the smallest and largest of the values beingaveraged. Idempotency implies that if all the values between averaged are the samethis must be the average. Averages can be taken over a finite set of values or over afunction defined on some interval. Averages can also involve situations in which eachof elements being averaged has a different weight or importance in the averaging pro-cess. The satisfaction of the basic properties of averaging, monotonicity, boundednessand idempotency, impose certain constraints on the weights associated with elementsbeing averaged. These constraints essentially require that the weights be positive andsum to one.

Assume X is some space and F is a function that associates a real number with eachelement in X, F: X → R. Thus here we have a collection of values, F = 〈F(x)/x ∈ X〉.The concept of finding the average of these values is meaningful. We call this theaverage of F over X. If X is finite, X = {x1, . . ., xn} then the simple average wouldbe, F = 1

n

∑nj=1 F(xj). If X is a continuous interval X = [a, b] then our simple average

would be F = 1b−a

∫ ab F(x)dx. Let us highlight the fact that the integral is providing an

average, thus integrals and averages are related. One point we want to make here is thatwe have implicitly assumed all the elements in X are of equal importance. Anotherpoint we want to make is that even in this basic case we could use other formulations

123

R. R. Yager, N. Alajlan

to obtain an average then the simple average used above. For example we could usethe median or OWA (Yager 1988) operator to obtain an average. Many other averagesare possible (Bullen 2003).

We note that in this case of simple average if we assign a weight wj to each xj thenour weighted average is F = ∑n

j=1 wjF(xj). It is useful to understand that the integral

type of average, F = 1b−a

∫ ab F(x)dx, can be viewed as a type of weighted average. In

this perspective we see that dxb−a is the weight associated with the value F(x). Here we

have implicitly assumed each of the F(x) have the same weight.We now turn to our main interest, the role of averaging in uncertain environments.

We first consider the finite environment. Here now we shall let V be some variable thattakes values in the space X = {x1, . . ., xn} and assume F is some function, F: X → R,it associates with each x ∈ X a value. Consider now that we have some uncertaintywith respect to the value of X. Let us initially assume that our information about thisuncertainty is expressed via a probability distribution where pj is the probability thatV = xj. A question of interest is the determination of some representative value for Fin this situation. One approach to obtaining a representative value is to use the expectedvalue, EV(F) = ∑n

j=1 pjF(xj). We see this is an example of a weighted average of theF(xj) where the weights are provided by the probability distribution. Thus we see thatthe representative value of F is obtained using the weight average of the F(xj). Thushere the probability distribution is seen as inducing some weights on the associatedvalues, the F(xj).

We recall in the case where X = [a, b] then the information about the probabilisticuncertainty is carried by a probability density function h defined on [a, b] such thath(x) ≥ 0 and

∫ ba h(x)dx = 1. Using this probability density function we calculate the

expected value of F over X as

EV(F) =b∫

a

F(x) · h(x)dx.

We note here that this is a weighted average of F where h(x)dx is the weight associatedwith F(x).

Another approach to obtaining a representative value in the situation is to usethe median. Here we must order the payoffs. Let us consider the finite case whereX = {x1, . . ., xn}. We shall let ind(j) indicate the index of the outcome with the jthlargest payoff. Thus F(xind(j)) is the jth largest payoff. We also let pind(j) be the associ-ated probability. Using this we obtain as the median of F, Med(F), the value F(xind(j∗))so that

j∗−1∑

j=1

pind(j) < 0.5≤j∗∑

j=1

pind(j)

It can be shown that this is also a weighted average where the weights are providedby the probabilities.

A useful idea in probability theory is the cumulative distribution function, CDF. Werecall this is defined as a mapping CDF: R → [0, 1] such that CDF(y) = Prob(F(x) ≤ y).

123

Representation of uncertain information

We note in the case where X is finite then CDF(y) = Prob{xi|F(xi) ≤ y} =∑

xi, s.t. F(xi) ≤ ypi. We note that the median is easily obtained here as the value of y

such that CDF(y) = 0.5.An interesting relationship exists between the CDF and the expected value. Assume

X = {x1, . . ., xn}, let pj be the probability of xj and let F(xj) = yj be the payoff asso-ciated with xj. Without loss of generality assume the indexing of the xi

′s is such thatyj ≥ yk if j ≥ k, the larger j the larger the payoff. In this case if we are interested in the

CDF for any y we see from Fig. 1 that for yj ≤ y < yj+1 we have CDF(y) = ∑jk=1 pk.

Using this we can calculate the area under the CDF between y1 and yn as

AREA =yn∫

y1

CDF(y) dy =n−1∑

j=1

CDF(yj)(yj+1 − yj)

AREA = (y2 − y1)p1 + (y3 − y2)(p1 + p2) + (y4 − y3)(p1 + p2 + p3)

+ · · · + (yn − yn−1)(p1 + · · · + pn−1)

AREA = −n−1∑

j=1

pjyj + yn

n−1∑

j=1

pj. However sincen−1∑

j=1

pj = 1 − pn then AREA

= yn −n∑

j=1

pjyj.

Furthermore since∑n

j=1 pjyj = EV(F) we see that

EV(F) = yn − Area under CDF = yn −y∫

y1

CDF(y)dy

We can also show that EV(F) = y1 + ∫ yny1

(1 − CFD(y))dy.Based on the preceding we can make the following observation. Consider two

uncertain, lotteries both with possible outcomes X = {x1, . . . , xn} and associated

y1

y2 yn

p1

1

p2

p1 +

Fig. 1 Cumulative distribution function

123

R. R. Yager, N. Alajlan

payoffs F(xi) = yi. Consider in one lottery we have the probabilities pi associatedwith the different outcomes and for the other we have the probabilities qi. Furthermorelet CDFP and CDFq be the respective cumulative distribution functions. We observefrom the preceding if CDFq(y) ≤ CDFP(y) for all y ∈ [yi, yn] then EVq(F) ≥ EVP(F),It is also interesting to observe that if CDFq(y) ≤ CDFq(C) then if yP is the placewhere CDFP(yP) = 0.5 and yq is such that CDFP(yq) = 0.5 we see that yP ≥ yq,then Medq(F) ≥ MedP(F).

Here we consider a formulation related to the CDF. We shall define

UCDF(y) = Prob{xj|F(xj)≥y}

In Fig. 2 we show a typical formulation for the discrete case where we have assumedyi ≤ yk for i ≤ k. In particular from this figure we observe

UCDF(y) = 1 for y ≤ y1

UCDF(y) = 1 − ∑jk=1 pk for yj < y ≤ yj+1

UCDF(y) = 0 for y ≥ yn

Consider now the area under this UCDF as calculated by the following integral

yn∫

y1

UCDF(y)dy =n−1∑

j=1

(yj+1 − yj)UCDF(yj) =n−1∑

j=1

(yj+1 − yj)

⎝1 −j∑

k=1

pk

yn∫

y1

UCDF(y)dy = (1 − P1)(y2 − y1) + (y3 − y2)(1 − P1 − P2)

+· · · + (yn − yn−1)

(1 −

n−1∑

K=1

pj

)

yn∫

y1

UCDF(y)dy = −y1 + P1y1 + P2y2 + P3y3 + · · · + ynPn

yn∫

y1

UCDF(y)dy = EV(F) − y1.

Fig. 2 Typical UCDF

y1

y2

yn

1

123

Representation of uncertain information

7 Representative values for measure based uncertainty

We now want to again consider the situation in which V is a variable taking its value insome space X and F is a real valued function on X. However here we shall assume ourinformation about the actual value of V rather than being expressed by a probabilitydistribution is carried by a set measure μ on X. A question of interest here is thedetermination of some representative value of F.

We can, as in the case of probabilistic uncertainty view the measure μ as induc-ing some weights on the elements in X and in turn inducing some weights on theirassociated values, the F(x). Thus now we must consider the problem of finding theμ-weighted average of F over X. A number of approaches can be suggested for accom-plishing this task.

One approach for obtaining a weighted average of F over X with respect to μ is touse Choquet integral. We shall denote this as Cμ(F) (Wang et al. 2010). In the casewhere X is some interval, [a, b], the Choquet integral is obtained as follows. We letFα = {x/F(x) ≥ α}, it is the set of values in X for which F(x) is at least α. We thendefine

Cμ(F) =b∫

a

μ(Fα)d α.

We note that in the case when μ is a probabilistic type of uncertainty the term μ(F α)

corresponds to the probability of F(x) ≥ α, it is what we earlier called the UCDF.In the special case where X is finite, X = {x1, . . ., xn} it can be shown (Wang et al.

2010) that the preceding formula of the Choquet integral becomes

Cμ(F) =n∑

j=1

(μ(Hj) − μ(Hj−1))F(xind(j)).

Here ind(j) is the index of the element in X with the jth largest value for F. In theabove formulation Hj = {xind(k)/k = 1 to j}, it is the subset of X with the j largestvalues for F. By convention F0 = Ø. We see since Hj−1 ⊆ Hj then μ(Hj−1) ≤ μ(Hj)

and hence μ(Hj) − μ(Hj−1) ≥ 0. If we denote wj = μ(Hj) − μ(Hj−1) we seeCμ(F) = ∑n

j=1 wj F(xind(j)) where wj ≥ 0. In addition since μ(Hn) = μ(X) = 1 andμ(H0) = 0 it is easy to show that

∑nj=1 wj = 1. Thus the wj is a set of weights. From

this we see that Cμ(F) is a kind of weighted average.We also note that if μ is a probability measure, then μ(Hj)−μ(Hj−1) = pind(j) and

hence Cμ(F) = ∑nj=1 pind(j)F(xind(j)) = ∑n

j=1 pjF(xj), the usual expected value.As we indicated earlier there are average functions other than the simple weighted

average. It is possible to generalize the basic Choquet integral to allow for other typesof averages in the face of an uncertainty measure μ. Here we shall initially restrictourselves to case where X is finite.

To accomplish this generalization we introduce a mapping g: [0, 1] → [0, 1] calledaveraging attitudinal function. This function will express the type of average we want.

123

R. R. Yager, N. Alajlan

It is required that g(0) = 0, g(1)=1 and g is monotonic, g(x) ≥ g(y) if x > y. Usingthis function we obtain a g-type average

gμ(F) =n∑

j=1

g(μ(Hj) − g(μ(Hj−1))F(xind(j))

We first observe that if g is linear, g(x) = x, then we get the basic Choquet integral.If g is such that g(x) = 0 for x < 0.5 and g(x) = 1 for x ≥ 0.5 we get a median typeaggregator function. It can be shown if μ is a probability measure we get the weightedmedian.

It is well known that the Max of a collection of numbers also provides a type ofaverage operator. We can extend this to our situation by defining g such that

g(x) = 1 if x = 0

g(0) = 0

The Min also provides a type of average operator to obtain this in our framework welet

g(1) = 1

g(x) = 0 if x = 1.

Various other types of aggregation can be obtained by approximately selecting g.This leads to different choices for representative value in our uncertain situation. Thechoice of g then is essentially a choice depending the attitude of the responsible deci-sion maker.

There exists another approach for calculating the average of function F under ameasure μ. This approach is based on the Sugeno integral (Sugeno 1977; Murofushiand Sugeno 2000). The Sugeno integral requires F(x) ∈ [0, 1]. In the situations whereX = [a, b] then

Sμ(F) = Maxα∈[0,1][α∧μ(Fα)),

here again Fα = {x|F(x) ≤ α}.In the case where X is finite, X = [x1, . . ., xn}, and xind(j) is the element with the

jth largest value for F then this becomes

Sμ(F) = Maxj=1 to n

[μ(Hj) ∧ xind(j)]

where Hj = {xind(k)/k = 1 to j}, the subset of X with the j largest values for F.The Sugeno integral has some advantages and disadvantages compared with the

Choquet integral. One advantage of this type of average is that the information usedin this type of average doesn’t have to be numeric but it can be ordinal. Thus if S issome ordinal scale and F(x) ∈ S and μ(A) ∈ S then the Sugeno integral can provide

123

Representation of uncertain information

a weighted average. Its disadvantage is that when working with numbers it is not assensitive as the Choquet integral.

Earlier we introduced to idea of the dual of a measure. We recall if μ is a mea-sure then its dual μ is defined as μ(A) = 1 − μ(A). In the context of modelinguncertain information about a variable V as we indicated μ(A) provides the antic-ipation of finding the value of V in A while μ(A) is the anticipation of not find-ing the value V in not A. Let us investigate the relationship between the Choquetintegral of function F with respect to μ and with respect to its dual μ. We recallCμ(F) = ∑n

j=1 (μ(Hj) − μ(Hj−1))F(xind(j)) where xind(j) is the element with the jthlargest value for F and Hj = {xind(k)/k = 1 to j}. If we denote wj = μ(Hj) − μ(Hj−1)

then we can express Cμ(F) = ∑nj=1 wjF(xind(j)). We note that the wj ∈ [0, 1] and sum

to one, thus Cμ(F) is a weighted average of the F(xind(j)).Consider now the case of the dual measure here

Cμ(F) =n∑

j=1

(μ(Hj) − μ(Hj−1))F(xind(j))

=n∑

j=1

((1 − μ(Hj)) − (1 − μ(Hj−1))F(xind(j))

Cμ(F) =n∑

j=1

(μ(Hj−1) − μ(Hj))F(xind(j))

We note that since Hj−1 ⊇ Hj then μ(Hj−1) − μ(Hj) ≥ 0 and we easily seethat since Hn = X, then Hn = Ø and since H0 = H1−1 = Ø then H0 = X.If we denote wj = μ(Hj−1) − μ(Hj) then Cμ(F)) = ∑n

j=1 wjFxind(j). Furtherwj ≥ 0 and

∑nj=1 wj = 1. Thus the wj are also a set of weights. However we observe

that there exists some relationship between the wj and wj. Recall wj = μ(Hj)−μ(Hj−1)

where Hj = {xind(1), xind(2), . . ., xind(j)} and Hj−1 = {xind(1), . . ., xind(j−1)}. Considernow wj. Here wj = μ(Hj−1)−μ(Hj) where Hj−1 = {xind(j), . . ., xind(n)} and μ(Hj) ={xind(j+1), . . ., xind(j)}.

Consider now wn−j+1 = μ(Hn−j) − μ(Hn−j+1). We observe that

Hn−j = {xind(1), . . ., xind(n−j)} and Hn−j = {xind(n−j+1), . . ., xind(n)}Hn−j+1 = {xind(1), . . ., xind(n−j+1)} and Hn−j−1 = {xind(n−j+2), . . ., xind(n)}

Thus while wj is the difference between the measure of the jth elements with the high-est value and the j − 1 elements with the highest value, wj is the difference betweenthe measure of the elements with the jth lowest values and the elements with the j − 1lowest values.

We note that if μ is the additive probabilistic type measure then wj = μ{xind(j)} =wj.

123

R. R. Yager, N. Alajlan

8 Extension to fuzzy sets

As defined a measure μ on X associates with a subset A of X a value μ(A) ∈ [0, 1]. Inour concern with modeling uncertain information the semantics associated with μ(A)

is as the anticipation of finding the value of V in A. When μ is a probability measureμ(A) is the probability of A. When μ is a possibility measure μ(A) is the possibilityof A. A natural question that arises is the determination of μ(B) when B is a fuzzysubset of x. Here we describe some approaches to accomplishing this task.

The first approach is based on the idea of level sets. If B is a fuzzy subset of X wedefine the α-level set of B as Bα = {x/A(x) ≥ α}. It is the crisp subset of elements withmembership grade at least α. This approach uses the representation of a fuzzy subset

in terms of these level sets. In particular if B is a fuzzy set we express B = ∪α

}.

Using Zadeh’s extension principle we get

μ(B) = ∪α

μ(Bα)

}.

Since Bα is a crisp set the measure of Bα is well defined. Here we get μ(B) definedas a fuzzy subset of [0, 1]. We note that this approach is very much in the spirit ofYager’s (1984) method for defining probability of fuzzy sets.

We now shall describe a second approach that is in the spirit of Zadeh’s (1968)approach to defining the probability of a fuzzy subset. This approach (Wang andKlir 2009) uses the Choquet integral. Again assume μ is a measure defined onthe space X. Let A be a crisp subset of X. We recall associated with A is itscharacteristic or membership function MA defined such that MA(x) = 1 if x ∈ Aand MA(x) = 0 if x /∈ A. We note that we can view MA as a real function,F: X → [0, 1] such that F(x) = MA(x). Let us now take the Choquet inte-gral of this function F with respect to μ. Here Cμ(F) = ∫ 1

0 μ(Fα)d α whereFα = {x/MA(x) ≥ α}. We note in this case Fα = A for α > 0. Thus we haveCμ(F) = ∫ 1

0 μ(A)d α = μ(A), Thus the Choquet integral of the membership func-tion of A with respect to μ is equal to μ(A). This then suggests that to obtain themeasure of a fuzzy subset B of X we define it as the Choquet integral of the mem-bership function of B. Thus if MB : X → [0, 1] is the membership function ofB then μ(B) = Cμ(MB). Here we emphasize that MB(x) is not necessarily in {0,1} as the crisp set but in [0, 1]. We see that μ(B) is the anticipation of finding thevalue of the variable V in B. It is a generalization of the idea of expected value ofB.

Consider the case where X is finite and μ is a probability measure, here μ({•j}) = pjis probability of xj. Let B be a fuzzy set where B(xj) is the membership of xj in B.In this case Cμ(B) = ∑n

j=1 (μ(Hj) − μ(Hj−1))B(xind(j)). Here xind(j) is the elementwith the jth largest membership grade in B. Furthermore Hj = {xind(k)/k = 1 to j},it is the collection of j elements with the largest membership grades in B. We seethat μ(Hj) = ∑i

k=1 pind(k). We see here that μ(Hj) − μ(Hj−1) = pind(j) and henceCμ(B) = ∑n

j=1 pind(j) · B(xind(j)) = ∑ni=1 piB(xi). Thus the Choquet integral of the

fuzzy subset B is the expected probability of the membership function of B. This is the

123

Representation of uncertain information

exactly formulation suggested by Zadeh (1968) for extending a probability measureto a fuzzy subset, it is the expected value of the membership function.

We observe that when X = [a, b], μ is on measure and B is a fuzzy subset of X then

μ(B) =1∫

0

μ(Bα)d α (I)

where Bα = {x/MB(x) ≥ α} it is the α-level set. We see a connection between thisand the first method of defining μ(B) as a fuzzy set

μ1(B) = ∪α∈[0, 1]{

α

μ(Bα)

}(II)

We see that (I) is essentially what we would obtain applying to the fuzzy subset II theprocedure suggested by Yager in (1981) for getting representative value of fuzzy sets.

Actually the Sugeno integral also provides an alternative method for obtaining themeasure of fuzzy sets. Again assume A is a crisp set and let MA be its membershipfunction. Let xind(j) indicate the element with the jth largest membership grade in A.If we calculate Sμ(MA) we have

Sμ(MA) = Maxj=1 to n

[μ(Hj) ∧ MA(xind(j))]

where Hj = {xind(k)/k = 1 to j}. Let j* be such that MA(xind(j)) = 1 for j ≤ j∗ andMA(xind(xj)) = 0 for j > j∗. Here then Sμ(MA) = Maxj=1to j∗ [μ(Hj)]. Furthermorebecause of the monotonicity of μ(Hj) we see Sμ(MA) = μ(Hj∗). However we see thatH∗

j are all the elements where MA(xk) = 1, thus Hj∗ = A and hence Sμ(MA) = μ(A).Inspired by this if B is a fuzzy subset one approach to obtaining μ(B) is to let

μ(B) = Sμ(MB) = Maxj=1 to n

[μ(Hj) ∧ B(xind(j))].

Here MB is the membership function of B, MB(x) = B(x) and xind(j) is the elementwith the jth largest membership grade in B.

Consider the use of this in the case where μ is a possibility measure μ({xi}) = πiand μ(A) = Maxxj∈A[πj]. Assumer now B is a fuzzy subset of X. In this case

μ(B) = Sμ(B) = Maxj=1 to n

[μ(Hj) ∧ B(xind(j))]

where Hj = {xind(k)/k = 1 to j}. It can be shown in this case that μ(Hj)= Maxj=1ton[πj ∧B(xj)]. This is the value suggested by Zadeh (1979) for finding the possibility ofa fuzzy subset.

In the case when μ is a probability distribution the Sugeno integral givesan alternative characterization of the probability of a fuzzy set. Here againμ(B) = Maxj=1 to n[μ(Hj)

123

R. R. Yager, N. Alajlan

∧ Bind(j))]. With Hj = {xind(k)/k = 1 to j} we have μ(Hj) = ∑jk=1 pind(k). Using

this we get

μ(B) = Maxj=1 to n

⎣B(xind(j)) ∧j∑

k=1

pind(k)

To gain intuition about this form we see that if the indexing was done in order ofdecreasing membership grades in B, xind(j) = xj then we have

μ(B) = Maxj=1 to n

⎣B(xj) ∧j∑

k=1

pk

We observe that if B is a crisp set, B(xj) is binary, the above reduces to the usualdefinition of probability of B, μ(B) = ∑

xj∈B pj

9 Extension to measures on power sets

We now consider another extension of a measure. Assume μ is a measure on X, μ :2X → [0, 1], here μ(Ø) = 0,μ(X) = 1 and for any A and B ⊂ X we have μ(A) ⊂μ(B) if B ⊆ A. We now consider the extension of μ to the power set of X. We shallthis denote μP: 22X → [0, 1]. Thus μP associates with set of subsets of X a value inthe unit interval. We must require μP to satisfy the basic properties of a measure:

μP(Ø) = 0,μP(2X)= 1 and for any E and F ∈ 22Xwe have μ(E) ≥ μ(F) if F ⊆ E.

We now can suggest the following method of getting an extension of μ to satisfythese conditions. Assume F ⊆ 2X, it is a collection of subsets of X. We shall denoteF = {A1, . . ., Aq} where each Aj ⊆ X. We now define

μ(F) = MaxAj∈F

[μ(Aj]

Thus μP(F) is equal to the value of the maximal measure any subset in F.We see that if F = Ø, then μP(F) = MaxAj∈Ø[μ(Aj)] = 0. If F = 2X then we see

that X ∈ F and since μ(X) = 1 then μP(2X) = 1. It is also clear that if F ⊆ E then

μP(F) ≤ MaxAj∈F

(μ(Aj) ≤ MaxAj∈E

(μ(Aj) ≤ μP(E)

One very notable feature of this extension is the following. Assume E is a single subsetof X, E = {A} for A ⊆ X. Then here μP(E) = μ(A). Thus the singletons have thesame measure as they have in the original measure.

Actually we see here that μP is a possibility measure on 22X. If μP({Aj}) = μ(Aj) =

aj then for any E ⊆ 2X we that μ(E) = MaxAj∈E[aj]. Here the singletons are the subsetsof X.

123

Representation of uncertain information

Actually we can provide a more general definition for the extension of μ to thepower set. Let S is any t-conorm, using this we define μPS : 22X → [0, 1] such thatfor any E = {A1, . . ., Aq} where Aj be a subset of X we have

μPS (E) = SAj∈E

[μ(Aj)]

We easily see that this satisfies all the required properties μPS(Ø) = 0,μPS(2X) = 1

and the monotonicity requirement. In addition if E is a singleton, E = {Aj}, thenμPS(E) = μ(Aj). We shall use the notation μP for the default case where S =Max. A very special property of this extension is that for any S and all E we haveμP(E) ≤ μPS(E). Thus the Max t-conorm provides the minimal valued extension.

Assume E is a subset of 2X, it is a collection of subsets of X. Let E* be a subset of Econsisting of subsets in E that are non-dominated, that is Aj ∈ E∗ if Aj ∈ E and thereexists no subset Ak ∈ E such that Aj ⊂Ak. We say that E* is the subset of maximalelements in E. Then because of the monotonicity of μ, if A ⊂ B, μ(A) ≤ μ(B) thenμP(E) = μP(E∗). Thus μP(E) is equal to the maximum μ(Aj) for any Aj ∈ E∗. It isthe maximal measure of any element in the maximal set of E.

Let us look at μP for some special cases of μ. Here we let X = {x1, . . ., xn}. WeFirst consider the case where μ is a possibility measure and we have μ({xj}) = πj.For any A ⊆ X we have μ(A) = Maxxj∈A[πj] = Maxj[πj ∧A(xj]. Consider nowμP in this case. What is clear is that for any E = {A1, . . ., Aq} we have μP(E) =μ(

⋃qk=1 Ak) = Maxq

k=1[Maxj [πj ∧Ak(xj)]]. Thus it is the largest possibility of anyxj that is in a subset Ak contained in E.

We now look at the situation where μ is a probability type measure. Here welet pj = μ(xj) and for any Ak ⊆ X we have μ(Ak) = ∑n

j=1 pj · Ak(xj). Consider

now the extension μP of this measure. Here for any E ⊆ 2X we have μP(E) =MaxAk∈E[∑n

j=1 pjAk(xj)]. Thus in this case μP(E) is the maximum probability of asubset contained in E. Actually we can also use the maximal set E* to obtain μP(E)and then μP(E) = MaxAk∈E∗ [∑n

j=1 pjAk(xj)].We observe that if instead of using the Max t-conorm we use the probabilistic sum,

S(a, b) = a + b − ab = 1 − (1 − a)(1 − b) then we get μPS (E) = 1 − ∏Ak⊂E(1 −

Prob(Ak)).Let us recall here that μP, which we have just previously defined, is a measure on

the power set of X, 2X. It takes a subset E of 2X and gives us a value in the unit interval.Let F be a function that associates a real number with elements in 2X, F : 2X → R.We can now use μP to obtain the Choquet integral of F with respect to μP. Here then

Cμp (F) =∫

R

μP(Fα)d α

where Fα = {Aj|Aj ∈ 2X and F(Aj) ≥ α} Thus we can define the Choquet integral ofF with respect μP.

Consider now the case where X is finite and hence 2X is finite. Let us denote2X = {A1, . . ., Aq}, it is the collection of all subsets of X. We note that Ø ∈ 2X and

123

R. R. Yager, N. Alajlan

X ∈ 2X. In this finite case we let ind(j) be the index of the Ai with the jth largest valuefor F(Ai). Furthermore we let Gj = {Aind(k)/k = 1to j}, it is the subset of 2X with thej largest values for F. We see that since Gj−1 ⊆ Gj then μP(Gj−1) ≤ μP(Gj). Usingthis we can express Cμp (F)) in this finite case as

Cμp (F) =q∑

j=1

((μP(Gj) − μP(Gj−1))F(Aind(j)))

As a result of the preceding a very interesting possibility has now arisen. Let μ∗ besome other measure on X. Since μ∗ : 2X → [0, 1] is a function on 2X, it is actuallya function of the same type as the F used in the preceding Choquet integral. Thuswe just provided a mechanism that allows us to calculate the Choquet integral ofthe measure μ∗ with respect to μP, Cμp(μ

∗). Thus Cμp(μ∗) = ∫ 1

0 μP(μ∗α)d α where

μ∗α = {Aj/μ

∗(Aj) ≥ α). In the finite case we get

Cμp(μ∗) =

q∑

j=1

(μP(Gj) − μP(Gj−1))μ∗(Aind(j))

where ind(j) is the index of the Ai with the largest value for μ∗(Ai) and Gj ={Aind(j)/k = 1 to j}.

We further observe that since μP is the extension of the measure μ we are essentiallyobtaining the value Cμ(μ∗). Since, as we previously, indicated the Choquet integralof a set, fuzzy or crisp, is the measure of the set we have essentially extend this ideato obtain the measure of a measure. Thus if we start with a measure μ on X and μ∗ isanother measure on X then Cμp(μ

∗) gives the anticipation that the value of variableV lies in μ∗ given that we know V is μ.

Again let F be a function that associates a real number with elements in 2X, F :2X → R. We can just, as in the preceding, define the Sugeno integral of F with respectto μ. That is

Sμp (F) = Maxj=1 to q

[μP(Gj) ∧ F(Aind(j))]

Here again Aind(j) is the subset of X, an element in 2X, which has the jth largest valuefor F(Ai). In addition Gj = {Aind(k)/k = 1 to j}, it is the set of subset of 2X with the jlargest values for F.

In the special case where F is a measure μ∗ on X then

SμP(u∗) = Max

j=1 to q[μP(Gj) ∧ μ∗(Aind(j))]

where again Aind(j) is the subset of X with the jth largest value of μ∗.In the preceding we defined the extension of a measure μ on X into a measure μP

on 2X. Thus μP : 22x → [0, 1], it associates with subset of subsets of X a number inthe unit interval. In particular we for any E ⊂ 2X then μP(E) = MaxA ⊆ E[μ(A)].

123

Representation of uncertain information

We now want to introduce the idea of the dual of μP. We recall then the dual μp isdefined such that μp(E) = 1 − μP(E). We see that E = {Ak/Ak /∈ E}. Thus we have

μp(E) = 1 − MaxAk /∈E

[μ(Ak)]

We easily see that μp is a measure. Since if E = Ø then X /∈ E and since μ(X) = 1then μp(Ø) = 0. In addition if E = 2X then E = Ø and here μP(E) = 0 andthus μp(2X) = 1. Finally the monotonicity of μ is obvious since as E increasesMaxAk /∈E[μ(Ak)] can’t increase.

Thus we have the idea of dual of μP. Thus here then μp it’s just another measureon 2X, it takes collection of subset of X into the unit interval.

10 Question answering

Let V be a variable whose value lies in the set X. Rather than knowing the value of Vwe are uncertain about its value. Our point of departure has been the representation ofour knowledge of the value of variable V in terms of a measure μ on its domain, X. Werecall under this representation for any subset A of X, μ(A) indicates our anticipationthat the value of V lies in the set A.

An important task that arises in many environments is the determination of whetherV satisfies some condition. Consider the following simple example illustrating the sit-uation and providing some intuition. Assume we know that John is in his twenties andwe are interested in whether he is 25 or more years old. We see here that our answer isuncertain. It is possible that John is over 25 years old but we are not sure. At a formallevel we see that our knowledge about John’s age, that he is in the twenties, has thenature of a possibility type measure. In addition our question or criteria is expressed asa set B = {x|x ≥ 25}. Thus here we are faced by the problem of determining whethersome uncertain variable lies in some set. We note that an affirmative answer to ourquestion essentially requires that one of the possible values for V lies in B and thatnone of the possible values for V lie in not B.

In Yager (2011) Yager suggested a general approach to handling these types ofproblems with the introduction of the measure of opportunity and assurance. Given ameasure μ and a set A the measure of opportunity that μ satisfies A is defined as

�μ(A) = μ(A) ∨ μ(A).

Here we recall that μ in the dual of μ defined such that μ(A) = 1 − μ( A).Secondly the measure of assurance that μ satisfies A is defined as

λμ(A) = μ(A) ∧ μ(A)

We see that since μ(A) = 1 − μ(A), the measure of assurance is essentially the antic-ipation that V is in A and also anticipation V is not in (not A). We see that this whatwe require to affirmatively conclusively say that A is true.

123

R. R. Yager, N. Alajlan

We see that for our example μ(B) = 1 and also μ(B) = 1 but μ(B) = 1−μ(B) = 0.Thus in this case λμ(B) = μ(B) ∧ μ(B) = 0 while �μ = μ(B) ∨ μ(B) = 1. Thus aswe indicated it is possible that V satisfies B but we have no assurance.

While we shall not pursue it in detail our earlier developed ability to extend a mea-sure μ on X to the measure μp on 2X allows us to extend the concepts of opportunity andassurance to situations in which our questions involve objects other sets. Assume V isvariable whose domain is X and our knowledge of its value is expressed by some mea-sure μ, V is μ. Let μ∗ be some other measure on X. Assume our question is, does V sat-isfy μ∗. We can denote this V is μ∗? We can now calculate �μ(A) = μp(μ

∗)∨μp(μ∗)

and λμ(A) = μp(μ∗) ∧ μp(μ

∗).

11 Conclusion

We introduced the use of monotonic set measures for the representation of uncertaininformation. We looked at some important examples of measure-based uncertainty,specifically probability and possibility and necessity. Others types of uncertainty suchas cardinality based and quasi-additive measures were also investigated. We consideredthe problem of determining the representative value of a variable whose uncertain valueis formalized using a monotonic set measure. We noted the central role that averagingand particularly weighted averaging operations play in obtaining these representativevalues. We investigated the use of the Choquet and Sugeno integrals for obtainingthese required averages. We suggested ways of extending a measure defined on a setto the case of fuzzy sets and the power sets of the original set. We briefly consideredthe problem of question answering under uncertain knowledge.

Acknowledgments This work has been supported in part by a Multidisciplinary University Research Ini-tiative (MURI) grant (Number W911NF-09-1-0392) for “Unified Research on Network-based Hard/SoftInformation Fusion”, issued by the US Army Research Office (ARO) under the program management ofDr. John Lavery. This work has also been supported by an ONR grant for “Human Behavior Modeling UsingFuzzy and Soft Technologies”, award number N000141010121. We gratefully appreciate this support

References

Beliakov, G., Pradera, A., & Calvo, T. (2007). Aggregation functions: A guide for practitioners.Heidelberg: Springer.

Bullen, P. S. (2003). Handbook of means and their inequalities. Dordrecht: Kluwer.Dubois, D., & Prade, H. (1987). Necessity measures and the resolution principle. IEEE Transactions

on Systems, Man and Cybernetics, 17, 474–478.Dubois, D., & Prade, H. (1988). Possibility theory : An approach to computerized processing of

uncertainty. New York: Plenum Press.Grabisch, M. (1997). k-order additive discrete fuzzy measures and their representation. Fuzzy Sets and

Systems, 92, 167–189.Klir, G. J. (2006). Uncertainty and information. New York: Wiley.Liu, L., & Yager, R. R. (2008). Classic works of the Dempster-Shafer theory of belief functions: An

introduction. In R. R. Yager & L. Liu (Eds.), Classic works of the Dempster-Shafer theory ofbelief functions (pp. 1–34). Heidelberg: Springer.

Murofushi, T., & Sugeno, M. (2000). Fuzzy measures and fuzzy integrals. In M. Grabisch, T. Murofu-shi, & M. Sugeno (Eds.), Fuzzy measures and integrals (pp. 3–41). Heidelberg: Physica-Verlag.

Pedrycz, W., Skowron, A., & Kreinovich, V. (2008). Handbook of granular computing. New York: Wiley.

123

Representation of uncertain information

Shafer, G. (1976). A mathematical theory of evidence. Princeton, NJ: Princeton University Press.Smets, P. (1988). Belief functions. In P. Smets, E. H. Mamdani, D. Dubois, & H. Prade (Eds.), Non-standard

logics for automated reasoning (pp. 253–277). London: Academic Press.Sugeno, M. (1977). Fuzzy measures and fuzzy integrals: A survey. In M. M. Gupta, G. N. Saridis, & B.

R. Gaines (Eds.), Fuzzy automata and decision process (pp. 89–102). Amsterdam: North-HollandPub.

Wang, Z., & Klir, G. J. (2009). Generalized measure theory. New York: Springer.Wang, Z., Yang, R., & Leung, K.-S. (2010). Nonlinear integrals and their applications in data

mining. Singapore: World Scientific.Yager, R. R. (1981). A procedure for ordering fuzzy subsets of the unit interval. Information Sci-

ences, 24, 143–161.Yager, R. R. (1984). A representation of the probability of a fuzzy subset. Fuzzy Sets and Sys-

tems, 13, 273–283.Yager, R. R. (1988). On ordered weighted averaging aggregation operators in multi-criteria decision

making. IEEE Transactions on Systems, Man and Cybernetics, 18, 183–190.Yager, R. R. (2011a). A measure based approach to the fusion of possibilistic and probabilistic

uncertainty. Fuzzy Optimization and Decision Making, 10, 91–113.Yager, R. R. (2011b). Measures of assurance and opportunity in modeling uncertain information,

Technical Report #MII-3106. Machine Intelligence Institute, Iona College, New Rochelle, NY.Yager, R. R. (2002). On the cardinality index and attitudinal character of fuzzy measures. International

Journal of General Systems, 31, 303–329.Zadeh, L. A. (1968). Probability measures of fuzzy events. Journal of Mathematical Analysis and

Applications, 10, 421–427.Zadeh, L. A. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1, 3–28.Zadeh, L. A. (1979). Fuzzy sets and information granularity. In M. M. Gupta, R. K. Ragade, &

R. R. Yager (Eds.), Advances in fuzzy set theory and applications (pp. 3–18). Amsterdam: North-Holland.

123


Recommended