+ All Categories
Home > Documents > Research Article Lossless Join Decomposition for...

Research Article Lossless Join Decomposition for...

Date post: 05-Jul-2018
Category:
Upload: nguyencong
View: 240 times
Download: 0 times
Share this document with a friend
10
Research Article Lossless Join Decomposition for Extended Possibility-Based Fuzzy Relational Databases Julie Yu-Chih Liu 1,2 1 Department of Information Management, Yuan Ze University, Chung Li, Taoyuan 320, Taiwan 2 Innovation Center for Big Data & Digital Convergence, Yuan Ze University, Chung Li, Taoyuan 320, Taiwan Correspondence should be addressed to Julie Yu-Chih Liu; [email protected] Received 16 January 2014; Revised 5 August 2014; Accepted 7 August 2014; Published 21 August 2014 Academic Editor: Hui-Shen Shen Copyright © 2014 Julie Yu-Chih Liu. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Functional dependency is the basis of database normalization. Various types of fuzzy functional dependencies have been proposed for fuzzy relational database and applied to the process of database normalization. However, the problem of achieving lossless join decomposition occurs when employing the fuzzy functional dependencies to database normalization in an extended possibility- based fuzzy data models. To resolve the problem, this study defined fuzzy functional dependency based on a notion of approximate equality for extended possibility-based fuzzy relational databases. Examples show that the notion is more applicable than other similarity concept to the research related to the extended possibility-based data model. We provide a decomposition method of using the proposed fuzzy functional dependency for database normalization and prove the lossless join property of the decomposition method. 1. Introduction Database normalization plays a crucial role in the design theory of relational database to avoid insertion and deletion and update anomalies in a database. e database normal- ization involves decomposition of a relation schema (table) into several smaller ones. e essential requirement of the decomposition is lossless join property, which ensures that the original relation can be obtained from its decomposed results via combination operations [1]. Several methods have been proposed to design normalized relation schemes based on the keys and functional dependencies of a relation to achieve lossless join decomposition [2, 3]. e design theory has been applied to fuzzy databases, in which uncertain and imprecise information can be represented and manipulated. e fuzzy databases are extended from the classical databases based on fuzzy sets and possibility theory [4], and they can be resemblance-based fuzzy model [5, 6] and possibility-based fuzzy model [7, 8]. In the context of fuzzy databases, fuzzy functional dependency (FFD) has emerged to extend the clas- sical functional dependency to represent functional relation- ships between classes/attributes of objects for fuzzy database models. Various FFD definitions have been proposed in some fuzzy data models for database normalization [9, 10]. However, very few research methods discuss lossless join property for the normalization in possibility-based fuzzy databases. To achieve lossless join decomposition by using FFDs for the possibility-based fuzzy databases is more difficult than for the resemblance-based fuzzy databases, especially for extended possibility-based fuzzy database. e extended possibility-based fuzzy database [7] is an exten- sion of possibility-based fuzzy database [8] by including a resemblance-based fuzzy model [6]. In the fuzzy database, attribute values could be the possibility distributions of the attribute on its domain. Additionally, the elements in a domain have some degree of resemblance. Previous work has applied FFDs on the decomposition for the fuzzy database [1012]. Informally, these FFDs are based on a certain degree of similarity between two attribute values. Namely, two tuples that are similar but not identical might be regarded as redundant. Applying the similarity-based FFDs on rela- tion decomposition prompts the difficulty for lossless join decomposition on two facets: (i) redundancy removal: how to eliminate redundant tuples that are not identical from the Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2014, Article ID 842680, 9 pages http://dx.doi.org/10.1155/2014/842680
Transcript

Research ArticleLossless Join Decomposition for Extended Possibility-BasedFuzzy Relational Databases

Julie Yu-Chih Liu12

1 Department of Information Management Yuan Ze University Chung Li Taoyuan 320 Taiwan2 Innovation Center for Big Data amp Digital Convergence Yuan Ze University Chung Li Taoyuan 320 Taiwan

Correspondence should be addressed to Julie Yu-Chih Liu imyuchihsaturnyzuedutw

Received 16 January 2014 Revised 5 August 2014 Accepted 7 August 2014 Published 21 August 2014

Academic Editor Hui-Shen Shen

Copyright copy 2014 Julie Yu-Chih Liu This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Functional dependency is the basis of database normalization Various types of fuzzy functional dependencies have been proposedfor fuzzy relational database and applied to the process of database normalization However the problem of achieving lossless joindecomposition occurs when employing the fuzzy functional dependencies to database normalization in an extended possibility-based fuzzy data models To resolve the problem this study defined fuzzy functional dependency based on a notion of approximateequality for extended possibility-based fuzzy relational databases Examples show that the notion is more applicable than othersimilarity concept to the research related to the extendedpossibility-based datamodelWeprovide a decompositionmethodof usingthe proposed fuzzy functional dependency for database normalization and prove the lossless join property of the decompositionmethod

1 Introduction

Database normalization plays a crucial role in the designtheory of relational database to avoid insertion and deletionand update anomalies in a database The database normal-ization involves decomposition of a relation schema (table)into several smaller ones The essential requirement of thedecomposition is lossless join property which ensures that theoriginal relation can be obtained from its decomposed resultsvia combination operations [1] Several methods have beenproposed to design normalized relation schemes based onthe keys and functional dependencies of a relation to achievelossless join decomposition [2 3] The design theory hasbeen applied to fuzzy databases in which uncertain andimprecise information can be represented and manipulatedThe fuzzy databases are extended from the classical databasesbased on fuzzy sets and possibility theory [4] and they can beresemblance-based fuzzy model [5 6] and possibility-basedfuzzy model [7 8] In the context of fuzzy databases fuzzyfunctional dependency (FFD) has emerged to extend the clas-sical functional dependency to represent functional relation-ships between classesattributes of objects for fuzzy database

models Various FFD definitions have been proposed in somefuzzy data models for database normalization [9 10]

However very few research methods discuss losslessjoin property for the normalization in possibility-basedfuzzy databases To achieve lossless join decomposition byusing FFDs for the possibility-based fuzzy databases is moredifficult than for the resemblance-based fuzzy databasesespecially for extended possibility-based fuzzy database Theextended possibility-based fuzzy database [7] is an exten-sion of possibility-based fuzzy database [8] by including aresemblance-based fuzzy model [6] In the fuzzy databaseattribute values could be the possibility distributions of theattribute on its domain Additionally the elements in adomain have some degree of resemblance Previous work hasapplied FFDs on the decomposition for the fuzzy database[10ndash12] Informally these FFDs are based on a certain degreeof similarity between two attribute values Namely twotuples that are similar but not identical might be regardedas redundant Applying the similarity-based FFDs on rela-tion decomposition prompts the difficulty for lossless joindecomposition on two facets (i) redundancy removal howto eliminate redundant tuples that are not identical from the

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2014 Article ID 842680 9 pageshttpdxdoiorg1011552014842680

2 Journal of Applied Mathematics

decomposed results so that the results can be later on usedto produce the original relation without losing informationand (ii) tuple merging how to combine two relations viamerging their tuples of which attribute values are similar butnot identical

Complicating this problem further most similarity mea-sures [7 13 14] of values in the form of possibility distributionare not transitive When tuple redundancy is determinedby the similarity measures of nontransitivity the result ofeliminating redundant tuples from a decomposed relationmight be not unique (or order sensitive1) An inconsistentdata redundancy removal not only leads to unstable resultsof data integration as described in [15] but also causesdecomposition results not lossless When the decompositionresult of a relation is not unique the combination of the resultwill have many different outcomes at least one of which isdifferent from the original relation Accordingly the decom-position inevitably violates the lossless join property More-over the nonunique results occur for relation combinationwhen the attribute values to be joinedmerged have similarityrelation of nontransitivity

To avoid nontransitivity Chen et al provided FFD withembedded classical FD [11 16] where redundancy removalis restricted to duplicate tuples But this restriction drawsthe normalization process back to the traditional operationsof crisp data To obtain transitive relationship among tuplessome research applied the max-min transitive closure on therelationship matrix of similarity degree between tuples [17]The max-min transitive closure of a relationship matrix mustbe amatrix withmax-min transitivity [18] By referring to thetransitive closure of the relationship matrix the tuples whichhave similarity higher than a given threshold can be groupedinto disjointed sets The tuples in the same set were regardedas redundant However this approach cannot determine thesimilarity of two tuples by merely examining these two andthe similarity is changed by inserting or deleting other tuplesThe nondeterministic and dynamic characteristic is notapplicable to the practice of databases

To our knowledge very few studies provide a completeguideline to perform normalization that ensures lossless joindecomposition in the fuzzy databasesTherefore the purposeof this study is to fill up this gap This study first proposes anotion of approximate equality which represents the transi-tive equivalent relation among tuples Then it provides newdefinition of FFD and lossless join decomposition based onapproximate equality for the fuzzy databases Both functionaldependencies and lossless join decomposition in a traditionaldatabase are special cases in this proposal Examples showthat the notion is more applicable than other similarityconcepts to the research related to the fuzzy databases Thiswork also provides the method of achieving the lossless joindecomposition for the fuzzy databases

The remainder of this paper is organized as followsSection 2 gives a brief introduction to database normalizationand fuzzy database and the survey of the similarity mea-sures related to the fuzzy database Section 3 demonstratesthe problem of using nontransitive similarity measures fordetermining tuple redundancy and provides a notion ofapproximate equality for it The FFD is then defined based

on the approximate equality in Section 4 Besides the losslessjoin decomposition is proposed for the fuzzy databases andits property is proven as well Section 5 draws the conclusionof this paper

2 Preliminaries

This section first briefly reviews the essential operations forlossless join decomposition in traditional databases Thenit introduces the fuzzy databases considered in this workand the similarity measures of values in form of possibilitydistribution

21 Essential Operations for Lossless Join Decomposition Intraditional relational database a row is called a tuple acolumn header is called an attribute and the table is called arelation Given an119898-ary relation schema 119877(119860

1 1198602 119860

119898)

an instance of 119877 denoted by 119903(119877) is the set of all tuples in 119877Let Α denote a set of attributes 119860

1 1198602 119860

119898 A functional

dependency FD 119860119894rarr 119860

119895existing in 119877(A) represents the

tuples having the same values on attribute 119860119894that must be

identical on 119860119895 where 119860

119894 119860119895isin A Two operations are

related to the lossless join decomposition projection andnatural join The operation projection generates a result byselecting certain attributes from given relation and removingredundant tuples LetΘ denote a set of attributes in119877(A) thatis Θ sub A The result of projection 119877 over attributes Θ sube Ais ΠΘ(119877) = 119905[Θ] | 119905 isin 119903(119877) where 119905[Θ] represents the

composite of values onΘ in tuple 119905The natural join (denotedby lowast) of 1198771015840(XY) and 119877

10158401015840(YZ) is obtained by removing

duplicate attribute from the results of equal join on joinedattribute Y and is denoted as shown below

1198771015840lowast 11987710158401015840

= (1199051015840[X] 1199051015840 [Y] 11990510158401015840 [Z]) 1199051015840 [Y] = 11990510158401015840 [Y] | 1199051015840 isin 119903 (1198771015840)

11990510158401015840isin 119903 (119877

1015840)

(1)

The natural join and projection operations are respectivelyused to combine and decompose relations

Formally decomposition 1198771 1198772 119877

119896 of 119877 is lossless

join if equation 119903(119877) = Π1198771

(119877) lowastΠ1198772

(119877) lowast sdot sdot sdot lowast Π119877119896

(119877) holdsIn other words the lossless join decomposition ensures thatthe combination of the decomposed results of a relation hasno spurious tuple or missing tuple to the relation via naturaljoin operation [1]

For example given a relation 119877 the results of 1198771015840 =

Π11986011198602

(119877) and 11987710158401015840 = Π11986021198603

(119877) are shown in Table 1In this case the decomposition 1198771015840 11987710158401015840 of 119877 has lossless

join property because the natural join result of 1198771015840 and 11987710158401015840 isexactly the same as 119877 (as shown in Table 2)

22The Fuzzy Databases In last two decades fuzzy conceptshave been incorporated in traditional databases [5 8 19]and applied to measure the relation between data [20ndash22] The fuzzy databases enable dealing with imprecisionand uncertainty in the real world based on the theory of

Journal of Applied Mathematics 3

Table 1

119877 1198771015840

11987710158401015840

1198601

1198602

1198603

1198601

1198602

1198602

1198603

1199051 119909 119901 119898 119909 119901 119901 119898

1199052 119910 119901 119898 119910 119901 119902 119899

1199053 119911 119902 119899 119911 119902

Table 2

equal join of 1198771015840 and 11987710158401015840 on 1198602

1198771015840lowast11987710158401015840

1198601

1198602

1198602

1198603

1198601

1198602

1198603

1199051 119909 119901 119901 119898 119909 119901 119898

1199052 119910 119901 119901 119898 119910 119901 119898

1199053 119911 119902 119902 119899 119911 119902 119899

fuzzy sets and possibility distribution theoryThe possibility-based fuzzy theory has been widely applied in environmentalmanagement such as flood-diversion planning [23] waterresources management [24] and air quality management[25] This work considers the extended possibility-baseddatabases proposed by Chen et al [7] because it can captureboth the possibility-based fuzzy model and the resemblance-based fuzzy concept The fuzzy database has drawn muchattention of research on semanticmeasures information pro-cessing update operation and UML class diagram therewith[20 26 27]The data model of the fuzzy databases is a hybridof a possibility-based data model in [8] and a resemblance-based data model in [6] The possibility-based model derivesfrom Zadehrsquos fuzzy theory In the theory [4] a fuzzy set 119865 ona universe of discourse 119880 is described by 120583

119865(119906)119906 | 119906 isin 119880

where 120583119865 119880 rarr [0 1] is a membership function for the

fuzzy set 119865 and 120583119865(119906) denotes the degree of membership

of 119906 in 119865 In a possibility-based database [8] the value ofan attribute 119860 on a domain 119863 is a possibility distribution120587119860= 120587119860(119906)119906 | 119906 isin 119863 where 120587

119860(119906) denotes the possibility

that 119906 is the actual value of 119860 For example 119863 = 1199061 1199062 1199063

and120587119860= 08119906

1 05119906

2 06119906

3 An example of applying the

possibility-based fuzzy theory2 in real world is shown belowConsider a domain of attribute ldquoeye colorrdquo is black brownblue green and a possibility distribution is given below

Asia-Color = 10

black

08

brown

03

blue

01

green (2)

Suppose that Johnrsquos eye color is an ldquoAsia colorrdquo Thenaccording to the interpretation for possibility-based fuzzytheory one concludes that the possibility of Johnrsquos eye beingbrown blue color is 03

In the extended possibility-based database attribute val-ues are represented by possibility distributions of an attributeon its domain and a domain is associated with a similarityrelation of domain elements Formally an 119898-ary relationinstance 119903 on a schema 119877(119860

1 1198602 119860

119898) in the fuzzy

database is a subset of Cartesian product ofΦ(1198601) timesΦ(119860

2) times

sdot sdot sdot times Φ(119860119898) where Φ(119860

119894) represents a set of all possibility

distributions of attribute 119860119894on its domain For a domain

119863119894 a proximity relation is given to describe the resemblance

between domain elements in 119863119894 A proximity is a mapping

119904119894 119863119894times 119863119894rarr [0 1] with reflexivity and symmetry that is

119904119894(119906 119906) = 1 and 119904

119894(119906 V) = 119904

119894(V 119906) The elements in a domain

cannot directly be partitioned into disjoint equivalent classesby a threshold cutting on the proximity relation for thedomain elements

To acquire equivalent classes of a proximity relation ona domain Shenoi et al [18] proposed 120572-proximate relationTwo elements 119906 V isin 119863 are 120572-proximate (denoted by 119906119878+

120572V)

if 119904(119906 V) gt 120572 or there exists a sequence 1199081 1199082 119908

119903isin 119863

Such that min119904(119906 1199081) 119904(1199081 1199082) 119904(119908

119903minus1 119908119903) 119904(119908119903 V) gt

120572 Given a proximity relation 119904 and a threshold 120572 for domain119863 the domain can be partitioned into disjoint subsets (called120572-proximate equivalent classes) such that the elements in apartition are120572-proximateThe equivalent classes are regardedas basic concepts for themethods being reviewed or proposedhereinafter

By extending traditional functional dependency researchhas proposed variety of fuzzy functional dependencies(FFDs) for fuzzy databases [14 21 28] The FFDs are deter-mined by the degree of similarity of attribute values ratherthan by the identity Several similarity measures of attributevalues are proposed for the extended possibility-based frame-works [7 16 20 29] Most of them provide the estimateswithin the interval [0 1] The similarity measures are brieflyrestated hereinafter in which 119904 and 120572 isin [0 1] respec-tively denote the proximity relation and a threshold definedon a given domain 119863 = 119906

1 1199062 119906

119899 120587119860and 120587

119861represent

two possibility distributions on 119863 The degree of closenessbetween 120587

119860and 120587

119861 denoted by 120585

1(120587119860 120587119861) is defined as

follows [29]

1205851(120587119860 120587119861)

=

0 min119906119894119906119895isin

119904 (119906119894 119906119895) lt 120572

min119906119894isin119863

(1 minus1003816100381610038161003816120587119860(119906119894) minus 120587119861(119906119894)1003816100381610038161003816) otherwise

(3)

where119863 = 119906119894isin 119863 120587

119860(119906119894) gt 0 cup 119906

119894isin 119863 120587

119861(119906119894) gt 0

The measure of 1205851may give low degree of similarity for

two values that are very similar to each other for example1205851(09excellent 1good 1good) = 01 To prevent some

counter-intuitive estimates of 1205851 Chen et al defined the

possibility that 120587119860= 120587119861is true as shown below [7] (here and

denotes minimum)

1205852(120587119860 120587119861) = sup119904(119906119894 119906119895)ge120572119906119894 119906119895isin119863

(120587119860(119906119894) and 120587119861(119906119895)) (4)

This assessment is widely adopted in the extended possibility-based databases and is adoptable for the application withsubnormal distribution (ie 120585

2(120587119860 120587119860) lt 1 or see [4] for

details) For normal distribution Chen et al [16] includedidentity relation (denoted by =id) into (4) as follows

1205853(120587119860 120587119861) =

1 120587119860=id120587119861

sup119904(119906119894 119906119895)ge120572119906119894 119906119895isin119863

120587119860(119906119894) and 120587119861(119906119895) ow

(5)

4 Journal of Applied Mathematics

Ma et al defined the similaritymeasure from the perspec-tive of the semantic closeness between two attribute values[20] as shown below

1205854(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860) (6)

where 120575 denotes a semantic inclusion degree Consider thefollowing

120575 (120587119860 120587119861) =

sum119906119894119906119895isin119863119904(119906119894 119906119895)ge120572

(120587119860(119906119894) and 120587119861(119906119895))

sum119899

119895=1120587119861(119906119895)

(7)

Thenotion of 1205854may violate the convention that the similarity

degree of two values lies within [0 1] 1205854(120587119860 120587119861) = 152

for the case that 120587119860= 09excellence 08good and 120587

119861=

07excellence 06goodwhen the similarity of ldquoexcellencerdquoand ldquogoodrdquo is larger than the given threshold that is119904 (excellence good) ge 120572 It is difficult to set up a properthreshold for estimates that range out of [0 1] having anunpredictable upper bound

Liu et al [13] extended the semantic equivalence to ensurethat the result of similarity measure lies within [0 1] Themeasurement adjusts the possibility distributions of valuesbased on 120572-proximate equivalent classes of the domainbefore measuring their similarity Let 119862 = 119862

1 1198622 119862

119903 be

the120572-proximate equivalent classes of domain119863The adjustedvalue of possibility distribution 120587

∙is defined as follows

∙=

119903

119896=1

119906∙119896

1199061015840 1199061015840isin 119862119896 (8)

where 119906∙119896

= sum|119862119896|

119895=1119906119895isin119862119896

120587∙(119906119895)|119862119896| and 119862

119896= 119906119895isin 119862119896

120587∙(119906119895) gt 0 119896 = 1 119903 Then

1205855(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860)

where 120575 (120587119860 120587119861) =

sum119899

119894=1(119860(119906119894) and 119861(119906119894))

sum119899

119895=1119861(119906119895)

(9)

Although the methods mentioned above differ from eachother on measuring similarity of attribute values most of themethods of measuring the similarity of tuples are the sameThe methods adopt the minimum of the similarity of eachpair of attribute values Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

)

and 1199051015840 = (12058710158401198601

1205871015840

1198602

1205871015840

119860119898

) the resemblance of tuples 119905 and1199051015840 denoted by 120578(119905 1199051015840) is given by

120578 (119905 1199051015840) = min119894=1119898

120585∙(120587119860119894

1205871015840

119860119894

) (10)

where 120585∙could be either 120585

1 1205852 1205853 1205854 or 1205855in (3)ndash(9) Tuples

119905 and 1199051015840 are redundant to each other if 120578(119905 1199051015840) ge 120572 where120572 is a given threshold The similarity measure of tuples hasbeen applied to extract representative tuples for reducinginformation redundancy [17]

Fuzzy functional dependency (FFD) is a concept derivedfrom traditional FD Both FFD and FD have several appli-cations on databases for example redundancy elimination

[30] missing data prediction fuzzy data compression [17 31]and lossless join decomposition [10 28 32] In literaturevarious FFDs are defined for different fuzzy data model Forsome fuzzy data representation FFDs are defined based onthe equivalence classes of tuples such as the similarity-basedfuzzy data model [33] In the extended possibility-baseddatabases the definition of FFD is also of variety such as liter-ature [10 14 34 35] One example among the FFD definitionsin the literature is listed below

Definition 1 (see [10] fuzzy functional dependency) Let119883 sim

gt 119884 denote that attribute119883 is fuzzy functional which dependson attribute 119884 in a relation 119877 The FFD 119883 simgt 119884 holds in theinstance 119903(119877) if and only if 120578(119905[119883] 1199051015840[119883]) ≦ 120578(119905[119884] 1199051015840[119884]) forevery 119905 1199051015840 isin 119903(119877)

The example helps in understanding the problem ofapplying the FFDs on relation decomposition in the fuzzydatabases illustrated in Section 3

3 Redundancy Removal and Tuple Merging

Several factors determine whether the relation decomposi-tion possesses the lossless join property They are the waysto decompose a relation to remove redundant tuples and tocombine the decomposed results Redundancy removal is toeliminate redundant tuples If the similarity measures usedto measure tuple redundancy are not transitive the resultof redundancy removal could be nonunique An exampleof nontransitivity is that tuples 1199051015840 and 119905

10158401015840 are redundant toeach other and 119905

1015840 and 11990510158401015840 are redundant as well but 119905 and

11990510158401015840 are not redundant In this case the result of redundancyremoval will be 119905 11990510158401015840 if 1199051015840 is deleted first which differs fromthe one-tuple result (either 119905 or 11990510158401015840) when first deleting thetuples other than 1199051015840 The nontransitivity makes the result ofredundancy removal order sensitive and hinders the losslessjoin decomposition

Nevertheless most well-defined similarity measures [710 20 29] for the values of possibility distribution arereflexive and symmetric but not transitive For example con-sider adopting (4) to measure the similarity of tuples Giventhree values 120587

119860= 06119906

1 10119906

2 06119906

3 1205871015840119860

= 101199061

101199062 06119906

3 and 120587

10158401015840

119860= 10119906

1 06119906

2 06119906

3 then we

have 1205852(120587119860 1205871015840

119860) = 1 120585

2(1205871015840

119860 12058710158401015840

119860) = 1 and 120585

2(120587119860 12058710158401015840

119860) = 06

Considering tuples 119905 = (120587119860) 1199051015840 = (120587

1015840

119860) and 11990510158401015840 = (120587

10158401015840

119860) we

have 120578(119905 1199051015840) ge 120572 and 120578(1199051015840 11990510158401015840) ge 120572 but 120578(119905 11990510158401015840) lt 120572 for any120572 gt 06 according to (10) Thus the similarity measure oftuples is not transitive

In generalizing projection and equal join operations oftraditional database to fuzzy databases when the redundancyremoval is order sensitive it is hard to obtain lossless joindecomposition Consider the case that 119884 simgt 119885 holds in theinstance 119903(119877) of relation 119877(119883 119884 119885) based on Definition 1namely 120578(119905[119884] 1199051015840[119884]) ≦ 120578(119905[119885] 119905

1015840[119885]) for every 119905 119905

1015840isin

119903(119877) Assuming that 119903(119877) consists of three tuples ⟨1199091 119910 119911⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ and119883 is a key attribute it is possible

that the two values in each of pairs (119910 1199101015840) (119910 11991010158401015840) (119911 1199111015840) and(119911 11991110158401015840) are redundant to each other but (1199101015840 11991010158401015840) and (1199111015840 11991110158401015840)

Journal of Applied Mathematics 5

are not Since 119884 simgt 119885 119877 should be decomposed to avoidredundancy After decomposing 119877(119883 119884 119885) to 1198771015840(119883 119884) and11987710158401015840(119884 119885) if tuple ⟨119910 119911⟩ is first removed because it is redun-

dant to tuple ⟨1199101015840 1199111015840⟩ the result of 11987710158401015840(119884 119885) contains twotuples ⟨1199101015840 1199111015840⟩ and ⟨11991010158401015840 11991110158401015840⟩ The natural join of 1198771015840(119883 119884) and11987710158401015840(119884 119885) generates a four-tuple result ⟨119909

1 119910 1199111015840⟩ ⟨1199091 119910 11991110158401015840⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ which contains spurious tuple

To resolve this problem this study proposes the oper-ations of projection and equal join for the fuzzy databaseswhich involves evaluation of redundancy and tuple mergingSince the decomposition of relations is based on FFD itdepends on the similarity of tuples For the data in the fuzzymodel (3)ndash(9) can be used tomeasure the similarity of tuplesand define FFDs in the fuzzy databases However (5) restrictsredundant tuples to those duplicate Equations (3) (4) and(6) lack transitivity Therefore this work adopts (9) and (10)to define approximate equality for the tuples that might notbe identical but have high similarity degreeThe approximateequality enables obtaining a unique result of redundancyremoval

Definition 2 (approximately equal tuples) Two tuples 119905 =

(1205871 1205872 120587

119898) and 119905

1015840= (120587

1015840

1 1205871015840

2 120587

1015840

119898) are approxi-

mately equal denoted by 119905 cong 1199051015840 if it is satisfied that

min119895=1119898

1205855(120587119895 1205871015840

119895) = 1

In other words tuples 119905 and 1199051015840 are approximately equal iftheir similarity 120578(119905 1199051015840) = 1

Lemma 3 The approximate equality of tuples (or attributevalues) is transitive

Proof Based on (9) it is obvious that if 1205855(120587119860 120587119861) = 1 and

1205855(120587119861 120587119866) = 1 then 120585

5(120587119860 120587119866) = 1Thus if 119905 cong 1199051015840 and 1199051015840 cong 11990510158401015840

then 119905 cong 11990510158401015840 based on (10)

The tuples of approximate equality are considered to beredundant to each other The notion of approximate equalitycan be applied to query processingwith the predicate contain-ing fuzzy concept [36] for fuzzy databases in differentmodelsFor simplicity we let120587

119860cong 120587119861denote 120585

5(120587119860 120587119861) = 1 hereafter

Example 4 Given values 120587119860= 075pretty 065cuteness

and 120587119861= 06pretty 07charm 08cuteness on domain

119863 and equivalent classes 1198621= pretty and 119862

2= charm

cuteness for 119863 the average possibilities of 120587119861are 1199061198611

=

061 = 06 and 1199061198612

= (07 + 09)2 = 08 yielding119861= 06pretty 08charm 08cuteness Likewise 119906

1198601=

08 1199061198602

= 065 and 119860

= 075pretty 065charm065cuteness We have 120575(120587

119860 120587119861) = (06+065+065)(06+

08 + 08) = 086 and 120575(120587119861 120587119860) = 19205 = 092 Thus

1205855(120587119860 120587119861) = min086 092 = 086 Given 120587

119866= 06pretty

065charm 085cuteness on 119863 we have 120587119866

cong 120587119861even

though 120587119866is not identical to 120587

119861

Proposition 5 The approximate equality can be used to clas-sify values of the fuzzy database into disjoint sets (equivalenceclasses)

Proof Based on the definition of (9) it is obvious that 1205855

is reflective and symmetric that is 1205855(120587119860 120587119860) = 1 and

1205855(120587119860 120587119861) = 120585

5(120587119861 120587119860) for values 120587

119860and 120587

119861 Besides

approximate equality is transitive according to Lemma 3Therefore two different sets of approximately equal values areeither disjoint sets or same class sets where any two of thevalues are approximately equal to each other

The transitivity of similarity measure is important to anyoperation involving redundancy removal or tuple mergingBesides the measure of transitivity can be applied to cluster-ing methods or data groupings such as the ones in [36 37]

Proposition 6 Given 120587119860and its adjusted value

119860following

(8) 120587119860cong 119860

Proof It is obvious by the definition of (9)

Buckles and Petry first proposed the way of tuplemergingand applied it to remove redundant tuples in a fuzzy database[5] Tuple merging can also be used at join operationThis study extends the tuple merging of Chen et al [16]to be (11) for relation combination as well as redundancyremoval Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

) and 1199051015840=

(1205871015840

1198601

1205871015840

1198602

1205871015840

119860119898

) tuplemerging of 119905 and 1199051015840 denoted by 119905∘1199051015840is given by

119905 ∘ 1199051015840= (1198601

cup1198651015840

1198601

1198602

cup1198651015840

1198602

119860119898

cup1198651015840

119860119898

) (11)

where each ∙(or 1015840∙) is the adjusted value of 120587

∙(or 1205871015840∙)

according to (8) and cup119865denotes fuzzy union For single-value

tuples 119905 = (1205871198601

) and 1199051015840 = (12058710158401198601

) tuple merging is alternativelydenoted by 120587

1198601

∘ 1205871015840

1198601

Lemma 7 Let 120587119860and 1205871015840

119860be two possibility distributions on

the same domain If 120587119860cong 1205871015840

119860 then 120587

119860cong 120587119860∘ 1205871015840

119860cong 1205871015840

119860

Proof Based on (9) and (11) it is obvious that if 1205855(120587119860 1205871015840

119860) =

1 then 1205855(120587119860 120587119860∘ 1205871015840

119860) = 1 and 120585

5(1205871015840

119860 120587119860∘ 1205871015840

119860) = 1

Based on the literature review and Lemma 7 we sum-marize the property of different similarity measures withthreshold 120572 = 1 in Table 3 to show the merit of (9) adoptedin this work

4 Approximate Lossless Join Decomposition

This section first offers the operations for relation decom-position and combination Then it proposes a notion ofapproximate lossless join decomposition (ALJD) which incor-porates fuzzy concepts into lossless join decomposition Italso provides the method to achieve the ALJD

Similar to the works in [37] this study generalizes theprojection and natural join operations in traditional databaseto the fuzzy databases as below Here given a relation 119877Θ denotes a set of attributes in 119877 (ie Θ sub 119877) and 119905[Θ]

denotes the composite of values in tuple 119905 over attribute Θ

6 Journal of Applied Mathematics

Table 3 Property of different similarity measures with threshold 120572 = 1

Propertiesmeasures Equation (5) [16] Equation (6) [20] Equation (9) [13]Transitivity Yes No YesResult after merging Lossless Non-lossless Approximate losslessPredicates for lossless join Identical values Different values Different values

For example given 119905 isin 119903(119877) 119905 = (1205871198601

1205871198602

120587119860119898

) andΘ = (119860

2 1198603 1198604) then 119905[Θ] = (120587

1198602

1205871198603

1205871198604

)

Projection Projecting the instance of relation 119877 on attributesΘ sub 119877 denoted by Π

Θ(119877) is given by

ΠΘ(119877) = 119905 [Θ] ∘ 119905

1015840[Θ] 119905 [Θ] cong 119905

1015840[Θ] | 119905 119905

1015840isin 119903 (119877)

(12)

Natural Join Natural join instances of 119877(119883 119884) and 1198771015840(119884 119885)denoted by 119877 otimes 1198771015840 are defined as follows

119877 otimes 1198771015840

= (119905 [119883] 119905 [119884] ∘ 1199051015840[119884] 119905

1015840[119885]) 119905 [119884] cong 119905

1015840[119884] | 119905 isin 119903 (119877)

1199051015840isin 119903 (119877

1015840)

(13)

In (12) and (13) tuple redundancy is determined by approx-imate equality (eg 119905[119884] cong 119905

1015840[119884] see Definition 2) and

both redundancy removal and tuple combination use tuplemerging in (11)

Proposition 8 Theprojection result of a relation based on (12)must be unique

Proof It can be directly derived from Proposition 5

Based on the operations (12) and (13) the ALJD isformally defined following the extension of approximateequality from tuple level to relation level in Definition 9

Definition 9 (approximately equal relation instances) Tworelation instances 119903(119877) and 119903

1015840(119877) in the fuzzy database are

approximately equal denoted by 119903(119877) asymp 1199031015840(119877) if for every

tuple 119905 isin 119903(119877) there must exist a tuple 1199051015840 isin 1199031015840(119877) such that

119905 cong 1199051015840 and vice versa

Definition 10 (approximate lossless join) A composition1198771 1198772 119877

119896 of a relation 119877 in the fuzzy database is

approximate lossless join if 119903(119877) asymp (Π1198771

(119877) otimes Π1198772

(119877) otimes sdot sdot sdot otimes

Π119877119896(119877))

The approximate lossless join decomposition means thenatural join of all decomposed results of a relation instance isapproximately equal to the original relation instance Morespecifically every tuple in the original relation is approxi-mately equal to one of tuples in the combination result

Proposition 11 Consider the following

ΠΘ(119877) asymp 119905 [Θ] 119905 isin 119903 (119877) (14)

Proof It can be derived from (11) and (12)

Corollary 12 Consider the following

Π119877(119877) asymp 119903 (119877) (15)

Proof It can be derived directly from Proposition 11

The projection of a relation over the same schema asshown in Corollary 12 represents no operations other thanremoving redundant tuples from the instance of the relationvia tuplemerging Corollary 12 shows that the result of redun-dancy removal of a relation instance is approximately equalto the original instance This property is essential for obtain-ing the combination result that is approximately equal tothe original instance after relation decomposition

This study proposes FFD for the decomposition in thefuzzy database as shown below

Definition 13 The FFD119883 simgt 119884 holds in the relation instance119903(119877) if 119903(119877) satisfies that for every 119905 1199051015840 isin 119903(119877) if 119905[119883] cong 1199051015840[119883]then 119905[119884] cong 1199051015840[119884]

Remark 14 An FD in a traditional database is a special caseof the FFD If a FD119883 rarr 119884 holds in 119903(119877) then119883 simgt 119884 holdsas well It is because 119905[Θ] cong 1199051015840[Θ]must be true for any Θ sub 119877

if 119905[Θ] = 1199051015840[Θ]

Lemma 15 Given relations 119877 and 1198771015840 and 119903(1198771015840) asymp 119903(119877) if 119903(119877)satisfies a set F of FFDs then 119903(1198771015840) satisfies 119865

Proof Proof by contradiction we assumed that 119903(1198771015840) asymp 119903(119877)and there exists an FFD119883 simgt 119884 such that 119877 satisfies119883 simgt 119884

and 1198771015840 does not Because119883 simgt 119884 exists in 119903(119877)

for every 1199051 1199052isin 119903 (119877) if 119905

1 [119883] cong 1199052 [

119883]

then 1199051 [119884] cong 1199052 [

119884]

(16)

Since 119883 simgt 119884 does not exist in 119903(1198771015840) there exists 1199051015840

1 1199051015840

2isin

119903(1198771015840) such that 1199051015840

1[119883] cong 119905

1015840

2[119883] and 120578(1199051015840

1[119884] 1199051015840

2[119884]) = 1 Since

119903(1198771015840) asymp 119903(119877) there exists 11990510158401015840 isin 119903(119877) such that 119905 cong 119905

1015840

1and

11990510158401015840cong 1199051015840

2 Then we have 119905 11990510158401015840 isin 119903(119877) and 119905[119883] cong 119905

10158401015840[119883] but

120578(119905[119884] 11990510158401015840[119884]) = 1 which contradicts (16)

It is noted that the FFD in Definition 13 satisfies Arm-strongrsquos axioms (inference rules) including reflexive ruleaugmentation rule and transitive rule3This property enablesthe result of lossless join decomposition that has dependencypreservation property4 [1]

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

2 Journal of Applied Mathematics

decomposed results so that the results can be later on usedto produce the original relation without losing informationand (ii) tuple merging how to combine two relations viamerging their tuples of which attribute values are similar butnot identical

Complicating this problem further most similarity mea-sures [7 13 14] of values in the form of possibility distributionare not transitive When tuple redundancy is determinedby the similarity measures of nontransitivity the result ofeliminating redundant tuples from a decomposed relationmight be not unique (or order sensitive1) An inconsistentdata redundancy removal not only leads to unstable resultsof data integration as described in [15] but also causesdecomposition results not lossless When the decompositionresult of a relation is not unique the combination of the resultwill have many different outcomes at least one of which isdifferent from the original relation Accordingly the decom-position inevitably violates the lossless join property More-over the nonunique results occur for relation combinationwhen the attribute values to be joinedmerged have similarityrelation of nontransitivity

To avoid nontransitivity Chen et al provided FFD withembedded classical FD [11 16] where redundancy removalis restricted to duplicate tuples But this restriction drawsthe normalization process back to the traditional operationsof crisp data To obtain transitive relationship among tuplessome research applied the max-min transitive closure on therelationship matrix of similarity degree between tuples [17]The max-min transitive closure of a relationship matrix mustbe amatrix withmax-min transitivity [18] By referring to thetransitive closure of the relationship matrix the tuples whichhave similarity higher than a given threshold can be groupedinto disjointed sets The tuples in the same set were regardedas redundant However this approach cannot determine thesimilarity of two tuples by merely examining these two andthe similarity is changed by inserting or deleting other tuplesThe nondeterministic and dynamic characteristic is notapplicable to the practice of databases

To our knowledge very few studies provide a completeguideline to perform normalization that ensures lossless joindecomposition in the fuzzy databasesTherefore the purposeof this study is to fill up this gap This study first proposes anotion of approximate equality which represents the transi-tive equivalent relation among tuples Then it provides newdefinition of FFD and lossless join decomposition based onapproximate equality for the fuzzy databases Both functionaldependencies and lossless join decomposition in a traditionaldatabase are special cases in this proposal Examples showthat the notion is more applicable than other similarityconcepts to the research related to the fuzzy databases Thiswork also provides the method of achieving the lossless joindecomposition for the fuzzy databases

The remainder of this paper is organized as followsSection 2 gives a brief introduction to database normalizationand fuzzy database and the survey of the similarity mea-sures related to the fuzzy database Section 3 demonstratesthe problem of using nontransitive similarity measures fordetermining tuple redundancy and provides a notion ofapproximate equality for it The FFD is then defined based

on the approximate equality in Section 4 Besides the losslessjoin decomposition is proposed for the fuzzy databases andits property is proven as well Section 5 draws the conclusionof this paper

2 Preliminaries

This section first briefly reviews the essential operations forlossless join decomposition in traditional databases Thenit introduces the fuzzy databases considered in this workand the similarity measures of values in form of possibilitydistribution

21 Essential Operations for Lossless Join Decomposition Intraditional relational database a row is called a tuple acolumn header is called an attribute and the table is called arelation Given an119898-ary relation schema 119877(119860

1 1198602 119860

119898)

an instance of 119877 denoted by 119903(119877) is the set of all tuples in 119877Let Α denote a set of attributes 119860

1 1198602 119860

119898 A functional

dependency FD 119860119894rarr 119860

119895existing in 119877(A) represents the

tuples having the same values on attribute 119860119894that must be

identical on 119860119895 where 119860

119894 119860119895isin A Two operations are

related to the lossless join decomposition projection andnatural join The operation projection generates a result byselecting certain attributes from given relation and removingredundant tuples LetΘ denote a set of attributes in119877(A) thatis Θ sub A The result of projection 119877 over attributes Θ sube Ais ΠΘ(119877) = 119905[Θ] | 119905 isin 119903(119877) where 119905[Θ] represents the

composite of values onΘ in tuple 119905The natural join (denotedby lowast) of 1198771015840(XY) and 119877

10158401015840(YZ) is obtained by removing

duplicate attribute from the results of equal join on joinedattribute Y and is denoted as shown below

1198771015840lowast 11987710158401015840

= (1199051015840[X] 1199051015840 [Y] 11990510158401015840 [Z]) 1199051015840 [Y] = 11990510158401015840 [Y] | 1199051015840 isin 119903 (1198771015840)

11990510158401015840isin 119903 (119877

1015840)

(1)

The natural join and projection operations are respectivelyused to combine and decompose relations

Formally decomposition 1198771 1198772 119877

119896 of 119877 is lossless

join if equation 119903(119877) = Π1198771

(119877) lowastΠ1198772

(119877) lowast sdot sdot sdot lowast Π119877119896

(119877) holdsIn other words the lossless join decomposition ensures thatthe combination of the decomposed results of a relation hasno spurious tuple or missing tuple to the relation via naturaljoin operation [1]

For example given a relation 119877 the results of 1198771015840 =

Π11986011198602

(119877) and 11987710158401015840 = Π11986021198603

(119877) are shown in Table 1In this case the decomposition 1198771015840 11987710158401015840 of 119877 has lossless

join property because the natural join result of 1198771015840 and 11987710158401015840 isexactly the same as 119877 (as shown in Table 2)

22The Fuzzy Databases In last two decades fuzzy conceptshave been incorporated in traditional databases [5 8 19]and applied to measure the relation between data [20ndash22] The fuzzy databases enable dealing with imprecisionand uncertainty in the real world based on the theory of

Journal of Applied Mathematics 3

Table 1

119877 1198771015840

11987710158401015840

1198601

1198602

1198603

1198601

1198602

1198602

1198603

1199051 119909 119901 119898 119909 119901 119901 119898

1199052 119910 119901 119898 119910 119901 119902 119899

1199053 119911 119902 119899 119911 119902

Table 2

equal join of 1198771015840 and 11987710158401015840 on 1198602

1198771015840lowast11987710158401015840

1198601

1198602

1198602

1198603

1198601

1198602

1198603

1199051 119909 119901 119901 119898 119909 119901 119898

1199052 119910 119901 119901 119898 119910 119901 119898

1199053 119911 119902 119902 119899 119911 119902 119899

fuzzy sets and possibility distribution theoryThe possibility-based fuzzy theory has been widely applied in environmentalmanagement such as flood-diversion planning [23] waterresources management [24] and air quality management[25] This work considers the extended possibility-baseddatabases proposed by Chen et al [7] because it can captureboth the possibility-based fuzzy model and the resemblance-based fuzzy concept The fuzzy database has drawn muchattention of research on semanticmeasures information pro-cessing update operation and UML class diagram therewith[20 26 27]The data model of the fuzzy databases is a hybridof a possibility-based data model in [8] and a resemblance-based data model in [6] The possibility-based model derivesfrom Zadehrsquos fuzzy theory In the theory [4] a fuzzy set 119865 ona universe of discourse 119880 is described by 120583

119865(119906)119906 | 119906 isin 119880

where 120583119865 119880 rarr [0 1] is a membership function for the

fuzzy set 119865 and 120583119865(119906) denotes the degree of membership

of 119906 in 119865 In a possibility-based database [8] the value ofan attribute 119860 on a domain 119863 is a possibility distribution120587119860= 120587119860(119906)119906 | 119906 isin 119863 where 120587

119860(119906) denotes the possibility

that 119906 is the actual value of 119860 For example 119863 = 1199061 1199062 1199063

and120587119860= 08119906

1 05119906

2 06119906

3 An example of applying the

possibility-based fuzzy theory2 in real world is shown belowConsider a domain of attribute ldquoeye colorrdquo is black brownblue green and a possibility distribution is given below

Asia-Color = 10

black

08

brown

03

blue

01

green (2)

Suppose that Johnrsquos eye color is an ldquoAsia colorrdquo Thenaccording to the interpretation for possibility-based fuzzytheory one concludes that the possibility of Johnrsquos eye beingbrown blue color is 03

In the extended possibility-based database attribute val-ues are represented by possibility distributions of an attributeon its domain and a domain is associated with a similarityrelation of domain elements Formally an 119898-ary relationinstance 119903 on a schema 119877(119860

1 1198602 119860

119898) in the fuzzy

database is a subset of Cartesian product ofΦ(1198601) timesΦ(119860

2) times

sdot sdot sdot times Φ(119860119898) where Φ(119860

119894) represents a set of all possibility

distributions of attribute 119860119894on its domain For a domain

119863119894 a proximity relation is given to describe the resemblance

between domain elements in 119863119894 A proximity is a mapping

119904119894 119863119894times 119863119894rarr [0 1] with reflexivity and symmetry that is

119904119894(119906 119906) = 1 and 119904

119894(119906 V) = 119904

119894(V 119906) The elements in a domain

cannot directly be partitioned into disjoint equivalent classesby a threshold cutting on the proximity relation for thedomain elements

To acquire equivalent classes of a proximity relation ona domain Shenoi et al [18] proposed 120572-proximate relationTwo elements 119906 V isin 119863 are 120572-proximate (denoted by 119906119878+

120572V)

if 119904(119906 V) gt 120572 or there exists a sequence 1199081 1199082 119908

119903isin 119863

Such that min119904(119906 1199081) 119904(1199081 1199082) 119904(119908

119903minus1 119908119903) 119904(119908119903 V) gt

120572 Given a proximity relation 119904 and a threshold 120572 for domain119863 the domain can be partitioned into disjoint subsets (called120572-proximate equivalent classes) such that the elements in apartition are120572-proximateThe equivalent classes are regardedas basic concepts for themethods being reviewed or proposedhereinafter

By extending traditional functional dependency researchhas proposed variety of fuzzy functional dependencies(FFDs) for fuzzy databases [14 21 28] The FFDs are deter-mined by the degree of similarity of attribute values ratherthan by the identity Several similarity measures of attributevalues are proposed for the extended possibility-based frame-works [7 16 20 29] Most of them provide the estimateswithin the interval [0 1] The similarity measures are brieflyrestated hereinafter in which 119904 and 120572 isin [0 1] respec-tively denote the proximity relation and a threshold definedon a given domain 119863 = 119906

1 1199062 119906

119899 120587119860and 120587

119861represent

two possibility distributions on 119863 The degree of closenessbetween 120587

119860and 120587

119861 denoted by 120585

1(120587119860 120587119861) is defined as

follows [29]

1205851(120587119860 120587119861)

=

0 min119906119894119906119895isin

119904 (119906119894 119906119895) lt 120572

min119906119894isin119863

(1 minus1003816100381610038161003816120587119860(119906119894) minus 120587119861(119906119894)1003816100381610038161003816) otherwise

(3)

where119863 = 119906119894isin 119863 120587

119860(119906119894) gt 0 cup 119906

119894isin 119863 120587

119861(119906119894) gt 0

The measure of 1205851may give low degree of similarity for

two values that are very similar to each other for example1205851(09excellent 1good 1good) = 01 To prevent some

counter-intuitive estimates of 1205851 Chen et al defined the

possibility that 120587119860= 120587119861is true as shown below [7] (here and

denotes minimum)

1205852(120587119860 120587119861) = sup119904(119906119894 119906119895)ge120572119906119894 119906119895isin119863

(120587119860(119906119894) and 120587119861(119906119895)) (4)

This assessment is widely adopted in the extended possibility-based databases and is adoptable for the application withsubnormal distribution (ie 120585

2(120587119860 120587119860) lt 1 or see [4] for

details) For normal distribution Chen et al [16] includedidentity relation (denoted by =id) into (4) as follows

1205853(120587119860 120587119861) =

1 120587119860=id120587119861

sup119904(119906119894 119906119895)ge120572119906119894 119906119895isin119863

120587119860(119906119894) and 120587119861(119906119895) ow

(5)

4 Journal of Applied Mathematics

Ma et al defined the similaritymeasure from the perspec-tive of the semantic closeness between two attribute values[20] as shown below

1205854(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860) (6)

where 120575 denotes a semantic inclusion degree Consider thefollowing

120575 (120587119860 120587119861) =

sum119906119894119906119895isin119863119904(119906119894 119906119895)ge120572

(120587119860(119906119894) and 120587119861(119906119895))

sum119899

119895=1120587119861(119906119895)

(7)

Thenotion of 1205854may violate the convention that the similarity

degree of two values lies within [0 1] 1205854(120587119860 120587119861) = 152

for the case that 120587119860= 09excellence 08good and 120587

119861=

07excellence 06goodwhen the similarity of ldquoexcellencerdquoand ldquogoodrdquo is larger than the given threshold that is119904 (excellence good) ge 120572 It is difficult to set up a properthreshold for estimates that range out of [0 1] having anunpredictable upper bound

Liu et al [13] extended the semantic equivalence to ensurethat the result of similarity measure lies within [0 1] Themeasurement adjusts the possibility distributions of valuesbased on 120572-proximate equivalent classes of the domainbefore measuring their similarity Let 119862 = 119862

1 1198622 119862

119903 be

the120572-proximate equivalent classes of domain119863The adjustedvalue of possibility distribution 120587

∙is defined as follows

∙=

119903

119896=1

119906∙119896

1199061015840 1199061015840isin 119862119896 (8)

where 119906∙119896

= sum|119862119896|

119895=1119906119895isin119862119896

120587∙(119906119895)|119862119896| and 119862

119896= 119906119895isin 119862119896

120587∙(119906119895) gt 0 119896 = 1 119903 Then

1205855(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860)

where 120575 (120587119860 120587119861) =

sum119899

119894=1(119860(119906119894) and 119861(119906119894))

sum119899

119895=1119861(119906119895)

(9)

Although the methods mentioned above differ from eachother on measuring similarity of attribute values most of themethods of measuring the similarity of tuples are the sameThe methods adopt the minimum of the similarity of eachpair of attribute values Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

)

and 1199051015840 = (12058710158401198601

1205871015840

1198602

1205871015840

119860119898

) the resemblance of tuples 119905 and1199051015840 denoted by 120578(119905 1199051015840) is given by

120578 (119905 1199051015840) = min119894=1119898

120585∙(120587119860119894

1205871015840

119860119894

) (10)

where 120585∙could be either 120585

1 1205852 1205853 1205854 or 1205855in (3)ndash(9) Tuples

119905 and 1199051015840 are redundant to each other if 120578(119905 1199051015840) ge 120572 where120572 is a given threshold The similarity measure of tuples hasbeen applied to extract representative tuples for reducinginformation redundancy [17]

Fuzzy functional dependency (FFD) is a concept derivedfrom traditional FD Both FFD and FD have several appli-cations on databases for example redundancy elimination

[30] missing data prediction fuzzy data compression [17 31]and lossless join decomposition [10 28 32] In literaturevarious FFDs are defined for different fuzzy data model Forsome fuzzy data representation FFDs are defined based onthe equivalence classes of tuples such as the similarity-basedfuzzy data model [33] In the extended possibility-baseddatabases the definition of FFD is also of variety such as liter-ature [10 14 34 35] One example among the FFD definitionsin the literature is listed below

Definition 1 (see [10] fuzzy functional dependency) Let119883 sim

gt 119884 denote that attribute119883 is fuzzy functional which dependson attribute 119884 in a relation 119877 The FFD 119883 simgt 119884 holds in theinstance 119903(119877) if and only if 120578(119905[119883] 1199051015840[119883]) ≦ 120578(119905[119884] 1199051015840[119884]) forevery 119905 1199051015840 isin 119903(119877)

The example helps in understanding the problem ofapplying the FFDs on relation decomposition in the fuzzydatabases illustrated in Section 3

3 Redundancy Removal and Tuple Merging

Several factors determine whether the relation decomposi-tion possesses the lossless join property They are the waysto decompose a relation to remove redundant tuples and tocombine the decomposed results Redundancy removal is toeliminate redundant tuples If the similarity measures usedto measure tuple redundancy are not transitive the resultof redundancy removal could be nonunique An exampleof nontransitivity is that tuples 1199051015840 and 119905

10158401015840 are redundant toeach other and 119905

1015840 and 11990510158401015840 are redundant as well but 119905 and

11990510158401015840 are not redundant In this case the result of redundancyremoval will be 119905 11990510158401015840 if 1199051015840 is deleted first which differs fromthe one-tuple result (either 119905 or 11990510158401015840) when first deleting thetuples other than 1199051015840 The nontransitivity makes the result ofredundancy removal order sensitive and hinders the losslessjoin decomposition

Nevertheless most well-defined similarity measures [710 20 29] for the values of possibility distribution arereflexive and symmetric but not transitive For example con-sider adopting (4) to measure the similarity of tuples Giventhree values 120587

119860= 06119906

1 10119906

2 06119906

3 1205871015840119860

= 101199061

101199062 06119906

3 and 120587

10158401015840

119860= 10119906

1 06119906

2 06119906

3 then we

have 1205852(120587119860 1205871015840

119860) = 1 120585

2(1205871015840

119860 12058710158401015840

119860) = 1 and 120585

2(120587119860 12058710158401015840

119860) = 06

Considering tuples 119905 = (120587119860) 1199051015840 = (120587

1015840

119860) and 11990510158401015840 = (120587

10158401015840

119860) we

have 120578(119905 1199051015840) ge 120572 and 120578(1199051015840 11990510158401015840) ge 120572 but 120578(119905 11990510158401015840) lt 120572 for any120572 gt 06 according to (10) Thus the similarity measure oftuples is not transitive

In generalizing projection and equal join operations oftraditional database to fuzzy databases when the redundancyremoval is order sensitive it is hard to obtain lossless joindecomposition Consider the case that 119884 simgt 119885 holds in theinstance 119903(119877) of relation 119877(119883 119884 119885) based on Definition 1namely 120578(119905[119884] 1199051015840[119884]) ≦ 120578(119905[119885] 119905

1015840[119885]) for every 119905 119905

1015840isin

119903(119877) Assuming that 119903(119877) consists of three tuples ⟨1199091 119910 119911⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ and119883 is a key attribute it is possible

that the two values in each of pairs (119910 1199101015840) (119910 11991010158401015840) (119911 1199111015840) and(119911 11991110158401015840) are redundant to each other but (1199101015840 11991010158401015840) and (1199111015840 11991110158401015840)

Journal of Applied Mathematics 5

are not Since 119884 simgt 119885 119877 should be decomposed to avoidredundancy After decomposing 119877(119883 119884 119885) to 1198771015840(119883 119884) and11987710158401015840(119884 119885) if tuple ⟨119910 119911⟩ is first removed because it is redun-

dant to tuple ⟨1199101015840 1199111015840⟩ the result of 11987710158401015840(119884 119885) contains twotuples ⟨1199101015840 1199111015840⟩ and ⟨11991010158401015840 11991110158401015840⟩ The natural join of 1198771015840(119883 119884) and11987710158401015840(119884 119885) generates a four-tuple result ⟨119909

1 119910 1199111015840⟩ ⟨1199091 119910 11991110158401015840⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ which contains spurious tuple

To resolve this problem this study proposes the oper-ations of projection and equal join for the fuzzy databaseswhich involves evaluation of redundancy and tuple mergingSince the decomposition of relations is based on FFD itdepends on the similarity of tuples For the data in the fuzzymodel (3)ndash(9) can be used tomeasure the similarity of tuplesand define FFDs in the fuzzy databases However (5) restrictsredundant tuples to those duplicate Equations (3) (4) and(6) lack transitivity Therefore this work adopts (9) and (10)to define approximate equality for the tuples that might notbe identical but have high similarity degreeThe approximateequality enables obtaining a unique result of redundancyremoval

Definition 2 (approximately equal tuples) Two tuples 119905 =

(1205871 1205872 120587

119898) and 119905

1015840= (120587

1015840

1 1205871015840

2 120587

1015840

119898) are approxi-

mately equal denoted by 119905 cong 1199051015840 if it is satisfied that

min119895=1119898

1205855(120587119895 1205871015840

119895) = 1

In other words tuples 119905 and 1199051015840 are approximately equal iftheir similarity 120578(119905 1199051015840) = 1

Lemma 3 The approximate equality of tuples (or attributevalues) is transitive

Proof Based on (9) it is obvious that if 1205855(120587119860 120587119861) = 1 and

1205855(120587119861 120587119866) = 1 then 120585

5(120587119860 120587119866) = 1Thus if 119905 cong 1199051015840 and 1199051015840 cong 11990510158401015840

then 119905 cong 11990510158401015840 based on (10)

The tuples of approximate equality are considered to beredundant to each other The notion of approximate equalitycan be applied to query processingwith the predicate contain-ing fuzzy concept [36] for fuzzy databases in differentmodelsFor simplicity we let120587

119860cong 120587119861denote 120585

5(120587119860 120587119861) = 1 hereafter

Example 4 Given values 120587119860= 075pretty 065cuteness

and 120587119861= 06pretty 07charm 08cuteness on domain

119863 and equivalent classes 1198621= pretty and 119862

2= charm

cuteness for 119863 the average possibilities of 120587119861are 1199061198611

=

061 = 06 and 1199061198612

= (07 + 09)2 = 08 yielding119861= 06pretty 08charm 08cuteness Likewise 119906

1198601=

08 1199061198602

= 065 and 119860

= 075pretty 065charm065cuteness We have 120575(120587

119860 120587119861) = (06+065+065)(06+

08 + 08) = 086 and 120575(120587119861 120587119860) = 19205 = 092 Thus

1205855(120587119860 120587119861) = min086 092 = 086 Given 120587

119866= 06pretty

065charm 085cuteness on 119863 we have 120587119866

cong 120587119861even

though 120587119866is not identical to 120587

119861

Proposition 5 The approximate equality can be used to clas-sify values of the fuzzy database into disjoint sets (equivalenceclasses)

Proof Based on the definition of (9) it is obvious that 1205855

is reflective and symmetric that is 1205855(120587119860 120587119860) = 1 and

1205855(120587119860 120587119861) = 120585

5(120587119861 120587119860) for values 120587

119860and 120587

119861 Besides

approximate equality is transitive according to Lemma 3Therefore two different sets of approximately equal values areeither disjoint sets or same class sets where any two of thevalues are approximately equal to each other

The transitivity of similarity measure is important to anyoperation involving redundancy removal or tuple mergingBesides the measure of transitivity can be applied to cluster-ing methods or data groupings such as the ones in [36 37]

Proposition 6 Given 120587119860and its adjusted value

119860following

(8) 120587119860cong 119860

Proof It is obvious by the definition of (9)

Buckles and Petry first proposed the way of tuplemergingand applied it to remove redundant tuples in a fuzzy database[5] Tuple merging can also be used at join operationThis study extends the tuple merging of Chen et al [16]to be (11) for relation combination as well as redundancyremoval Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

) and 1199051015840=

(1205871015840

1198601

1205871015840

1198602

1205871015840

119860119898

) tuplemerging of 119905 and 1199051015840 denoted by 119905∘1199051015840is given by

119905 ∘ 1199051015840= (1198601

cup1198651015840

1198601

1198602

cup1198651015840

1198602

119860119898

cup1198651015840

119860119898

) (11)

where each ∙(or 1015840∙) is the adjusted value of 120587

∙(or 1205871015840∙)

according to (8) and cup119865denotes fuzzy union For single-value

tuples 119905 = (1205871198601

) and 1199051015840 = (12058710158401198601

) tuple merging is alternativelydenoted by 120587

1198601

∘ 1205871015840

1198601

Lemma 7 Let 120587119860and 1205871015840

119860be two possibility distributions on

the same domain If 120587119860cong 1205871015840

119860 then 120587

119860cong 120587119860∘ 1205871015840

119860cong 1205871015840

119860

Proof Based on (9) and (11) it is obvious that if 1205855(120587119860 1205871015840

119860) =

1 then 1205855(120587119860 120587119860∘ 1205871015840

119860) = 1 and 120585

5(1205871015840

119860 120587119860∘ 1205871015840

119860) = 1

Based on the literature review and Lemma 7 we sum-marize the property of different similarity measures withthreshold 120572 = 1 in Table 3 to show the merit of (9) adoptedin this work

4 Approximate Lossless Join Decomposition

This section first offers the operations for relation decom-position and combination Then it proposes a notion ofapproximate lossless join decomposition (ALJD) which incor-porates fuzzy concepts into lossless join decomposition Italso provides the method to achieve the ALJD

Similar to the works in [37] this study generalizes theprojection and natural join operations in traditional databaseto the fuzzy databases as below Here given a relation 119877Θ denotes a set of attributes in 119877 (ie Θ sub 119877) and 119905[Θ]

denotes the composite of values in tuple 119905 over attribute Θ

6 Journal of Applied Mathematics

Table 3 Property of different similarity measures with threshold 120572 = 1

Propertiesmeasures Equation (5) [16] Equation (6) [20] Equation (9) [13]Transitivity Yes No YesResult after merging Lossless Non-lossless Approximate losslessPredicates for lossless join Identical values Different values Different values

For example given 119905 isin 119903(119877) 119905 = (1205871198601

1205871198602

120587119860119898

) andΘ = (119860

2 1198603 1198604) then 119905[Θ] = (120587

1198602

1205871198603

1205871198604

)

Projection Projecting the instance of relation 119877 on attributesΘ sub 119877 denoted by Π

Θ(119877) is given by

ΠΘ(119877) = 119905 [Θ] ∘ 119905

1015840[Θ] 119905 [Θ] cong 119905

1015840[Θ] | 119905 119905

1015840isin 119903 (119877)

(12)

Natural Join Natural join instances of 119877(119883 119884) and 1198771015840(119884 119885)denoted by 119877 otimes 1198771015840 are defined as follows

119877 otimes 1198771015840

= (119905 [119883] 119905 [119884] ∘ 1199051015840[119884] 119905

1015840[119885]) 119905 [119884] cong 119905

1015840[119884] | 119905 isin 119903 (119877)

1199051015840isin 119903 (119877

1015840)

(13)

In (12) and (13) tuple redundancy is determined by approx-imate equality (eg 119905[119884] cong 119905

1015840[119884] see Definition 2) and

both redundancy removal and tuple combination use tuplemerging in (11)

Proposition 8 Theprojection result of a relation based on (12)must be unique

Proof It can be directly derived from Proposition 5

Based on the operations (12) and (13) the ALJD isformally defined following the extension of approximateequality from tuple level to relation level in Definition 9

Definition 9 (approximately equal relation instances) Tworelation instances 119903(119877) and 119903

1015840(119877) in the fuzzy database are

approximately equal denoted by 119903(119877) asymp 1199031015840(119877) if for every

tuple 119905 isin 119903(119877) there must exist a tuple 1199051015840 isin 1199031015840(119877) such that

119905 cong 1199051015840 and vice versa

Definition 10 (approximate lossless join) A composition1198771 1198772 119877

119896 of a relation 119877 in the fuzzy database is

approximate lossless join if 119903(119877) asymp (Π1198771

(119877) otimes Π1198772

(119877) otimes sdot sdot sdot otimes

Π119877119896(119877))

The approximate lossless join decomposition means thenatural join of all decomposed results of a relation instance isapproximately equal to the original relation instance Morespecifically every tuple in the original relation is approxi-mately equal to one of tuples in the combination result

Proposition 11 Consider the following

ΠΘ(119877) asymp 119905 [Θ] 119905 isin 119903 (119877) (14)

Proof It can be derived from (11) and (12)

Corollary 12 Consider the following

Π119877(119877) asymp 119903 (119877) (15)

Proof It can be derived directly from Proposition 11

The projection of a relation over the same schema asshown in Corollary 12 represents no operations other thanremoving redundant tuples from the instance of the relationvia tuplemerging Corollary 12 shows that the result of redun-dancy removal of a relation instance is approximately equalto the original instance This property is essential for obtain-ing the combination result that is approximately equal tothe original instance after relation decomposition

This study proposes FFD for the decomposition in thefuzzy database as shown below

Definition 13 The FFD119883 simgt 119884 holds in the relation instance119903(119877) if 119903(119877) satisfies that for every 119905 1199051015840 isin 119903(119877) if 119905[119883] cong 1199051015840[119883]then 119905[119884] cong 1199051015840[119884]

Remark 14 An FD in a traditional database is a special caseof the FFD If a FD119883 rarr 119884 holds in 119903(119877) then119883 simgt 119884 holdsas well It is because 119905[Θ] cong 1199051015840[Θ]must be true for any Θ sub 119877

if 119905[Θ] = 1199051015840[Θ]

Lemma 15 Given relations 119877 and 1198771015840 and 119903(1198771015840) asymp 119903(119877) if 119903(119877)satisfies a set F of FFDs then 119903(1198771015840) satisfies 119865

Proof Proof by contradiction we assumed that 119903(1198771015840) asymp 119903(119877)and there exists an FFD119883 simgt 119884 such that 119877 satisfies119883 simgt 119884

and 1198771015840 does not Because119883 simgt 119884 exists in 119903(119877)

for every 1199051 1199052isin 119903 (119877) if 119905

1 [119883] cong 1199052 [

119883]

then 1199051 [119884] cong 1199052 [

119884]

(16)

Since 119883 simgt 119884 does not exist in 119903(1198771015840) there exists 1199051015840

1 1199051015840

2isin

119903(1198771015840) such that 1199051015840

1[119883] cong 119905

1015840

2[119883] and 120578(1199051015840

1[119884] 1199051015840

2[119884]) = 1 Since

119903(1198771015840) asymp 119903(119877) there exists 11990510158401015840 isin 119903(119877) such that 119905 cong 119905

1015840

1and

11990510158401015840cong 1199051015840

2 Then we have 119905 11990510158401015840 isin 119903(119877) and 119905[119883] cong 119905

10158401015840[119883] but

120578(119905[119884] 11990510158401015840[119884]) = 1 which contradicts (16)

It is noted that the FFD in Definition 13 satisfies Arm-strongrsquos axioms (inference rules) including reflexive ruleaugmentation rule and transitive rule3This property enablesthe result of lossless join decomposition that has dependencypreservation property4 [1]

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 3

Table 1

119877 1198771015840

11987710158401015840

1198601

1198602

1198603

1198601

1198602

1198602

1198603

1199051 119909 119901 119898 119909 119901 119901 119898

1199052 119910 119901 119898 119910 119901 119902 119899

1199053 119911 119902 119899 119911 119902

Table 2

equal join of 1198771015840 and 11987710158401015840 on 1198602

1198771015840lowast11987710158401015840

1198601

1198602

1198602

1198603

1198601

1198602

1198603

1199051 119909 119901 119901 119898 119909 119901 119898

1199052 119910 119901 119901 119898 119910 119901 119898

1199053 119911 119902 119902 119899 119911 119902 119899

fuzzy sets and possibility distribution theoryThe possibility-based fuzzy theory has been widely applied in environmentalmanagement such as flood-diversion planning [23] waterresources management [24] and air quality management[25] This work considers the extended possibility-baseddatabases proposed by Chen et al [7] because it can captureboth the possibility-based fuzzy model and the resemblance-based fuzzy concept The fuzzy database has drawn muchattention of research on semanticmeasures information pro-cessing update operation and UML class diagram therewith[20 26 27]The data model of the fuzzy databases is a hybridof a possibility-based data model in [8] and a resemblance-based data model in [6] The possibility-based model derivesfrom Zadehrsquos fuzzy theory In the theory [4] a fuzzy set 119865 ona universe of discourse 119880 is described by 120583

119865(119906)119906 | 119906 isin 119880

where 120583119865 119880 rarr [0 1] is a membership function for the

fuzzy set 119865 and 120583119865(119906) denotes the degree of membership

of 119906 in 119865 In a possibility-based database [8] the value ofan attribute 119860 on a domain 119863 is a possibility distribution120587119860= 120587119860(119906)119906 | 119906 isin 119863 where 120587

119860(119906) denotes the possibility

that 119906 is the actual value of 119860 For example 119863 = 1199061 1199062 1199063

and120587119860= 08119906

1 05119906

2 06119906

3 An example of applying the

possibility-based fuzzy theory2 in real world is shown belowConsider a domain of attribute ldquoeye colorrdquo is black brownblue green and a possibility distribution is given below

Asia-Color = 10

black

08

brown

03

blue

01

green (2)

Suppose that Johnrsquos eye color is an ldquoAsia colorrdquo Thenaccording to the interpretation for possibility-based fuzzytheory one concludes that the possibility of Johnrsquos eye beingbrown blue color is 03

In the extended possibility-based database attribute val-ues are represented by possibility distributions of an attributeon its domain and a domain is associated with a similarityrelation of domain elements Formally an 119898-ary relationinstance 119903 on a schema 119877(119860

1 1198602 119860

119898) in the fuzzy

database is a subset of Cartesian product ofΦ(1198601) timesΦ(119860

2) times

sdot sdot sdot times Φ(119860119898) where Φ(119860

119894) represents a set of all possibility

distributions of attribute 119860119894on its domain For a domain

119863119894 a proximity relation is given to describe the resemblance

between domain elements in 119863119894 A proximity is a mapping

119904119894 119863119894times 119863119894rarr [0 1] with reflexivity and symmetry that is

119904119894(119906 119906) = 1 and 119904

119894(119906 V) = 119904

119894(V 119906) The elements in a domain

cannot directly be partitioned into disjoint equivalent classesby a threshold cutting on the proximity relation for thedomain elements

To acquire equivalent classes of a proximity relation ona domain Shenoi et al [18] proposed 120572-proximate relationTwo elements 119906 V isin 119863 are 120572-proximate (denoted by 119906119878+

120572V)

if 119904(119906 V) gt 120572 or there exists a sequence 1199081 1199082 119908

119903isin 119863

Such that min119904(119906 1199081) 119904(1199081 1199082) 119904(119908

119903minus1 119908119903) 119904(119908119903 V) gt

120572 Given a proximity relation 119904 and a threshold 120572 for domain119863 the domain can be partitioned into disjoint subsets (called120572-proximate equivalent classes) such that the elements in apartition are120572-proximateThe equivalent classes are regardedas basic concepts for themethods being reviewed or proposedhereinafter

By extending traditional functional dependency researchhas proposed variety of fuzzy functional dependencies(FFDs) for fuzzy databases [14 21 28] The FFDs are deter-mined by the degree of similarity of attribute values ratherthan by the identity Several similarity measures of attributevalues are proposed for the extended possibility-based frame-works [7 16 20 29] Most of them provide the estimateswithin the interval [0 1] The similarity measures are brieflyrestated hereinafter in which 119904 and 120572 isin [0 1] respec-tively denote the proximity relation and a threshold definedon a given domain 119863 = 119906

1 1199062 119906

119899 120587119860and 120587

119861represent

two possibility distributions on 119863 The degree of closenessbetween 120587

119860and 120587

119861 denoted by 120585

1(120587119860 120587119861) is defined as

follows [29]

1205851(120587119860 120587119861)

=

0 min119906119894119906119895isin

119904 (119906119894 119906119895) lt 120572

min119906119894isin119863

(1 minus1003816100381610038161003816120587119860(119906119894) minus 120587119861(119906119894)1003816100381610038161003816) otherwise

(3)

where119863 = 119906119894isin 119863 120587

119860(119906119894) gt 0 cup 119906

119894isin 119863 120587

119861(119906119894) gt 0

The measure of 1205851may give low degree of similarity for

two values that are very similar to each other for example1205851(09excellent 1good 1good) = 01 To prevent some

counter-intuitive estimates of 1205851 Chen et al defined the

possibility that 120587119860= 120587119861is true as shown below [7] (here and

denotes minimum)

1205852(120587119860 120587119861) = sup119904(119906119894 119906119895)ge120572119906119894 119906119895isin119863

(120587119860(119906119894) and 120587119861(119906119895)) (4)

This assessment is widely adopted in the extended possibility-based databases and is adoptable for the application withsubnormal distribution (ie 120585

2(120587119860 120587119860) lt 1 or see [4] for

details) For normal distribution Chen et al [16] includedidentity relation (denoted by =id) into (4) as follows

1205853(120587119860 120587119861) =

1 120587119860=id120587119861

sup119904(119906119894 119906119895)ge120572119906119894 119906119895isin119863

120587119860(119906119894) and 120587119861(119906119895) ow

(5)

4 Journal of Applied Mathematics

Ma et al defined the similaritymeasure from the perspec-tive of the semantic closeness between two attribute values[20] as shown below

1205854(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860) (6)

where 120575 denotes a semantic inclusion degree Consider thefollowing

120575 (120587119860 120587119861) =

sum119906119894119906119895isin119863119904(119906119894 119906119895)ge120572

(120587119860(119906119894) and 120587119861(119906119895))

sum119899

119895=1120587119861(119906119895)

(7)

Thenotion of 1205854may violate the convention that the similarity

degree of two values lies within [0 1] 1205854(120587119860 120587119861) = 152

for the case that 120587119860= 09excellence 08good and 120587

119861=

07excellence 06goodwhen the similarity of ldquoexcellencerdquoand ldquogoodrdquo is larger than the given threshold that is119904 (excellence good) ge 120572 It is difficult to set up a properthreshold for estimates that range out of [0 1] having anunpredictable upper bound

Liu et al [13] extended the semantic equivalence to ensurethat the result of similarity measure lies within [0 1] Themeasurement adjusts the possibility distributions of valuesbased on 120572-proximate equivalent classes of the domainbefore measuring their similarity Let 119862 = 119862

1 1198622 119862

119903 be

the120572-proximate equivalent classes of domain119863The adjustedvalue of possibility distribution 120587

∙is defined as follows

∙=

119903

119896=1

119906∙119896

1199061015840 1199061015840isin 119862119896 (8)

where 119906∙119896

= sum|119862119896|

119895=1119906119895isin119862119896

120587∙(119906119895)|119862119896| and 119862

119896= 119906119895isin 119862119896

120587∙(119906119895) gt 0 119896 = 1 119903 Then

1205855(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860)

where 120575 (120587119860 120587119861) =

sum119899

119894=1(119860(119906119894) and 119861(119906119894))

sum119899

119895=1119861(119906119895)

(9)

Although the methods mentioned above differ from eachother on measuring similarity of attribute values most of themethods of measuring the similarity of tuples are the sameThe methods adopt the minimum of the similarity of eachpair of attribute values Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

)

and 1199051015840 = (12058710158401198601

1205871015840

1198602

1205871015840

119860119898

) the resemblance of tuples 119905 and1199051015840 denoted by 120578(119905 1199051015840) is given by

120578 (119905 1199051015840) = min119894=1119898

120585∙(120587119860119894

1205871015840

119860119894

) (10)

where 120585∙could be either 120585

1 1205852 1205853 1205854 or 1205855in (3)ndash(9) Tuples

119905 and 1199051015840 are redundant to each other if 120578(119905 1199051015840) ge 120572 where120572 is a given threshold The similarity measure of tuples hasbeen applied to extract representative tuples for reducinginformation redundancy [17]

Fuzzy functional dependency (FFD) is a concept derivedfrom traditional FD Both FFD and FD have several appli-cations on databases for example redundancy elimination

[30] missing data prediction fuzzy data compression [17 31]and lossless join decomposition [10 28 32] In literaturevarious FFDs are defined for different fuzzy data model Forsome fuzzy data representation FFDs are defined based onthe equivalence classes of tuples such as the similarity-basedfuzzy data model [33] In the extended possibility-baseddatabases the definition of FFD is also of variety such as liter-ature [10 14 34 35] One example among the FFD definitionsin the literature is listed below

Definition 1 (see [10] fuzzy functional dependency) Let119883 sim

gt 119884 denote that attribute119883 is fuzzy functional which dependson attribute 119884 in a relation 119877 The FFD 119883 simgt 119884 holds in theinstance 119903(119877) if and only if 120578(119905[119883] 1199051015840[119883]) ≦ 120578(119905[119884] 1199051015840[119884]) forevery 119905 1199051015840 isin 119903(119877)

The example helps in understanding the problem ofapplying the FFDs on relation decomposition in the fuzzydatabases illustrated in Section 3

3 Redundancy Removal and Tuple Merging

Several factors determine whether the relation decomposi-tion possesses the lossless join property They are the waysto decompose a relation to remove redundant tuples and tocombine the decomposed results Redundancy removal is toeliminate redundant tuples If the similarity measures usedto measure tuple redundancy are not transitive the resultof redundancy removal could be nonunique An exampleof nontransitivity is that tuples 1199051015840 and 119905

10158401015840 are redundant toeach other and 119905

1015840 and 11990510158401015840 are redundant as well but 119905 and

11990510158401015840 are not redundant In this case the result of redundancyremoval will be 119905 11990510158401015840 if 1199051015840 is deleted first which differs fromthe one-tuple result (either 119905 or 11990510158401015840) when first deleting thetuples other than 1199051015840 The nontransitivity makes the result ofredundancy removal order sensitive and hinders the losslessjoin decomposition

Nevertheless most well-defined similarity measures [710 20 29] for the values of possibility distribution arereflexive and symmetric but not transitive For example con-sider adopting (4) to measure the similarity of tuples Giventhree values 120587

119860= 06119906

1 10119906

2 06119906

3 1205871015840119860

= 101199061

101199062 06119906

3 and 120587

10158401015840

119860= 10119906

1 06119906

2 06119906

3 then we

have 1205852(120587119860 1205871015840

119860) = 1 120585

2(1205871015840

119860 12058710158401015840

119860) = 1 and 120585

2(120587119860 12058710158401015840

119860) = 06

Considering tuples 119905 = (120587119860) 1199051015840 = (120587

1015840

119860) and 11990510158401015840 = (120587

10158401015840

119860) we

have 120578(119905 1199051015840) ge 120572 and 120578(1199051015840 11990510158401015840) ge 120572 but 120578(119905 11990510158401015840) lt 120572 for any120572 gt 06 according to (10) Thus the similarity measure oftuples is not transitive

In generalizing projection and equal join operations oftraditional database to fuzzy databases when the redundancyremoval is order sensitive it is hard to obtain lossless joindecomposition Consider the case that 119884 simgt 119885 holds in theinstance 119903(119877) of relation 119877(119883 119884 119885) based on Definition 1namely 120578(119905[119884] 1199051015840[119884]) ≦ 120578(119905[119885] 119905

1015840[119885]) for every 119905 119905

1015840isin

119903(119877) Assuming that 119903(119877) consists of three tuples ⟨1199091 119910 119911⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ and119883 is a key attribute it is possible

that the two values in each of pairs (119910 1199101015840) (119910 11991010158401015840) (119911 1199111015840) and(119911 11991110158401015840) are redundant to each other but (1199101015840 11991010158401015840) and (1199111015840 11991110158401015840)

Journal of Applied Mathematics 5

are not Since 119884 simgt 119885 119877 should be decomposed to avoidredundancy After decomposing 119877(119883 119884 119885) to 1198771015840(119883 119884) and11987710158401015840(119884 119885) if tuple ⟨119910 119911⟩ is first removed because it is redun-

dant to tuple ⟨1199101015840 1199111015840⟩ the result of 11987710158401015840(119884 119885) contains twotuples ⟨1199101015840 1199111015840⟩ and ⟨11991010158401015840 11991110158401015840⟩ The natural join of 1198771015840(119883 119884) and11987710158401015840(119884 119885) generates a four-tuple result ⟨119909

1 119910 1199111015840⟩ ⟨1199091 119910 11991110158401015840⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ which contains spurious tuple

To resolve this problem this study proposes the oper-ations of projection and equal join for the fuzzy databaseswhich involves evaluation of redundancy and tuple mergingSince the decomposition of relations is based on FFD itdepends on the similarity of tuples For the data in the fuzzymodel (3)ndash(9) can be used tomeasure the similarity of tuplesand define FFDs in the fuzzy databases However (5) restrictsredundant tuples to those duplicate Equations (3) (4) and(6) lack transitivity Therefore this work adopts (9) and (10)to define approximate equality for the tuples that might notbe identical but have high similarity degreeThe approximateequality enables obtaining a unique result of redundancyremoval

Definition 2 (approximately equal tuples) Two tuples 119905 =

(1205871 1205872 120587

119898) and 119905

1015840= (120587

1015840

1 1205871015840

2 120587

1015840

119898) are approxi-

mately equal denoted by 119905 cong 1199051015840 if it is satisfied that

min119895=1119898

1205855(120587119895 1205871015840

119895) = 1

In other words tuples 119905 and 1199051015840 are approximately equal iftheir similarity 120578(119905 1199051015840) = 1

Lemma 3 The approximate equality of tuples (or attributevalues) is transitive

Proof Based on (9) it is obvious that if 1205855(120587119860 120587119861) = 1 and

1205855(120587119861 120587119866) = 1 then 120585

5(120587119860 120587119866) = 1Thus if 119905 cong 1199051015840 and 1199051015840 cong 11990510158401015840

then 119905 cong 11990510158401015840 based on (10)

The tuples of approximate equality are considered to beredundant to each other The notion of approximate equalitycan be applied to query processingwith the predicate contain-ing fuzzy concept [36] for fuzzy databases in differentmodelsFor simplicity we let120587

119860cong 120587119861denote 120585

5(120587119860 120587119861) = 1 hereafter

Example 4 Given values 120587119860= 075pretty 065cuteness

and 120587119861= 06pretty 07charm 08cuteness on domain

119863 and equivalent classes 1198621= pretty and 119862

2= charm

cuteness for 119863 the average possibilities of 120587119861are 1199061198611

=

061 = 06 and 1199061198612

= (07 + 09)2 = 08 yielding119861= 06pretty 08charm 08cuteness Likewise 119906

1198601=

08 1199061198602

= 065 and 119860

= 075pretty 065charm065cuteness We have 120575(120587

119860 120587119861) = (06+065+065)(06+

08 + 08) = 086 and 120575(120587119861 120587119860) = 19205 = 092 Thus

1205855(120587119860 120587119861) = min086 092 = 086 Given 120587

119866= 06pretty

065charm 085cuteness on 119863 we have 120587119866

cong 120587119861even

though 120587119866is not identical to 120587

119861

Proposition 5 The approximate equality can be used to clas-sify values of the fuzzy database into disjoint sets (equivalenceclasses)

Proof Based on the definition of (9) it is obvious that 1205855

is reflective and symmetric that is 1205855(120587119860 120587119860) = 1 and

1205855(120587119860 120587119861) = 120585

5(120587119861 120587119860) for values 120587

119860and 120587

119861 Besides

approximate equality is transitive according to Lemma 3Therefore two different sets of approximately equal values areeither disjoint sets or same class sets where any two of thevalues are approximately equal to each other

The transitivity of similarity measure is important to anyoperation involving redundancy removal or tuple mergingBesides the measure of transitivity can be applied to cluster-ing methods or data groupings such as the ones in [36 37]

Proposition 6 Given 120587119860and its adjusted value

119860following

(8) 120587119860cong 119860

Proof It is obvious by the definition of (9)

Buckles and Petry first proposed the way of tuplemergingand applied it to remove redundant tuples in a fuzzy database[5] Tuple merging can also be used at join operationThis study extends the tuple merging of Chen et al [16]to be (11) for relation combination as well as redundancyremoval Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

) and 1199051015840=

(1205871015840

1198601

1205871015840

1198602

1205871015840

119860119898

) tuplemerging of 119905 and 1199051015840 denoted by 119905∘1199051015840is given by

119905 ∘ 1199051015840= (1198601

cup1198651015840

1198601

1198602

cup1198651015840

1198602

119860119898

cup1198651015840

119860119898

) (11)

where each ∙(or 1015840∙) is the adjusted value of 120587

∙(or 1205871015840∙)

according to (8) and cup119865denotes fuzzy union For single-value

tuples 119905 = (1205871198601

) and 1199051015840 = (12058710158401198601

) tuple merging is alternativelydenoted by 120587

1198601

∘ 1205871015840

1198601

Lemma 7 Let 120587119860and 1205871015840

119860be two possibility distributions on

the same domain If 120587119860cong 1205871015840

119860 then 120587

119860cong 120587119860∘ 1205871015840

119860cong 1205871015840

119860

Proof Based on (9) and (11) it is obvious that if 1205855(120587119860 1205871015840

119860) =

1 then 1205855(120587119860 120587119860∘ 1205871015840

119860) = 1 and 120585

5(1205871015840

119860 120587119860∘ 1205871015840

119860) = 1

Based on the literature review and Lemma 7 we sum-marize the property of different similarity measures withthreshold 120572 = 1 in Table 3 to show the merit of (9) adoptedin this work

4 Approximate Lossless Join Decomposition

This section first offers the operations for relation decom-position and combination Then it proposes a notion ofapproximate lossless join decomposition (ALJD) which incor-porates fuzzy concepts into lossless join decomposition Italso provides the method to achieve the ALJD

Similar to the works in [37] this study generalizes theprojection and natural join operations in traditional databaseto the fuzzy databases as below Here given a relation 119877Θ denotes a set of attributes in 119877 (ie Θ sub 119877) and 119905[Θ]

denotes the composite of values in tuple 119905 over attribute Θ

6 Journal of Applied Mathematics

Table 3 Property of different similarity measures with threshold 120572 = 1

Propertiesmeasures Equation (5) [16] Equation (6) [20] Equation (9) [13]Transitivity Yes No YesResult after merging Lossless Non-lossless Approximate losslessPredicates for lossless join Identical values Different values Different values

For example given 119905 isin 119903(119877) 119905 = (1205871198601

1205871198602

120587119860119898

) andΘ = (119860

2 1198603 1198604) then 119905[Θ] = (120587

1198602

1205871198603

1205871198604

)

Projection Projecting the instance of relation 119877 on attributesΘ sub 119877 denoted by Π

Θ(119877) is given by

ΠΘ(119877) = 119905 [Θ] ∘ 119905

1015840[Θ] 119905 [Θ] cong 119905

1015840[Θ] | 119905 119905

1015840isin 119903 (119877)

(12)

Natural Join Natural join instances of 119877(119883 119884) and 1198771015840(119884 119885)denoted by 119877 otimes 1198771015840 are defined as follows

119877 otimes 1198771015840

= (119905 [119883] 119905 [119884] ∘ 1199051015840[119884] 119905

1015840[119885]) 119905 [119884] cong 119905

1015840[119884] | 119905 isin 119903 (119877)

1199051015840isin 119903 (119877

1015840)

(13)

In (12) and (13) tuple redundancy is determined by approx-imate equality (eg 119905[119884] cong 119905

1015840[119884] see Definition 2) and

both redundancy removal and tuple combination use tuplemerging in (11)

Proposition 8 Theprojection result of a relation based on (12)must be unique

Proof It can be directly derived from Proposition 5

Based on the operations (12) and (13) the ALJD isformally defined following the extension of approximateequality from tuple level to relation level in Definition 9

Definition 9 (approximately equal relation instances) Tworelation instances 119903(119877) and 119903

1015840(119877) in the fuzzy database are

approximately equal denoted by 119903(119877) asymp 1199031015840(119877) if for every

tuple 119905 isin 119903(119877) there must exist a tuple 1199051015840 isin 1199031015840(119877) such that

119905 cong 1199051015840 and vice versa

Definition 10 (approximate lossless join) A composition1198771 1198772 119877

119896 of a relation 119877 in the fuzzy database is

approximate lossless join if 119903(119877) asymp (Π1198771

(119877) otimes Π1198772

(119877) otimes sdot sdot sdot otimes

Π119877119896(119877))

The approximate lossless join decomposition means thenatural join of all decomposed results of a relation instance isapproximately equal to the original relation instance Morespecifically every tuple in the original relation is approxi-mately equal to one of tuples in the combination result

Proposition 11 Consider the following

ΠΘ(119877) asymp 119905 [Θ] 119905 isin 119903 (119877) (14)

Proof It can be derived from (11) and (12)

Corollary 12 Consider the following

Π119877(119877) asymp 119903 (119877) (15)

Proof It can be derived directly from Proposition 11

The projection of a relation over the same schema asshown in Corollary 12 represents no operations other thanremoving redundant tuples from the instance of the relationvia tuplemerging Corollary 12 shows that the result of redun-dancy removal of a relation instance is approximately equalto the original instance This property is essential for obtain-ing the combination result that is approximately equal tothe original instance after relation decomposition

This study proposes FFD for the decomposition in thefuzzy database as shown below

Definition 13 The FFD119883 simgt 119884 holds in the relation instance119903(119877) if 119903(119877) satisfies that for every 119905 1199051015840 isin 119903(119877) if 119905[119883] cong 1199051015840[119883]then 119905[119884] cong 1199051015840[119884]

Remark 14 An FD in a traditional database is a special caseof the FFD If a FD119883 rarr 119884 holds in 119903(119877) then119883 simgt 119884 holdsas well It is because 119905[Θ] cong 1199051015840[Θ]must be true for any Θ sub 119877

if 119905[Θ] = 1199051015840[Θ]

Lemma 15 Given relations 119877 and 1198771015840 and 119903(1198771015840) asymp 119903(119877) if 119903(119877)satisfies a set F of FFDs then 119903(1198771015840) satisfies 119865

Proof Proof by contradiction we assumed that 119903(1198771015840) asymp 119903(119877)and there exists an FFD119883 simgt 119884 such that 119877 satisfies119883 simgt 119884

and 1198771015840 does not Because119883 simgt 119884 exists in 119903(119877)

for every 1199051 1199052isin 119903 (119877) if 119905

1 [119883] cong 1199052 [

119883]

then 1199051 [119884] cong 1199052 [

119884]

(16)

Since 119883 simgt 119884 does not exist in 119903(1198771015840) there exists 1199051015840

1 1199051015840

2isin

119903(1198771015840) such that 1199051015840

1[119883] cong 119905

1015840

2[119883] and 120578(1199051015840

1[119884] 1199051015840

2[119884]) = 1 Since

119903(1198771015840) asymp 119903(119877) there exists 11990510158401015840 isin 119903(119877) such that 119905 cong 119905

1015840

1and

11990510158401015840cong 1199051015840

2 Then we have 119905 11990510158401015840 isin 119903(119877) and 119905[119883] cong 119905

10158401015840[119883] but

120578(119905[119884] 11990510158401015840[119884]) = 1 which contradicts (16)

It is noted that the FFD in Definition 13 satisfies Arm-strongrsquos axioms (inference rules) including reflexive ruleaugmentation rule and transitive rule3This property enablesthe result of lossless join decomposition that has dependencypreservation property4 [1]

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

4 Journal of Applied Mathematics

Ma et al defined the similaritymeasure from the perspec-tive of the semantic closeness between two attribute values[20] as shown below

1205854(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860) (6)

where 120575 denotes a semantic inclusion degree Consider thefollowing

120575 (120587119860 120587119861) =

sum119906119894119906119895isin119863119904(119906119894 119906119895)ge120572

(120587119860(119906119894) and 120587119861(119906119895))

sum119899

119895=1120587119861(119906119895)

(7)

Thenotion of 1205854may violate the convention that the similarity

degree of two values lies within [0 1] 1205854(120587119860 120587119861) = 152

for the case that 120587119860= 09excellence 08good and 120587

119861=

07excellence 06goodwhen the similarity of ldquoexcellencerdquoand ldquogoodrdquo is larger than the given threshold that is119904 (excellence good) ge 120572 It is difficult to set up a properthreshold for estimates that range out of [0 1] having anunpredictable upper bound

Liu et al [13] extended the semantic equivalence to ensurethat the result of similarity measure lies within [0 1] Themeasurement adjusts the possibility distributions of valuesbased on 120572-proximate equivalent classes of the domainbefore measuring their similarity Let 119862 = 119862

1 1198622 119862

119903 be

the120572-proximate equivalent classes of domain119863The adjustedvalue of possibility distribution 120587

∙is defined as follows

∙=

119903

119896=1

119906∙119896

1199061015840 1199061015840isin 119862119896 (8)

where 119906∙119896

= sum|119862119896|

119895=1119906119895isin119862119896

120587∙(119906119895)|119862119896| and 119862

119896= 119906119895isin 119862119896

120587∙(119906119895) gt 0 119896 = 1 119903 Then

1205855(120587119860 120587119861) = 120575 (120587

119860 120587119861) and 120575 (120587

119861 120587119860)

where 120575 (120587119860 120587119861) =

sum119899

119894=1(119860(119906119894) and 119861(119906119894))

sum119899

119895=1119861(119906119895)

(9)

Although the methods mentioned above differ from eachother on measuring similarity of attribute values most of themethods of measuring the similarity of tuples are the sameThe methods adopt the minimum of the similarity of eachpair of attribute values Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

)

and 1199051015840 = (12058710158401198601

1205871015840

1198602

1205871015840

119860119898

) the resemblance of tuples 119905 and1199051015840 denoted by 120578(119905 1199051015840) is given by

120578 (119905 1199051015840) = min119894=1119898

120585∙(120587119860119894

1205871015840

119860119894

) (10)

where 120585∙could be either 120585

1 1205852 1205853 1205854 or 1205855in (3)ndash(9) Tuples

119905 and 1199051015840 are redundant to each other if 120578(119905 1199051015840) ge 120572 where120572 is a given threshold The similarity measure of tuples hasbeen applied to extract representative tuples for reducinginformation redundancy [17]

Fuzzy functional dependency (FFD) is a concept derivedfrom traditional FD Both FFD and FD have several appli-cations on databases for example redundancy elimination

[30] missing data prediction fuzzy data compression [17 31]and lossless join decomposition [10 28 32] In literaturevarious FFDs are defined for different fuzzy data model Forsome fuzzy data representation FFDs are defined based onthe equivalence classes of tuples such as the similarity-basedfuzzy data model [33] In the extended possibility-baseddatabases the definition of FFD is also of variety such as liter-ature [10 14 34 35] One example among the FFD definitionsin the literature is listed below

Definition 1 (see [10] fuzzy functional dependency) Let119883 sim

gt 119884 denote that attribute119883 is fuzzy functional which dependson attribute 119884 in a relation 119877 The FFD 119883 simgt 119884 holds in theinstance 119903(119877) if and only if 120578(119905[119883] 1199051015840[119883]) ≦ 120578(119905[119884] 1199051015840[119884]) forevery 119905 1199051015840 isin 119903(119877)

The example helps in understanding the problem ofapplying the FFDs on relation decomposition in the fuzzydatabases illustrated in Section 3

3 Redundancy Removal and Tuple Merging

Several factors determine whether the relation decomposi-tion possesses the lossless join property They are the waysto decompose a relation to remove redundant tuples and tocombine the decomposed results Redundancy removal is toeliminate redundant tuples If the similarity measures usedto measure tuple redundancy are not transitive the resultof redundancy removal could be nonunique An exampleof nontransitivity is that tuples 1199051015840 and 119905

10158401015840 are redundant toeach other and 119905

1015840 and 11990510158401015840 are redundant as well but 119905 and

11990510158401015840 are not redundant In this case the result of redundancyremoval will be 119905 11990510158401015840 if 1199051015840 is deleted first which differs fromthe one-tuple result (either 119905 or 11990510158401015840) when first deleting thetuples other than 1199051015840 The nontransitivity makes the result ofredundancy removal order sensitive and hinders the losslessjoin decomposition

Nevertheless most well-defined similarity measures [710 20 29] for the values of possibility distribution arereflexive and symmetric but not transitive For example con-sider adopting (4) to measure the similarity of tuples Giventhree values 120587

119860= 06119906

1 10119906

2 06119906

3 1205871015840119860

= 101199061

101199062 06119906

3 and 120587

10158401015840

119860= 10119906

1 06119906

2 06119906

3 then we

have 1205852(120587119860 1205871015840

119860) = 1 120585

2(1205871015840

119860 12058710158401015840

119860) = 1 and 120585

2(120587119860 12058710158401015840

119860) = 06

Considering tuples 119905 = (120587119860) 1199051015840 = (120587

1015840

119860) and 11990510158401015840 = (120587

10158401015840

119860) we

have 120578(119905 1199051015840) ge 120572 and 120578(1199051015840 11990510158401015840) ge 120572 but 120578(119905 11990510158401015840) lt 120572 for any120572 gt 06 according to (10) Thus the similarity measure oftuples is not transitive

In generalizing projection and equal join operations oftraditional database to fuzzy databases when the redundancyremoval is order sensitive it is hard to obtain lossless joindecomposition Consider the case that 119884 simgt 119885 holds in theinstance 119903(119877) of relation 119877(119883 119884 119885) based on Definition 1namely 120578(119905[119884] 1199051015840[119884]) ≦ 120578(119905[119885] 119905

1015840[119885]) for every 119905 119905

1015840isin

119903(119877) Assuming that 119903(119877) consists of three tuples ⟨1199091 119910 119911⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ and119883 is a key attribute it is possible

that the two values in each of pairs (119910 1199101015840) (119910 11991010158401015840) (119911 1199111015840) and(119911 11991110158401015840) are redundant to each other but (1199101015840 11991010158401015840) and (1199111015840 11991110158401015840)

Journal of Applied Mathematics 5

are not Since 119884 simgt 119885 119877 should be decomposed to avoidredundancy After decomposing 119877(119883 119884 119885) to 1198771015840(119883 119884) and11987710158401015840(119884 119885) if tuple ⟨119910 119911⟩ is first removed because it is redun-

dant to tuple ⟨1199101015840 1199111015840⟩ the result of 11987710158401015840(119884 119885) contains twotuples ⟨1199101015840 1199111015840⟩ and ⟨11991010158401015840 11991110158401015840⟩ The natural join of 1198771015840(119883 119884) and11987710158401015840(119884 119885) generates a four-tuple result ⟨119909

1 119910 1199111015840⟩ ⟨1199091 119910 11991110158401015840⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ which contains spurious tuple

To resolve this problem this study proposes the oper-ations of projection and equal join for the fuzzy databaseswhich involves evaluation of redundancy and tuple mergingSince the decomposition of relations is based on FFD itdepends on the similarity of tuples For the data in the fuzzymodel (3)ndash(9) can be used tomeasure the similarity of tuplesand define FFDs in the fuzzy databases However (5) restrictsredundant tuples to those duplicate Equations (3) (4) and(6) lack transitivity Therefore this work adopts (9) and (10)to define approximate equality for the tuples that might notbe identical but have high similarity degreeThe approximateequality enables obtaining a unique result of redundancyremoval

Definition 2 (approximately equal tuples) Two tuples 119905 =

(1205871 1205872 120587

119898) and 119905

1015840= (120587

1015840

1 1205871015840

2 120587

1015840

119898) are approxi-

mately equal denoted by 119905 cong 1199051015840 if it is satisfied that

min119895=1119898

1205855(120587119895 1205871015840

119895) = 1

In other words tuples 119905 and 1199051015840 are approximately equal iftheir similarity 120578(119905 1199051015840) = 1

Lemma 3 The approximate equality of tuples (or attributevalues) is transitive

Proof Based on (9) it is obvious that if 1205855(120587119860 120587119861) = 1 and

1205855(120587119861 120587119866) = 1 then 120585

5(120587119860 120587119866) = 1Thus if 119905 cong 1199051015840 and 1199051015840 cong 11990510158401015840

then 119905 cong 11990510158401015840 based on (10)

The tuples of approximate equality are considered to beredundant to each other The notion of approximate equalitycan be applied to query processingwith the predicate contain-ing fuzzy concept [36] for fuzzy databases in differentmodelsFor simplicity we let120587

119860cong 120587119861denote 120585

5(120587119860 120587119861) = 1 hereafter

Example 4 Given values 120587119860= 075pretty 065cuteness

and 120587119861= 06pretty 07charm 08cuteness on domain

119863 and equivalent classes 1198621= pretty and 119862

2= charm

cuteness for 119863 the average possibilities of 120587119861are 1199061198611

=

061 = 06 and 1199061198612

= (07 + 09)2 = 08 yielding119861= 06pretty 08charm 08cuteness Likewise 119906

1198601=

08 1199061198602

= 065 and 119860

= 075pretty 065charm065cuteness We have 120575(120587

119860 120587119861) = (06+065+065)(06+

08 + 08) = 086 and 120575(120587119861 120587119860) = 19205 = 092 Thus

1205855(120587119860 120587119861) = min086 092 = 086 Given 120587

119866= 06pretty

065charm 085cuteness on 119863 we have 120587119866

cong 120587119861even

though 120587119866is not identical to 120587

119861

Proposition 5 The approximate equality can be used to clas-sify values of the fuzzy database into disjoint sets (equivalenceclasses)

Proof Based on the definition of (9) it is obvious that 1205855

is reflective and symmetric that is 1205855(120587119860 120587119860) = 1 and

1205855(120587119860 120587119861) = 120585

5(120587119861 120587119860) for values 120587

119860and 120587

119861 Besides

approximate equality is transitive according to Lemma 3Therefore two different sets of approximately equal values areeither disjoint sets or same class sets where any two of thevalues are approximately equal to each other

The transitivity of similarity measure is important to anyoperation involving redundancy removal or tuple mergingBesides the measure of transitivity can be applied to cluster-ing methods or data groupings such as the ones in [36 37]

Proposition 6 Given 120587119860and its adjusted value

119860following

(8) 120587119860cong 119860

Proof It is obvious by the definition of (9)

Buckles and Petry first proposed the way of tuplemergingand applied it to remove redundant tuples in a fuzzy database[5] Tuple merging can also be used at join operationThis study extends the tuple merging of Chen et al [16]to be (11) for relation combination as well as redundancyremoval Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

) and 1199051015840=

(1205871015840

1198601

1205871015840

1198602

1205871015840

119860119898

) tuplemerging of 119905 and 1199051015840 denoted by 119905∘1199051015840is given by

119905 ∘ 1199051015840= (1198601

cup1198651015840

1198601

1198602

cup1198651015840

1198602

119860119898

cup1198651015840

119860119898

) (11)

where each ∙(or 1015840∙) is the adjusted value of 120587

∙(or 1205871015840∙)

according to (8) and cup119865denotes fuzzy union For single-value

tuples 119905 = (1205871198601

) and 1199051015840 = (12058710158401198601

) tuple merging is alternativelydenoted by 120587

1198601

∘ 1205871015840

1198601

Lemma 7 Let 120587119860and 1205871015840

119860be two possibility distributions on

the same domain If 120587119860cong 1205871015840

119860 then 120587

119860cong 120587119860∘ 1205871015840

119860cong 1205871015840

119860

Proof Based on (9) and (11) it is obvious that if 1205855(120587119860 1205871015840

119860) =

1 then 1205855(120587119860 120587119860∘ 1205871015840

119860) = 1 and 120585

5(1205871015840

119860 120587119860∘ 1205871015840

119860) = 1

Based on the literature review and Lemma 7 we sum-marize the property of different similarity measures withthreshold 120572 = 1 in Table 3 to show the merit of (9) adoptedin this work

4 Approximate Lossless Join Decomposition

This section first offers the operations for relation decom-position and combination Then it proposes a notion ofapproximate lossless join decomposition (ALJD) which incor-porates fuzzy concepts into lossless join decomposition Italso provides the method to achieve the ALJD

Similar to the works in [37] this study generalizes theprojection and natural join operations in traditional databaseto the fuzzy databases as below Here given a relation 119877Θ denotes a set of attributes in 119877 (ie Θ sub 119877) and 119905[Θ]

denotes the composite of values in tuple 119905 over attribute Θ

6 Journal of Applied Mathematics

Table 3 Property of different similarity measures with threshold 120572 = 1

Propertiesmeasures Equation (5) [16] Equation (6) [20] Equation (9) [13]Transitivity Yes No YesResult after merging Lossless Non-lossless Approximate losslessPredicates for lossless join Identical values Different values Different values

For example given 119905 isin 119903(119877) 119905 = (1205871198601

1205871198602

120587119860119898

) andΘ = (119860

2 1198603 1198604) then 119905[Θ] = (120587

1198602

1205871198603

1205871198604

)

Projection Projecting the instance of relation 119877 on attributesΘ sub 119877 denoted by Π

Θ(119877) is given by

ΠΘ(119877) = 119905 [Θ] ∘ 119905

1015840[Θ] 119905 [Θ] cong 119905

1015840[Θ] | 119905 119905

1015840isin 119903 (119877)

(12)

Natural Join Natural join instances of 119877(119883 119884) and 1198771015840(119884 119885)denoted by 119877 otimes 1198771015840 are defined as follows

119877 otimes 1198771015840

= (119905 [119883] 119905 [119884] ∘ 1199051015840[119884] 119905

1015840[119885]) 119905 [119884] cong 119905

1015840[119884] | 119905 isin 119903 (119877)

1199051015840isin 119903 (119877

1015840)

(13)

In (12) and (13) tuple redundancy is determined by approx-imate equality (eg 119905[119884] cong 119905

1015840[119884] see Definition 2) and

both redundancy removal and tuple combination use tuplemerging in (11)

Proposition 8 Theprojection result of a relation based on (12)must be unique

Proof It can be directly derived from Proposition 5

Based on the operations (12) and (13) the ALJD isformally defined following the extension of approximateequality from tuple level to relation level in Definition 9

Definition 9 (approximately equal relation instances) Tworelation instances 119903(119877) and 119903

1015840(119877) in the fuzzy database are

approximately equal denoted by 119903(119877) asymp 1199031015840(119877) if for every

tuple 119905 isin 119903(119877) there must exist a tuple 1199051015840 isin 1199031015840(119877) such that

119905 cong 1199051015840 and vice versa

Definition 10 (approximate lossless join) A composition1198771 1198772 119877

119896 of a relation 119877 in the fuzzy database is

approximate lossless join if 119903(119877) asymp (Π1198771

(119877) otimes Π1198772

(119877) otimes sdot sdot sdot otimes

Π119877119896(119877))

The approximate lossless join decomposition means thenatural join of all decomposed results of a relation instance isapproximately equal to the original relation instance Morespecifically every tuple in the original relation is approxi-mately equal to one of tuples in the combination result

Proposition 11 Consider the following

ΠΘ(119877) asymp 119905 [Θ] 119905 isin 119903 (119877) (14)

Proof It can be derived from (11) and (12)

Corollary 12 Consider the following

Π119877(119877) asymp 119903 (119877) (15)

Proof It can be derived directly from Proposition 11

The projection of a relation over the same schema asshown in Corollary 12 represents no operations other thanremoving redundant tuples from the instance of the relationvia tuplemerging Corollary 12 shows that the result of redun-dancy removal of a relation instance is approximately equalto the original instance This property is essential for obtain-ing the combination result that is approximately equal tothe original instance after relation decomposition

This study proposes FFD for the decomposition in thefuzzy database as shown below

Definition 13 The FFD119883 simgt 119884 holds in the relation instance119903(119877) if 119903(119877) satisfies that for every 119905 1199051015840 isin 119903(119877) if 119905[119883] cong 1199051015840[119883]then 119905[119884] cong 1199051015840[119884]

Remark 14 An FD in a traditional database is a special caseof the FFD If a FD119883 rarr 119884 holds in 119903(119877) then119883 simgt 119884 holdsas well It is because 119905[Θ] cong 1199051015840[Θ]must be true for any Θ sub 119877

if 119905[Θ] = 1199051015840[Θ]

Lemma 15 Given relations 119877 and 1198771015840 and 119903(1198771015840) asymp 119903(119877) if 119903(119877)satisfies a set F of FFDs then 119903(1198771015840) satisfies 119865

Proof Proof by contradiction we assumed that 119903(1198771015840) asymp 119903(119877)and there exists an FFD119883 simgt 119884 such that 119877 satisfies119883 simgt 119884

and 1198771015840 does not Because119883 simgt 119884 exists in 119903(119877)

for every 1199051 1199052isin 119903 (119877) if 119905

1 [119883] cong 1199052 [

119883]

then 1199051 [119884] cong 1199052 [

119884]

(16)

Since 119883 simgt 119884 does not exist in 119903(1198771015840) there exists 1199051015840

1 1199051015840

2isin

119903(1198771015840) such that 1199051015840

1[119883] cong 119905

1015840

2[119883] and 120578(1199051015840

1[119884] 1199051015840

2[119884]) = 1 Since

119903(1198771015840) asymp 119903(119877) there exists 11990510158401015840 isin 119903(119877) such that 119905 cong 119905

1015840

1and

11990510158401015840cong 1199051015840

2 Then we have 119905 11990510158401015840 isin 119903(119877) and 119905[119883] cong 119905

10158401015840[119883] but

120578(119905[119884] 11990510158401015840[119884]) = 1 which contradicts (16)

It is noted that the FFD in Definition 13 satisfies Arm-strongrsquos axioms (inference rules) including reflexive ruleaugmentation rule and transitive rule3This property enablesthe result of lossless join decomposition that has dependencypreservation property4 [1]

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 5

are not Since 119884 simgt 119885 119877 should be decomposed to avoidredundancy After decomposing 119877(119883 119884 119885) to 1198771015840(119883 119884) and11987710158401015840(119884 119885) if tuple ⟨119910 119911⟩ is first removed because it is redun-

dant to tuple ⟨1199101015840 1199111015840⟩ the result of 11987710158401015840(119884 119885) contains twotuples ⟨1199101015840 1199111015840⟩ and ⟨11991010158401015840 11991110158401015840⟩ The natural join of 1198771015840(119883 119884) and11987710158401015840(119884 119885) generates a four-tuple result ⟨119909

1 119910 1199111015840⟩ ⟨1199091 119910 11991110158401015840⟩

⟨1199092 1199101015840 1199111015840⟩ ⟨1199093 11991010158401015840 11991110158401015840⟩ which contains spurious tuple

To resolve this problem this study proposes the oper-ations of projection and equal join for the fuzzy databaseswhich involves evaluation of redundancy and tuple mergingSince the decomposition of relations is based on FFD itdepends on the similarity of tuples For the data in the fuzzymodel (3)ndash(9) can be used tomeasure the similarity of tuplesand define FFDs in the fuzzy databases However (5) restrictsredundant tuples to those duplicate Equations (3) (4) and(6) lack transitivity Therefore this work adopts (9) and (10)to define approximate equality for the tuples that might notbe identical but have high similarity degreeThe approximateequality enables obtaining a unique result of redundancyremoval

Definition 2 (approximately equal tuples) Two tuples 119905 =

(1205871 1205872 120587

119898) and 119905

1015840= (120587

1015840

1 1205871015840

2 120587

1015840

119898) are approxi-

mately equal denoted by 119905 cong 1199051015840 if it is satisfied that

min119895=1119898

1205855(120587119895 1205871015840

119895) = 1

In other words tuples 119905 and 1199051015840 are approximately equal iftheir similarity 120578(119905 1199051015840) = 1

Lemma 3 The approximate equality of tuples (or attributevalues) is transitive

Proof Based on (9) it is obvious that if 1205855(120587119860 120587119861) = 1 and

1205855(120587119861 120587119866) = 1 then 120585

5(120587119860 120587119866) = 1Thus if 119905 cong 1199051015840 and 1199051015840 cong 11990510158401015840

then 119905 cong 11990510158401015840 based on (10)

The tuples of approximate equality are considered to beredundant to each other The notion of approximate equalitycan be applied to query processingwith the predicate contain-ing fuzzy concept [36] for fuzzy databases in differentmodelsFor simplicity we let120587

119860cong 120587119861denote 120585

5(120587119860 120587119861) = 1 hereafter

Example 4 Given values 120587119860= 075pretty 065cuteness

and 120587119861= 06pretty 07charm 08cuteness on domain

119863 and equivalent classes 1198621= pretty and 119862

2= charm

cuteness for 119863 the average possibilities of 120587119861are 1199061198611

=

061 = 06 and 1199061198612

= (07 + 09)2 = 08 yielding119861= 06pretty 08charm 08cuteness Likewise 119906

1198601=

08 1199061198602

= 065 and 119860

= 075pretty 065charm065cuteness We have 120575(120587

119860 120587119861) = (06+065+065)(06+

08 + 08) = 086 and 120575(120587119861 120587119860) = 19205 = 092 Thus

1205855(120587119860 120587119861) = min086 092 = 086 Given 120587

119866= 06pretty

065charm 085cuteness on 119863 we have 120587119866

cong 120587119861even

though 120587119866is not identical to 120587

119861

Proposition 5 The approximate equality can be used to clas-sify values of the fuzzy database into disjoint sets (equivalenceclasses)

Proof Based on the definition of (9) it is obvious that 1205855

is reflective and symmetric that is 1205855(120587119860 120587119860) = 1 and

1205855(120587119860 120587119861) = 120585

5(120587119861 120587119860) for values 120587

119860and 120587

119861 Besides

approximate equality is transitive according to Lemma 3Therefore two different sets of approximately equal values areeither disjoint sets or same class sets where any two of thevalues are approximately equal to each other

The transitivity of similarity measure is important to anyoperation involving redundancy removal or tuple mergingBesides the measure of transitivity can be applied to cluster-ing methods or data groupings such as the ones in [36 37]

Proposition 6 Given 120587119860and its adjusted value

119860following

(8) 120587119860cong 119860

Proof It is obvious by the definition of (9)

Buckles and Petry first proposed the way of tuplemergingand applied it to remove redundant tuples in a fuzzy database[5] Tuple merging can also be used at join operationThis study extends the tuple merging of Chen et al [16]to be (11) for relation combination as well as redundancyremoval Given tuples 119905 = (120587

1198601

1205871198602

120587119860119898

) and 1199051015840=

(1205871015840

1198601

1205871015840

1198602

1205871015840

119860119898

) tuplemerging of 119905 and 1199051015840 denoted by 119905∘1199051015840is given by

119905 ∘ 1199051015840= (1198601

cup1198651015840

1198601

1198602

cup1198651015840

1198602

119860119898

cup1198651015840

119860119898

) (11)

where each ∙(or 1015840∙) is the adjusted value of 120587

∙(or 1205871015840∙)

according to (8) and cup119865denotes fuzzy union For single-value

tuples 119905 = (1205871198601

) and 1199051015840 = (12058710158401198601

) tuple merging is alternativelydenoted by 120587

1198601

∘ 1205871015840

1198601

Lemma 7 Let 120587119860and 1205871015840

119860be two possibility distributions on

the same domain If 120587119860cong 1205871015840

119860 then 120587

119860cong 120587119860∘ 1205871015840

119860cong 1205871015840

119860

Proof Based on (9) and (11) it is obvious that if 1205855(120587119860 1205871015840

119860) =

1 then 1205855(120587119860 120587119860∘ 1205871015840

119860) = 1 and 120585

5(1205871015840

119860 120587119860∘ 1205871015840

119860) = 1

Based on the literature review and Lemma 7 we sum-marize the property of different similarity measures withthreshold 120572 = 1 in Table 3 to show the merit of (9) adoptedin this work

4 Approximate Lossless Join Decomposition

This section first offers the operations for relation decom-position and combination Then it proposes a notion ofapproximate lossless join decomposition (ALJD) which incor-porates fuzzy concepts into lossless join decomposition Italso provides the method to achieve the ALJD

Similar to the works in [37] this study generalizes theprojection and natural join operations in traditional databaseto the fuzzy databases as below Here given a relation 119877Θ denotes a set of attributes in 119877 (ie Θ sub 119877) and 119905[Θ]

denotes the composite of values in tuple 119905 over attribute Θ

6 Journal of Applied Mathematics

Table 3 Property of different similarity measures with threshold 120572 = 1

Propertiesmeasures Equation (5) [16] Equation (6) [20] Equation (9) [13]Transitivity Yes No YesResult after merging Lossless Non-lossless Approximate losslessPredicates for lossless join Identical values Different values Different values

For example given 119905 isin 119903(119877) 119905 = (1205871198601

1205871198602

120587119860119898

) andΘ = (119860

2 1198603 1198604) then 119905[Θ] = (120587

1198602

1205871198603

1205871198604

)

Projection Projecting the instance of relation 119877 on attributesΘ sub 119877 denoted by Π

Θ(119877) is given by

ΠΘ(119877) = 119905 [Θ] ∘ 119905

1015840[Θ] 119905 [Θ] cong 119905

1015840[Θ] | 119905 119905

1015840isin 119903 (119877)

(12)

Natural Join Natural join instances of 119877(119883 119884) and 1198771015840(119884 119885)denoted by 119877 otimes 1198771015840 are defined as follows

119877 otimes 1198771015840

= (119905 [119883] 119905 [119884] ∘ 1199051015840[119884] 119905

1015840[119885]) 119905 [119884] cong 119905

1015840[119884] | 119905 isin 119903 (119877)

1199051015840isin 119903 (119877

1015840)

(13)

In (12) and (13) tuple redundancy is determined by approx-imate equality (eg 119905[119884] cong 119905

1015840[119884] see Definition 2) and

both redundancy removal and tuple combination use tuplemerging in (11)

Proposition 8 Theprojection result of a relation based on (12)must be unique

Proof It can be directly derived from Proposition 5

Based on the operations (12) and (13) the ALJD isformally defined following the extension of approximateequality from tuple level to relation level in Definition 9

Definition 9 (approximately equal relation instances) Tworelation instances 119903(119877) and 119903

1015840(119877) in the fuzzy database are

approximately equal denoted by 119903(119877) asymp 1199031015840(119877) if for every

tuple 119905 isin 119903(119877) there must exist a tuple 1199051015840 isin 1199031015840(119877) such that

119905 cong 1199051015840 and vice versa

Definition 10 (approximate lossless join) A composition1198771 1198772 119877

119896 of a relation 119877 in the fuzzy database is

approximate lossless join if 119903(119877) asymp (Π1198771

(119877) otimes Π1198772

(119877) otimes sdot sdot sdot otimes

Π119877119896(119877))

The approximate lossless join decomposition means thenatural join of all decomposed results of a relation instance isapproximately equal to the original relation instance Morespecifically every tuple in the original relation is approxi-mately equal to one of tuples in the combination result

Proposition 11 Consider the following

ΠΘ(119877) asymp 119905 [Θ] 119905 isin 119903 (119877) (14)

Proof It can be derived from (11) and (12)

Corollary 12 Consider the following

Π119877(119877) asymp 119903 (119877) (15)

Proof It can be derived directly from Proposition 11

The projection of a relation over the same schema asshown in Corollary 12 represents no operations other thanremoving redundant tuples from the instance of the relationvia tuplemerging Corollary 12 shows that the result of redun-dancy removal of a relation instance is approximately equalto the original instance This property is essential for obtain-ing the combination result that is approximately equal tothe original instance after relation decomposition

This study proposes FFD for the decomposition in thefuzzy database as shown below

Definition 13 The FFD119883 simgt 119884 holds in the relation instance119903(119877) if 119903(119877) satisfies that for every 119905 1199051015840 isin 119903(119877) if 119905[119883] cong 1199051015840[119883]then 119905[119884] cong 1199051015840[119884]

Remark 14 An FD in a traditional database is a special caseof the FFD If a FD119883 rarr 119884 holds in 119903(119877) then119883 simgt 119884 holdsas well It is because 119905[Θ] cong 1199051015840[Θ]must be true for any Θ sub 119877

if 119905[Θ] = 1199051015840[Θ]

Lemma 15 Given relations 119877 and 1198771015840 and 119903(1198771015840) asymp 119903(119877) if 119903(119877)satisfies a set F of FFDs then 119903(1198771015840) satisfies 119865

Proof Proof by contradiction we assumed that 119903(1198771015840) asymp 119903(119877)and there exists an FFD119883 simgt 119884 such that 119877 satisfies119883 simgt 119884

and 1198771015840 does not Because119883 simgt 119884 exists in 119903(119877)

for every 1199051 1199052isin 119903 (119877) if 119905

1 [119883] cong 1199052 [

119883]

then 1199051 [119884] cong 1199052 [

119884]

(16)

Since 119883 simgt 119884 does not exist in 119903(1198771015840) there exists 1199051015840

1 1199051015840

2isin

119903(1198771015840) such that 1199051015840

1[119883] cong 119905

1015840

2[119883] and 120578(1199051015840

1[119884] 1199051015840

2[119884]) = 1 Since

119903(1198771015840) asymp 119903(119877) there exists 11990510158401015840 isin 119903(119877) such that 119905 cong 119905

1015840

1and

11990510158401015840cong 1199051015840

2 Then we have 119905 11990510158401015840 isin 119903(119877) and 119905[119883] cong 119905

10158401015840[119883] but

120578(119905[119884] 11990510158401015840[119884]) = 1 which contradicts (16)

It is noted that the FFD in Definition 13 satisfies Arm-strongrsquos axioms (inference rules) including reflexive ruleaugmentation rule and transitive rule3This property enablesthe result of lossless join decomposition that has dependencypreservation property4 [1]

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

6 Journal of Applied Mathematics

Table 3 Property of different similarity measures with threshold 120572 = 1

Propertiesmeasures Equation (5) [16] Equation (6) [20] Equation (9) [13]Transitivity Yes No YesResult after merging Lossless Non-lossless Approximate losslessPredicates for lossless join Identical values Different values Different values

For example given 119905 isin 119903(119877) 119905 = (1205871198601

1205871198602

120587119860119898

) andΘ = (119860

2 1198603 1198604) then 119905[Θ] = (120587

1198602

1205871198603

1205871198604

)

Projection Projecting the instance of relation 119877 on attributesΘ sub 119877 denoted by Π

Θ(119877) is given by

ΠΘ(119877) = 119905 [Θ] ∘ 119905

1015840[Θ] 119905 [Θ] cong 119905

1015840[Θ] | 119905 119905

1015840isin 119903 (119877)

(12)

Natural Join Natural join instances of 119877(119883 119884) and 1198771015840(119884 119885)denoted by 119877 otimes 1198771015840 are defined as follows

119877 otimes 1198771015840

= (119905 [119883] 119905 [119884] ∘ 1199051015840[119884] 119905

1015840[119885]) 119905 [119884] cong 119905

1015840[119884] | 119905 isin 119903 (119877)

1199051015840isin 119903 (119877

1015840)

(13)

In (12) and (13) tuple redundancy is determined by approx-imate equality (eg 119905[119884] cong 119905

1015840[119884] see Definition 2) and

both redundancy removal and tuple combination use tuplemerging in (11)

Proposition 8 Theprojection result of a relation based on (12)must be unique

Proof It can be directly derived from Proposition 5

Based on the operations (12) and (13) the ALJD isformally defined following the extension of approximateequality from tuple level to relation level in Definition 9

Definition 9 (approximately equal relation instances) Tworelation instances 119903(119877) and 119903

1015840(119877) in the fuzzy database are

approximately equal denoted by 119903(119877) asymp 1199031015840(119877) if for every

tuple 119905 isin 119903(119877) there must exist a tuple 1199051015840 isin 1199031015840(119877) such that

119905 cong 1199051015840 and vice versa

Definition 10 (approximate lossless join) A composition1198771 1198772 119877

119896 of a relation 119877 in the fuzzy database is

approximate lossless join if 119903(119877) asymp (Π1198771

(119877) otimes Π1198772

(119877) otimes sdot sdot sdot otimes

Π119877119896(119877))

The approximate lossless join decomposition means thenatural join of all decomposed results of a relation instance isapproximately equal to the original relation instance Morespecifically every tuple in the original relation is approxi-mately equal to one of tuples in the combination result

Proposition 11 Consider the following

ΠΘ(119877) asymp 119905 [Θ] 119905 isin 119903 (119877) (14)

Proof It can be derived from (11) and (12)

Corollary 12 Consider the following

Π119877(119877) asymp 119903 (119877) (15)

Proof It can be derived directly from Proposition 11

The projection of a relation over the same schema asshown in Corollary 12 represents no operations other thanremoving redundant tuples from the instance of the relationvia tuplemerging Corollary 12 shows that the result of redun-dancy removal of a relation instance is approximately equalto the original instance This property is essential for obtain-ing the combination result that is approximately equal tothe original instance after relation decomposition

This study proposes FFD for the decomposition in thefuzzy database as shown below

Definition 13 The FFD119883 simgt 119884 holds in the relation instance119903(119877) if 119903(119877) satisfies that for every 119905 1199051015840 isin 119903(119877) if 119905[119883] cong 1199051015840[119883]then 119905[119884] cong 1199051015840[119884]

Remark 14 An FD in a traditional database is a special caseof the FFD If a FD119883 rarr 119884 holds in 119903(119877) then119883 simgt 119884 holdsas well It is because 119905[Θ] cong 1199051015840[Θ]must be true for any Θ sub 119877

if 119905[Θ] = 1199051015840[Θ]

Lemma 15 Given relations 119877 and 1198771015840 and 119903(1198771015840) asymp 119903(119877) if 119903(119877)satisfies a set F of FFDs then 119903(1198771015840) satisfies 119865

Proof Proof by contradiction we assumed that 119903(1198771015840) asymp 119903(119877)and there exists an FFD119883 simgt 119884 such that 119877 satisfies119883 simgt 119884

and 1198771015840 does not Because119883 simgt 119884 exists in 119903(119877)

for every 1199051 1199052isin 119903 (119877) if 119905

1 [119883] cong 1199052 [

119883]

then 1199051 [119884] cong 1199052 [

119884]

(16)

Since 119883 simgt 119884 does not exist in 119903(1198771015840) there exists 1199051015840

1 1199051015840

2isin

119903(1198771015840) such that 1199051015840

1[119883] cong 119905

1015840

2[119883] and 120578(1199051015840

1[119884] 1199051015840

2[119884]) = 1 Since

119903(1198771015840) asymp 119903(119877) there exists 11990510158401015840 isin 119903(119877) such that 119905 cong 119905

1015840

1and

11990510158401015840cong 1199051015840

2 Then we have 119905 11990510158401015840 isin 119903(119877) and 119905[119883] cong 119905

10158401015840[119883] but

120578(119905[119884] 11990510158401015840[119884]) = 1 which contradicts (16)

It is noted that the FFD in Definition 13 satisfies Arm-strongrsquos axioms (inference rules) including reflexive ruleaugmentation rule and transitive rule3This property enablesthe result of lossless join decomposition that has dependencypreservation property4 [1]

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 7

Inputs R and F where R is a relation and F is the set of FFDs exists in 119903(119877)Step 1 LetR = 119877 and 1198651015840 be the set of all 119891 isin 119865 that are not trivialStep 2 For a 1198911015840119883 simgt 119884 in 1198651015840 Let 1198771015840 be the relation chosen fromR such that both119883 and 119884 are in 1198771015840Step 3 If X is not a key attribute in 1198771015840 do followings

(1) decompose 1198771015840 into 1198771and 119877

2 such that 119877

1= Π(119883119884)

(1198771015840) and 119877

2= Π1198771015840(119883)(1198771015840)

(2) let 1198651015840 = 1198651015840 minus 1198911015840 (remove 1198911015840 from 1198651015840)

(3) letR = R cup 1198771 1198772 minus 119877

1015840

Step 4 Go to Step 2 if 1198651015840 is not emptyOutput isR the set of relations decomposed from 119877Note 1198771015840(119883) represents the list of all attributes in 1198771015840 other than X

Algorithm 1 ALJD Algorithm

Lemma 16 Let 119877(119860119883 119884) be a relation and 119883 simgt 119884 be anFFD in 119903(119877) If 119877

1= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) thenthe decomposition 119877

1 1198772 of 119877 has approximate lossless join

property

Proof We proved that 119903(119877) asymp Π1198771

(119877) otimes Π1198772

(119877) based onDefinition 10 Let 1198771015840 be a relation such that 119903(1198771015840) = Π

1198771

(119877) otimes

Π1198772

(119877) We first prove that for all 119905 isin 119903(119877) there exist 1199051015840 isin119903(1198771015840) and 1199051015840 cong 119905 and then prove that for all 119905 isin 119903(119877

1015840) there

must exist 1199051015840 isin 119903(119877) and 1199051015840 cong 119905 Proof by contradiction letus assume that 119905 isin 119903(119877) is the tuple such that no 1199051015840 isin 119903(119877

1015840)

satisfies 1199051015840 cong 119905 Let 1199051and 1199052be tuples such that 119905

1= 119905[119860119883]

and 1199052= 119905[119883 119884] Based on Proposition 11 there must be

1isin

119903(1198771) and

2isin 119903(1198772) such that

1cong 1199051and 2cong 1199052because119877

1=

Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877) Since 119903(1198771015840) asymp Π1198771

(119877)otimesΠ1198772

(119877)

and 1199051[119883] cong 119905[119883] cong 119905

2[119883] there must exist 1199051015840 isin 119903(119877

1015840) such

that 1199051015840 cong (1199051[119860] 1199051[119883] ∘ 119905

2[119883] 1199052[119884]) according to (13) Also

since 1199051= 119905[119860119883] and 119905

2= 119905[119883 119884] we have (119905

1[119860] 1199051[119883] ∘

1199052[119883] 1199052[119884]) cong (119905[119860] 119905[119883] 119905[119884]) by Lemma 7 Thus

1199051015840cong 119905 which contradicts the assumptionProof by contradiction for second part with renewed

symbols assume that 1199051015840 isin 119903(1198771015840) is the tuple such that no

119905 isin 119903(119877) satisfies 1199051015840 cong 119905 Since 119903(1198771015840)Π1198771

(119877)otimesΠ1198772

(119877) there exist1199051isin 119903(119877

1) and 119905

2isin 119903(119877

2) such that 119905

1cong 1199051015840[119860119883] 119905

2cong

1199051015840[119883 119884] and 119905

1[119883] cong 119905

2[119883] based on (13) Also we let

1and

2be the tuples such that

1isin 119903 (119877

1)

2isin 119903 (119877

2)

1cong 1199051015840[119860119883]

2cong 1199051015840[119883 119884]

(17)

Since 1198771= Π(119860119883)

(119877) based on Proposition 11 there mustexist 119905 isin 119903(119877) such that

119905 [119860119883] cong 1 (18)

Likewise there must also exist 11990510158401015840 isin 119903 (119877)

such that 11990510158401015840 [119883 119884] cong 2(19)

Based on (17) and (18) we have 119905[119883] cong 11990510158401015840[119883] Because 119883 sim

gt 119884 119905[119884] cong 11990510158401015840[119884] holds based on Definition 13 Thus 119905[119884] cong2[119884] cong 119905

1015840[119884] based on (19) and 1199051015840 cong 119905 which contradicts the

assumption

In Lemma 16 each one of 119860 119883 and 119884 could be a singleattribute or a set of attributes

Definition 17 Let 119865 be the set of FFDs An FFD 119891 119883 simgt 119884

in 119865 is trivial if there exists an FFD 1198911015840 119883 simgt Y1015840 in 119865 such

that 119884 sub Y1015840

Based on Armstrongrsquos inference rules IR1 and IR3 (seeendnote 2) if a set 119865 of FFD contains 119883 simgt 119884119885 the closureof 119865 will also contain 119883 simgt 119884 and 119883 simgt 119885 which is trivialWhen a relation is decomposed into more relations it takesmore join operations to obtain the original data for queryprocess Considering the cost of join operations it is notefficient to decompose a relation that has already been in thethird normal form A relation is in the third normal form ifthere is no functional dependency between nonkey attributesin the relation [1] Accordingly the relation decompositionhas two prerequisites as follows

(i) It needs to avoid decomposing a relation based ontrivial FFDs

(ii) It needs to make sure that the decomposed resultpreserves the closure of FFDs in the original relation

For example if 119883 is not a key in 119903(119877) then 119877 will bedecomposed based on 119883 simgt 119884119885 rather than on trivial FFD119883 simgt 119884 or 119883 simgt 119885 Based on Lemma 16 and Definition 17we propose an algorithm for ALJD (see Algorithm 1)

In the ALJD algorithm an FFD containing key attributesis excluded from the decomposition process at Step 3This follows the concept of the normalization of traditionaldatabases where only the FD of nonkey attributes is consid-ered To have a consistent presentation of data this work gen-eralizes the definition of key attributes for the fuzzy databasesnamely an attribute 119860 is a key attribute in 119877 if there doesnot exist two tuples 119905 and 1199051015840 in 119903(119877) such that 119905[119860] cong 119905

1015840[119860]

The exclusion of processing FFDs containing key attributescan prevent unnecessary decomposing on the relations whichhave no update anomaly problem Although the decomposi-tion without the key exclusion is still an ALJD it increasesthe cost of the join operations of query process

Proposition 18 Let 119877(119860119883 119884) be a relation and let 119883 simgt 119884

be an FFD in 119903(119877) If 1198771= Π(119860119883)

(119877) and 1198772= Π(119883119884)

(119877)then (i) each FFD existing in 119903(119877

1) or 119903(119877

2)must exist in 119903(119877)

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

8 Journal of Applied Mathematics

and (ii) every FFD existing in 119903(119877)must either exist in 119903(1198771) or

119903(1198772) or be derived via FFDs in 119903(119877

1) and 119903(119877

2)

Proof Statement (i) can be derived by Proposition 11 andLemma 15 Statement (ii) can be derived by Lemma 16 and theproperty of FFD (namely Armstrongrsquos axiom IRs 1 2 and 3described at endnote 2)

The above statements show that the ALJD also preservesthe closure of FFDs in the original relation which is impor-tant to the issues related to the application of FFDs

5 Conclusion

The contribution of this work is threefold First it highlightsthe problem of relation decomposition when tuple elimina-tion is order sensitive To overcome the problem it proposesthe notion of approximate equality for the tuples or relationsin the fuzzy databases and provides the measure of theapproximate equalityThemeasurement is reflexive symmet-ric and transitive It enables classifying tuples into disjointsets and ensures that a decomposed relation has uniqueresult after redundancy removal or tuple merging Thereforethe notion of approximate equality is important for dataoperations in the fuzzy databases Second it proposes approx-imate lossless join decomposition for the fuzzy databasesand defines two operations projection and equal join for thedecomposition all of which are based on the approximateequality The data operations and ALJD can be appliedto the issue on data compression in the fuzzy databasesThird this work defines FFDs and proposes an algorithmto decompose relations in the fuzzy databases based onthe FFDs The decomposition by the algorithm ensuresthe approximate lossless join property The FFD and ALJDproposed for the fuzzy databases are respectively the generalcases of the traditional FD and lossless join decomposi-tion The general property is important for dealing withthe databases containing crisp data and fuzzy data Forthsimilar to the existing approaches of database normalizationon resemblance-based fuzzy databases this study providesseveral propositions to prove that the proposed approachof decomposition satisfies a degree of lossless join propertyCompared to the normalization approaches for resemblance-based fuzzy databases achieving lossless join decompositionfor the extended possibility-based fuzzy databases is moredifficult because of having more complex data

There are some directions of future work Future studycan adopt the notion of approximate equality to define dataoperations for the query processing in the fuzzy databasesResearch can apply the notion on the research related to datacompression fuzzy association rules missing value predic-tion relation compactness and the integrity constraint in thefuzzy databases Study aims to incorporate the fuzzy conceptinto clustering methods or data groupings for decision-making in marketing healthcare applications or businessoperations that can adopt the approximate equality for thesimilarity measures Since the fuzzy concept has been incor-porated into object-oriented databases in literature future

work can provide the approximate equality specifically for thedata in the fuzzy object-oriented data models

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to acknowledge the support ofNationalScience Council NSC102-2410-H-155-036-MY2 and Innova-tion Center for Big Data amp Digital Convergence

Endnotes

1 Different orders on removing redundant tuples couldlead to different results

2 For further details on possibility distribution and on thedifference between possibility and probability measuresthe reader is referred to [38]

3 IR1 (reflexive rule)119883 rarr 119884 if119884 sube 119883 IR2 (augmentationrule) if 119883 rarr 119884 then 119883119885 rarr 119884119885 IR3 (transitive rule)if119883 rarr 119884 and 119884 rarr 119885 the119883 rarr 119885 (see [1])

4 Each FFD in 119903(119877) either directly exists in some indi-vidual relations that decomposed from 119877 or can berepresented via Armstrongrsquos inference rules of the FFDsin these relations

References

[1] S B Navathe and R Elmasri Fundamentals of Database Sys-tems Addison-Wesley New York NY USA 2010

[2] P A Bernstein J R Swenson and D C Tsichritzis ldquoA unifiedapproach to functional dependencies and relationsrdquo in Pro-ceedings of the ACM SIGMOD International Conference onManagement of Data pp 237ndash245 1975

[3] E F Codd ldquoA relational model of data for large shared databanksrdquo Communications of the ACM vol 13 no 6 pp 377ndash3871970

[4] L A Zadeh ldquoFuzzy sets as a basis for a theory of possibilityrdquoFuzzy Sets and Systems vol 1 no 1 pp 3ndash28 1978

[5] B P Buckles and F E Petry ldquoA fuzzy representation of data forrelational databasesrdquo Fuzzy Sets and Systems vol 7 no 3 pp213ndash226 1982

[6] S Shenoi and A Melton ldquoProximity relations in the fuzzyrelational database modelrdquo Fuzzy Sets and Systems vol 31 no3 pp 285ndash296 1989

[7] G Chen J Vandenbulcke and E E Kerre ldquoA general treatmentof data redundancy in a fuzzy relational data modelrdquo Journal ofthe American Society for Information Science vol 43 no 4 pp304ndash311 1992

[8] H Prade and C Testemale ldquoGeneralizing database relationalalgebra for the treatment of incomplete or uncertain informa-tion and vague queriesrdquo Information Sciences vol 34 no 2 pp115ndash143 1984

[9] J C Cubero andM A Vila ldquoNew definition of fuzzy functionaldependency in fuzzy relational databasesrdquo International Journalof Intelligent Systems vol 9 no 5 pp 441ndash448 1994

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 9

[10] K Raju and A K Majumdar ldquoFuzzy functional dependenciesand lossless join decomposition of fuzzy relational databasesystemsrdquo ACM Transactions on Database Systems vol 13 no 2pp 129ndash166 1988

[11] G Chen E E Kerre and J Vandenbulcke ldquoNormalizationbased on fuzzy functional dependency in a fuzzy relational datamodelrdquo Information Systems vol 21 no 3 pp 299ndash310 1996

[12] S Jyothi and M S Babu ldquoMultivalued dependencies in fuzzyrelational databases and lossless join decompositionrdquo Fuzzy Setsand Systems vol 88 no 3 pp 315ndash332 1997

[13] J Y Liu P Chang and C P C Yeh ldquoConsistent data operationsfor multi-databases in extended possibility-based data modelsrdquoExpert Systems with Applications vol 36 no 3 pp 6174ndash61802009

[14] ZMMaW J ZhangWYMa andFMili ldquoData dependenciesin extended possibility-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 17 no 3 pp 321ndash3322002

[15] J Y Liu ldquoData integration constraints for consistent data redun-dancy in fuzzy databasesrdquo International Journal of IntelligentSystems vol 23 no 6 pp 635ndash653 2008

[16] G Chen J Vandenbulcke and E E Kerre ldquoOn the lossless-joindecomposition of relation scheme(s) in a fuzzy relational datamodelrdquo in Proceedings of the 2nd International Symposium onUncertainty Modeling and Analysis pp 440ndash446 IEEE CollegePark Md USA April 1993

[17] J Zhang G Chen and X Tang ldquoExtracting representativeinformation to enhance flexible data queriesrdquo IEEETransactionsonNeuralNetworks andLearning Systems vol 23 no 6 pp 928ndash941 2012

[18] S Shenoi A Melton and L T Fan ldquoAn equivalence classesmodel of fuzzy relational databasesrdquoFuzzy Sets and Systems vol38 no 2 pp 153ndash170 1990

[19] M Koyuncu and A Yazici ldquoIFOOD An intelligent fuzzyobject-oriented database architecturerdquo IEEE Transactions onKnowledge and Data Engineering vol 15 no 5 pp 1137ndash11542003

[20] Z Ma W Zhang and W Ma ldquoSemantic measure of fuzzydata in extended possibility-based fuzzy relational databasesrdquoInternational Journal of Intelligent Systems vol 15 no 8 pp 705ndash716 2000

[21] V V Cross ldquoFuzzy extensions for relationships in a generalizedobject modelrdquo International Journal of Intelligent Systems vol16 no 7 pp 843ndash861 2001

[22] V V Cross ldquoDefining fuzzy relationships in object modelsabstraction and interpretationrdquo Fuzzy Sets and Systems vol 140no 1 pp 5ndash27 2003

[23] S Wang and G H Huang ldquoA two-stage mixed-integerfuzzy programming with interval-valued membership func-tions approach for flood-diversion planningrdquo Journal of Envi-ronmental Management vol 117 pp 208ndash218 2013

[24] S Wang and G H Huang ldquoAn interval-parameter two-stagestochastic fuzzy program with type-2 membership functionsAn application to water resources managementrdquo StochasticEnvironmental Research and Risk Assessment vol 27 no 6 pp1493ndash1506 2013

[25] S Wang and G H Huang ldquoInteractive fuzzy boundary intervalprogramming for air quality management under uncertaintyrdquoWater Air and Soil Pollution vol 224 article 1574 2013

[26] Z M Ma and F Mili ldquoHandling fuzzy information in extendedpossibility-based fuzzy relational databasesrdquo International Jour-nal of Intelligent Systems vol 17 no 10 pp 925ndash942 2002

[27] Z M Ma and L Yan ldquoUpdating extended possibility-basedfuzzy relational databasesrdquo International Journal of IntelligentSystems vol 22 no 3 pp 237ndash258 2007

[28] P Bosc D Dubois and H Prade ldquoFuzzy functional dependen-ciesan overview and a critical discussionrdquo in Proceedings of theIEEE International Conference on Fuzzy Systems pp 325ndash3301994

[29] P Bosc and O Pivert ldquoOn the impact of regular functionaldependencies when moving to a possibilistic database frame-workrdquo Fuzzy Sets and Systems vol 140 no 1 pp 207ndash227 2003

[30] E A Rundensteiner L W Hawkes and W Bandler ldquoOn near-ness measures in fuzzy relational data modelsrdquo InternationalJournal of Approximate Reasoning vol 3 no 3 pp 267ndash2981989

[31] P Bosc D Dubois and H Prade ldquoFuzzy functional depen-dencies and redundancy eliminationrdquo Journal of the AmericanSociety for Information Science vol 49 no 3 pp 217ndash235 1998

[32] Z M Ma W J Zhang and F Mili ldquoFuzzy data compressionbased on data dependenciesrdquo International Journal of IntelligentSystems vol 17 no 4 pp 409ndash426 2002

[33] B Bhuniya and P Niyogi ldquoLossless join property in fuzzyrelational databasesrdquo Data and Knowledge Engineering vol 11no 2 pp 109ndash124 1993

[34] O Bahar andA Yazici ldquoNormalization and lossless join decom-position of similarity-based fuzzy relational databasesrdquo Inter-national Journal of Intelligent Systems vol 19 no 10 pp 885ndash917 2004

[35] T K Bhattacharjee and A K Mazumdar ldquoAxiomatisationof fuzzy multivalued dependencies in a fuzzy relational datamodelrdquo Fuzzy Sets and Systems vol 96 no 3 pp 343ndash352 1998

[36] Z M Ma and L Yan ldquoGeneralization of strategies for fuzzyquery translation in classical relational databasesrdquo Informationand Software Technology vol 49 no 2 pp 172ndash180 2007

[37] J Zhang Q Wei and G Chen ldquoAn efficient incrementalmethod for generating equivalence groups of search results ininformation retrieval and queriesrdquo Knowledge-Based Systemsvol 32 pp 91ndash100 2012

[38] D Dubois and H Prade Fuzzy Sets and Systems Theory andApplications vol 144 ofMathematics in Science and EngineeringAcademic Press New York NY USA 1980

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of


Recommended