+ All Categories
Home > Documents > A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in...

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in...

Date post: 16-Aug-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
15
A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A Fuzzy Datalog Deductive Database System Pascual Juli´ an-Iranzo, Fernando S´ aenz-P´ erez Abstract—This paper describes a proposal for a deduc- tive database system with fuzzy Datalog as its query lan- guage. Concepts supporting the fuzzy logic programming sys- tem BousiProlog are tailored to the needs of the deductive database system DES. We develop a version of fuzzy Datalog where programs and queries are compiled to the DES core Datalog language. Weak unification and weak SLD resolution are adapted for this setting, and extended to allow rules with truth de- gree annotations. We provide a public implementation in Prolog which is open-source, multiplatform, portable, and in-memory, featuring a graphical user interface. A distinctive feature of this system is that, unlike others, we have formally demonstrated that our implementation techniques fit the proposed operational semantics. We also study the efficiency of these implementation techniques through a series of detailed experiments. Moreover, a database example for a recommender system is used to illustrate some of the features of the system and its usefulness. Index Terms—Deductive Database, Fuzzy Logic Programming, Fuzzy Prolog, Weak Unification, BousiProlog, Datalog Educa- tional System. I. I NTRODUCTION F uzzy Logic Programming integrates concepts coming from fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using declarative techniques. In the last decade there has been a renewed interest in this amalgamation as revealed in the works [3], [4]. There is not a common method for intro- ducing fuzzy concepts into logic programming. In particular, BousiProlog [5], [6], an extension of the Prolog language, follows the approach of [7], where the syntactic unification mechanism of classical SLD resolution is replaced by a weak unification algorithm, based on proximity/similarity relations. This algorithm provides a weak most general unifier as well as a numerical value, called the approximation degree. Intuitively, the approximation degree represents the truth degree associ- ated with the (query) computed instance. Programs written in this language consist, in essence, of a set of ordinary (Prolog) clauses jointly with a set of “proximity equations” which play an important role during the unification process. Datalog [8] is a query language for deductive databases that can be seen as a syntactic subset of Prolog. Pure Datalog is a truly declarative language because the order neither of rules nor goals in the program do affect operational semantics (in Pascual Juli´ an-Iranzo: Dep. of Information Technologies and Systems, University of Castilla-La Mancha. E-mail: [email protected] Fernando aenz-P´ erez: Dep. of Software Engineering and Artificial Intelligence, Universidad Complutense de Madrid. E-mail: [email protected] This work has been partially supported by FEDER and the Spanish Ministry of Economy and Competition under grant TIN2016-76843-C4-2-R, projects CAVI-ART (TIN2013-44742-C4-3-R) and CAVI-ART-2 (TIN2017-86217-R), and Madrid regional project N-GREENS Software-CM (S2013/ICE-2731). We would like to thank the anonymous referees for helping us in improving this paper with their suggestions. particular, non-logic constructors are disallowed, as the cut), but it is not Turing-complete as it is meant as a database query language. Fuzzy Datalog [9] is an extension of Datalog-like languages using lower bounds of uncertainty degrees in facts and rules. Akin proposals as [10], [11] explicitly include the computation of the rule degree, as well as an additional argument to represent this degree. However, and similar to BousiProlog, we are interested in removing this burden from the user with an automatic rule transformation that elides both the degree argument and the explicit call to fuzzy connective computations in user rules. In addition, we provide support for several proximity/similarity relations such as in BousiProlog. Thus, in this paper, we are interested in the implementation of a deductive fuzzy database with such features by extending the Datalog Educational System (DES) [12] into a system that we call FuzzyDES from here on. The DES system is a deductive database targeted at teaching databases and their query languages. We focus on Datalog to be extended with fuzzy relations (i.e., relations where each data tuple is associated to an approximation degree), and weak unification and resolution. By contrast to a fuzzy Prolog system, answers are (multi)-sets instead of just single answers retrieved via backtracking. Whereas queries solved under SLD resolution can easily develop non-termination, queries under tabled SLD resolution [13] are ensured to terminate for user predicates, which is a natural requirement in the database arena (cf. SQL). Due to the interactive nature of DES, we keep this feature in the fuzzy setting to let users play and interact with the system by providing commands to assert new rules and facts on the fly. For the implementation, we transfer the techniques devel- oped for the BousiProlog system into the DES system. In general, following the approach of the high level implementa- tion of BousiProlog,a FuzzyDES program is compiled into a set of core Datalog rules, which in turn are interpreted by a deductive engine implemented in Prolog. The translation includes calls to a collection of auxiliary predicates able to reproduce the Weak SLD resolution with graded rules (i.e., annotated rules with weights acting as truth degrees). Since Datalog differ from Prolog (e.g., non-compound terms and ground answers), we adapt and specialize the Weak SLD res- olution (WSLD) procedure found in the BousiProlog system to DES. A distinctive feature of the built system is that, unlike others, we have formally demonstrated that our techniques for implementing the WSLD resolution operational semantics are correct, i.e., they produce the expected answers and only them. We also study the efficiency of these implementation techniques through a series of detailed experiments. To the best of our knowledge, there is no a publicly- available implementation of a fuzzy Datalog system as the
Transcript
Page 1: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1

A Fuzzy Datalog Deductive Database SystemPascual Julian-Iranzo, Fernando Saenz-Perez

Abstract—This paper describes a proposal for a deduc-tive database system with fuzzy Datalog as its query lan-guage. Concepts supporting the fuzzy logic programming sys-tem Bousi∼Prolog are tailored to the needs of the deductivedatabase system DES. We develop a version of fuzzy Datalogwhere programs and queries are compiled to the DES coreDatalog language. Weak unification and weak SLD resolution areadapted for this setting, and extended to allow rules with truth de-gree annotations. We provide a public implementation in Prologwhich is open-source, multiplatform, portable, and in-memory,featuring a graphical user interface. A distinctive feature of thissystem is that, unlike others, we have formally demonstratedthat our implementation techniques fit the proposed operationalsemantics. We also study the efficiency of these implementationtechniques through a series of detailed experiments. Moreover, adatabase example for a recommender system is used to illustratesome of the features of the system and its usefulness.

Index Terms—Deductive Database, Fuzzy Logic Programming,Fuzzy Prolog, Weak Unification, Bousi∼Prolog, Datalog Educa-tional System.

I. INTRODUCTION

Fuzzy Logic Programming integrates concepts coming fromfuzzy logic [1] into pure logic programming [2] in order todeal with imprecise information, uncertainty and/or vaguenessby using declarative techniques. In the last decade there hasbeen a renewed interest in this amalgamation as revealed inthe works [3], [4]. There is not a common method for intro-ducing fuzzy concepts into logic programming. In particular,Bousi∼Prolog [5], [6], an extension of the Prolog language,follows the approach of [7], where the syntactic unificationmechanism of classical SLD resolution is replaced by a weakunification algorithm, based on proximity/similarity relations.This algorithm provides a weak most general unifier as well asa numerical value, called the approximation degree. Intuitively,the approximation degree represents the truth degree associ-ated with the (query) computed instance. Programs written inthis language consist, in essence, of a set of ordinary (Prolog)clauses jointly with a set of “proximity equations” which playan important role during the unification process.

Datalog [8] is a query language for deductive databases thatcan be seen as a syntactic subset of Prolog. Pure Datalog isa truly declarative language because the order neither of rulesnor goals in the program do affect operational semantics (in

Pascual Julian-Iranzo: Dep. of Information Technologies and Systems,University of Castilla-La Mancha. E-mail: [email protected]

Fernando Saenz-Perez: Dep. of Software Engineering and ArtificialIntelligence, Universidad Complutense de Madrid. E-mail: [email protected]

This work has been partially supported by FEDER and the Spanish Ministryof Economy and Competition under grant TIN2016-76843-C4-2-R, projectsCAVI-ART (TIN2013-44742-C4-3-R) and CAVI-ART-2 (TIN2017-86217-R),and Madrid regional project N-GREENS Software-CM (S2013/ICE-2731).We would like to thank the anonymous referees for helping us in improvingthis paper with their suggestions.

particular, non-logic constructors are disallowed, as the cut),but it is not Turing-complete as it is meant as a database querylanguage.

Fuzzy Datalog [9] is an extension of Datalog-like languagesusing lower bounds of uncertainty degrees in facts and rules.Akin proposals as [10], [11] explicitly include the computationof the rule degree, as well as an additional argument torepresent this degree. However, and similar to Bousi∼Prolog,we are interested in removing this burden from the userwith an automatic rule transformation that elides both thedegree argument and the explicit call to fuzzy connectivecomputations in user rules. In addition, we provide support forseveral proximity/similarity relations such as in Bousi∼Prolog.

Thus, in this paper, we are interested in the implementationof a deductive fuzzy database with such features by extendingthe Datalog Educational System (DES) [12] into a systemthat we call FuzzyDES from here on. The DES systemis a deductive database targeted at teaching databases andtheir query languages. We focus on Datalog to be extendedwith fuzzy relations (i.e., relations where each data tuple isassociated to an approximation degree), and weak unificationand resolution. By contrast to a fuzzy Prolog system, answersare (multi)-sets instead of just single answers retrieved viabacktracking. Whereas queries solved under SLD resolutioncan easily develop non-termination, queries under tabled SLDresolution [13] are ensured to terminate for user predicates,which is a natural requirement in the database arena (cf. SQL).Due to the interactive nature of DES, we keep this feature inthe fuzzy setting to let users play and interact with the systemby providing commands to assert new rules and facts on thefly.

For the implementation, we transfer the techniques devel-oped for the Bousi∼Prolog system into the DES system. Ingeneral, following the approach of the high level implementa-tion of Bousi∼Prolog, a FuzzyDES program is compiled intoa set of core Datalog rules, which in turn are interpreted bya deductive engine implemented in Prolog. The translationincludes calls to a collection of auxiliary predicates able toreproduce the Weak SLD resolution with graded rules (i.e.,annotated rules with weights acting as truth degrees). SinceDatalog differ from Prolog (e.g., non-compound terms andground answers), we adapt and specialize the Weak SLD res-olution (WSLD) procedure found in the Bousi∼Prolog systemto DES. A distinctive feature of the built system is that, unlikeothers, we have formally demonstrated that our techniquesfor implementing the WSLD resolution operational semanticsare correct, i.e., they produce the expected answers and onlythem. We also study the efficiency of these implementationtechniques through a series of detailed experiments.

To the best of our knowledge, there is no a publicly-available implementation of a fuzzy Datalog system as the

Page 2: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 2

one we propose in this work. This paper describes ourapproach to develop such a system. Our motivation lies inthat, although SQL (as a query language) dominates thedatabase panorama, over the years SQL has become quitecomplex and some statements have started to cause problemsfor end users and even for professionals. Also, one of themain advantages of deductive databases is the ability tospecify recursive rules without the limitations present in SQLsystems (linearity and termination control). This, jointly withthe addition of fuzzy features, can facilitate the flexible useof databases and knowledge bases. Additionally, in the lastyears we have witnessed an increased interest in recursiveDatalog queries in a variety of application domains such asdata integration, information extraction, networking, programanalysis, security, and even cloud computing (cf. Section III).Finally, it is important to emphasize that our system has beenspecially designed to facilitate flexibility for fuzzy queryingand answering of questions, encapsulating fuzzy management.This is a definitive feature of our system that do not haveothers based on more traditional techniques, in which theuser has to set certain parameters (as truth degrees and theirmanipulation) when specifying their queries and rules. Thisis an inherited characteristic of Bousi∼Prolog that allows acomplete separation when specifying logic, vague knowledgeand control. All of these are appreciable features that facilitatethe modelling of knowledge systems from a more declarativeperspective.

The following section gives a practical motivation to thiswork by describing an application example of FuzzyDESshowing its ability to model uncertain knowledge and facilitateflexible query answering.1

II. A MOTIVATING EXAMPLE

Recommender systems are an effective way to assist peopleby offering advice on finding suitable products and services tofacilitate online decision-making [14]. Subjective information(such as how a person evaluates a product or service, and howanother person trust an opinion depending on its confidence)can be specified with linguistic, fuzzy information. In order tomotivate our work, as an example of a fuzzy database in DES,we apply these ideas to modelling a small recommender sys-tem for restaurants in Madrid. We consider people interested inasking questions relating the location of the restaurant with re-spect to its own location, the quality of the restaurant in termsof other’s opinions, and the type of food served. Thus, wecan distinguish two fuzzy relations: A proximity relation near

(reflexive and symmetric) for representing walking distances,and a predefined similarity relation ~ (which is, in addition,transitive) for representing quality degrees. These are definedin the next code excerpt of the database file that can be down-loaded from http://des.sourceforge.net/fuzzy/recommender.dl.:-fuzzy_relation(near,[reflexive,symmetric]).sol near callao = 0.6. sol near cruz = 0.5.

callao near plaza_espa~na = 0.4.

:-fuzzy_relation(~,[reflexive,symmetric,transitive]).

1The documentation in the downloadable system contains some otherapplication examples and all the available commands.

plain~good=0.5. good~very_good=0.5.very_good~excellent=0.3.

The first assertion fuzzy_relation defines the operator nearas a proximity relation for a walking distance with the reflexiveand symmetric properties. Analogously, a similarity relation ~

is explicitly defined next (in fact, its assertion can be removedsince it is defined as such by default). Proximity equations, asplain~good=0.5, specify the degree of semantic similarity fortwo different syntactic symbols. In this particular case it statesthat plain and good are similar with approximation degree 0.5.

A fuzzy relation confidence/1 defines the degree of relia-bility (in the opinion) which a given type of user deserves. Forexample, a local guide is assumed to be a person includingserious comments in the database where normal and casualusers may not, and with different confidences:

confidence(local_guide) with 0.9.confidence(normal_user) with 0.5.confidence(casual_user) with 0.3.

Next, several relations are defined with facts in the database:The relation restaurant/3 relates the name of a restaurant, itslocation, and food served. Users and their types are related inthe relation user/3. Finally, user comments are represented inthe relation comment/3 relating user, restaurant, and comment.

restaurant(don_oso,cruz,burguer).restaurant(rodilla,callao,snacks).restaurant(roque,sol,rice).restaurant(tagliatella,benavente,italian).

user(juan,local_guide). user(sara,normal_user).user(pepe,casual_user).

comment(juan,don_oso,plain).comment(juan,rodilla,good).comment(pepe,roque,excellent).comment(sara,tagliatella,very_good).

A recommendation relies on the quality of a restaurant,which is defined with a single rule that takes into accountthe comment a user has provided, the type of this user, andits confidence:

quality(Restaurant,Quality) :-comment(User,Restaurant,Quality),user(User,Type), confidence(Type).

So, in order to provide recommendations, the rulerecommend relates restaurants (Restaurant) with the locationof the user (Origin), serving certain food (Food) with anacknowledged quality (Quality):

recommend(Origin,Food,Quality,Restaurant) :-restaurant(Restaurant,Location,Food),Location near Origin, quality(Restaurant,Quality).

Contrary to Fuzzy Prolog implementations, the set-orientedapproach of deductive databases makes queries to return all theanswers, similar to relational databases, so that a query to thisdatabase can return several recommendations with differentapproximation degrees at once. For instance, the query in thefollowing system session returns recommendations for a userlocated at sol:

DES> /system_mode fuzzyFDES> /consult recommenderInfo: 22 rules consulted.

Page 3: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 3

FDES> recommend(sol,Food,Quality,Restaurant)Info: Processing:answer(Food,Quality,Restaurant) :-recommend(sol,Food,Quality,Restaurant).

{ answer(snacks,good,rodilla)with 0.6,answer(burguer,plain,don_oso)with 0.5,answer(rice,excellent,roque)with 0.3 }

In this session, after switching the deductive system to thefuzzy setting, the command consult loads the database locatedin the given file (with default extension .dl). The query returnsall possible restaurants indicating the type of food served,quality and the recommended restaurant. Each answer tuple isordered by default with respect to descending approximationdegrees. The first tuple indicates that rodilla has a supportdegree of 0.6, which is a value constructed by taking intoaccount that the restaurant is located at callao, which is nearsol. Also, there is a comment from the local guide juan (witha support of 0.9) stating that the restaurant is good. The nexttuple in the answer refer to the restaurant don_oso, receivinga support degree of 0.5 with quality plain, because the samelocal guide commented on this restaurant with this qualitylevel for a restaurant located at cruz, near sol. The last tuplereceives a small support because the comment was raised bya user whose type is scored rather low.

We can interactively add information about the type of thefood served with:

FDES> /assert burguer~fast_food=0.7.FDES> /assert snacks~fast_food=0.9.

Then, a similar query but looking for good fast food nearsol (with no need to reload the database) would retrieve:

FDES> recommend(sol,fast_food,good,Restaurant)Info: Processing:

answer(Restaurant) :-recommend(sol,fast_food,good,Restaurant).

{ answer(rodilla)with 0.6, answer(don_oso)with 0.5 }

There are really two tuples for don_oso fulfilling the ques-tion, with the same support degree (one for plain and otherfor good quality). Due to the default set-oriented behavior ofthe system, duplicates are removed. However, duplicates canbe enabled with the command /duplicates on, and the samequery would return 3 tuples.2

An approximation degree threshold (λ-cut) can be statedwith the command /lambda_cut Value , which prunes com-putations as soon as a degree greater than the threshold iscomputed as a result of an application of the t-norm 4, whichis a binary truth function generalizing classical conjunction. Inthis last example, the second answer would be removed fromthe answer for a λ-cut of 0.55.

Facts and rules can also be interactively added (with thecommand /assert Rule or from a file with /reconsult

File ) so that the database have not to be recompiled eachtime it is modified. Finally, a query for the predicate recommend

with variables in all its arguments would return all possiblerecommendations. (The reader is encouraged to try the systemwith different queries.)

2Nonetheless, as it will explained later in Subsection IV-F, enablinganswer subsumption automatically removes output tuples with equal or lesserapproximation degrees.

III. DATALOG EDUCATIONAL SYSTEM

The Datalog Educational System (DES) [12] is a free, open-source, multiplatform, portable, in-memory, Prolog-based im-plementation of a deductive database system. DES 5.0.1(des.sourceforge.net) is the current release, which includesseveral query languages: Datalog, SQL, Relational Algebraand Relational Calculi. Fuzzy Datalog is the main addition forthis new major release. DES features tabling, types, integrityconstraints, stratified negation [8], persistency, full-fledgedarithmetic, ODBC connections to relational databases, novelapproaches to Datalog and SQL declarative debugging [15],[16], test case generation for SQL views [17], null valuesupport, outer join, and aggregate predicates and functions[12]. This system is used world-wide in many universities (e.g.Imperial College London –UK–, TU Munchen –Germany–,Universite Lille 1 –France–, UCLA –USA– or The Univer-sity of Sydney –Australia–) for teaching deductive databases(See http://des.sourceforge.net/html/what s des for .html, inits web page, for an extensive list of institutions), and usedas a test-bed for research (Mozilla leaks, smart grid networks,deductive data warehouses, declarative debugging, . . . ), andfeaturing more than 71K downloads up to now.

A. Syntax

A deductive database in DES is defined by a normal logicprogram with rules of the form A← Q, where the body Q ≡L1∧ . . .∧Ln is a query, the head A is an atom and, each Li isa literal. In this paper, we restrict ourselves to definite clauses(with positive literals). Thus, a literal can be a call to either adefined predicate or to a built-in predicate. A disjunctive ruleA ← Q1 ∨Q2 is understood as the rules A ← Q1 and A ←Q2. Textual syntax in the system follows Prolog convention:Variables start with either upper-case or an underscore, andpredicate symbols and string constants either start with down-case or are delimited by single quotes. The left implication←is written as :-, the conjunction with a comma (,), and thedisjunction with a semicolon (;). We interchangeably use theterms: predicate - relation, clause - rule, and goal - query.

B. Safe Databases

Though the Datalog language might be understood as asyntactic subset of Prolog, it is claimed to be a true declarativelanguage (see, e.g., [18]) in the sense that rule and goalordering in a program does not affect declarative semantics.However, as a database language, users expect terminationand finite answers, so that some restrictions to programs areimposed. The first restriction is about safe rules [8], [19], thatensure closed (ground) answers. Open answers (i.e., includingunbound variables) are not allowed as they can represent infi-nite tuples (for example, the answer p(X) represents ∀Xp(X),a domain which is not defined as finite in the database). Asecond restriction refers to built-ins because whereas user-defined predicates represent finite relations, some useful built-ins represent infinite relations. For example, a comparisonoperator (\=, >, . . . ) relates two arguments representing a truerelation for all the possible elements fulfilling the comparison

Page 4: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 4

(e.g., X<Y represents all the pairs (X,Y) such that X is lessthan Y). For ensuring a finite answer, these operators demandground arguments.

C. Stratification and Aggregates

Deductive system implementations require some form offixpoint computation to deduce the meaning of programs andqueries. Keeping this computation monotonic is achieved bythe well-known technique called stratification [8], which is asa syntactic restriction that rejects programs which combineboth negation and recursion in a computation path. Though inour work we do not deal with negation, we still take advantageof this technique for computing aggregates. Computing anaggregate can be a source of non-monotonicity because in agiven iteration of a fixpoint computation, all the tuples forthe relation on which the aggregate operates are not availablein general. So, there exists the risk of deducing an incorrectoutcome in a given iteration that can be neglected in a furtheriteration. For instance, we can assume that the tuple r(0) is inthe current fixpoint iteration, so that computing the maximumof the single argument of r yields 0. But it is possible that ina next iteration, a new tuple (say r(1)) is computed, thereforeneglecting the previous computed value for the aggregate.Thus, stratification is applied to ensure that all the tuplesof the relation r are computed before trying to compute theaggregate.

A stratification for a program is built with the aid of a pred-icate dependency graph (PDG) [19], showing both the positiveand negative dependencies between predicates in the program.Each node in this graph is a program predicate symbol, andthere are as many nodes as such symbols in the program.Arcs come from each predicate in a rule body (antecedent) toits rule predicate (consequent). If the antecedent occurs as anaggregate argument, its arc is labelled as negative, and positiveotherwise. A stratification collects predicates into numberedstrata so that, given the function str(Π, p) which assigns astratum number to predicate p in a database Π, then for apositive arc p←q, str(Π, p) ≤ str(Π, q), and for a negativearc p

¬←q, str(Π, p) < str(Π, q)In this paper, we use the aggregate group_by(R,Vs,E ),

where R is a relation call, Vs are the variables in R usedas a grouping criterion, and E is a Boolean expressionincluding in general an aggregate. For example, the querygroup_by(r(X,Y),[X],M=max(Y)) is intended to computethe maximum value of Y for each group formed by tuples withthe same value for X. Thus, if the meaning of r is definedby { r(a,0), r(a,1), r(b,3) }, then the previous queryreturns { answer(a,1), answer(b,3) }.

D. Tabling

Instead of inferring the meaning of the whole database, DESsolves queries with a top-down-driven, bottom-up fixpointcomputation based on tabling, following the ideas found in[20]. In this section we assume safe databases.

Implementing tabling in DES resorts to a call table ct whichstores the goal calls made along resolution as atom entries, φ,and answers in an answer table at as id : ψ entries, where id

is a clause identifier and ψ is a positive ground atom. Fillinganswer and call tables is due to the memo function whichproceeds by tabled SLD resolution [13], [20], [21]. Alongresolution, a call to a goal is executed if there are no entry inthe call table subsuming 3 the current call. If so, this call isadded to the call table. Otherwise, it means that the results fora more general call has been performed and its answers areavailable in the answer table, which can be simply retrievedfor the current call. Analogously, upon resolving a given call,its result is added as an entry to the answer table if there isno other entry in the same table subsuming it. While in logicprogramming systems a subsumption test involves open entriesin general, in deductive systems as DES, a subsumption testcannot involve open entries because safety. Thus, in this lattercase, answer subsumption simply leads to a membership test.

Filling the answer and call tables is done by strata byensuring that the meaning of atoms which are required to proveother goals are already in the answer table. Such atoms canbe either those in negative literals (which are out of the scopeof the current work) or those in aggregates. So, followingthe stratification for the database Π, for a given goal φ, agoal dependency graph (gdg(Π, φ)) is computed, which is thesubgraph of the PDG such that contains all reachable nodesfrom φ in Π. Then, for each node pi in the subgraph suchthat there is a negative arc coming out from pi, an open goalφi is built with the same arity as pi. Goals φi are ordered bystr(Π, φi), so that lower-strata goals will be computed beforeupper-strata goals.

The stratified meaning of a program restricted to a goal isgot by filling the tables as specified next:

Definition 3.1: Stratified Meaning of a Program restrictedto a Goal. Given a program Π and a goal φk

< cti, ati >=⊔n≥0

memon(φi,Π, cti−1, ati−1)

where gdg(Π, φk) = < N,A >, pi ∈ N , i ∈ {1, . . . , k},q¬← pi ∈ A for some q, φi = pi(X1, . . . , Xarity(pi)), Xj

fresh variables, arity(pi) is the arity of the predicate pi, andindexes i are ordered such that str(Π, φi) ≤ str(Π, φi+1), and⊔n≥0 is the least upper bound of the successive applications of

the function memo which solves a goal using call and answertables [13].

So, the meaning of a tabled goal is defined analogously tothe meaning of a goal:

Definition 3.2: Meaning of a Tabled Goal. The meaning ofa tabled goal φ w.r.t. an answer table at for a source tupleidentifier id is defined as solve(φ, at) = {ψ such that id :ψ ∈ at, and φθ = ψ} where solve returns a bag, and θ is asubstitution.

Thanks to tabling, while some Prolog queries (goals) arenon-terminating, equivalent Datalog queries are terminating(modulo infinite built-in predicates). For example, the queryp(X) in the context of the program { p(1), (p(X):-p(X))} hasthe answer { p(1) }. Even more, termination in Prolog mightdepend on rule ordering. For example, the same query p(X) in

3An atom A subsumes another B if there exists a substitution σ such thatAσ = B. That is, if B is an instance of A.

Page 5: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 5

the context of the program { (p(X):-p(Y), Y>0, X is Y-1),p(1) } do not terminate for SLD resolution as the first rule isselected infinitely many times without possibility of selectingthe second one. By contrast, tabling provides the answer {p(0), p(1) }.

Tabling has been acknowledged as an efficient approach inlogic programming, where complexity results are given (see,e.g., [22]), so that our approach seems adequate for supportinga fuzzy deductive database system.

IV. FUZZYDES

FuzzyDES is an extension of the Datalog language withtruth degree annotations and an operational semantics, thatwe called Weak SLD resolution, based on similarity relations.In this section, we briefly summarize the main theoreticalconcepts supporting FuzzyDES and we give the clues of itsimplementation.

A. Fuzzy Relations and Weak Unification

A binary fuzzy relation on a set U is a fuzzy subset onU × U (that is, a mapping U × U −→ [0, 1]). There aresome important properties that fuzzy relations may have: (1)(Reflexive) R(x, x) = 1 for any x ∈ U ; (2) (Symmetric)R(x, y) = R(y, x) for any x, y ∈ U ; (3) (Transitive)R(x, z) ≥ R(x, y)4R(y, z) for any x, y, z ∈ U ; where theoperator 4 is an arbitrary t-norm. The notion of transitivityabove is 4-transitive, and if the operator 4 is the minimumof two elements (Godel t-norm), we speak of min-transitive.

A proximity relation is a binary fuzzy relation which is re-flexive and symmetric. A proximity relation which in additionfulfills the transitive property is called a similarity relation.

We are primarily interested in similarity relations on thealphabet of a first order language. The reason is to breakthe usual constraint in programming languages by whichdistinct different syntactic symbols represent distinct infor-mation. So, in this context, two different constant symbolsor predicate(/function) symbols, with the same arity, can betreated as equal up to a certain degree. Syntactically equalvariables are consider similar with approximation degree 1;otherwise, their approximation degree is 0.

For a Datalog language, where a term can only be either aconstant or a variable (i.e., no function symbols are allowed),the similarity relation R on the symbols of the alphabet canbe extended to atomic formulas as follows: Let p and q betwo n-ary predicate symbols and let t1, . . . , tn, s1, . . . , sn beeither constants or variables, then,

R(p(t1,. . . ,tn), q(s1,. . . ,sn)) = R(p, q)4(4ni=1R(ti, si)).

FuzzyDES inherits the weak unification algorithm ofBousi∼Prolog. When we work with similarity relations, thisalgorithm coincides with the one defined by M. Sessa [7].Here we give our slightly different version of it, which isthresholded by a cut value λ ∈ [0, 1], also known as λ-cut.

In the context of a similarity relation it is possible to definea weak notion of most general unifier. Let R be a similarityrelation, λ be a cut value and E1 and E2 be two expressions.The substitution θ is a weak unifier of level λ for E1 and

E2 with respect to R (or λ-unifier) if its unification degreeR(E1θ, E2θ) ≥ λ. A substitution θ is a weak most generalunifier (wmgu) of level λ (or λ-wmgu), w.r.t.R, for E1 and E2,denoted by wmguλR(E1, E2), if: (1) θ is a λ-unifier of E1 andE2; and (2) for any λ-unifier σ of E1 and E2, the substitutionθ is more general than the substitution σ with level λ; that is,there exists a substitution δ such that, for any variable x inDom(σ) ∪ Dom(θ), R(xσ, xθδ) ≥ λ.

Weak most general unifiers are computed by means of aweak unification algorithm which is formalized as a transitionsystem supported by a similarity-based unification relation⇒. For a similarity relation R and a cut value λ, theunification of the expressions E1 and E2 is obtained by astate transformation sequence starting from an initial state〈G0≡{E1≈E2},id,α0〉, where id is the identity substitutionand α0 = 1: 〈G0,id,α0〉 ⇒+ 〈Gn,θn,αn〉. When Gn = ∅and αn ≥ λ, the expressions E1 and E2 are unifiable bysimilarity with wmgu θn and unification degree αn. Otherwise,if Gn = Fail, E1 and E2 fail to unify.

The similarity-based unification relation (⇒) is defined asin the classical unification algorithm, except for the termdecomposition and failure rules (see later in Subsection IV-Cand in [6] for a more formal and extensive discussion ofthis topic). Note that, unlike [7], the resulting approximationdegree is limited by a cut value λ ≥ 0. Nevertheless, the weakunification theorem proved in [7, pag. 412] is valid in ourframework.

B. Implementing Fuzzy Relations

As in Bousi∼Prolog, a fuzzy relation R can be speci-fied by stating, first, a set of what we called relationshipequations and, second, its properties. A relationship equationis a program declaration with textual form x/n r y/n = a,which represents entries R(x, y) = α of the fuzzy binaryrelation R (textually represented by an infix operator r), andits intuitive reading is that two n-ary symbols, x (x) and y(y), are related with a certain degree α (a), where the arityspecification for constants (/0) is omitted in the equation. Notethat when the relation partially specified by the relationshipequations is a proximity (or a similarity) we will often speakof proximity equations. The relationship equations specifyingthe standard similarity relation are represented internally byatoms $~(X,Y,D) meaning that two symbols X and Y are relatedwith the approximation degree D.

The properties attached to a relation R represent its inten-sional description, so that users are not expected to includeall the relationship equations fulfilling such properties. Forinstance, obtaining the (additional) entries in a similarity rela-tionR (which are needed for the compilation of e-clauses –seeSubsection IV-E–) is performed by automatically computingits 4-closure [23], [24]. This closure can be understood as theshortest paths in a weighted directed graph (cf. Floyd-Warshallalgorithm), where the length of a path x1

α1→ x2α2→ · · · αn−1→ xn

is 4n−1i=1 αi. By taking advantage of the DES deductive engine,this closure can be neatly specified with Datalog rules foreach of the properties with no need to resorting to a specificimplementation of a 4-closure algorithm.

Page 6: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 6

In the concrete implementation of FuzzyDES, a defaultsimilarity relation R is denoted by the identifier ~ and itis represented internally by a set of atoms ~(X,Y,D) relatingtwo symbols X and Y with the approximation degree D. Thesimilarity relation ~ is intensionally defined by its propertieswith the following Datalog rules, where underscored variablesare non-relevant variables for the outcome:4

$~(X,X,1.0) :- $~(X,_Y,_D1) ; $~(_Y,X,_D2).

$~(X,Y,D) :- $~(Y,X,D).

$~(X,Y,D) :- $~(X,Z,D1), $~(Z,Y,D2),$t_norm(~,[D1,D2],D).

~(X,Y,D) :- group_by($~(X,Y,D1),[X,Y],D=max(D1)).

where the call to the predicate $t_norm/3 represents theapplication of the t-norm 4 associated to a relation R (e.g.,the relation ~ has associated the minimum t-norm by default),provided in its first argument, to the list of approximationdegrees given in the second argument. The result of thisoperation is returned in the third argument of $t_norm/3. Thecall to the aggregate metapredicate group_by/3 groups tuplesin ~ by the criteria [X,Y], and applies the aggregate expressionD=max(D1) over the partitioned relation (cf. extended relationalalgebra operation [25]). Note that ’=’/2 in DES does not justrepresent unification, but performs expression evaluation ofboth arguments followed by their unification (compound termsare not allowed as data). In plain words, this operation selectsamong the different $~(X,Y,D1) with the same arguments X

and Y, the one which has maximum D1.Solving the aggregate in the metapredicate $group_by/3

requires that the relation $~, in its first argument, must beplaced in a lower stratum (similarly to what happens withthe negation) than the one of the predicate (~) containing therule with the metapredicate. For a relation with no transitiveproperty, its closure coincides with the classical closure, andthe last rule simply becomes: ~(X,Y,D) :- $~(X,Y,D), so thatboth $~ and ~ are in the same stratum. Finally, note that onlysome built-ins as this grouping can contain compound termsand goals as arguments.

The system includes several commands for stating theproperties and t-norm of a given relation. In particular,the command /fuzzy_relation Relation ListOfProperties

sets the relation name with its properties given as a list(including reflexive, symmetric and transitive). The com-mand /t_norm Relation TNorm sets the t-norm (goedel,lukasiewicz, product, hamacher, nilpotent, where min issynonymous for goedel, and luka for lukasiewicz) for thegiven relation.

C. Implementing Weak Unification

Weak unification applies both to terms and predicates. Thespecific weak unification algorithm is implemented followingclosely Martelli and Montanari’s unification algorithm forsyntactic unification [26]. The occurs check is not neededin the deductive setting because compound terms are not

4Otherwise, a warning would be raised signalling them as singletons.

allowed (as X=f(X) in Prolog, which would represent aninfinite structure).

weak_unify(Atomic1, Atomic2, Lambda, Degree) :-atomic(Atomic1), atomic(Atomic2), !,unification_degree(Atomic1,Atomic2,Degree),Degree >= Lambda.

weak_unify(Term, Variable, _Lambda, 1.0) :-nonvar(Term), var(Variable), !, Variable = Term.

weak_unify(Variable, Term, _Lambda, 1.0) :-var(Variable), Variable = Term.

The first clause uses the approximation degree betweentwo constants defined by ~ for returning the wmgu de-gree. The predicate unification_degree(Atomic1, Atomic2,

Degree) returns this degree, and it is checked next to be abovethe λ-cut value. The predicate is implemented as:

unification_degree(Atomic,Atomic,1.0) :-(fuzzy_relation_property(~, reflexive) -> !; fail).

unification_degree(Atomic1,Atomic2,Degree) :-~(Atomic1,Atomic2,Degree), !.

Its first clause returns the top approximation degree forunifying a constant with itself if the relation is reflexive.5

Otherwise it either returns the degree between them (if exists)or fails. The final cut removes alternatives and provides away to select the representative of the wmgu class. Hence,this implementation provides a weak most general unifier aswell as a numerical value, called the unification degree in[7]. Intuitively, the unification degree will represent the truthdegree associated with the (query) computed instance.

FuzzyDES implements a weak unification operator, denotedby ~~, which is the fuzzy counterpart of the syntacticalunification operator = of standard Prolog. It can be used, first,to unify two terms as in the goal Term1~~Term2, and, second,to construct expressions. So, the expression Term1~~Term2

returns the unification degree when evaluated, and can be usedwherever an expression is expected in a goal, as in 1-a~~b>0.5

(if a~b=0.4, then the goal succeeds because 1− 0.4 > 0.5 ).Solving a goal Expr1 Op Expr2 in DES, where Op is a

comparison operator (=, \=, >, <, >=, =<) and each Expri isan expression, amounts to evaluate both expressions and com-paring them with respect to the operator (with the exceptionof =, which performs classical unification on evaluated terms).

D. FuzzyDES Programs and Weak SLD Resolution

FuzzyDES defines a program Π as a fuzzy theory, thatis, as a mapping applying a finite set of formulas, namelyrules, into the elements (truth values) of the lattice (0, 1].Informally, a FuzzyDES program can be seen as a set of pairs〈R;α〉, where R is a rule and α = Π(R) is a truth degreeexpressing the confidence which the system user has in thetruth of the rule R. We call such a pair a graded rule. Forrules, FuzzyDES follows the same syntactical conventions asdescribed in Subsection III-A, and textual syntax of a gradedrule 〈R;α〉 is the textual syntax for R followed by with α.

5This clause is introduced for efficiency reasons, by omitting the search forentries ~(Atomic,Atomic,1.0) in the 4-closure. These extensional entriescould be omitted as well.

Page 7: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 7

The next definition, which formalizes FuzzyDES opera-tional semantics, enhances the definition in Subsection 5.2 of[6] to deal with graded rules.

Definition 4.1: Let Π be a FuzzyDES program, R a simi-larity relation on the first order alphabet induced by Π, 4 thefixed t-norm associated to R, and λ a λ-cut. We define WeakSLD (WSLD) resolution as a transition system 〈E,⇒WSLD〉where E is a set of triples 〈G, θ, α〉 (goal, substitution,approximation degree), that we call the state of a computation,and whose transition relation ⇒WSLD⊆ (E ×E) is the smallestrelation which satisfies:

〈(←A′ ∧Q′), θ, α〉 ⇒WSLD 〈← (Q∧Q′)σ, θσ, δ4β4α〉

if 〈R ≡ (A ←Q); δ〉 << Π, δ ≥ λ, σ = wmguλR(A,A′) 6=fail, β = R(Aσ,A′σ) ≥ λ, and (δ4β4α) ≥ λ.where, Q and Q′ are conjunctions of atoms and the notationR << Π represents that R is a standardized apart rule in Π.

A WSLD derivation for Π ∪ {G0} is a sequence of WSLDresolution steps 〈G0, id, 1〉 ⇒WSLD 〈G1, θ1, α1〉 ⇒WSLD . . .⇒WSLD

〈Gn, θn, αn〉. And a WSLD refutation is a WSLD derivation〈G0, id, 1〉 ⇒WSLD

∗ 〈2, σ, α〉, where 2 is the empty clause,〈σ, α〉 is a fuzzy computed answer, where σ is a computedsubstitution and α its approximation degree. Certainly, aWSLD-refutation computes a family of answers, in the sensethat, if θ = {x1/t1, . . . , xn/tn} is a computed substitution,with degree α, then any substitution θ′ = {x1/s1, . . . , xn/sn},satisfying R(si, ti) ≥ λ, for any 1 ≤ i ≤ n, is alsoa computed substitution if its approximation degree β =α4(4n1R(si, ti)) ≥ λ.

E. Compilation of FuzzyDES Programs

Similarly to Bousi∼Prolog, FuzzyDES implements the op-erational mechanism of WSLD resolution by compiling pro-grams into core Datalog rules (instead of to Prolog clauses).The idea is to obtain a set of Datalog rules that allows theemulation of the WSLD resolution procedure, splitting it intwo steps: i) the crisp unification of a defined predicate (inthe head of the transformed rule); and ii) the weak unificationof the attributes of the defined predicate (in the body ofthe transformed rules) with the help of the built-in weakunification predicate explained in Subsection IV-C. To thisend, the transformation moves each head argument to anexplicit weak unification performed by a specific predicate,as formalized later.

Moreover, if there exists R(p, q) = α between predicates pand q, then simulating a flexible matching of these predicatesymbols by using a classical unification technique can be doneby introducing a new clause for each predicate q which is closeto p.

The following definition formalizes the program transfor-mation just outlined above, but first we need to introduce anextended language obtained by adding to the source languagealphabet the elements of the lattice [0, 1] (of approximationdegrees). Clauses in this extended language contain bodieswith literals which are interpreted as approximation degrees.We call these clauses e-clauses (expanded clauses). Also, e-clauses with an empty head are called e-goals.

Definition 4.2: Let Π be a logic program, R a similarityrelation on the syntactic domain generated by Π, 4 the fixedt-norm associated to R, and λ ∈ [0, 1] a cut value. Let〈p(t1, . . . , tn) ← Q; δ〉 be a graded clause in Π defining then-ary predicate p. Then, for each R(p, q) = α > λ add to thetransformed program Π′ the new e-clause:

q(x1, . . . , xn)← (δ4α) ∧ x1 ≈ t1 ∧ · · · ∧ xn ≈ tn ∧Q.

where each xi is a fresh variable and xi ≈ ti is an expressionthat returns the approximation degree obtained from the weakunification of the term bound to the variable xi and the termti.Observe that, since R(p, p) = 1 for any symbol p, if〈p(t1, . . . , tn) ← Q; δ〉 is in the original program, the e-clause p(x1, . . . , xn) ← δ ∧ x1 ≈ t1 ∧ · · · ∧ xn ≈ tn ∧ Qwill be in the transformed program. Thus, we give a uniformtreatment for all clauses in the transformed program. Since thistransformation was initially implemented in Bousi∼Prolog, wename it (for short) BPL expansion, and the transformed pro-grams generated by its application BPL expanded programs.This transformation has been adapted and implemented in theFuzzyDES system with the predicate expand_rule/7 (calledat compilation-time) and produces Datalog code able to beexecuted by the FuzzyDES system. The behaviour of thispredicate is illustrated by the following example.6

Example 1: Consider the following fragment of a simpleFuzzyDES program, where the truth degree of a graded ruleis specified with the operator with:

% PROXIMITY EQUATIONSp/1~q/1=0.6.

% FACTS AND RULESr(a). p(b). p(X) :- r(X) with 0.8. q(c).

After the execution of the predicate expand_rule/7, thefollowing Datalog code is generated ( having issued thecommand /fuzzy_expansion bpl ):r(A,_D) :- ’$unify_arguments’([[A,a,B]]),

’$t_norm’(~,[_D,B],_D).p(A,_D) :- ’$unify_arguments’([[A,b,B]]),

’$t_norm’(~,[_D,B],_D).p(A,_D) :- ’$unify_arguments’([[A,X,B]]),r(X,C),

’$t_norm’(~,[_D,0.8,C,B],_D).p(A,_D) :- ’$over_lambda_cut’(0.6),

’$unify_arguments’([[A,c,B]]),’$t_norm’(~,[_D,B,0.6],_D).

q(A,_D) :- ’$unify_arguments’([[A,c,B]]),’$t_norm’(~,[_D,B],_D).

q(A,_D) :- ’$over_lambda_cut’(0.6),’$unify_arguments’([[A,b,B]]),’$t_norm’(~,[_D,B,0.6],_D).

q(A,_D) :- ’$over_lambda_cut’(0.6),’$unify_arguments’([[A,X,B]]), r(X,C),’$t_norm’(~,[_D,0.8,C,B,0.6],_D).

Note that each transformed predicate has an additionalargument in order to store and allow the propagation oftruth degrees. Here, $unify_arguments implements the weakunification operator ≈ in Definition 4.2, and $t_norm (asexplained in Subsection IV-B) is an internal predicate usedto propagate the truth degrees of the rules and the approxima-tion degrees coming from the fuzzy relations. The predicate

6The rules listed as a result of the transformation correspond to coreDATALOG, and they can be examined with the command /listing andenabling development listings with /development on.

Page 8: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 8

$over_lambda_cut anticipates failure during solving of theexpanded rule if the current λ-cut is above the approximationdegree given by the equation corresponding to the expansion.Since p/1 is close to q/1 (with degree 0.6), one rule for q/1 isadded for each rule defining p/1 (and viceversa).

Though users can inspect core Datalog by issuing thecommand /development on, the default mode is intended forreasoning in the source Datalog level, so that users manipulateboth graded rules and proximity equations at the high-levelsource syntax (e.g., interactively adding or removing them).

In general, adding as many rules as those defining p for apredicate q close to p with a truth degree α might develop aspace issue for a high number of rules. Fortunately, thanks tothe tabling-based operational mechanism of FuzzyDES, it ispossible to simplify this transformation. So, q can be definedas a straight call to p with a truth degree α (see the seconditem of the next definition).

Definition 4.3: Let Π be a logic program, R a similarityrelation on the syntactic domain generated by Π, and λ ∈ [0, 1]a cut value. For each predicate p defined in Π,

1) for each graded clause 〈p(t1, . . . , tn) ← Q; δ〉 in Πdefining the n-ary predicate p, add to the transformedprogram Π′ the e-clause:

p(x1, . . . , xn)← δ ∧ x1 ≈ t1 ∧ · · · ∧ xn ≈ tn ∧Q.

2) for each entry R(p, q) = α ≥ λ in R (with q 6≡ p) addto the transformed program Π′ the e-clause:

q(x1, . . . , xn)← α ∧ p(x1, . . . , xn).

where each xi is a fresh variable.Since this transformation is specific of the FuzzyDES sys-

tem, we name it (for short) FDES expansion, and the trans-formed programs generated by its application FDES expandedprograms.

Under this definition, if a program with n predicates, eachone defined by ri rules, and being it close to other ci predi-cates, then last program transformation generates

∑ni=1 ri+ci

rules. However, the program transformation of Definition 4.2generates

∑ni=1 ri + (ri × ci) rules. Therefore, many trans-

formed rules can be saved for a proximity (or similarity)relation with a huge amount of defined rules.

Note that the transformation of Definition 4.3 is only viablein the context of languages, such as Datalog, with specificsyntactic constraints and using a tabling-based computationstrategy. However, languages that follow a top-down strategy,like Bousi∼Prolog, cannot benefit from this transformation.Note that this transformation can convert a program with afinite search space into one with an infinite search space for apure top-down computational strategy (for instance, think ofR(p, q) = α1 and R(q, p) = α2 that would lead to mutuallyrecursive e-clauses).

The following example shows a transformation follow-ing Definition 4.3 by enabling it with the command/fuzzy_expansion des.

Example 2: Consider again the program fragment of Exam-ple 1. After the execution of the predicate expand_rule/7 thefollowing Datalog code is generated:r(A,_D) :- ’$unify_arguments’([[A,a,B]]),

’$t_norm’(~,[_D,B],_D).p(A,_D) :- ’$unify_arguments’([[A,b,B]]),

’$t_norm’(~,[_D,B],_D).p(A,_D) :- ’$unify_arguments’([[A,X,B]]),r(X,C),

’$t_norm’(~,[_D,0.8,C,B],_D).p(A,_D) :- ’$over_lambda_cut’(0.6), q(A,B),

’$t_norm’(~,[0.6,B,_D],_D).q(A,_D) :- ’$unify_arguments’([[A,c,B]]),

’$t_norm’(~,[_D,B],_D).q(A,_D) :- ’$over_lambda_cut’(0.6), p(A,B),

’$t_norm’(~,[0.6,B,_D],_D).

Solving e-clauses from this transformation in the deductivesetting poses no non-nontermination problems.

Finally, as we shall justify in Section V, the Datalog codegenerated by the transformation of either Definition 4.2 or 4.3(depending on the flag fuzzy_expansion) is executed by thedeductive engine of the FuzzyDES system, therefore emulatingthe result of executing the original program under WSLDresolution.

F. Fuzzy Answer Subsumption

As introduced in Subsection III-D, answer subsumption ina deductive database simply resorts to a membership test.Indeed, this approach can be also used in the fuzzy setting,but a different, more-efficient notion of answer subsumptioncan be applied as we propose at the end of this subsection.

Following any of the expansions explained in the formerSubsection IV-E, each program rule is compiled to a rule withan additional argument for the approximation degree. Fuzzy-DES handles a user query Q by transforming the query into anautoview for the predicate answer/m+1, where its body is thecompiled query, and m are the relevant variables in Q. The lastargument of the transformed call is the placeholder for the ap-proximation degree. The outcome to the user query is thus builtby solving the call answer(X1,...,Xm,Xδ), where X1,...,Xm

⊆ vars(Q) are the relevant variables in the user query, and Xδis the variable for the approximation degree. Each matchingtuple ti in the answer table (i.e., ti=answer(X1,...,Xm,Xδ)σi)is presented to the user as answer(X1,...,Xm)σi with Xδσi.

The answer table holds entries for all user predicates in-volved in solving a user query, each one with the last argumentbeing the approximation degree, which is ground as the otherarguments are. Then, along solving, it is possible to deducea tuple t1 ≡ t(C,D) so that there already exists anothertuple t2 ≡ t(C,D′) in the answer table, differing only in theapproximation degree. If D′ > D, it is not worthwhile to addthe same tuple with a lower approximation degree, so that wesay that t2 subsumes t1. This can be seen as a generalizationof answer subsumption to the fuzzy setting, and we applyin the deductive setting obtaining at least three advantages:First, the size of the answer table is reduced. Second, answertable look-ups are therefore more efficient. And, third, joinsare simplified by taking into account less tuples.

V. OPERATIONAL SEMANTICS FOR EXPANDED PROGRAMS

This section formally describes, at a high abstraction level,the operational semantics for expanded programs implementedboth by Bousi∼Prolog and the FuzzyDES systems. This se-mantics simulates the WSLD resolution rule of Definition 4.1,

Page 9: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 9

and is the basis to the implementation described in Subsec-tion IV-E. The main goal of this section is to establish that,in fact, the behavior of a program executed by using WSLDresolution is equivalent to one of the corresponding expandedprogram when it is executed by this abstract operationalmechanisms.

In the remainder of this section we shall work insidethe framework of the extended language built by e-clausesand e-goals. Π′ denotes a transformed program (or expandedprogram) obtained by applying either Definition 4.2 or Def-inition 4.3 on a logic program Π equipped with a similarityrelation R and a cut value λ. In what follows, transition stepsare applied to underlined fragments, and the symbolsQ,Q′ aredenoting conjunctions of atoms and, possibly, approximationdegrees.

Definition 5.1: We define the operational semantics forexpanded programs as a transition system 〈E,⇒Ex〉 where Eis a set of triples 〈G, θ, α〉 (e-goal, substitution, approximationdegree), and the transition relation ⇒Ex⊆ (E × E) is thesmallest relation which satisfies:

Rule 1: if β ∈ (0, 1], and (β4α) ≥ λ,

〈(←β ∧Q′), θ, α〉 ⇒Ex 〈← Q′, θ, β4α〉

Rule 2: if σ = wmguλR(A,B) 6= fail, β = R(Aσ,Bσ) ≥ λ,and (β4α) ≥ λ,

〈(←A ≈ B ∧Q′), θ, α〉 ⇒Ex 〈← Q′σ, θσ, β4α〉

Rule 3: if (p(x1, . . . , xn)←β ∧ x1 ≈ t1 ∧ · · · ∧xn ≈ tn ∧Q)<< Π′,

〈(←p(s1, . . . , sn) ∧Q′), θ, α〉 ⇒Ex

〈(← β ∧ s1≈ t1 ∧ · · · ∧ sn≈ tn ∧Q ∧Q′), θ, α〉

Note that in the operational step defined by Rule 3, weperform a syntactic unification of the selected atom of thee-goal and the head of the e-clause. Note also that, in thiscase, the most general unifier {x1/s1, . . . , xn/sn} does notparticipate in the final computed answer because its domainvariables are standardized apart (i.e., they are fresh variables)and does not affect the bindings of the substitution θ. In whatfollows, we will often speak of simplification steps to refer tothe steps performed with Rule 1 above.

To achieve a more condensed notation in the proofs,throughout the remainder of this section, we introduce thefollowing notation: we write on for the sequence of syntacticobjects o1, . . . , on; similarly, σn denotes the composition ofsubstitutions σ1σ2 · · ·σn.

A. Semantic Equivalence for BPL Expanded Programs

In the sequel, we prove the semantic equivalence betweenthe WSLD rule, applied to a logic program Π, and theoperational mechanism of Definition 5.1, when it is applied tothe BPL expanded program Π′ (obtained from Π and generatedby Definition 4.2).

First we recall a useful property of a similarity relation Rproved by M. Sessa.

Proposition 5.2: [7, pag. 397] Let R be a similarity relationand λ > 0 a cut value. For any substitution θ and terms t, t′,if R(t, t′) ≥ λ then R(tθ, t′θ) = R(t, t′) ≥ λ.

Lemma 5.3: Given a program Π with a similarity relationR, an associated t-norm 4, and a cut value λ ∈ (0, 1], let Π′

be the BPL expanded program. If there is a step S:

〈(←q(sn)∧Q′), θ, α〉 ⇒WSLD 〈← (Q∧Q′)σ, θσ, γ4α〉

in Π, then there is a derivation: 〈←q(sn)∧Q′,θ,α〉 ⇒Ex+ 〈←

(Q∧Q′)σ, θσ, γ4α〉 in Π′, which computes the same state.PROOF. If there is a step S in Π, is because there exists agraded rule C = 〈p(tn)← Q; δ〉 in Π such that

σ = wmguλR(q(sn), p(tn)) = wmguλR({sn ≈ tn}),

with approximation degree

λ′ = R(q(sn)σ,(ptn)σ) = R(q, p)4(4ni=1R(siσ, tiσ))= β4(4ni=1λ

′i) ≥ λ.

and γ = δ4λ′ ≥ λOn the other hand, by Definition 4.2, if R(p, q) = β ≥ λ,

there is an e-clause q(xn) ← (δ4β) ∧ xn ≈ tn ∧ Q in Π′.Therefore, it is possible to construct the following derivationin Π′:

〈(←q(sn)∧Q′), θ, α〉⇒Ex 〈(← (δ4β)∧sn ≈ tn∧Q∧Q′), θ, α〉⇒Ex 〈(← sn ≈ tn∧Q∧Q′), θ, (δ4β)4α〉⇒Ex 〈(← (s2 ≈ t2∧. . .∧ sn ≈ tn∧Q∧Q′)σ1), θσ1,

(δ4β)4λ′14α〉. . .⇒Ex 〈(← (sn ≈ tn∧Q∧Q′)σn−1), θσn−1,

(δ4β)4(∧n−1i=1 λ

′i)4α〉

⇒Ex 〈← (Q∧Q′)σ, θσ, γ4α〉

where σ1 = wmguλR({s1 ≈ t1}), for 2 ≤ i ≤ n, σi =wmguλR({si ≈ ti}σi−1), and, thanks to Proposition 5.2,R(siσi, tiσi) = R(tiσ, siσ) = λ′i.

Note that, when we apply the weak unification algorithm de-scribed in Section IV-A to the unification problem {sn ≈ tn},we reach a successful configuration:

〈{s1 ≈ t1, . . . , sn ≈ tn}, id, 1〉⇒+ 〈{s2 ≈ t2, . . . , sn ≈ tn}σ1, σ1, λ′1〉⇒+ 〈{s3 ≈ t3, . . . , sn ≈ tn}σ2, σ2, λ′14λ′2〉. . .⇒+ 〈{sn ≈ tn}σn−1, σn−1,4n−1i=1 λ

′i〉

⇒+ 〈∅, σn,4ni=1λ′i〉

and this weak unification process closely follows the stepsperformed with the subgoals si ≈ ti in the former deriva-tion with the BPL expanded program. Therefore, σn =wmguλR({tn ≈ sn}) = σ and R({tn ≈ sn}) = 4ni=1λ

′i. §

The following proposition establishes a kind of complete-ness result where we prove that derivations in the originalprogram using the WSLD resolution rule can be reproducedby the FuzzyDES operational mechanism in the transformedprogram.

Proposition 5.4: Given a program Π with a similarityrelation R, let Π′ be the BPL expanded program. If there

Page 10: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 10

is a derivation D = (〈←Q, θ, α〉 ⇒WSLD∗ 〈←Q′, θ′, α′〉) in Π,

then there is a derivation 〈←Q, θ, α〉 ⇒Ex∗ 〈←Q′, θ′, α′〉 in

Π′, which computes the same state.PROOF. By induction on the length of the derivation D andLemma 5.3. §

Now we proceed by demonstrating the reverse of the lastproposition, which constitute a kind of soundness result.

Lemma 5.5: Given a program Π with a similarity relationR, an associated t-norm 4, and a cut value λ ∈ (0, 1], letΠ′ be the BPL expanded program. If there is a derivationD′: 〈← p(sn)∧Q′, θ, α〉 ⇒Ex

+ 〈← (Q∧Q′)σ, θσ, λ′4α〉 inΠ′, then there is a step S: 〈← p(sn)∧Q′, θ, α〉 ⇒WSLD 〈←(Q∧Q′)σ, θσ, λ′4α〉 in Π.PROOF. Note that we can assume that the shape of thederivation D′ is:

〈(←p(sn)∧Q′), θ, α〉⇒Ex

+ 〈(← β∧(Q∧Q′)σ), θσ, (4ni=1λ′i)4α〉

⇒Ex 〈← (Q∧Q′)σ, θσ, λ′4α〉

where the first step is performed by Rule 3 and then itis followed by a sequence of applications of Rule 2, withλi = R(tiσ,siσ) and σ = wmguλR({tn ≈ sn}). Finally, asimplification step with Rule 1 is performed and the degreeβ is compounded with 4ni=1λ

′i to obtain λ′.

If the first step of derivation D′ is possible, it is becausethere exists an e-clause C′≡ (p(xn)← (δ4β)∧xn ≈ tn∧Q)in Π′ and there must be an entry R(q, p) = β ≥ λ in R.So, there exists a clause C ≡ 〈q(tn)←Q; δ〉 in Π, whosehead weakly unify with p(sn) and wmguλR(q(tn), p(sn)) =wmguλR({tn ≈ sn}) = σ with approximation degreeR(q(tn)σ,p(sn)σ) = R(p, q)4 (4ni=1R(tiσ, siσ)) =β4(4ni=1λ

′i) = λ′. Therefore, it is possible the WSLD step

S. §

Proposition 5.6: Given a program Π with a similarityrelationR, let Π′ be the BPL expanded program. If there existsa derivation D′ = (〈←Q, θ, α〉 ⇒Ex

∗ 〈← Q′, θ′, α′〉) in Π′,then there exists a derivation 〈←Q, θ, α〉 ⇒WSLD

∗ 〈← Q′, θ′, α′〉in Π, which computes the same state.PROOF. By induction on the length of the derivation D′.Without loss of generality we can assume that the steps inderivation D′ are conveniently ordered to allow the applicationof Lemma 5.5. §

The last proposition jointly with Proposition 5.4 state theequivalence of both operational mechanisms and the correct-ness of our implementation.

B. Semantic Equivalence for FDES Expanded Programs

In this subsection we turn our attention to FDES expandedprograms. The ultimate objective, as in the previous sub-section, is to prove the equivalence of WSLD resolutionand the source program Π with respect to the operationalmechanism of Definition 5.1, when it is applied to an FDESexpanded program Π′ obtained from Π. In this case, weproceed in an indirect way, studying the relation between thesetwo types of transformed programs, and proving that theyare semantically equivalent (modulo subsumed answers) with

respect to the operational semantics for expanded programs ofDefinition 5.1.

Given a source program Π, we first prove that the BPLexpanded program Π′ for Π results from the unfolding trans-formation of the corresponding FDES expanded program Π′′.

Program transformation is an optimization technique forcomputer programs that, from an initial program Π0, derivesa sequence Π1, . . . ,Πn of transformed programs by applyingelementary transformation rules which improve the originalprogram under some criteria.

Unfolding is a well-known semantics-preserving programtransformation rule (first introduced in [27] to optimize func-tional programs). In essence, it is usually based on the ap-plication of operational steps on the body of program rules.The unfolding transformation is able to improve programs,generating more efficient code. Unfolding is the basis fordeveloping sophisticated and powerful programming tools,such as folding/unfolding transformation systems and partialevaluators.

In our fuzzy framework, the unfolding transformation canbe defined, similarly to [28], as follows:

Definition 5.7: Let Π′ be an expanded program and R :(A ← Q) ∈ Π′ a (non unit) program rule. Then, the fuzzyunfolding of Π′ with respect to the ruleR and a fixed selectionrule is the new expanded program Π′′ = (Π′ ∪U) \ {R} suchthat:

U = {Aσ ← α ∧Q′ | 〈Q; id; 1〉 ⇒Ex 〈Q′;σ;α〉}In order to accelerate the unfolding transformation process, wewill allow to perform a sequence of simplification steps afteran unfolding step.

A BPL expanded program can be obtained by unfolding theFDES expanded program. Next proposition states this result.

Proposition 5.8: Given a program Π with a similarityrelation R, an associated t-norm 4, and a cut value λ ∈ (0, 1],let Π′ be the FDES expanded program. The BPL expandedprogram Π′′ can be obtained by unfolding Π′, after disregard-ing rules that produce subsumed answers.PROOF. Without loss of generality, we can assume a programΠ defining two predicates p and q, each one defined by k andl rules respectively, and equipped with a similarity relation R(characterized by the reflexive, symmetric, transitive t-closureof the entries R(p, q) = α, R(q, r) = β and R(p, r) = γ).Note that since the predicate r is not defined by rules, thereare no e-clauses for it.

Π = {Rp1 : 〈p(t1n)← Q1, δp1〉,. . .

Rpk : 〈p(tkn)← Qk, δpk〉,Rq1 : 〈q(s1n)← B1, δq1〉,

. . .Rql : 〈q(sln)← Bl, δql〉}

For this program, the FDES expanded program is:

Π1 = {R1p1

: p(xn)← δp1 ∧ xn ≈ t1n ∧Q1,

. . .R1pk

: p(xn)← δpk ∧ xn ≈ tkn ∧Qk,Rp,q : q(xn)← α ∧ p(xn),Rp,r : r(xn)← γ ∧ p(xn),

Page 11: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 11

R1q1

: q(xn)← δq1 ∧ xn ≈ s1n ∧ B1,. . .

R1ql

: q(xn)← δql ∧ xn ≈ sln ∧ Bl,Rq,p : p(xn)← α ∧ q(xn),Rq,r : r(xn)← β ∧ q(xn)}

Now, we proceed to unfold Π1 with respect to Rp,q using aselection rule that selects the subgoal p(xn), in the body ofRp,q . Then, for each rule R1

piin Π1:

〈α ∧ p(xn), id, 1〉R1

pi⇒Ex 〈α ∧ δpi ∧ xn ≈ tin ∧Qi, id, 1〉Simp⇒Ex

+ 〈xn ≈ tin ∧Qi, id, α4δpi〉

leading, by Definition 5.7, to the rules :

q(xn)← (α4δpi) ∧ xn ≈ tin ∧Qi

For the rule Rq,p, also defining p, we have the derivation

〈α ∧ p(xn), id, 1〉Rq,p⇒Ex 〈α ∧ α ∧ q(xn), id, 1〉Simp⇒Ex

+ 〈q(xn), id, α〉

However, this last derivation leads to the recursive ruleq(xn) ← α ∧ q(xn) which can be removed since it does notprovide new answers. Therefore,

U1 = {R2q1

: q(xn)← (α4δp1) ∧ xn ≈ t1n ∧Q1,

. . .R2qk

: q(xn)← (α4δpk) ∧ xn ≈ tkn ∧Qk}

with this first unfolding step we obtain an unfolded programΠ2 = (Π1 ∪ U1) \ {Rp,q}. Note that the rule Rp,q in Π1 hasbeen removed and the rules of the BPL expanded programproduced by the k rules defining p in Π and the entryR(p, q) = α (see Definition 4.2) has been added.

Starting from Π2, we can undertake a new unfolding step.In this case, we proceed to unfold Π2 with respect to Rp,r,Again, selecting the subgoal p(xn), in the body of Rp,r. Then,for each rule R1

piin Π2:

〈γ ∧ p(xn), id, 1〉R1

pi⇒Ex 〈γ ∧ δpi ∧ xn ≈ tin ∧Qi, id, 1〉Simp⇒Ex

+ 〈xn ≈ tin ∧Qi, id, γ4δpi〉

leading, by Definition 5.7, to the rules:

r(xn)← (γ4δpi) ∧ xn ≈ tin ∧Qi

For the rule Rq,p, also defining p, we have the derivation

〈γ ∧ p(xn), id, 1〉Rq,p⇒Ex 〈γ ∧ α ∧ q(xn), id, 1〉Simp⇒Ex

+ 〈q(xn), id, γ4α〉

However, this last derivation leads to the recursive ruler(xn) ← (γ4α) ∧ q(xn). It is important to note that β ≥(γ4α), thanks to the transitive property of the similarity R.Hence, we can remove the preceding unfolded rule since itdoes not provide better answers than the rule Rq,r which issubsuming it. Therefore,

U2 = {R2r1 : r(xn)← (γ4δp1) ∧ xn ≈ t1n ∧Q1,

. . .R2rk : r(xn)← (γ4δpk) ∧ xn ≈ tkn ∧Qk}

and we obtain an unfolded program Π3 = (Π2∪U2)\{Rp,r}.Once again, the rule Rp,r has been removed and the rules ofthe BPL expanded program produced by the k rules definingp in Π and the entry R(p, r) = γ (see Definition 4.2) has beenadded.

In a complete similar way we can unfold Π3 w.r.t to Rq,pgiven the unfolded program Π4 and in a further unfoldingtransformation we can unfold Π4 w.r.t to Rq,r given theunfolded program Π5 where the rules Rq,p and Rq,r have beenremoved and the rules of the BPL expanded program producedby the l rules defining q in Π and the entries R(q, p) = α andR(q, r) = β (see Definition 4.2) have been added. As a resultΠ5 is the BPL expanded program.

In summary, starting from the FDES expanded program,Π1, we carry out continuous unfolding transformation steps,leading to a sequence of unfolded programs: Π2,Π3,Π4,Π5,which ends into the BPL expanded program for Π. §

Finally, we can prove that an FDES expanded programexecuted by the operational mechanism of Definition 5.1 issemantically equivalent to the original program executed byWSLD resolution.

Proposition 5.9: Given a program Π with a similarityrelation R and a cut value λ ∈ (0, 1], let Π′ be the FDESexpanded program. Π′, executed by the operational mechanismof Definition 5.1, is semantically equivalent to Π executedby WSLD resolution. That is, they produce the same fuzzycomputed answers (modulo subsumed answers).PROOF. Given a program Π and its corresponding FDESexpanded program Π′, by Proposition 5.8, the BPL expandedprogram Π′′ is the unfolded program obtained from Π′ ifsome rules that produce subsumed answers are removed.First note that, unfolding is a semantics-preserving programtransformation (preserving fuzzy computed answers). On theother hand, according to propositions 5.6 and 5.4, the BPLexpanded program Π′′, executed by the operational mechanismof Definition 5.1, is semantically equivalent to the originalprogram executed by WSLD resolution. Therefore, we canaffirm that an FDES expanded program executed by the oper-ational mechanism of Definition 5.1 is semantically equivalentto the original program executed by WSLD resolution (modulosubsumed answers). §

VI. PERFORMANCE

This section tries to highlight the effect of different param-eters of the system with respect to their impact on its perfor-mance. Solving time and data structure sizes are elements ofinterest in performing experiments. A number of tests havebeen executed with different parameters to analyse scalability.As test platform, we used a Windows 10 64 bit OS on anIntel Xeon CPU E3-1505M v5 (4 physical cores) running at2.8 GHz at its peak, with 16GB RAM. DES (programmed inProlog) is run with SICStus Prolog 4.3.1 64 bit in interpretedmode (no compiled executable). Solving time is only due tosolving, eliding e.g. parsing and display. Statistics in DES areenabled to collect data as timings, look-ups, and data structuresizes. This incurs in a small but noticeable overhead on the

Page 12: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 12

system performance. For the fuzzy environment, we considera similarity (transitive) relation ˜ with a product t-norm.

Regarding solving parameters, we analyse the impact ofthe two different expansions BPL and FDES. The next twosubsections focus on this: the first one on extensional predicatesimilarity, and the second one on intensional predicate similar-ity. We do not handle the case of constant similarity becauseit does not affect expansions, so that the same performanceresults would be retrieved for both. Subsection VI-C analysesthe performance impact of enabling answer subsumption forboth expansions. Finally, Subsection VI-D analyses the per-formance impact of specifying a λ-cut for both expansions.

A. Comparing Expansions for Extensional Predicate Similar-ity

A first experiment consists of examining the performance ofthe two expansions BPL and FDES for extensional predicates(consisting only of facts). Here, we consider n extensionalpredicates so that they are similar as follows: pi/2 ~ pi+1/2

= 0.5 for i ∈ {1, . . . , n− 1}. Each predicate is defined by mfacts of the form: pi(ci+j−1,c2i+j−1), where i ∈ {1, . . . , n}and j ∈ {1, . . . ,m}. For example, for n = 2 and m = 3, theresulting test program is:

p1/2~p2/2=0.5.p1(c1,c2). p1(c2,c3). p1(c3,c4).p2(c2,c4). p2(c3,c5). p2(c4,c6).

This way, a pair of constants which is not in apredicate pi1 but is in pi2 , can be deduced for pi1with an approximation degree less than one, and assmall as the product distance between pi1 and pi2 . Asparameter instances we selected n ∈ {2, 4, 6}, and m ∈{10, 100, 200, 500, 1000, 2000, 5000, 10000, 15000, 20000}.A failing goal p1(a,b) (which produces no answer tuples)has been selected with the aim to traverse all the rules in thedatabase (instead of testing the performance of caching dueto tabling), either directly to the predicate p1/2 or indirectlyto the rest of the predicates via the similarity relation.

Table I includes results for these parameters m and n, andfor each combination of them, three measures are shown:Columns “So.”, “Co.”, “Rs.” respectively show the ratio ofsolving time, consult time, and number of rules. Ratios arealways shown as the BPL measure with respect to the FDESmeasure. For each test configuration, 10 runs have beenexecuted, and the best one has been selected.

TABLE IEXTENSIONAL PREDICATE SIMILARITY EXPANSIONS COMPARISON

n = 2 n = 4 n = 6m So. Co. Rs. So. Co. Rs. So. Co. Rs.

10 1.07 1.03 1.62 0.79 1.32 2.77 0.12 0.43 3.67100 1.36 1.20 1.95 0.88 1.80 3.82 0.18 2.46 5.63200 0.86 1.15 1.97 1.11 2.21 3.91 0.23 3.99 5.81500 1.13 1.31 1.99 1.30 3.61 3.96 0.35 9.16 5.92

1,000 1.00 1.54 1.99 1.56 6.21 3.98 0.51 15.16 5.962,000 1.14 1.88 2.00 1.66 9.35 3.99 0.73 20.89 5.985,000 1.31 2.74 2.00 1.65 11.50 4.00 1.20 24.33 5.99

10,000 1.32 3.25 2.00 1.79 12.34 4.00 1.38 28.94 6.0015,000 1.38 3.18 2.00 1.77 12.46 4.00 1.58 26.05 6.0020,000 1.39 3.48 2.00 1.84 12.84 4.00 1.75 22.55 6.00

For n = 2, the solving speed-up fluctuates for small m upto 2,000. Note that solving times for these values of m arerather low (less than 32 ms) so that OS mostly interferes forthese numbers. For larger numbers, up to a solving speed-up of1.39× is got. Observe that only two extensional predicates aremade similar. The ratio of rules increases until the asymptote2.00 is reached (BPL generates twice the rules than FDES).Before this, for small number of m, the threshold whichcorresponds to the fixed number of rules for computing the t-closure is noticeable when computing the ratio. For example,the number 1.62 is computed as (40 + 7)/(22 + 7), where40 is the number of compiled rules for the predicates p1 andp2 for BPL expansion, 22 is for FDES, and 7 are the rulesneeded to compute the t-closure (in this case, the t-closuredelivers only 4 equations). Considering higher values for n,an increasing solving speed-up greater than 1 for increasingm’s is found for m = 200 and m = 5, 000 for n = 4and n = 6, respectively. Recall that the FDES expansionresorts to convert any predicate (whether recursive or not) to arecursive predicate. However, the implementation of DES hasno optimization for recursion, as the differential semi-naıverecursion optimization. This makes harder to solve recursivepredicates than non-recursive ones, and as n increases, sothe number of recursive predicates does. However, for givennumbers of n and m there is a point in which this cost isovercome by the cost of handling a greater number of rulesper predicate in the database. For example, for n = 4 andm = 200 there are 3, 209 rules for BPL vs. 821 rules for FDES(which makes a ratio of 3.91), and for n = 6 and m = 5, 000there are 180, 011 rules for BPL vs. 30, 041 rules for FDES(ratio of 5.99). The cost of consulting rules notably increasesmainly because of two factors: First, the database is dynamicand not indexed, so that when adding new rules, the databasemust be sequentially explored. Second, PDG construction isimplemented with an exponential-time algorithm. Followingthe results in this experiment, two issues are amenable to beimproved: recursion optimization, and consulting time (ruletraversing and PDG construction).

B. Comparing Expansions for Intensional Predicate Similarity

Following the comparison of both expansions, a secondexperiment considers a database composed of both extensionaland intensional predicates (with non-empty rule bodies). Weuse the same parameters n and m as in last subsection,and introduce a new parameter k. By contrast to the for-mer experiment, here we focus on similarities between kintensional predicates out of n. We denote the k intensionalpredicates as p1, . . . , pk, while the n − k extensional pred-icates are denoted as pk+1, . . . , pn. The form of each in-tensional predicate clause is pi(X1,Xn−k+1) :- pk+1(X1,X2),

pk+2(X2,X3), ..., pn(Xn−k,Xn−k+1), where i ∈ {1, . . . , k}.Hence, intensional predicate bodies are defined as calls toeach extensional predicate, correlated with shared variablesso that a call pk+l(Xl,Xl+1) is followed by pk+l+1(Xl+1,Xl+2)

for l < n− k. As for the intensional predicate head, the first(resp. second) head variable is the first (resp. last) variable inthe body. As it was just commented, similarity is only specified

Page 13: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 13

TABLE IIINTENSIONAL PREDICATE SIMILARITY EXPANSIONS COMPARISON

k = 3 k = 5 k = 7m So. Co. So. Co. So. Co.

10 0.00 0.96 0.00 1.98 0.40 103.43100 1.00 0.96 0.94 1.28 0.00 30.80200 0.27 1.00 1.00 1.14 3.00 16.94500 2.67 1.05 0.44 1.06 1.23 7.11

1,000 1.94 1.01 1.47 1.06 1.00 3.822,000 0.95 0.99 1.26 1.00 2.07 2.145,000 1.28 1.03 1.56 1.06 1.34 1.27

10,000 1.13 0.99 1.33 1.03 1.51 0.9915,000 1.38 1.00 1.37 0.99 1.70 1.0520,000 1.35 0.97 1.41 1.02 1.67 1.01

for intensional predicates. We assign, axiomatically, proximityequations for the k intensional predicates. Specifically, wewrite pi/2 ~ pi+1/2 = 0.5 for i ∈ {1, . . . , k−1}. On the otherhand, the rest of n− k (extensional) predicates are defined asin the first experiment, i.e., each one defined with m facts. Forexample, for n = 4, m = 2, k = 2, the resulting test programis:

p1/2 ~ p2/2 = 0.5.p1(X1,X3) :- p3(X1,X2), p4(X2,X3).p2(X1,X3) :- p3(X1,X2), p4(X2,X3).p3(c1,c2). p3(c2,c3). p4(c1,c3). p4(c2,c4).

We consider for the experiments 10 predicates (n =10), m ∈ {10, 100, 200, 500, 1000, 2000, 5000, 10000, 15000,20000}, and k ∈ {3, 5, 7}. The same goal p1(a,b) has beenconsidered.

Table II includes results for these parameters m and k,and for each combination of them, two measures are shown:“So.” and “Co.” with the same meaning as before but withthe ratios the other way round. Observe that the number ofrules in the expansion coincides for both BPL and FDES(we therefore omit the column “Rs.”), and ranges from 87to 140, 017 (including the rules for t-closure computation).

As in the first experiment, fluctuations do occur for smallsolving times (less than 50 ms) related to small numbers form. In this case, focusing on larger numbers of m, solvingtimes are better in BPL than in FDES, and increase as mgrows. It is also observed that higher speed-ups are got forlarger k (number of intensional predicates). Though thereare the same expanded rules in both cases, in FDES, thereis one mutually recursive rule for each intensional predicateinstead of a non-recursive rule in the case of BPL. Because(mutually) recursive rules are harder to solve in DES as alreadymentioned, this issue explains such behaviour. With respectto consult times, both expansions converge to 1 for largernumbers of m because the same number of rules are generated.However, for small numbers of m, there is a timing thresholddue to PDG construction, which follows a naıve algorithm.Again, recursive predicates make this algorithm to performbadly. This is also the reason for omitting PDG updating whensubmitting queries, which are automatic views (therefore rulesadded to the database and hence, the PDG should be updated).For the queries posed, there is no need for PDG update sincein particular the stratification does not change.

TABLE IIIIMPACT OF ANSWER SUBSUMPTION

n m k l Exp. S. S-U ET. Gain Ans. Gain

5 10 2 4 BPL 1 1 2DES 1,078 385 1,025

7 100 4 10 BPL 1 2 4DES 546 302 1,025

C. Analysing Answer Subsumption

Subsection IV-F explained the concept of fuzzy answersubsumption to prune computations as an optimization to thenaıve case. Here we test the benefits of this optimization withrespect to both solving time and data size. By contrast toformer experiments, the goal must succeed and return answertuples to examine the amount of such tuples, the tuples in theextension table, and its solving time. Here we select the goalp1(c1,X). Regarding the test program, we reuse the one inlast subsection, which can be parametrized to specify a totalnumber n of predicates, the number of k similar predicates,the number of l similar constants, and the number of facts mper extensional predicate.

Table III shows two test instances for (n,m, k, l) ∈{(5, 10, 2, 4), (7, 100, 4, 10)} and the expansions BPL andFDES (column Exp.). For the optimized case (answer sub-sumption enabled), it includes its solving speed-up (columnS. S-U), the reduction of both the ET size (ET. Gain) andanswer tuples (Ans. Gain).

Whereas in the BPL expansion, gains are not noticeablewhen enabling answer subsumption (because the BPL expan-sion avoids the generation of many redundant answers), in theFDES expansion great gains are achieved, both in solving time,ET size, and number of answers. Recall that the t-norm productwas selected for experiments. Then, as the FDES expansiontransforms non-recursive predicates related by similarity inmutually recursive predicates, recursive cycles are built alongfixpoint iterations, producing the same tuples with decreasingsimilarity degrees. Thus, this optimization is a huge advantagein this scenario.

D. Analysing λ-cut

Finally, this subsection analyses the effect of the λ-cutvalue for both expansions. Again, we consider the samebenchmark and parameters as in the last subsection with(n,m, k, l) = (7, 100, 4, 30), that is, we specify 4 similarintensional predicates (out of 7, with 3 extensional predicateseach one with 100 facts) and 30 similar constants. This amountof similar constants will highlight the effect of the λ-cut. forboth expansions. The goal is the same as in the last subsection.

Table IV includes the column λ-cut for different selectedvalues, showing results for both expansions BPL and FDESand both for solving time and ET size gains. These columnshave the same meaning as in the last subsection.

Measures for the value 0.00 of λ-cut correspond to theworst case and are therefore compared to itself (ratios of1). For the next value (0.05, very close to 0.00) a greatimprovement is achieved (even when answer subsumptionis enabled) since many computations derived from constant

Page 14: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 14

similarity are pruned. Note that in all cases, the number ofanswers is the same and, thus, the column “Ans. Gain” iselided. This improvement increases as λ-cut does. For thelast value, there is no further computation cuts and the gainsremain constant. It is also worth to note that the achievementsare better in FDES than in BPL because more computations(due to the recursive part) are cut. Though, the computationtime is still better in BPL.

VII. RELATED WORK

The idea of integrating concepts derived from fuzzy logicwithin a database is not new. In this section we give an ab-breviated record of these achievements (see the supplementarymaterial provided at IEEE Xplore for a more detailed studyon fuzzy databases).

Buckles and Petry [29] fuzzify relational databases byassociating similarity relations to the attribute domains in adatabase. In that way, they can weaken the equality relationin order to perform some kind of flexible querying througha extended relational algebra. Later on, Shenoi and Melton[30] extended this model to deal with proximity relations.Note that, although we share with these authors the useof similarity/proximity relations, the methods they use torepresent and deal with the fuzzy component of a databasediffer by far from ours.

Umano [31] proposed an alternate model supported on theconcept of possibility distribution. It was generalized by Pradeand Testemale [32] and it turned into a mainstream in thearea. FREDDI [10] synthesizes the most outstanding aspectsof this prior approach but incorporating deductive capabilities.An extended implementation of the FREDDI architecture [11]adopts a clausal representation for the rules of the intensionaldatabase, and deductive capabilities are obtained by imple-menting a non-classical bottom-up algorithm inspired in Dat-alog techniques. Although this framework presents deductivecharacteristics, unlike ours, it requires the user, explicitly,to take into account the propagation of the truth degreesthrough an additional parameter that stores a “matching”degree, leading to a kind of hybrid system combining SQL andlogic rules. In addition, though they claim that their proposalis based on two theoretical models, there is no theoretical

TABLE IVIMPACT OF λ-CUT

λ-cut Exp. S. S-U ET. Gain

0.00 BPL 1 1DES 1 1

0.05 BPL 93 12DES 120 11

0.10 BPL 146 19DES 190 16

0.25 BPL 247 33DES 316 27

0.50 BPL 370 73DES 492 60

0.75 BPL 494 367DES 738 443

0.90 BPL 494 367DES 738 443

provision for their combination. Moreover, this system doesnot seem to be available.

The introduction of fuzzy features in classical deductivedatabases, with Datalog as a query language, is a more recentactivity and has not produced so many examples. In this field,some of the work has consisted in the addition of confidencefactors or truth degrees to the facts and rules of the extensionaland the intensional databases which are manipulated by fuzzyconnectives (e.g., the work presented in [33]). However, we areinterested in those works that introduce fuzzy characteristicswithin the framework of the Datalog language, primarilythrough the use of fuzzy relations. In this sense, the workof A. Achs et al. [9] can be considered as a pioneering work,since it integrated the use of fuzzy relations in the frameworkof the Datalog language for the first time. They started froman extended Datalog language, with truth degrees in factsand rules, to which they incorporated a similarity relation onthe domain of attributes. Unfortunately, and contrary to ourproposal, this work has the following limitations: i) it onlyconsiders the similarity between the attributes of a relation,not between the relation itself; and ii) Datalog programsundergo a transformation guided by the similarity relation, butthis transformation is complex (and possibly inefficient) anddoes not adequately handle rules with non-variable arguments.Finally, to the best of our knowledge, there is no report of apublicly available implementation of their system.

The work of A. Achs et al. was developed in paralleland independently with respect to the works [34], [35] and[7]. There, they introduced similarity relations in a logicprogramming language, and the concept of unification bysimilarity was first developed. It is in this last stream, andespecially in the works of M. Sessa, that we have found ourinspiration. As within the Bousi∼Prolog language [5], [6], inthe FuzzyDES system we make use of the unification algorithmproposed in [7]. However, there are some striking differencesbetween the concepts presented in [7] and those we use:7 i)We adopt a different notion of weak unifier and weak mostgeneral unifier. ii) Also, we have defined a more appropriateconcept of similar clause, which has a beneficial impact onthe efficiency of the system we have implemented. iii) Wehave introduced some new peculiarities into the operationalsemantics of our language (such as the use of a thresholdor a different notion of family of computed answers). iv)Finally, we have developed new implementation techniques forefficient execution of programs. These conceptual and practicaldifferences must be added to the differences arising from thefact that FuzzyDES uses an operational mechanism based ontabling (a bottom-up top-down-driven strategy) which havebeen adequately explained throughout this work.

VIII. CONCLUSIONS

This paper has introduced formal and practical aspectsneeded to transfer some techniques found in the fuzzy logicprogramming system Bousi∼Prolog into the deductive systemDES. The new extension of DES, called FuzzyDES, defines afuzzy Datalog language which is executed by the operational

7See [6] for a more extensive discussion of these differences.

Page 15: A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 1 A ...fuzzy logic [1] into pure logic programming [2] in order to deal with imprecise information, uncertainty and/or vagueness by using

A FUZZY DATALOG DEDUCTIVE DATABASE SYSTEM 15

mechanism of Weak SLD resolution. Our implementationperforms this task by compiling FuzzyDES programs into coreDatalog in order to obtain a set of Datalog rules that allows theemulation of the mentioned Weak SLD resolution procedure.We have precisely defined two processes for generating suchcompiled programs, leading to what we call the BPL expandedprogram and the FDES expanded program. By using programtransformation techniques, we have studied the relationshipsbetween both expansions. Also we prove the semantic equiv-alence between the WSLD rule, applied to a logic programΠ, and the operational mechanism of Definition 5.1, when itis applied to a BPL or FDES expanded program Π′ obtainedfrom Π. Moreover, we performed an experimental analysisof the system for performance and scalability, comparing theproposed expansions. We have provided a working system ableto solve queries and commands on fuzzy deductive databasesand shown some of its features with an application example.Since the DES system has been used already for big dataapplications (deductive data warehouses and OLAP [36] andtheir performance [37]), we expect its fuzzy extension alsoamenable for such applications. To the best of our knowledge,this is the only publicly available system implementing aninteractive fuzzy deductive database with user-defined fuzzyrelations.

REFERENCES

[1] H. Nguyen and E. Walker, A First Course in Fuzzy Logic. Chapman &Hall/CRC, Boca Raton, Florida, 2000.

[2] J. Lloyd, Foundations of Logic Programming. Springer-Verlag, Berlin,1987, second edition.

[3] S. Guadarrama, S. Munoz, and C. Vaucheret, “Fuzzy Prolog: A newapproach using soft constraints propagation,” Fuzzy Sets and Systems,Elsevier, vol. 144, no. 1, pp. 127–150, 2004.

[4] P. Vojtas, “Fuzzy Logic Programming,” Fuzzy Sets and Systems, vol.124, no. 1, pp. 361–370, 2001.

[5] C. Rubio-Manzano and P. Julian-Iranzo, “Fuzzy Linguistic Prolog andits Applications,” J Intell Fuzzy Syst, vol. 26, pp. 1503–1516, 2014.

[6] P. Julian-Iranzo and C. Rubio-Manzano, “A sound and complete seman-tics for a similarity-based logic programming language,” Fuzzy Sets andSystems, pp. 1–26, 2017.

[7] M. I. Sessa, “Approximate reasoning by similarity-based SLD resolu-tion.” Theor Comput Sci, vol. 275, no. 1-2, pp. 389–426, 2002.

[8] J. D. Ullman, Database and Knowledge-Base Systems, Vols. I and II.Computer Science Press, 1988.

[9] A. Achs and A. Kiss, “Fuzzy extension of datalog,” Acta Cybernetica,vol. 12, no. 2, pp. 153–166, 1995.

[10] J. M. Medina, O. Pons, J. C. Cubero, and M. A. Vila, “FREDDI: Afuzzy relational deductive database interface,” International Journal ofIntelligent Systems, vol. 12, no. 8, pp. 597–613, 1997.

[11] I. Blanco, “Deduccion en base de datos relacionales difusas,” Ph.D.dissertation, Dep. of Comp. Sci. and AI, Univ. of Granada, Spain, 2001.

[12] F. Saenz-Perez, “DES: A Deductive Database System,” Electronic Noteson Theoretical Computer Science, vol. 271, pp. 63–78, March 2011.

[13] ——, “Implementing Tabled Hypothetical Datalog,” in Proc. of the 25thIEEE ICTAI), 2013, pp. 596–601.

[14] L.-C. Cheng and H.-A. Wang, “A fuzzy recommender system basedon the integration of subjective preferences and objective information,”Applied Soft Computing, vol. 18, pp. 290 – 301, 2014.

[15] R. Caballero, Y. Garcıa-Ruiz, and F. Saenz-Perez, “A TheoreticalFramework for the Declarative Debugging of Datalog Programs,” in Intl.Workshop on SDKB, ser. LNCS, vol. 4925. Springer, 2008, pp. 143–159.

[16] R. Caballero, Y. Garcıa-Ruiz, and F. Saenz-Perez, “Algorithmic Debug-ging of SQL Views,” in Ershov Informatics Conference (PSI’11), ser.LNCS, vol. 7162. Springer, 2011, pp. 77–85.

[17] ——, “Applying Constraint Logic Programming to SQL Test CaseGeneration,” in Proc. Intl. Symp. on FLOPS’10, ser. LNCS, vol. 6009.Springer, 2010, pp. 191–206.

[18] S. Ceri, G. Gottlob, and L. Tanca, “What you always wanted toknow about Datalog (and never dared to ask),” IEEE Transactions onKnowledge and Data Engineering, vol. 1, no. 1, pp. 146–166, 1989.

[19] C. Zaniolo, S. Ceri, C. Faloutsos, R. T. Snodgrass, V. S. Subrahmanian,and R. Zicari, Advanced Database Systems. Morgan Kaufmann, 1997.

[20] S. W. Dietrich, “Extension tables: Memo relations in logic program-ming,” in IEEE Symp. on Logic Programming, 1987, pp. 264–272.

[21] S. Verbaeten, K. Sagonas, and D. De Schreye, “Termination proofsfor logic programs with tabling,” ACM Transactions on ComputationalLogic, vol. 2, no. 1, pp. 57–92, 2001.

[22] T. Swift and D. S. Warren, “XSB: extending prolog with tabled logicprogramming,” TPLP, vol. 12, no. 1-2, pp. 157–187, 2012.

[23] S. Kundu, “An optimal O(N2) algorithm for computing the min-transitive closure of a weighted graph,” Information Processing Letters,vol. 74, no. 5, pp. 215–220, 2000.

[24] P. Julian-Iranzo, “A procedure for the construction of a similarityrelation,” in Proc. of the 12th Intl. Conf. on IPMU, 2008, pp. 489–496.

[25] A. Silberschatz, H. Korth, and S. Sudarshan, Database Systems Con-cepts, 5th ed. New York, NY, USA: McGraw-Hill, Inc., 2006.

[26] A. Martelli and U. Montanari, “An Efficient Unification Algorithm,”ACM Tran. Progr. Lang. Sys., vol. 4, pp. 258–282, 1982.

[27] R. Burstall and J. Darlington, “A Transformation System for DevelopingRecursive Programs,” Journal of the ACM, vol. 24, no. 1, pp. 44–67,1977.

[28] P. Julian, G. Moreno, and J. Penabad, “On Fuzzy Unfolding. A Multi-Adjoint Approach,” Fuzzy Sets and Systems, vol. 154, pp. 16–33, 2005.

[29] B. Buckles and F. Petry, “A fuzzy representation of data for relationaldatabases,” Fuzzy Sets and Systems, vol. 7, pp. 213–226, 1982.

[30] S. Shenoi and A. Melton, “Proximity relations in the fuzzy relationaldatabase model,” Fuzzy Sets and Systems, vol. 31, pp. 285–296, 1989.

[31] M. Umano, “FREEDOM-O: A fuzzy database system,” in Fuzzy Infor-mation and Decision Processes., North-Holland, 1982, pp. 339–347.

[32] Prade and Testemale, “Generalizing database relational algebra forthe treatment of incomplete/uncertain information and vague queries,”Information Science, vol. 34, pp. 115–143, 1984.

[33] K. Jezek and M. Zıma, “Efficient Evaluation of Fuzzy Deduction,” inProc. of the 3rd ISM, 2000, pp. 45–50.

[34] F. Formato, G. Gerla, and M. I. Sessa, “Similarity-based unification.”Fundam. Inform., vol. 41, no. 4, pp. 393–414, 2000.

[35] F. A. Fontana and F. Formato, “A similarity-based resolution rule.” Int.J. Intell. Syst., vol. 17, no. 9, pp. 853–872, 2002.

[36] K. Rabuzin, “Deductive data warehouses,” International Journal of DataWarehousing and Mining (IJDWM), vol. 10, no. 1, pp. 16–31, 2014.

[37] ——, “Deductive data warehouses: testing performances,” in Proc. ofCSSE, Wit Press, 2014, pp. 205–212.


Recommended