SeLeNe Kick-off Meeting 15-16/11/2002 SeLeNe-related Research At Birkbeck Alex Poulovassilis and...

SeLeNe Kick-off Meeting 15-16/11/2002

SeLeNe-related Research At Birkbeck

Alex Poulovassilis and Peter T.Wood

Database and Web Technologies Group

School of Computer Science and Information Systems

Birkbeck, University of London


Research in CS & IS at Birkbeck

Main groups:

• Database and Web Technologies

• Computational Intelligence

• Bioinformatics

• Software Engineering

Main research funding sources: EPSRC, BBSRC, EU, Wellcome Trust, HEFCE, industry

URL http://www.dcs.bbk.ac.uk/~research/groups.html

http://www.dcs.bbk.ac.uk/~research/groups.html


Teaching in CS & IS at Birkbeck

Foundation Degree in IT (part-time) BSc Computing (pt) BSc Information Systems and Management (pt) MSc Computing Science (ft and pt) PG Dip & MSc in e-commerce (ft and pt)

MSc in Advanced Information Systems (ft and pt) MRes in Computer Science (ft and pt) MPhil/PhD in Computer Science (ft and pt)

URL http://www.dcs.bbk.ac.uk/~courses/

http://www.dcs.bbk.ac.uk/~courses/


1. ECA Rules for XML

This is work by us in collaboration with James Bailey at Melbourne. It is currently being implemented by George Papamarkos, who has just started at Birkbeck as a research student and part-time RA on SeLeNe

XML repositories are increasingly being used in dynamic applications where actions need to be taken in a timely fashion in response to updates to the data

Thus, there is a need for reactive functionality on XML repositories:

event-condition-action (ECA) rules are a natural candidate


ECA Rules

ECA rules take the form: on event if condition do action

Users/Apps

EventDetection

Action Execution

ConditionEvaluation


ECA rules in Active Databases

ECA rules in active relational databases are of the form

on insert/delete/update of a table

if SQL condition

do SQL statement(s)

When an insertion/deletion/update occurs, the DBMS provides a set of instantiations for the variables $new and $old

These variables can be used within the condition and action parts of rules


ECA in Active Databases

ECA rules are used in conventional data warehouses for

• generation and incremental maintenance of materialised views

• checking integrity constraints

• performing automatic repairs when violations are detected

• maintaining audit trails of the data

• maintaining statistics of data warehouse performance and usage

By analogy, ECA rules can be used to provide similar functionality on semi-structured data such as XML and RDF.


Our ECA Language for XML

In our WWW2002 and Computer Networks 2002 papers, we present a language for defining ECA rules on XML

Rather than introducing yet another language for XML, we use fragments of the XPath and XQuery languages within the event, condition and action parts of our ECA rules

This allows leverage of ongoing work on XPath and XQuery


Our ECA Language for XML

The event part of an ECA rule is of the form

INSERT e

or

DELETE e

where e is a simple XPath expression

Simple XPath disallows the use of any axis other than the child, parent, self, or descendant-or-self axes, and the use of all functions other than document()


Our ECA Language - Events

In a rule event part of the form

INSERT e

the XPath expression e evaluates to a set of nodes

The rule is triggered if this set of nodes includes any node that has been inserted by the most recent update on the XML database

The set of instantiations for the variable $delta is the set of new nodes returned by e


Our ECA Language - Events

Similarly, in a rule event part of the form

DELETE e

the XPath expression e evaluates to a set of nodes

The rule is triggered if this set of nodes includes any node that has been deleted by the most recent update on the XML database

The set of instantiations for the variable $delta is the set of deleted nodes returned by e


Our ECA Language - Conditions

The condition part of a rule is either

• TRUE, or

• one or more simple XPath expressions connected by and, or, not

A rule’s actions are executed on each XML document which

• has been changed by an event of the form specified in the rule's event part,

• for each value of $delta for which the rule's condition is True


Our ECA Language - Actions

Each rule action is of the form

INSERT r BELOW e

or

DELETE e

e is a simple XPath expression r is a simple XQuery expression

Simple XQuery disallows the use of full FLWR expressions, essentially permitting only the Return part of an expression.


An Example

An XML database containing two documents s.xml and p.xml:

<stores> <products>

<store id="s1"> <product id="p1">

<location>...</location> <name>...</name>

<manager>...</manager> <price>...</price>

<product id="p1"/> <store id="s1"/>

<product id="p2"/> <store id="s2"/>

... …

</store> </product>

... …

</stores> </products>


Example (cont’d)

If one or more products are added to a store in s.xml, this rule appends that store to the children of those products in p.xml if it’s not already a child:

Rule 1:

on INSERT document('s.xml')/stores/store/product

if not (document('p.xml')/products/

product[@id=$delta/@id]/store[@id=$delta/../@id])

do INSERT <store id='{$delta/../@id}'/>

BELOW document('p.xml')/products/product[@id=$delta/@id]


Example (cont’d)

In a symmetric way, if one or more stores are added to a product in p.xml, this rule appends that product to the children of those stores in p.xml if it’s not already a child:

Rule 2:

on INSERT document('p.xml')/products/product/store

if not (document('s.xml')/stores/

store[@id=$delta/@id]/product[@id=$delta/../@id])

do INSERT <product id='{$delta/../@id}'/>

BELOW document('s.xml')/stores/store[@id=$delta/@id]


ECA Rule Analysis

We have also developed techniques for analysing the triggering and activation dependencies between our XML ECA rules, described in the two papers mentioned earlier

These analysis techniques are also useful beyond ECA rules, since they generally determine the effects of updates upon queries.

So can also be used for analysing the effects of other, not necessarily rule-initiated, updates made to an XML repository e.g.

• to determine if integrity constraints may have been violated, or

• whether materialised views need to be re-calculated.


Relation to SeLeNe

Similarly, we are planning to define an ECA rule language for RDF as part of the SeLeNe project

We need to specify the syntax and semantics of:

• queries (for rule conditions),

• updates (for rule actions), and

• events (for rule event parts)

e.g. as fragments of FORTH RDF suite’s RQL language (and the planned extensions to with update facilities for SeLeNe)


Relation to SeLeNe

George Papamarkos will implement a prototype RDF ECA rule execution engine

Within the SeLeNe architecture, such RDF ECA rules could be used to materialise views and to propagate changes from source learning objects to derived learning objects

Also, GP will work on developing techniques for automatically generating such ECA rules from declarative view specifications (c.f. earlier such techniques developed for relational databases)


2. The AutoMed Project

In work with Peter McBrien, AP has developed a new framework to support integration of heterogeneous data sources

The theoretical foundation of the framework consists of:

• a new notion of schema equivalence

• a set of primitive schema transformations which can be composed to define unconditional or conditional equivalences between schemas


The AutoMed Project

The modelling constructs of higher-level data models (e.g. relational, object-oriented, semi-structured, XML, RDF) are specified in terms of a low-level hypergraph data model (HDM)

The specification of a modelling construct C automatically generates addC, delC and renC primitive schema transformations

add and del transformations have as an argument a query

Composite schema transformations consist of a sequence of primitive transformations, and allow constructs from different modelling languages to be mixed within the same intermediate schema


Query and Data Translation

Schema transformations set up a two-way transformation pathway between pairs of schemas:

From a pathway T:S –> S’ we:

• compose the queries in the add steps to derive a definition of each construct in S’ as a view over S, and

• compose the queries in the del steps to derive a definition of each construct in S as a view over S’

These view definitions can then be used to automatically translate data and queries between S and S’. The process generalises to a set of local schemas being integrated into a global schema


Both-As-View integration

Our schema transformation pathways capture at least the information available from global-as-view (GAV) or local-as-view (LAV)

We discuss this in a forthcoming paper (ICDE’03) and term our integration approach both-as-view (BAV)

Unlike GAV and LAV, our framework readily supports the evolution of both local and global schemas (CAiSE’02, ICDE’03)


Unstructured Text Sources

As well as integrating structured and semi-structured data sources, we are also working on extracting structure from unstructured text sources – Dean Williams

We are using existing IE technology (the GATE tool from Sheffield) for text annotation. Natural language and domain ontologies will extend these annotations.

The extracted information will be matched with existing structured information to derive new facts and perhaps new global schema constructs


Materialised integration

Finally, as well as virtual integration of data sources, we are also investigating using the AutoMed framework for materialised data integration i.e. a data warehousing approach

In particular, we are looking at incremental view maintenance and data lineage tracing using the AutoMed schema transformation pathways – Hao Fan

Date post:	28-Mar-2015
Category:	Documents
Upload:	nicole-kelly
View:	215 times
Download:	1 times

SeLeNe Kick-off Meeting 15-16/11/2002 SeLeNe-related Research At Birkbeck Alex Poulovassilis and...

Documents