Date post: | 28-Mar-2015 |
Category: |
Documents |
Upload: | nicole-kelly |
View: | 215 times |
Download: | 1 times |
SeLeNe Kick-off Meeting 15-16/11/2002
SeLeNe-related Research At Birkbeck
Alex Poulovassilis and Peter T.Wood
Database and Web Technologies Group
School of Computer Science and Information Systems
Birkbeck, University of London
SeLeNe Kick-off Meeting 15-16/11/2002
Research in CS & IS at Birkbeck
Main groups:
• Database and Web Technologies
• Computational Intelligence
• Bioinformatics
• Software Engineering
Main research funding sources: EPSRC, BBSRC, EU, Wellcome Trust, HEFCE, industry
URL http://www.dcs.bbk.ac.uk/~research/groups.html
SeLeNe Kick-off Meeting 15-16/11/2002
Teaching in CS & IS at Birkbeck
Foundation Degree in IT (part-time) BSc Computing (pt) BSc Information Systems and Management (pt) MSc Computing Science (ft and pt) PG Dip & MSc in e-commerce (ft and pt)
MSc in Advanced Information Systems (ft and pt) MRes in Computer Science (ft and pt) MPhil/PhD in Computer Science (ft and pt)
URL http://www.dcs.bbk.ac.uk/~courses/
SeLeNe Kick-off Meeting 15-16/11/2002
1. ECA Rules for XML
This is work by us in collaboration with James Bailey at Melbourne. It is currently being implemented by George Papamarkos, who has just started at Birkbeck as a research student and part-time RA on SeLeNe
XML repositories are increasingly being used in dynamic applications where actions need to be taken in a timely fashion in response to updates to the data
Thus, there is a need for reactive functionality on XML repositories:
event-condition-action (ECA) rules are a natural candidate
SeLeNe Kick-off Meeting 15-16/11/2002
ECA Rules
ECA rules take the form: on event if condition do action
Users/Apps
EventDetection
Action Execution
ConditionEvaluation
SeLeNe Kick-off Meeting 15-16/11/2002
ECA rules in Active Databases
ECA rules in active relational databases are of the form
on insert/delete/update of a table
if SQL condition
do SQL statement(s)
When an insertion/deletion/update occurs, the DBMS provides a set of instantiations for the variables $new and $old
These variables can be used within the condition and action parts of rules
SeLeNe Kick-off Meeting 15-16/11/2002
ECA in Active Databases
ECA rules are used in conventional data warehouses for
• generation and incremental maintenance of materialised views
• checking integrity constraints
• performing automatic repairs when violations are detected
• maintaining audit trails of the data
• maintaining statistics of data warehouse performance and usage
By analogy, ECA rules can be used to provide similar functionality on semi-structured data such as XML and RDF.
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language for XML
In our WWW2002 and Computer Networks 2002 papers, we present a language for defining ECA rules on XML
Rather than introducing yet another language for XML, we use fragments of the XPath and XQuery languages within the event, condition and action parts of our ECA rules
This allows leverage of ongoing work on XPath and XQuery
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language for XML
The event part of an ECA rule is of the form
INSERT e
or
DELETE e
where e is a simple XPath expression
Simple XPath disallows the use of any axis other than the child, parent, self, or descendant-or-self axes, and the use of all functions other than document()
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Events
In a rule event part of the form
INSERT e
the XPath expression e evaluates to a set of nodes
The rule is triggered if this set of nodes includes any node that has been inserted by the most recent update on the XML database
The set of instantiations for the variable $delta is the set of new nodes returned by e
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Events
Similarly, in a rule event part of the form
DELETE e
the XPath expression e evaluates to a set of nodes
The rule is triggered if this set of nodes includes any node that has been deleted by the most recent update on the XML database
The set of instantiations for the variable $delta is the set of deleted nodes returned by e
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Conditions
The condition part of a rule is either
• TRUE, or
• one or more simple XPath expressions connected by and, or, not
A rule’s actions are executed on each XML document which
• has been changed by an event of the form specified in the rule's event part,
• for each value of $delta for which the rule's condition is True
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Actions
Each rule action is of the form
INSERT r BELOW e
or
DELETE e
e is a simple XPath expression r is a simple XQuery expression
Simple XQuery disallows the use of full FLWR expressions, essentially permitting only the Return part of an expression.
SeLeNe Kick-off Meeting 15-16/11/2002
An Example
An XML database containing two documents s.xml and p.xml:
<stores> <products>
<store id="s1"> <product id="p1">
<location>...</location> <name>...</name>
<manager>...</manager> <price>...</price>
<product id="p1"/> <store id="s1"/>
<product id="p2"/> <store id="s2"/>
... …
</store> </product>
... …
</stores> </products>
SeLeNe Kick-off Meeting 15-16/11/2002
Example (cont’d)
If one or more products are added to a store in s.xml, this rule appends that store to the children of those products in p.xml if it’s not already a child:
Rule 1:
on INSERT document('s.xml')/stores/store/product
if not (document('p.xml')/products/
product[@id=$delta/@id]/store[@id=$delta/../@id])
do INSERT <store id='{$delta/../@id}'/>
BELOW document('p.xml')/products/product[@id=$delta/@id]
SeLeNe Kick-off Meeting 15-16/11/2002
Example (cont’d)
In a symmetric way, if one or more stores are added to a product in p.xml, this rule appends that product to the children of those stores in p.xml if it’s not already a child:
Rule 2:
on INSERT document('p.xml')/products/product/store
if not (document('s.xml')/stores/
store[@id=$delta/@id]/product[@id=$delta/../@id])
do INSERT <product id='{$delta/../@id}'/>
BELOW document('s.xml')/stores/store[@id=$delta/@id]
SeLeNe Kick-off Meeting 15-16/11/2002
ECA Rule Analysis
We have also developed techniques for analysing the triggering and activation dependencies between our XML ECA rules, described in the two papers mentioned earlier
These analysis techniques are also useful beyond ECA rules, since they generally determine the effects of updates upon queries.
So can also be used for analysing the effects of other, not necessarily rule-initiated, updates made to an XML repository e.g.
• to determine if integrity constraints may have been violated, or
• whether materialised views need to be re-calculated.
SeLeNe Kick-off Meeting 15-16/11/2002
Relation to SeLeNe
Similarly, we are planning to define an ECA rule language for RDF as part of the SeLeNe project
We need to specify the syntax and semantics of:
• queries (for rule conditions),
• updates (for rule actions), and
• events (for rule event parts)
e.g. as fragments of FORTH RDF suite’s RQL language (and the planned extensions to with update facilities for SeLeNe)
SeLeNe Kick-off Meeting 15-16/11/2002
Relation to SeLeNe
George Papamarkos will implement a prototype RDF ECA rule execution engine
Within the SeLeNe architecture, such RDF ECA rules could be used to materialise views and to propagate changes from source learning objects to derived learning objects
Also, GP will work on developing techniques for automatically generating such ECA rules from declarative view specifications (c.f. earlier such techniques developed for relational databases)
SeLeNe Kick-off Meeting 15-16/11/2002
2. The AutoMed Project
In work with Peter McBrien, AP has developed a new framework to support integration of heterogeneous data sources
The theoretical foundation of the framework consists of:
• a new notion of schema equivalence
• a set of primitive schema transformations which can be composed to define unconditional or conditional equivalences between schemas
SeLeNe Kick-off Meeting 15-16/11/2002
The AutoMed Project
The modelling constructs of higher-level data models (e.g. relational, object-oriented, semi-structured, XML, RDF) are specified in terms of a low-level hypergraph data model (HDM)
The specification of a modelling construct C automatically generates addC, delC and renC primitive schema transformations
add and del transformations have as an argument a query
Composite schema transformations consist of a sequence of primitive transformations, and allow constructs from different modelling languages to be mixed within the same intermediate schema
SeLeNe Kick-off Meeting 15-16/11/2002
Query and Data Translation
Schema transformations set up a two-way transformation pathway between pairs of schemas:
From a pathway T:S –> S’ we:
• compose the queries in the add steps to derive a definition of each construct in S’ as a view over S, and
• compose the queries in the del steps to derive a definition of each construct in S as a view over S’
These view definitions can then be used to automatically translate data and queries between S and S’. The process generalises to a set of local schemas being integrated into a global schema
SeLeNe Kick-off Meeting 15-16/11/2002
Both-As-View integration
Our schema transformation pathways capture at least the information available from global-as-view (GAV) or local-as-view (LAV)
We discuss this in a forthcoming paper (ICDE’03) and term our integration approach both-as-view (BAV)
Unlike GAV and LAV, our framework readily supports the evolution of both local and global schemas (CAiSE’02, ICDE’03)
SeLeNe Kick-off Meeting 15-16/11/2002
Unstructured Text Sources
As well as integrating structured and semi-structured data sources, we are also working on extracting structure from unstructured text sources – Dean Williams
We are using existing IE technology (the GATE tool from Sheffield) for text annotation. Natural language and domain ontologies will extend these annotations.
The extracted information will be matched with existing structured information to derive new facts and perhaps new global schema constructs
SeLeNe Kick-off Meeting 15-16/11/2002
Materialised integration
Finally, as well as virtual integration of data sources, we are also investigating using the AutoMed framework for materialised data integration i.e. a data warehousing approach
In particular, we are looking at incremental view maintenance and data lineage tracing using the AutoMed schema transformation pathways – Hao Fan