Date post: | 11-May-2015 |
Category: |
Technology |
Upload: | fariz-darari |
View: | 87 times |
Download: | 2 times |
Completeness Statements about RDF Data Sources and Their Use for Query Answering
Fariz Dararijoint work with Werner Nutt, Giuseppe Pirrò, and Simon Razniewski
KRDB, Free University of Bozen-Bolzano, Italy
Thousands of RDF data sources are today available on the Web.
Machine-readable qualitative descriptions of their content are crucial.
We focus on data completeness, an important aspect of data quality.
How to formalize and express in a machine-readable way
completeness information about RDF data sources?
How to leveragesuch completeness information?
1. Formal framework for expressing completeness information.
2. Study of query completeness from completeness information in various settings.
Completeness statement on the Web
Users visiting this source can prefer it to other sources.
However, the completeness statement verified as complete is
only human readable!
Why is LinkedMDBcomplete ?
Why is DBpedia not complete for the query ?
The completeness statement in DBpedia says that it is complete for Tarantino’s movies (dv:st1). However, the query asks about all movies for which Tarantino is the director, and also an actor.
It is not stated that DBpedia includes all the actors of Tarantino’s movies. Therefore, DBpedia is possiblynot complete for this query.
The completeness statements inLMDB say that they are completefor Tarantino’s movies (lv:st1)and also the actors (lv:st2).
Implementation
http://rdfcorner.wordpress.com
Query completeness in a single data source scenario
lv:st2 c:hasCondition [c:subject [spin:varName "m"];
c:predicate rdf:type; c:object schema:Movie].
lv:st2 c:hasCondition [c:subject [spin:varName "m"];
c:predicate schema:director; c:object dbp:Tarantino].
dv:dbpdataset rdf:type void:Dataset;
dv:dbpdataset c:hasComplStmt dv:st1.
dv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate rdf:type; c:object schema:Movie ].
dv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate schema:director;c:object dbp:Tarantino].
SELECT ?m
WHERE {?m rdf:type schema:Movie.
?m schema:director dbp:Tarantino.
?m schema:actor dbp:Tarantino}
Select all the movies for which
Tarantino is the director and also an actor
DBPedia is complete
for all Tarantino's movies
LinkedMDB is complete for all Tarantino's movies
and also movies for which he is an actor
The answer is
incompleteThe answer is
complete
SPARQL
endpoint
SPARQL
endpoint
@prefix c: <http://inf.unibz.it/ontologies/completeness#>
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix spin: <http://spinrdf.org/sp#>
@prefix void: <http://rdfs.org/ns/void#>
@prefix dv: <http://dbpedia.org/void/>
@prefix lv: <http://linkedmdb.org/void/>
@prefix dbp: <http://dbpedia.org/resource/>
@prefix schema: <http://schema.org>
Q
lv:st2 c:hasPattern [c:subject[spin:varName "m"];
c:predicate schema:actor; c:object[spin:varName "a"]].
Endpoint IRI
DBPeEndpoint IRI
LMDBe
lv:lmdbdataset c:hasComplStmt lv:st1.
lv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate rdf:type; c:object schema:Movie ].
lv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate schema:director;c:object dbp:Tarantino ].
lv:lmdbdataset rdf:type void:Dataset;
lv:lmdbdataset c:hasComplStmt lv:st2.
For each completeness statement, all the triple patterns definedvia hasPattern are collected into a set P1 and all the triple patterns definedvia hasCondition are collected into a set P2. A completeness statement isinterpreted as: CONSTRUCT {P1} WHERE {P1 . P2}When a data source has a completeness statement (defined viahasComplStmt), it means that if the query above is evaluated overan “ideal” graph then all the results are in the data source.
SPARQL queries with OPT
Completeness with RDFS inference
Completeness statement on the Semantic Web
Extensions
CoRner: Completeness Reasoner
Given a query Q and a data source with completeness statements S:1. Create a template answer graph GQ of Q.2. Over GQ, evaluate all CONSTRUCT queries derived from S3. Check whether GQ can be obtained after the evaluation.
If yes, the query is complete, otherwise might be incomplete.
Semantics of completeness statements
Checking query completeness
Context Problem Contributions
lv:st1 c:hasCondition
[c:subject [spin:varName "m"]; c:predicate rdf:type; c:object schema:Movie].
lv:st1 c:hasCondition
[c:subject [spin:varName "m"]; c:predicate schema:director; c:object dbp:Tarantino].
lv:st1 c:hasPattern
[c:subject[spin:varName "m"]; c:predicate schema:actor; c:object[spin:varName "a"]].
lv:lmdbdataset rdf:type void:Dataset.
lv:lmdbdataset c:hasComplStmt lv:st1.
LMDB is complete for all Tarantino’s movies and all their actors.
Federated query completeness
SPARQL queries with negations and comparisons
Live, Web-based CoRner
Work In Progress
Empirical evaluation of query completeness checking