Date post: | 13-Jun-2015 |
Category: |
Software |
Upload: | dimitris-kontokostas |
View: | 416 times |
Download: | 2 times |
Test-driven Evaluation of Linked Data Quality
Dimitris Kontokostas14 Patrick Westphal1 Sören Auer2
Sebastian Hellmann14 Jens Lehmann1 Roland Cornelissen3
Amrapali Zaveri1
1AKSW, University of Leipzig
2University of Bonn
3Stichting Bibliotheek.nl
4DBpedia Association
2014-04-11
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 1 / 20
Problem De�nition
Unprecedented volume of structured dataon the Web
Datasets are of varying quality
LOD contains many crowd-sourced datasets with good coverage, but often poor,non-uniform quality
OWL schemas are often not su�cientlydeveloped or exploited for quality evaluation
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 2 / 20
Motivation
Quality is �tness for use
Key to the success of data webMajor barrier for industry adoption
Methodology inspired from test-drivensoftware development
Vocabularies, ontologies and knowledgebases should be accompanied by anumber of test cases, which help toensure a basic level of quality
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 3 / 20
Test-Driven Development (Software)
Test case: input on which the program under test is executed duringtesting.
Test suite: a set of test cases for testing a program
Status: Success or Fail (Error)
Test cases are implemented largely manually or with limitedprogrammatic support
H. Zhu, P. A. V. Hall, and J. H. R. May. Software unit test coverage
and adequacy. ACM Comput. Surv., 29(4):366�427, 1997.
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 4 / 20
Test-Driven Development (RDF)
Test case: a data constraint that involves one or more triples
Test suite: a set of test cases for testing a dataset
Status: Success, Fail, Timeout (complexity) or Error (e.g. network)
Fail: Error, warning or notice
RDF: basis for both data and schema
Uni�ed model facilitates automatic test case generationSPARQL serves as the test case de�nition language
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 5 / 20
Example test case
A person should never have a birth date after a death date
Test cases are written in SPARQL
SELECT ? s WHERE {? s dbo : b i r t hDa t e ? v1 .? s dbo : deathDate ? v2 .FILTER ( ? v1 > ?v2 ) }
We query for errors
Success: Query returns empty result set
Fail: Query returns results
Every result we get is a violation instance
Timeout / Error: needs further investigation on SPARQL Enginecapabilities, query syntax or query complexity
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 6 / 20
Patterns & Bindings
Data Quality Test Patterns (DQTP)abstract patterns, which can be further re�ned into concrete data qualitytest cases using test pattern bindings
Existing library of 20 patterns (DBpedia mailing lists since 2008)
SELECT ? s WHERE {? s %%P1%% ?v1 .? s %%P2%% ?v2 .FILTER ( ? v1 %%OP%% ?v2 ) }
Bindingsmapping of variables to valid pattern replacement
P1 => dbo : b i r t hDa t e | SELECT ? s WHERE {P2 => dbo : deathDate | ? s dbo : b i r t hDa t e ? v1 .OP => > | ? s dbo : deathDate ? v2 .
| FILTER ( ? v1 > ?v2 ) }
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 7 / 20
Test Auto Generators (TAGs)
RDF(s) & OWL (partial) support
Query schema for supported axioms
SELECT DISTINCT ?T1 ?T2 WHERE {?T1 owl : d i s j o i n tW i t h ?T2 . }
For every result a binding to a pattern is generated & a test caseinstantiated
Supported axioms at the moment:
RDFS: domain & rangeOWL: minCardinality, maxCardinality, cardinality, functionalProperty,InverseFunctionalProperty, disjointClass, propertyDisjointWith,AsymmetricProperty and deprecated
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 8 / 20
TAG Example
INVFUNC pattern
SELECT ? s WHERE {?a %%P1%% ? r e s o u r c e .?b %%P1%% ? r e s o u r c e .FILTER (? a != ?b ) }
Test Auto Generators => query the schema & generate bindingse.g for owl:InverseFunctionalProperty
SELECT DISTINCT ?P1 ?MESSAGE WHERE { {?P1 r d f : t ype owl : I n v e r s e F u n c t i o n a l P r o p e r t y . } UNION {?P1 r d f s : subPrope r tyOf+ owl : I n v e r s e F u n c t i o n a l P r o p e r t y }
}
Bindings => for every result of a TAG, binds values to a patterns andinstantiate test cases e.g. bind foaf:homepage to %%P1%%
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 9 / 20
Test Coverage (1/2)
Coverage computation function f : Q → 2E
Takes a SPARQL query q ∈ Q corresponding to a test case pattern bindingas input and returns a set of entities.
Property domain coverage (dom): Identi�es the ratio of propertyoccurrences, where a test-case is de�ned for verifying domainrestrictions of the property.
F ′(QS ,D) =∑
p∈F (QS) pfreq(p) where pfreq(p) is the frequency of aproperty p
fdom returns the set of all properties p such that the triple pattern(?s, p, ?o) occurs in q and there is at least one other triple patternusing ?s in q.
Property range coverage (ran): Identi�es the ratio of propertyoccurrences, where a test-case is de�ned for verifying rangerestrictions of the property. (similar to dom)
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 10 / 20
Test Coverage (2/2)
Property dependency coverage (pdep): Ratio of propertyoccurrences, where a test-case is de�ned for verifying dependencieswith other properties.
At least two di�erent propertiesProperty cardinality coverage (card): Ratio of propertyoccurrences, where a test-case is de�ned for verifying the cardinality ofthe property.
GROUP BY ?s/o and HAVING(count(?s/o) <op> <number>)Class instance coverage (mem): Ratio of classes with test-casesregarding class membership.
(?s, rdf : type, c)Class dependency coverage (cdep): Ratio of class occurrences forwhich test-cases verifying relationships with other classes are de�ned.
At least two di�erent classes
Cov(QS ,D) =1
6(F ′
dom(QS ,D) + F ′ran(QS ,D) + F ′
pdep(QS ,D)+
F ′card (QS ,D) + F ′
mem(QS ,D) + F ′cdep(QS ,D))
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 11 / 20
Schema Enrichment
Optionally run schema enrichment on the dataset (e.g. DL-Learner)
Get new axioms (�lter manually)
Run TAGs on the axioms and get additional automatic test cases
1. obtain schema information
Reasoner
SPARQLEndpoint
EnrichmentOntology
Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
List of Axiom Suggestions+ Metadata
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3. run machine learning algorithm
3-Phase EnrichmentLearning Approach:
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
iterate over all axiom typesand schema entities for fullenrichment
(sam
ple
dat
aif
nece
ssar
y)
Learner
DL-Learner
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 12 / 20
Test Case Elicitation Work�ow
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 13 / 20
Test case elicitation
Test cases depend on a schema or a dataset
Automatic & Manual test cases are reusable
A dataset can be tested against a number of schema
e.g. dbo, foaf, skos
Linked Open Vocabularies (http://lov.okfn.org):
Describes 400 vocabularies in RDF (pre�x, uri, description, etc.)
Run TAGs on all vocabularies
32.293 unique reusable test cases (10/2013)
Used for pre�x dereferencing
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 14 / 20
Evaluation
Implement the methodology in (Databugger) RDFUnit tool
Tested on 3 crowd-sourced and 2 library datasets
dbpedia.org (owl, dbo, foaf, dcterms, dc, skos, geo, prov)nl.dbpedia.org (owl, dbo, foaf, dcterms, dc, skos, geo, prov)linkedgeodata.org (ngeo, spatial, lgdo, dcterms, gsp, owl, geo, skos,foaf)id.loc.gov (owl, foaf, dcterms, skos, mads, mrel, premis)datos.bne.es (owl, frbrer, isbd, dcterms and skos)
De�ned manual test cases for dbo (22), lgdo (6) & skos (20)
Enrich datasets with DL-Learner to get additional test cases
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 15 / 20
Evaluation Results
Dataset Triples Subjects TC Pass Fail TO Errors ManEr EnrEr E/R
dben 817,467,330 24,922,670 6,064 4,288 1.860 55 63,644,169 5,224,298 249,857 2.55
dbnl 74,790,253 4,831,594 5,173 4,149 812 73 5,375,671 211,604 15,041 1.11
lgd 274,690,851 51,918,417 634 545 86 3 57,693,912 133,140 1 1.11
datos 60,017,091 7,470,044 2,473 2,376 89 8 27,943,993 25 537 3.74
loc 436,126,273 53,072,042 536 499 28 9 9,392,909 49 3,663 0.18
Pattern dben dbnl lgd datos loc
COMP 1.7M 7 - - -
INVFUNC 279K 13K - 511 3.5K
LITRAN 9 - - - -
MATCH 171K 103K 637 - -
OWLASYMP 19K 3K - - -
OWLCARD 610 291 1 1 3
OWLDISJC 92 - - 8.1K 1.1K
OWLDISJP 3.4K 7K - 53 223
OWLIRREFL 1.4K 14 - - -
PVT 267K 1.2K 22 - -
RDFSDOMAIN 31M 2.3M 55M 28M 9M
RDFSRANGE 26M 2.5M 191K 320K 111K
RDFSRANGED 760K 286K 2.7M 2 -
TRIPLE - - 132K - -
TYPDEP 674K - - - -
TYPRODEP 2M 100K - - -
Errors
Schema TC dben dbnl lgd dat. loc
dbo 5.7K 7.9M 716K - - -
frbrer 2.1K - - - 11K -
lgdo 224 - - 2.8M - -
isbd 179 - - - 28M -
prov 125 25M - - - -
foaf 95 25M 4.6M - - 59
gsp 83 - - 39M - -
mads 75 - - - - 0.3M
owl 48 5 3 2 5 -
skos 28 41 - - - 9M
dcterms 28 960 881 191K 37K 659
ngeo 18 - 119 - -
geo 7 2.8M 120K 16M - -
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 16 / 20
Evaluation Coverage
Richer / stricter schemas result in higher test coverage
Metric dben lgd datos loc
fpdom 20.32% 8.98% 72.26% 20.35%
fpran 23.67% 10.78% 37.64% 28.78%
fpdep 24.93% 13.65% 77.75% 29.78%
fcard 23.67% 10.78% 37.63% 28.78%
fmem 73.51% 12.78% 93.57% 58.62%
fcdep 37.55% 0% 93.56% 36.86%
Cov(QS ,D) 33.94% 9.49% 68.74% 33.86%
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 17 / 20
Related Work
SPIN: Expresses SPARQL queries in RDF and allows SPARQL querieswith arguments
Does not fully support our Pattern Bindings (e.g. operators)Compatible but would exponentially expand our Pattern library
SELECT ?x WHERE { ? c1 owl : d i s j o i n tW i t h ? c2 .? x a ? c1 .? x a ? c2 . }
PelletICV: converts OWL constraints to SPARQL
Expresses constraints only within OWLDoes not support the (re-)use of DQTPsNo schema enrichment step
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 18 / 20
Conclusion
Methodology to de�ne reusable test cases
Evaluation revealed a substantial amount of data quality issues
First step in a larger research and development agenda
Future directions
Web serviceTest-driven data quality cockpitAutomatic repair strategies
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 19 / 20
Thank you!
Dimitris kontokostas
With kind support of
http://rdfunit.aksw.org
http://github.com/AKSW/RDFUnit
Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 20 / 20