RDFUnit - Test-Driven Linked Data quality Assessment ()

Test-driven Evaluation of Linked Data Quality

Dimitris Kontokostas14 Patrick Westphal1 Sören Auer2

Sebastian Hellmann14 Jens Lehmann1 Roland Cornelissen3

Amrapali Zaveri1

1AKSW, University of Leipzig

2University of Bonn

3Stichting Bibliotheek.nl

4DBpedia Association

2014-04-11

Kontokostas et al. (WWW2014) TD-LD Quality Evaluation 2014-04-11 1 / 20

Problem De�nition

Unprecedented volume of structured dataon the Web

Datasets are of varying quality

LOD contains many crowd-sourced datasets with good coverage, but often poor,non-uniform quality

OWL schemas are often not su�cientlydeveloped or exploited for quality evaluation


Motivation

Quality is �tness for use

Key to the success of data webMajor barrier for industry adoption

Methodology inspired from test-drivensoftware development

Vocabularies, ontologies and knowledgebases should be accompanied by anumber of test cases, which help toensure a basic level of quality


Test-Driven Development (Software)

Test case: input on which the program under test is executed duringtesting.

Test suite: a set of test cases for testing a program

Status: Success or Fail (Error)

Test cases are implemented largely manually or with limitedprogrammatic support

H. Zhu, P. A. V. Hall, and J. H. R. May. Software unit test coverage

and adequacy. ACM Comput. Surv., 29(4):366�427, 1997.


Test-Driven Development (RDF)

Test case: a data constraint that involves one or more triples

Test suite: a set of test cases for testing a dataset

Status: Success, Fail, Timeout (complexity) or Error (e.g. network)

Fail: Error, warning or notice

RDF: basis for both data and schema

Uni�ed model facilitates automatic test case generationSPARQL serves as the test case de�nition language


Example test case

A person should never have a birth date after a death date

Test cases are written in SPARQL

SELECT ? s WHERE {? s dbo : b i r t hDa t e ? v1 .? s dbo : deathDate ? v2 .FILTER ( ? v1 > ?v2 ) }

We query for errors

Success: Query returns empty result set

Fail: Query returns results

Every result we get is a violation instance

Timeout / Error: needs further investigation on SPARQL Enginecapabilities, query syntax or query complexity


Patterns & Bindings

Data Quality Test Patterns (DQTP)abstract patterns, which can be further re�ned into concrete data qualitytest cases using test pattern bindings

Existing library of 20 patterns (DBpedia mailing lists since 2008)

SELECT ? s WHERE {? s %%P1%% ?v1 .? s %%P2%% ?v2 .FILTER ( ? v1 %%OP%% ?v2 ) }

Bindingsmapping of variables to valid pattern replacement

P1 => dbo : b i r t hDa t e | SELECT ? s WHERE {P2 => dbo : deathDate | ? s dbo : b i r t hDa t e ? v1 .OP => > | ? s dbo : deathDate ? v2 .

| FILTER ( ? v1 > ?v2 ) }


Test Auto Generators (TAGs)

RDF(s) & OWL (partial) support

Query schema for supported axioms

SELECT DISTINCT ?T1 ?T2 WHERE {?T1 owl : d i s j o i n tW i t h ?T2 . }

For every result a binding to a pattern is generated & a test caseinstantiated

Supported axioms at the moment:

RDFS: domain & rangeOWL: minCardinality, maxCardinality, cardinality, functionalProperty,InverseFunctionalProperty, disjointClass, propertyDisjointWith,AsymmetricProperty and deprecated


TAG Example

INVFUNC pattern

SELECT ? s WHERE {?a %%P1%% ? r e s o u r c e .?b %%P1%% ? r e s o u r c e .FILTER (? a != ?b ) }

Test Auto Generators => query the schema & generate bindingse.g for owl:InverseFunctionalProperty

SELECT DISTINCT ?P1 ?MESSAGE WHERE { {?P1 r d f : t ype owl : I n v e r s e F u n c t i o n a l P r o p e r t y . } UNION {?P1 r d f s : subPrope r tyOf+ owl : I n v e r s e F u n c t i o n a l P r o p e r t y }

}

Bindings => for every result of a TAG, binds values to a patterns andinstantiate test cases e.g. bind foaf:homepage to %%P1%%


Test Coverage (1/2)

Coverage computation function f : Q → 2E

Takes a SPARQL query q ∈ Q corresponding to a test case pattern bindingas input and returns a set of entities.

Property domain coverage (dom): Identi�es the ratio of propertyoccurrences, where a test-case is de�ned for verifying domainrestrictions of the property.

F ′(QS ,D) =∑

p∈F (QS) pfreq(p) where pfreq(p) is the frequency of aproperty p

fdom returns the set of all properties p such that the triple pattern(?s, p, ?o) occurs in q and there is at least one other triple patternusing ?s in q.

Property range coverage (ran): Identi�es the ratio of propertyoccurrences, where a test-case is de�ned for verifying rangerestrictions of the property. (similar to dom)


Test Coverage (2/2)

Property dependency coverage (pdep): Ratio of propertyoccurrences, where a test-case is de�ned for verifying dependencieswith other properties.

At least two di�erent propertiesProperty cardinality coverage (card): Ratio of propertyoccurrences, where a test-case is de�ned for verifying the cardinality ofthe property.

GROUP BY ?s/o and HAVING(count(?s/o) <op> <number>)Class instance coverage (mem): Ratio of classes with test-casesregarding class membership.

(?s, rdf : type, c)Class dependency coverage (cdep): Ratio of class occurrences forwhich test-cases verifying relationships with other classes are de�ned.

At least two di�erent classes

Cov(QS ,D) =1

6(F ′

dom(QS ,D) + F ′ran(QS ,D) + F ′

pdep(QS ,D)+

F ′card (QS ,D) + F ′

mem(QS ,D) + F ′cdep(QS ,D))


Schema Enrichment

Optionally run schema enrichment on the dataset (e.g. DL-Learner)

Get new axioms (�lter manually)

Run TAGs on the axioms and get additional automatic test cases

1. obtain schema information

Reasoner

SPARQLEndpoint

EnrichmentOntology

Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)

Background Knowledge

BackgroundKnowledge+ Relevant Instance Data

List of Axiom Suggestions+ Metadata

(opt

ion

alin

voca

tion

)

2. obtain axiom type and entity specific data

3. run machine learning algorithm

3-Phase EnrichmentLearning Approach:

(onl

y ex

ecu

ted

once

per

know

ledg

e ba

se)

iterate over all axiom typesand schema entities for fullenrichment

(sam

ple

dat

aif

nece

ssar

y)

Learner

DL-Learner


Test Case Elicitation Work�ow


Test case elicitation

Test cases depend on a schema or a dataset

Automatic & Manual test cases are reusable

A dataset can be tested against a number of schema

e.g. dbo, foaf, skos

Linked Open Vocabularies (http://lov.okfn.org):

Describes 400 vocabularies in RDF (pre�x, uri, description, etc.)

Run TAGs on all vocabularies

32.293 unique reusable test cases (10/2013)

Used for pre�x dereferencing


http://lov.okfn.org

Evaluation

Implement the methodology in (Databugger) RDFUnit tool

Tested on 3 crowd-sourced and 2 library datasets

dbpedia.org (owl, dbo, foaf, dcterms, dc, skos, geo, prov)nl.dbpedia.org (owl, dbo, foaf, dcterms, dc, skos, geo, prov)linkedgeodata.org (ngeo, spatial, lgdo, dcterms, gsp, owl, geo, skos,foaf)id.loc.gov (owl, foaf, dcterms, skos, mads, mrel, premis)datos.bne.es (owl, frbrer, isbd, dcterms and skos)

De�ned manual test cases for dbo (22), lgdo (6) & skos (20)

Enrich datasets with DL-Learner to get additional test cases


Evaluation Results

Dataset Triples Subjects TC Pass Fail TO Errors ManEr EnrEr E/R

dben 817,467,330 24,922,670 6,064 4,288 1.860 55 63,644,169 5,224,298 249,857 2.55

dbnl 74,790,253 4,831,594 5,173 4,149 812 73 5,375,671 211,604 15,041 1.11

lgd 274,690,851 51,918,417 634 545 86 3 57,693,912 133,140 1 1.11

datos 60,017,091 7,470,044 2,473 2,376 89 8 27,943,993 25 537 3.74

loc 436,126,273 53,072,042 536 499 28 9 9,392,909 49 3,663 0.18

Pattern dben dbnl lgd datos loc

COMP 1.7M 7 - - -

INVFUNC 279K 13K - 511 3.5K

LITRAN 9 - - - -

MATCH 171K 103K 637 - -

OWLASYMP 19K 3K - - -

OWLCARD 610 291 1 1 3

OWLDISJC 92 - - 8.1K 1.1K

OWLDISJP 3.4K 7K - 53 223

OWLIRREFL 1.4K 14 - - -

PVT 267K 1.2K 22 - -

RDFSDOMAIN 31M 2.3M 55M 28M 9M

RDFSRANGE 26M 2.5M 191K 320K 111K

RDFSRANGED 760K 286K 2.7M 2 -

TRIPLE - - 132K - -

TYPDEP 674K - - - -

TYPRODEP 2M 100K - - -

Errors

Schema TC dben dbnl lgd dat. loc

dbo 5.7K 7.9M 716K - - -

frbrer 2.1K - - - 11K -

lgdo 224 - - 2.8M - -

isbd 179 - - - 28M -

prov 125 25M - - - -

foaf 95 25M 4.6M - - 59

gsp 83 - - 39M - -

mads 75 - - - - 0.3M

owl 48 5 3 2 5 -

skos 28 41 - - - 9M

dcterms 28 960 881 191K 37K 659

ngeo 18 - 119 - -

geo 7 2.8M 120K 16M - -


Evaluation Coverage

Richer / stricter schemas result in higher test coverage

Metric dben lgd datos loc

fpdom 20.32% 8.98% 72.26% 20.35%

fpran 23.67% 10.78% 37.64% 28.78%

fpdep 24.93% 13.65% 77.75% 29.78%

fcard 23.67% 10.78% 37.63% 28.78%

fmem 73.51% 12.78% 93.57% 58.62%

fcdep 37.55% 0% 93.56% 36.86%

Cov(QS ,D) 33.94% 9.49% 68.74% 33.86%


Related Work

SPIN: Expresses SPARQL queries in RDF and allows SPARQL querieswith arguments

Does not fully support our Pattern Bindings (e.g. operators)Compatible but would exponentially expand our Pattern library

SELECT ?x WHERE { ? c1 owl : d i s j o i n tW i t h ? c2 .? x a ? c1 .? x a ? c2 . }

PelletICV: converts OWL constraints to SPARQL

Expresses constraints only within OWLDoes not support the (re-)use of DQTPsNo schema enrichment step


Conclusion

Methodology to de�ne reusable test cases

Evaluation revealed a substantial amount of data quality issues

First step in a larger research and development agenda

Future directions

Web serviceTest-driven data quality cockpitAutomatic repair strategies


Thank you!

Dimitris kontokostas

With kind support of

http://rdfunit.aksw.org

http://github.com/AKSW/RDFUnit


http://rdfunit.aksw.org

http://github.com/AKSW/RDFUnit

Date post:	13-Jun-2015
Category:	Software
Upload:	dimitris-kontokostas
View:	416 times
Download:	2 times