+ All Categories
Home > Education > TripleCheckMate

TripleCheckMate

Date post: 31-May-2015
Category:
Upload: amrapali-zaveri
View: 463 times
Download: 0 times
Share this document with a friend
Description:
Presentation of the TripleCheckMate tool: http://aksw.org/Projects/TripleCheckMate.html @KESW 2013 (kesw.ifmo.ru/kesw2013/)
Popular Tags:
13
TripleCheckMate: A Tool for Crowdsourcing the Quality Assessment of Linked Data Dimitris Kontokostas, Amrapali Zaveri, Sören Auer and Jens Lehmann KESW 2013 Oct 08, 2013
Transcript
Page 1: TripleCheckMate

TripleCheckMate: A Tool for Crowdsourcing the Quality Assessment of Linked Data

Dimitris Kontokostas, Amrapali Zaveri, Sören Auer and Jens Lehmann

KESW 2013 Oct 08, 2013

Page 2: TripleCheckMate

Outline

❏Data Quality❏Data Quality Assessment Methodology❏ Evaluation Methodology - Manual

❏ Phase I: Quality Problem Taxonomy❏ Phase II: Crowdsourcing Quality Assessment

❏ TripleCheckMate❏ Architecture❏Demo

❏Conclusion & Future Work

2

Page 3: TripleCheckMate

Data Quality

● Data Quality (DQ) is defined as:○ fitness for a certain use case*

● On the Data Web - varying quality of information covering various domains

● High quality datasets ○ curated over decades - life science domain○ crowdsourcing process - extracted from unstructured

and semi-structured information, e.g. DBpedia

* J. Juran. The Quality Control Handbook. McGraw-Hill, New York, 1974.3

Page 4: TripleCheckMate

Data Quality Assessment Methodology

4 Step Methodology:

❏ Step 1: Resource selection❏ Per Class❏ Completely random❏ Manual

❏ Step 2: Evaluation mode selection❏ Manual❏ Semi-automatic❏ Automatic

❏ Step 3: Resource evaluation

❏ Step 4: DQ improvement❏ Direct❏ Indirect

4

Page 5: TripleCheckMate

Evaluating Methodology - Manual

❏Phase I: Creation of quality problem taxonomy

❏Phase II: Crowdsourcing quality assessment

5

Page 6: TripleCheckMate

Phase I: Quality Problem Taxonomy

AZaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality assessment methodologies for Linked Open Data: A Review. Under review, available at http://www.semantic-webjournal.net/content/quality-assessment-methodologieslinked-open-data.

6

Page 7: TripleCheckMate

Phase II: Crowdsourcing Quality Assessment

Crowdsourcing Our Approach

Type Human Intelligent Tasks (HITs)

Contest-based

Participants Labor market Linked Data (LD) experts

Task Detect quality issues in triples

Detect & classify quality issues in resources

Reward Per tasks/triple Most no. of resources evaluated

Tool Amazon Mechanical Turk, CrowdFlower etc.

TripleCheckMate

7

Page 8: TripleCheckMate

TripleCheckMate - Architecture (1/2)

8

Page 9: TripleCheckMate

TripleCheckMate - Architecture (2/2)

● Built on Java / GWT○ GWT compiles to native cross-browser HTML/JS

● Tomcat / Jetty & MySQL as minimal backend○ store/retrieve evaluation data only

● Application logic is built on the client○ SPARQL executed on client○ Portable

9

Page 10: TripleCheckMate

Evaluation storage schema

● Designed to support multiple campaigns and different ontologies

● Quality taxonomy is stored in the database which makes it easy to adapt

10

Page 11: TripleCheckMate

TripleCheckMate - Demo

http://tinyurl.com/TCM-Demohttp://tinyurl.com/TCM-Screencast

Page 12: TripleCheckMate

Conclusion & Future Work

● TripleCheckMate○ Tool for crowdsouring quality assessment○ Linked Data quality assessment○ Supports inter-rater agreement○ Can be used with any Linked Dataset

● Future Work○ Directly integrating semi-automatic methods○ Improve efficiency of quality assessment○ Include support for Patch Ontology* as output format

* M. Knuth, J. Hercher, and H. Sack. Collaboratively patching linked data. CoRR, 2012. 12

Page 13: TripleCheckMate

Thank YouQuestions?

http://nl.dbpedia.org:8080/TripleCheckMate-Demo/https://github.com/AKSW/TripleCheckMate

http://aksw.org/[email protected]

Twitter: @amrapaliz