SPARQL and Linked Data Benchmarking

Benchmarks

What benchmarks are commonly used and what they mean

Overview

• SP2Bench

• LUBM

• BSBM

• UOBM

• SIB

• DBPedia SPARQL Benchmark

• LODIB

• FedBench

• THALIA Testbed

• Benchmark for Spatial Semantic Web Systems

• LODQA

• LinkBench

SP2Bench

• Is language specific: SPARQL-based performance benchmark.

• Components: data generator, query set

• provides a scalable RDF data generator and a set of benchmark queries, designed to test typical SPARQL operator constellations and RDF data access patterns.

• Example comparison: http://arxiv.org/pdf/0806.4627v2.pdf

LUBM

• The Lehigh University Benchmark• “The Lehigh University Benchmark is developed

to facilitate the evaluation of Semantic Web repositories in a standard and systematic way. The benchmark is intended to evaluate the performance of those repositories with respect to extensional queries over a large data set that commits to a single realistic ontology.“

• Components: ontology, data generator, test queries, tester

• http://swat.cse.lehigh.edu/projects/lubm/

BSBM

• Berlin SPARQL Benchmark

• provides for comparing the performance of RDF and Named Graph stores as well as RDF-mapped relational databases and other systems that expose SPARQL endpoints. Designed along an e-commerce use case. SPARQL and SQL version available.

• http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/

UOBM

• The Ontology Benchmark

• Extends the LUBM benchmark in terms of inference and scalability testing.

• Components: ontology and test data set

• http://www.springerlink.com/content/l0wu543x26350462/University

SIB

• Social Network Intelligence Benchmark (SIB)

• A benchmark suite developed by people at CWI and Openlink taking the schema from Social Networks for generating test areas where RDF/SPARQL can truly excel, and challenging query processing over highly connected graph.

• http://www.w3.org/wiki/Social_Network_Intelligence_BenchMark

DBPedia SPARQL Benchmark

• Designed to be benchmark against DBPediadata in order to provide a clear picture of real world performance.

• “Performance Assessment with Real Queries on Real Data.”

• http://svn.aksw.org/papers/2011/VLDB_AKSWBenchmark/public.pdf

http://svn.aksw.org/papers/2011/VLDB_AKSWBenchmark/public.pdf

LODIB

• The Linked Data Integration Benchmark

• Is a benchmark for comparing the expressivity as well as the runtime performance of Linked Data translation/integration systems.

• http://wifo5-03.informatik.uni-mannheim.de/bizer/lodib/

FedBench

• Benchmark for measuring the performance of federated SPARQL query processing.

• ISWC2011 whitepaper: https://www.uni-koblenz.de/~goerlitz/publications/ISWC2011-FedBench.pdf

https://www.uni-koblenz.de/~goerlitz/publications/ISWC2011-FedBench.pdf

THALIA Testbed

• Is designed to test the expressiveness of relational-to-RDF mapping languages.

• http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/THALIATestbed

Benchmark for Spatial Semantic Web Systems

• Extends LUBM with sample spatial data.

• https://filebox.vt.edu/users/dkolas/public/ssbm/

LODQA

• The Linked Open Data Quality Assessment

• Is a benchmark for comparing data quality assessment and data fusion systems.

• https://filebox.vt.edu/users/dkolas/public/ssbm/

LinkBench

• A database benchmark that is designed for the Facebook Social Graph.

• Whitepaper: http://people.cs.uchicago.edu/~tga/pubs/sigmod-linkbench-2013.pdf

• https://www.facebook.com/notes/facebook-engineering/linkbench-a-database-benchmark-for-the-social-graph/10151391496443920

http://people.cs.uchicago.edu/~tga/pubs/sigmod-linkbench-2013.pdf

Performance Results

• Results provided by store implementers themselves:– Virtuoso BSBM benchmark results (native RDF store versus mapped relational

database)– Jena TDB BSBM benchmark results (native RDF store)– OWLIM Benchmark results (LUBM, BSBM and Linked Data loading/inference)– SemWeb .NET library BSBM benchmark results– Virtuoso LUBM benchmark results– AllegroGraph 2.0 Benchmark for LUBM-50-0– Sesame NativeStore LUBM benchmark results– RacerPro LUBM benchmark results– SwiftOWLIM benchmark results for the LUBM and City benchmark (from slide

27 onwards)– Oracle 11g benchmark results for the LUBM and Uniprot benchmark (from

slide 20 onwards)– Jena SDB/Query performance and SDB/Loading performance– Bigdata BSBM V3 Reduced Query Mix benchmark results

http://lists.w3.org/Archives/Public/public-lod/2008Aug/0061.html

http://lists.w3.org/Archives/Public/public-sparql-dev/2008JulSep/0029.html

http://www.ontotext.com/owlim/benchmarking/index.html

http://lists.w3.org/Archives/Public/public-lod/2008Aug/0073.html

http://virtuoso.openlinksw.com/wiki/main/Main/VOSArticleLUBMBenchmark

http://agraph.franz.com/allegrograph/agraph_bench_lubm50.lhtml

http://www.openrdf.org/forum/mvnforum/viewthread?thread=829

http://www.sts.tu-harburg.de/~r.f.moeller/racer/lubm.html

http://www.ontotext.com/owlim/OWLIMPres.pdf

http://ontolog.cim3.net/file/work/DatabaseAndOntology/2007-10-18_AlanWu/RDBMS-RDFS-OWL-InferenceEngine--AlanWu_20071018-alt.ppt

http://jena.hpl.hp.com/wiki/SDB/Query_performance

http://jena.hpl.hp.com/wiki/SDB/Loading_performance

http://www.bigdata.com/bigdata/blog/?p=214

Performance Results

• Results provided by third parties:– Cudre-Mauroux, et al.: NoSQL Databases for RDF: An Empirical Evaluation (November 2013,

Uses the BSBM benchmark with workloads from 10 million to 1 billion triples to benchmark several NoSQL databases).

– Peter Boncz, Minh-Duc Pham: Berlin SPARQL Benchmark Results for Virtuoso, Jena TDB, BigData, and BigOWLIM (April 2013, 100 million to 150 billion triples, Explore and Business Intelligence Use Cases).

– Christian Bizer, Andreas Schultz: Berlin SPARQL Benchmark Results for Virtuoso, Jena TDB, 4store, BigData, and BigOWLIM (February 2011, 100 and 200 million triples, Explore and Update Use Cases).

– Christian Bizer, Andreas Schultz: Berlin SPARQL Benchmark Results for Virtuoso, Jena TDB and BigOWLIM (November 2009, 100 and 200 million triples).

– L.Sidirourgos et al.: Column-Store Support for RDF Data Management: not all swans are white. An experimental analysis along two dimensions – triple-store vs. vertically-partitioned and row-store vs. column-store – individually, before analyzing their combined effects. In VLDB 2008.

– Christian Bizer, Andreas Schultz: Berlin SPARQL Benchmark Results. Benchmark along an e-commerce use case comparing Virtuoso, Sesame, Jena TDB, D2R Server and MySQL with datasets ranging from 250,000 to 100,000,000 triples and setting the results into relation to two RDBMS. 2008. (Note: As discussed in Orri Erling's blog, the SQL mix results did not accurately reflect steady-state of all players, and should be taken with a grain of salt. Warm-up steps will change for future runs.)

http://ribs.csres.utexas.edu/nosqlrdf/nosqlrdf_iswc2013.pdf

http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/results/V7/

http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/results/V6/

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V5/

http://www.cwi.nl/htbin/ins1/publications?request=pdf&key=SiGoKeNeMa:VLDB:08

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html

Performance Results

• Results provided by third parties (cont):– Michael Schmidt et al.: SP2Bench: A SPARQL Performance Benchmark. Benchmark based on the DBLP data

set comparing current versions of ARQ, Redland, Sesame, SDB, and Virtuoso. TR, 2008 (short version of the TR to appear in ICDE 2009).

– Michael Schmidt et al.: An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario. Benchmarking Relational Database schemes on top of SP2Bench Suite. In ISWC 2008.

– Atanas Kiryakov: Measurable Targets for Scalable Reasoning– Baolin Liu and Bo Hu: An Evaluation of RDF Storage Systems for Large Data Applications– Christian Becker: RDF Store Benchmarks with DBpedia comparing Virtuoso, SDB and Sesame. 2007– Kurt Rohloff et al.: An Evaluation of Triple-Store Technologies for Large Data Stores. Comparing Sesame, Jena

and AllegroGraph. 2007– Christian Weiske: SPARQL Engines Benchmark Results– Ryan Lee: Scalability Report on Triple Store Applications comparing Jena, Kowari, 3store, Sesame. 2004– Martin Svihala, Ivan Jelinek: Benchmarking RDF Production Tools Paper comparing the performance of

relational database to RDF mapping tools (METAmorphoses, D2RQ, SquirrelRDF) with native RDF stores (Jena, Sesame)

– Michael Streatfield, Hugh Glaser: Benchmarking RDF Triplestores, 2005

http://arxiv.org/abs/0806.4627

http://www.informatik.uni-freiburg.de/~mschmidt/docs/sp2b_exp.pdf

http://www.ontotext.com/publications/ScalableReasoningTargets_nov07ak.pdf

http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/skg/2005/2534/00/2534toc.xml&DOI=10.1109/SKG.2005.37

http://www4.wiwiss.fu-berlin.de/benchmarks-200801/

http://www.springerlink.com/content/m14k476lr726x1g2/

http://cweiske.de/tagebuch/SPARQL Engines Benchmark Results.htm

http://simile.mit.edu/reports/stores/

http://metamorphoses.sourceforge.net/Papers/pdf/msvihla_dexa2007.pdf

http://eprints.aktors.org/437/01/finalreport-hg.pdf

Observations

• LUBM and BSBM results are shown often on the major players’ own websites.

• SP2Bench is harder to find on their websites, and other resources.

• A lot of the reported results by companies highlights performance on a very high performance computer, not commodity computers.

• http://www.w3.org/wiki/RdfStoreBenchmarking

Date post:	10-Jul-2015
Category:	Data & Analytics
Upload:	kristian-alexander
View:	208 times
Download:	0 times

SPARQL and Linked Data Benchmarking

Data & Analytics