Date post: | 07-Jul-2015 |
Category: |
Technology |
Upload: | mathieu-daquin |
View: | 1,151 times |
Download: | 0 times |
How much semantic data on
small devices?
Mathieu d’Aquin, AndriyNikolov and Enrico MottaKnowledge Media Institute, The Open Univeristy, UK
@mdaquin
Semantic Data on Small Devices?
Benchmarking Semantic Data Tools
Large Scale Benchmarks
LUBM(1,0)103,397 triples
Extracting sets of small-scale
ontologies
Clusters of ontologies having similar characteristics, except for size
Extracting sets of small-scale
Ontologies
• Characteristics of ontologies
– Size (tiples): varies from very small scale to
medium scale
– Ratio class/prop: allowing 50% variance
– Ratio class/inst.: allowing 50% variance
– DL expressivity: Complexity of the
language
• 99 automatically created clusters
• Manual selection of 10
Results
Size (triples) Prop/class Ind/class DL
9-2742 0.65-1.0 1.0-2.0 ALO
27-3688 0.21-0.48 0.07-0.14 ALH
2-8502 N/A N/A -
17-3696 0.66-2.0 4.5-20.5 -
3208-658808 N/A N/A EL
1514-153298 N/A N/A ELR+
8-3657 N/A N/A -
7-4959 1.41-4.0 N/A AL
1-2759 N/A N/A -
43-5132 1.0-2.0 13.0-22.09 -
Queries
• Using real life ontologies need domain independent Queries
• A set of 8 generic queries of varying complexity, and which results might depend on inference
Select all labels
Select all comments
Select all labels and comments
Select all RDFS classes
Select all classes (RDFS/OWL/DAML)
Select all instances of all classes
Select all properties applied to instances of all classes
Select all properties by their domain
Running the benchmarks – Triple
Stores
Jena with TDB persistent storage
R As above + RDFS reasoning
R
Sesame with persistent storage
As above + RDFS reasoning
Mulgara with default configuration
Running the benchmarks – Device
Asus EEE PC 700 (2G)
Running the benchmarks - Measures
• Loading time: for each ontologies in an
empty, re-initialized store.
• Disk Space: of the persistent store right
after loading.
• Memory consumption: of the triple store
process right after loading the ontology.
• Query time: for each ontology, averaged
over the 8 queries.
Results – Loading time
Results – Loading time
R
R
=
Results – Disk Space
Results – Disk Space
RR=< <
Results – Memory consumption
Results – Memory
consumptions
R
R
=
Result – Query time
Result – Query time
R=
R
<
Conclusion – on tests
• Sesame performs best in almost all
aspects, even when including reasoning
• Reasoning has big impact on Jena TDB at
query time
• Mulgara is clearly not adequate in a small-
scale scenario
Conclusion – on small-scale benchmarking
• Validates our assumption that small-scale benchmarks give different results than large-scale benchmarks
• Points out the need for more work to tackle the small-scale scenarios
• Results are not always clear cut in every aspects: benchmarks as support to decide which tool to use, depending on the application constraints