Unified Benchmarking of Big Data PlatformsThe HOBBIT Platform
Axel-Cyrille Ngonga Ngomo
Horizon 2020GA No 688227
01/12/2016–30/11/2018
Apache Big DataSevilla, Spain
November 15, 2016
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 1 / 44
Summary
Rationale
A community-driven unified benchmarking platform for the community
Focus on Big (Linked) DataProvide benchmarks and baselinesProvide reference implementation of KPIsExtensible and referenceableResult analysisOpen-Sourcehttp://project-hobbit.eu
@hobbit_project
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 2 / 44
Summary
Rationale
A community-driven unified benchmarking platform for the community
Focus on Big (Linked) DataProvide benchmarks and baselinesProvide reference implementation of KPIsExtensible and referenceableResult analysisOpen-Sourcehttp://project-hobbit.eu
@hobbit_projectNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 2 / 44
A Lot of Data
1
1http://www.ibmbigdatahub.com/infographic/four-vs-big-dataNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 44
A Lot of Tools
2
2https://cloudramblings.me/Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 4 / 44
A Lot ... of Tools
33http://mattturck.com/2016/02/01/big-data-landscape/
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 5 / 44
Questions
Developers: How good is my tool?Vendors: Who is my tool good for?Users: Which tool(s) should I use formy application?
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 6 / 44
Many Questions
Where are the current bottlenecks?Which steps of the data lifecycle arecritical?Which solutions are available?Which key performance indicatorsare relevant?How well do or should toolsperform?How do existing solutions performw.r.t. relevant indicators?
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 7 / 44
A Lot of Views
44https://steemit.com/philosophy/@l0k1/
subjectivity-and-truth-how-blockchains-model-consensus-buildingNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 8 / 44
SolutionBenchmark
ComponentsDataset(s), e.g., Twitter stream, sensor dataTask(s), e.g., entity recognition, storagePerformance indicators, e.g., precision, recall, queries per second
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 9 / 44
SolutionBenchmark
ComponentsDataset(s), e.g., Twitter stream, sensor dataTask(s), e.g., entity recognition, storagePerformance indicators, e.g., precision, recall, queries per second
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 9 / 44
SolutionBenchmark
TPCH-H (3,000 GB Results): −5.6 × 106 QphH between 2014 and 20165QALD: ≈ 5% increase in Micro F-MeasureACE2004: ≈ 6% increase in Micro F-measure
5http://www.tpc.org/tpch/results/tpch_perf_results.aspNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 10 / 44
ChallengesDataset and KPI Mismatch
Year
ACE
Wiki
AQUA
INT
MSN
BC
IITB
Meij
AIDA
/CoN
LL
N3collection
KORE
50
Wiki-D
isamb3
0
Wiki-A
nnot30
Spotlight
Corpus
SemEv
al-2013task
12
SemEv
al-2007task
7
SemEv
al-2007task
17
Senseval-3
NIF-based
corpus
Micr
oposts2014
Softw
areavailable?
Webservice
available?
Cucerzan 2007 3Wikipedia 2008 3* 3MinerIllinois Wikifier 2011 3 3 3* 3 3Spotlight 2011 3 3 3AIDA 2011 3 3 3**TagMe 2 2012 3 3 3 3Dexter 2013 3 3KEA 2013 3WAT 2013 3 3AGDISTIS 2014 3 3 3 3 3 3 3 3 3 3Babelfy 2014 3 3 3 3 3 3 3NERD-ML 2014 3 3 3 3
BAT- 2013 3 3 3 3 3 3 3* 3FrameworkNERD 2014 3 3 3 3 3FrameworkGERBIL 2014 3 3 3 3 3 3 3* 3 3 3 3 3 3 3
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 11 / 44
ChallengesUnclear KPI Semantics
ExampleFederated queries in distributed storage solutionsWhich time do we measure?
First or last result?With or without network delay?
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 12 / 44
ChallengesUnclear KPI Semantics
ExampleEntity recognition and linkingWhen is an annotation correct?
Weak or strong annotation?Semantically equivalent or exact URI?
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 44
Solution!Unified Benchmarking Framework
RationaleProvide all benchmark components in one packageInclude reference datasets and baselinesDevise standardized tasks and reference KPI implementations
Benchmark Core
Web service calls
Dataset Wrapper
Web service calls
Interface View
AnnotatorWrapper
Interface View
Open Datasets
Configuration(Model)
...
Benchmark Core
Your Annotator
Your DatasetDataHub.io
GERBIL Core Controller
Persistent Experiment Database
(Model)
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 14 / 44
Solution!Unified Benchmarking Framework
RationaleProvide all benchmark components in one packageInclude reference datasets and baselinesDevise standardized tasks and reference KPI implementations
Benchmark Core
Web service calls
Dataset Wrapper
Web service calls
Interface View
AnnotatorWrapper
Interface View
Open Datasets
Configuration(Model)
...
Benchmark Core
Your Annotator
Your DatasetDataHub.io
GERBIL Core Controller
Persistent Experiment Database
(Model)
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 14 / 44
GERBILHOBBIT v0.1
FeaturesUnified benchmarking platformfor NER/NEL18 reference annotation systems32 reference datasetsReference implementations ofKPIs
AdvantagesBenchmarking ≈ 30× fasterArchiving of resultsCiteable URIsAdditional analysis
AvailabilityOpen-source projectLocal deploymentOnline instanceFeedback for developers and users
http://gerbil.aksw.orghttp://github.org/aksw/gerbil
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 44
GERBILHOBBIT v0.1
FeaturesUnified benchmarking platformfor NER/NEL18 reference annotation systems32 reference datasetsReference implementations ofKPIs
AdvantagesBenchmarking ≈ 30× fasterArchiving of resultsCiteable URIsAdditional analysis
AvailabilityOpen-source projectLocal deploymentOnline instanceFeedback for developers and users
http://gerbil.aksw.orghttp://github.org/aksw/gerbil
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 44
GERBILHOBBIT v0.1
FeaturesUnified benchmarking platformfor NER/NEL18 reference annotation systems32 reference datasetsReference implementations ofKPIs
AdvantagesBenchmarking ≈ 30× fasterArchiving of resultsCiteable URIsAdditional analysis
AvailabilityOpen-source projectLocal deploymentOnline instanceFeedback for developers and users
http://gerbil.aksw.orghttp://github.org/aksw/gerbil
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 44
GERBILHOBBIT v0.1
Annotator TasksNIF-based Annotators 2519Babelfy 958DBpedia Spotlight 922TagMe 2 811WAT 787Kea 763Wikipedia Miner 714NERD-ML 639Dexter 587AGDISTIS 443Entityclassifier.eu NER 410FOX 352Cetus 1Overall 24.3K exps
50+ papers
http://gerbil.aksw.orghttp://github.org/aksw/gerbil
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 16 / 44
HOBBITRationale
Rationale
A community-driven unified benchmarking platform for the community
Build upon 24.3K GERBIL experimentsExperiments focus on Big Linked DataDesigned to accomodate all Big Data
Cover all steps of the Big (Linked) DatalifecycleOpen benchmarks based on industrial dataand use cases
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 17 / 44
HOBBITRationale
Rationale
A community-driven unified benchmarking platform for the community
Build upon 24.3K GERBIL experimentsExperiments focus on Big Linked DataDesigned to accomodate all Big Data
Cover all steps of the Big (Linked) DatalifecycleOpen benchmarks based on industrial dataand use cases
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 17 / 44
HOBBITAims
1 Gather real requirementsPerformance indicatorsPerformance thresholds
2 Develop benchmarks based on real data3 Provide universal benchmarking platform
Standardized hardwareComparable results
4 Periodic benchmarking challenges5 Periodic reporting6 Found independent Hobbit association
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 18 / 44
HOBBITOverview
Data Collection
Industrydata
Measure Collection
Benchmark Creation
Benchmark 1
KPIsTasks
KPIsTasksKPIsTasks
KPIsTasks
KPIsTasks
KPIsTasks
Benchmark 2
Benchmark n
HOBBITPlatform
Solution 1
Solution k
Solution 2
Challenges
Reports
Participants/Community
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 19 / 44
SurveyQuestions
QuestionsIn what areas are organizations active?What do people expect from benchmarks?How are benchmarks being used?
Profile CountSolution providers 56Technology users 67Scientific community 65
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 20 / 44
SurveyCan your solution be benchmarked?
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 21 / 44
SurveyDo you benchmark your solution?
Own datasets and settings in many casesOwn implementations of measuresResults not comparable
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 22 / 44
SurveyApplication Areas
http://big-data-europe.eu
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 23 / 44
HOBBIT PlatformFeatures
Uses established deploymenttechnologies (Docker)
Decoupled componentsBenchmark and systems can bewritten in different languages
Uses scalable message queues forcommunicationOpen-source implementationSupports distributed benchmarksand systemsOnline instance on server cluster
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 24 / 44
HOBBIT PlatformFeatures
FeaturesUnified benchmarking platform for BigData20+ reference annotation systems40+ reference datasetsReference implementations of KPIs
AdvantagesBenchmarks derived from real industrialdata and use casesScalable size of benchmarksArchiving of resultsCiteable URIsResult analysis
AvailabilityOpen-source projectLocal deployment
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 25 / 44
HOBBIT PlatformArchitecture
PlatformController
Data Generator
Task Generator
Data Generator
Data Generator
Task Generator
Task Generator
Front End
Benchmarked System
data flowcreates component
StorageAnalysis
BenchmarkController
Evaluation Module
Eval. Storage
Logging
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 26 / 44
HOBBIT PlatformBenchmark Initialization
PlatformController
Data Generator
Task Generator
Data Generator
Data Generator
Task Generator
Task Generator
Benchmarked System
data flowcreates component
Storage
BenchmarkController
Eval. Storage
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 27 / 44
HOBBIT PlatformBenchmark Execution
PlatformController
data flowcreates component
Storage
Data Generator
Task Generator
Data Generator
Data Generator
Task Generator
Task Generator
Benchmarked System
BenchmarkController
Eval. Storage
ex:Entity rdf:type ex:Class...
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 28 / 44
HOBBIT PlatformBenchmark Execution
PlatformController
data flowcreates component
Storage
Data Generator
Task Generator
Data Generator
Data Generator
Task Generator
Task Generator
Benchmarked System
BenchmarkController
Eval. Storage
vex:Entity...
SELECT ?vWHERE { ?v a ex:Class }
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 29 / 44
HOBBIT PlatformBenchmark Execution
PlatformController
data flowcreates component
Storage
Data Generator
Task Generator
Data Generator
Data Generator
Task Generator
Task Generator
Benchmarked System
BenchmarkController
Eval. Storage
X
vex:Entity...
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 30 / 44
HOBBIT PlatformBenchmark Evaluation
data flowcreates component
PlatformController
Storage
BenchmarkController
Evaluation Module
Eval. Storage
precision=...recall=...F1-score=... precision=...
recall=...F1-score=...
benchmark parameters: ...
vex:Entity...
vex:Entity...
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 31 / 44
HOBBIT PlatformBenchmarks
Streaming and static deterministic benchmarksRealistic benchmarksControlled volume and velocity
Generation and AcquisitionConversion of XML into RDFEntity recognition and linkingRelation extraction
Analysis and ProcessingLink DiscoveryMachine LearningSupervised and unsupervised
Storage and CurationTriple storesVersioningIncl. updates
Visualization and ServicesQuestion AnsweringFaceted BrowsingUsage-based benchmarks
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 32 / 44
DatasetsTWIG
Goal: Simulate real Twitter FirehoseRelies on 476 million tweets as training dataMimicking algorithm based on
Distribution of character frequenciesDistribution of transportation frequencyNetwork topology
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 33 / 44
DatasetsLinkedConnections
Goal: Simulate real transport networkReal transportation data from Belgium for trainingMimicking algorithm based on
Observed correlation between population density and transportationDistribution of transportation frequencyNetwork topology
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 34 / 44
DatasetsPrinting Machinery
Goal: Simulate events from printing machineryMimicking algorithm using event correlations and distributions
Changing plate
Double sheet
Early sheet
Finish job
Misaligned sheet
Missing sheet
Operation partially completed
Performance
Printing interval
Produktion Good Sheet
Side guide warning
Start job
Washing blanket
Washing impression cylinder
Washing ink rollers
with washing ink fountain roller
with washing plates
Mai 01 00:00 Mai 01 06:00 Mai 01 12:00 Mai 01 18:00 Mai 02 00:00Time
Eve
nts
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 35 / 44
DatasetsWeidmüller
Goal: Simulate events from injection molding machineryMimicking algorithm using event correlations and distributions
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 36 / 44
DatasetsSemantic Publishing
Goal: Simulate data from the BBCGenerator based on manually configurable set of correlations
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 37 / 44
HOBBIT RunsTriple Stores
1 4 161
10
100
1000
QmpH Updates
virtuosoblazegraphfuseki
SPARQL worker
upda
tes
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 38 / 44
HOBBIT RunsRuntimes
10× more effort for reduction of error rate by 30%Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 39 / 44
HOBBIT RunsA2KB
System AIDA
/CoN
LL-Com
p.
IITB
KORE
50
MSN
BC
Micr
op.2014-Train
N3-Re
uters-128
AIDA 0.668 0.141 0.625 0.622 0.363 0.391Babelfy 0.448 0.129 0.564 0.423 0.311 0.289DBpedia Spotlight 0.545 0.262 0.341 0.457 0.448 0.320FOX 0.512 0.100 0.268 0.127 0.309 0.518FREME NER 0.358 0.074 0.160 0.208 0.254 0.263WAT 0.673 0.137 0.543 0.631 0.403 0.480xLisa 0.363 0.233 0.352 0.365 0.322 0.274
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 40 / 44
Summary
Rationale
A community-driven unified benchmarking platform for the community
Provide benchmarks and baselinesProvide reference implementation of KPIsExtensible and referenceableOpen-Source
@hobbit_project
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 41 / 44
Summary
Rationale
A community-driven unified benchmarking platform for the community
Provide benchmarks and baselinesProvide reference implementation of KPIsExtensible and referenceableOpen-Source
@hobbit_projectNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 41 / 44
Join HOBBIT
α completedJoin the HOBBIT communityProvide KPIsProvide datasetsJoin the platform developmentFollow us on Twitter
https://project-hobbit.eu
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 42 / 44
Thank You
Axel NgongaAKSW Research GroupInstitute for Applied [email protected]
Michael RöderAKSW Research GroupInstitute for Applied [email protected]
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 43 / 44
Acknowledgment
This work was supported by grants from the EU H2020 Framework Programmeprovided for the project HOBBIT (GA no. 688227).
Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 44 / 44