+ All Categories
Home > Documents > Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo...

Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo...

Date post: 20-May-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
49
Unified Benchmarking of Big Data Platforms The HOBBIT Platform Axel-Cyrille Ngonga Ngomo Horizon 2020 GA No 688227 01/12/2016–30/11/2018 Apache Big Data Sevilla, Spain November 11, 2016 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 1 / 42
Transcript
Page 1: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Unified Benchmarking of Big Data PlatformsThe HOBBIT Platform

Axel-Cyrille Ngonga Ngomo

Horizon 2020GA No 688227

01/12/2016–30/11/2018

Apache Big DataSevilla, Spain

November 11, 2016

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 1 / 42

Page 2: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

A Lot of Data

1

1http://www.ibmbigdatahub.com/infographic/four-vs-big-dataNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 2 / 42

Page 3: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

A Lot of Tools

2

2https://cloudramblings.me/Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Page 4: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

A Lot ... of Tools

33http://mattturck.com/2016/02/01/big-data-landscape/

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 4 / 42

Page 5: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

A Lot of Views

44https://steemit.com/philosophy/@l0k1/

subjectivity-and-truth-how-blockchains-model-consensus-buildingNgonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 5 / 42

Page 6: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Core Questions

Developers: How good is my tool?Vendors: Who is my tool good for?Users: Which tool(s) should I use formy application?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 6 / 42

Page 7: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Many Questions

Where are the current bottlenecks?Which steps of the data lifecycle arecritical?Which solutions are available?Which key performance indicatorsare relevant?How well do or should toolsperform?How do existing solutions performw.r.t. relevant indicators?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 7 / 42

Page 8: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SolutionBenchmark

ComponentsDataset(s), e.g., Twitter stream, sensor dataTask(s), i.e., NER, NEL, ingestionKey Performance Indicators, e.g., precision, recall

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 8 / 42

Page 9: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SolutionBenchmark

ComponentsDataset(s), e.g., Twitter stream, sensor dataTask(s), i.e., NER, NEL, ingestionKey Performance Indicators, e.g., precision, recall

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 8 / 42

Page 10: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

ChallengesDataset Mismatch

Year

ACE

Wiki

AQUA

INT

MSN

BC

IITB

Meij

AIDA

/CoN

LL

N3collection

KORE

50

Wiki-D

isamb3

0

Wiki-A

nnot30

Spotlight

Corpus

SemEv

al-2013task

12

SemEv

al-2007task

7

SemEv

al-2007task

17

Senseval-3

NIF-based

corpus

Micr

oposts2014

Softw

areavailable?

Webservice

available?

Cucerzan 2007 3Wikipedia 2008 3* 3MinerIllinois Wikifier 2011 3 3 3* 3 3Spotlight 2011 3 3 3AIDA 2011 3 3 3**TagMe 2 2012 3 3 3 3Dexter 2013 3 3KEA 2013 3WAT 2013 3 3AGDISTIS 2014 3 3 3 3 3 3 3 3 3 3Babelfy 2014 3 3 3 3 3 3 3NERD-ML 2014 3 3 3 3

BAT- 2013 3 3 3 3 3 3 3* 3FrameworkNERD 2014 3 3 3 3 3FrameworkGERBIL 2014 3 3 3 3 3 3 3* 3 3 3 3 3 3 3

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 9 / 42

Page 11: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

ChallengesUnclear KPI Semantics

ExampleWhich time do we measure?

First or last result?With or without network delay?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 10 / 42

Page 12: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

ChallengesUnclear KPI Semantics

ExampleWhich time do we measure?

First or last result?With or without network delay?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 10 / 42

Page 13: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

ChallengesUnclear KPI Semantics

ExampleWhen is an annotation correct?

Weak or strong annotation?Semantically equivalent or exact URI?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 11 / 42

Page 14: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

ChallengesUnclear KPI Semantics

ExampleWhen is an annotation correct?

Weak or strong annotation?Semantically equivalent or exact URI?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 11 / 42

Page 15: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SolutionUnified Benchmarking Framework

Benchmark Core

Web service calls

Dataset Wrapper

Web service calls

Interface View

AnnotatorWrapper

Interface View

Open Datasets

Configuration(Model)

...

Benchmark Core

Your Annotator

Your DatasetDataHub.io

GERBIL Core Controller

Persistent Experiment Database

(Model)

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 12 / 42

Page 16: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

GERBILOverview

Evaluation platform for NER/NEL18 reference annotation systems32 reference datasetsBenchmarking 10× fasterArchiving of resultsCiteable URIsAdditional analysisOpen-source projectLocal deploymentNormalized implementation of KPIsOnline instanceFeedback for developers and users

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

Page 17: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

GERBILOverview

Evaluation platform for NER/NEL18 reference annotation systems32 reference datasetsBenchmarking 10× faster

Archiving of resultsCiteable URIsAdditional analysisOpen-source projectLocal deploymentNormalized implementation of KPIsOnline instanceFeedback for developers and users

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

Page 18: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

GERBILOverview

Evaluation platform for NER/NEL18 reference annotation systems32 reference datasetsBenchmarking 10× fasterArchiving of resultsCiteable URIsAdditional analysis

Open-source projectLocal deploymentNormalized implementation of KPIsOnline instanceFeedback for developers and users

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

Page 19: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

GERBILOverview

Evaluation platform for NER/NEL18 reference annotation systems32 reference datasetsBenchmarking 10× fasterArchiving of resultsCiteable URIsAdditional analysisOpen-source projectLocal deploymentNormalized implementation of KPIsOnline instanceFeedback for developers and users

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 13 / 42

Page 20: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

GERBIL

Annotator TasksNIF-based Annotators 2519Babelfy 958DBpedia Spotlight 922TagMe 2 811WAT 787Kea 763Wikipedia Miner 714NERD-ML 639Dexter 587AGDISTIS 443Entityclassifier.eu NER 410FOX 352Cetus 1Overall 24.3K exps

50+ papers

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 14 / 42

Page 21: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT

Rationale

A community-driven benchmarking framework for the community

Focus on Big (Linked) DataBuild upon 24.3K experiments performedwith GERBILCover all steps of the Linked Data lifecycle

Used by a growing number of companiesMature and maturing technologies

Open benchmarks based on industrial dataand use cases

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 42

Page 22: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT

Rationale

A community-driven benchmarking framework for the community

Focus on Big (Linked) DataBuild upon 24.3K experiments performedwith GERBILCover all steps of the Linked Data lifecycle

Used by a growing number of companiesMature and maturing technologies

Open benchmarks based on industrial dataand use cases

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 15 / 42

Page 23: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Aims

1 Gather real requirementsPerformance indicatorsPerformance thresholds

2 Develop benchmarks based on real data3 Provide universal benchmarking platform

Standardized hardwareComparable results

4 Periodic benchmarking challenges5 Periodic reporting6 Found independent Hobbit association

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 16 / 42

Page 24: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Overview

Data Collection

Industrydata

Measure Collection

Benchmark Creation

Benchmark 1

KPIsTasks

KPIsTasksKPIsTasks

KPIsTasks

KPIsTasks

KPIsTasks

Benchmark 2

Benchmark n

HOBBITPlatform

Solution 1

Solution k

Solution 2

Challenges

Reports

Participants/Community

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 17 / 42

Page 25: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Survey

QuestionsIn what areas are organizations active?What do people expect from benchmarks?How are benchmarks being used?

Profile CountSolution providers 56Technology users 67Scientific community 65

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 18 / 42

Page 26: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SurveyCan your solution be benchmarked?

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 19 / 42

Page 27: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SurveyDo you benchmark your solution?

Own datasets and settings in many casesOwn implementations of measuresResults not comparable

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 20 / 42

Page 28: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

SurveyApplication Areas

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 21 / 42

Page 29: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformFeatures

Uses established deploymenttechnologies (Docker)

Decoupled componentsBenchmark and Systems can bewritten in different languages

Uses scalable message queues forcommunication

Open-source implementation

Supports distributed benchmarksand systemsOnline instance on server cluster

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 22 / 42

Page 30: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT BenchmarksFeatures

Addresses all steps of the LinkedData Lifecycle

Benchmarks derived from industryuse casesReal data under the bechmarksScalable size of benchmarks

Open-source implementation

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 23 / 42

Page 31: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformBenchmarks

Streaming and static deterministic benchmarksRealistic benchmarksControlled volume and velocity

Generation and AcquisitionConversion of XML into RDFEntity recognition and linkingRelation extraction

Analysis and ProcessingLink DiscoveryMachine LearningSupervised and unsupervised

Storage and CurationTriple storesVersioningIncl. updates

Visualization and ServicesQuestion AnsweringFaceted BrowsingUsage-based benchmarks

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 24 / 42

Page 32: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformArchitecture

PlatformController

Data Generator

Task Generator

Data Generator

Data Generator

Task Generator

Task Generator

Front End

Benchmarked System

data flowcreates component

StorageAnalysis

BenchmarkController

Evaluation Module

Eval. Storage

Logging

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 25 / 42

Page 33: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformBenchmark Initialization

PlatformController

Data Generator

Task Generator

Data Generator

Data Generator

Task Generator

Task Generator

Benchmarked System

data flowcreates component

Storage

BenchmarkController

Eval. Storage

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 26 / 42

Page 34: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformBenchmark Execution

PlatformController

data flowcreates component

Storage

Data Generator

Task Generator

Data Generator

Data Generator

Task Generator

Task Generator

Benchmarked System

BenchmarkController

Eval. Storage

ex:Entity rdf:type ex:Class...

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 27 / 42

Page 35: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformBenchmark Execution

PlatformController

data flowcreates component

Storage

Data Generator

Task Generator

Data Generator

Data Generator

Task Generator

Task Generator

Benchmarked System

BenchmarkController

Eval. Storage

vex:Entity...

SELECT ?vWHERE { ?v a ex:Class }

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 28 / 42

Page 36: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformBenchmark Execution

PlatformController

data flowcreates component

Storage

Data Generator

Task Generator

Data Generator

Data Generator

Task Generator

Task Generator

Benchmarked System

BenchmarkController

Eval. Storage

X

vex:Entity...

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 29 / 42

Page 37: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT PlatformBenchmark Evaluation

data flowcreates component

PlatformController

Storage

BenchmarkController

Evaluation Module

Eval. Storage

precision=...recall=...F1-score=... precision=...

recall=...F1-score=...

benchmark parameters: ...

vex:Entity...

vex:Entity...

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 30 / 42

Page 38: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

DatasetsTWIG

Goal: Simulate real Twitter FirehoseRelies on 476 million tweets as training dataMimicking algorithm based on

Distribution of character frequenciesDistribution of transportation frequencyNetwork topology

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 31 / 42

Page 39: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

DatasetsLinkedConnections

Goal: Simulate real transport networkReal transportation data from Belgium for trainingMimicking algorithm based on

Observed correlation between population density and transportationDistribution of transportation frequencyNetwork topology

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 32 / 42

Page 40: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

DatasetsPrinting Machinery

Goal: Simulate events from printing machineryMimicking algorithm using event correlations and distributions

Changing plate

Double sheet

Early sheet

Finish job

Misaligned sheet

Missing sheet

Operation partially completed

Performance

Printing interval

Produktion Good Sheet

Side guide warning

Start job

Washing blanket

Washing impression cylinder

Washing ink rollers

with washing ink fountain roller

with washing plates

Mai 01 00:00 Mai 01 06:00 Mai 01 12:00 Mai 01 18:00 Mai 02 00:00Time

Eve

nts

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 33 / 42

Page 41: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

DatasetsWeidmüller

Goal: Simulate events from injection molding machineryMimicking algorithm using event correlations and distributions

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 34 / 42

Page 42: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

DatasetsSemantic Publishing

Goal: Simulate data from the BBCGenerator based on manually configurable set of correlations

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 35 / 42

Page 43: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Join HOBBIT

Join the HOBBIT communityProvide KPIsProvide datasetsJoin the platform developmentFollow us on Twitter

https://twitter.com/hobbit_project

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 36 / 42

Page 44: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT Benchmarks

Streaming and static deterministic benchmarksRealistic benchmarksControlled volume and velocity

Generation and AcquisitionConversion of XML into RDFEntity recognition and linkingRelation extraction

Analysis and ProcessingLink DiscoveryMachine LearningSupervised and unsupervised

Storage and CurationTriple storesVersioningIncl. updates

Visualization and ServicesQuestion AnsweringFaceted BrowsingUsage-based benchmarks

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 37 / 42

Page 45: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT RunRuntimes

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 38 / 42

Page 46: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT RunEffectiveness

System Precision Recall F1-measureFOX 0.515 0.310 0.351Balie 0.369 0.230 0.249Illinois 0.500 0.288 0.327OpenNLP 0.442 0.241 0.285Stanford 0.486 0.303 0.335

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 39 / 42

Page 47: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

HOBBIT RunExample results

A2KB, weak annotation match, Micro F1-measure

System AIDA

/CoN

LL-Com

p.

IITB

KORE

50

MSN

BC

Micr

op.2014-Train

N3-Re

uters-128

AIDA 0.668 0.141 0.625 0.622 0.363 0.391Babelfy 0.448 0.129 0.564 0.423 0.311 0.289DBpedia Spotlight 0.545 0.262 0.341 0.457 0.448 0.320FOX 0.512 0.100 0.268 0.127 0.309 0.518FREME NER 0.358 0.074 0.160 0.208 0.254 0.263WAT 0.673 0.137 0.543 0.631 0.403 0.480xLisa 0.363 0.233 0.352 0.365 0.322 0.274

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 40 / 42

Page 48: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Thank You

http://project-hobbit.eu/get-involved/

http://goo.gl/forms/1iRIoG4Xpb

https://twitter.com/hobbit_project

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 41 / 42

Page 49: Unified Benchmarking of Big Data Platforms The HOBBIT Platform€¦ · ALotofTools 2 2 Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 3 / 42

Acknowledgment

This work was supported by grants from the EU H2020 Framework Programmeprovided for the project HOBBIT (GA no. 688227).

Ngonga Ngomo (InfAI) Benchmarking Big Data November 15th, 2016 42 / 42


Recommended