+ All Categories
Home > Documents > T OWARDS A B IG D ATA C OMMUNITY C HALLENGE

T OWARDS A B IG D ATA C OMMUNITY C HALLENGE

Date post: 15-Feb-2016
Category:
Upload: flynn
View: 49 times
Download: 0 times
Share this document with a friend
Description:
T OWARDS A B IG D ATA C OMMUNITY C HALLENGE. Tilmann Rabl, Florian Stegmaier, Michael Granitzer and Hans-Arno Jacobsen 3rd W orkshop on Big D ata Benchmarking July 16-17 Xi‘an , China. B IG D ATA – W HY C OMMUNITY C HALLANGES M ATTER. - PowerPoint PPT Presentation
Popular Tags:
7
TOWARDS A BIG DATA COMMUNITY CHALLENGE Tilmann Rabl, Florian Stegmaier, Michael Granitzer and Hans-Arno Jacobsen 3rd Workshop on Big Data Benchmarking July 16-17 Xi‘an, China
Transcript
Page 1: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

TOWARDS A BIG DATACOMMUNITY CHALLENGE

Tilmann Rabl, Florian Stegmaier,Michael Granitzer and Hans-Arno Jacobsen

3rd Workshop on Big Data BenchmarkingJuly 16-17

Xi‘an, China

Page 2: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

BIG DATA – WHY COMMUNITY CHALLANGES MATTER

• Big Data is a major buzzword in scientific's world- Conferences, workshops, tutorials, panels- Component benchmark, end-to-end systems, etc.

• Variety leads to incomparability of results

• Research communities run challenges to… enable comparability of results… foster evolution of a research field… “Kites rise highest against the wind, not with it.” (W. Churchill)

Page 3: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

WHAT SHOULD BE IN THE FOCUS?

DATA!

„[...] other communities, like information retrieval, natural language processing, or Web research, have a much richer and agile culture in creating, disseminating, and re-using interesting new data resources

for scientific experimentation [...]” – G. Weikum, SIGMOD Blog

HOW SHOULD IT BE?

INTERESTING!

Page 4: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

HOW ARE „THE OTHERS“ DOING?• Information retrieval community:

– TREC, TRECVid (task-based, measurable scientific impact)

– CLEF Initiative (task-based, benchmarking initiatives)

• Multimedia community:– Multimedia Grand Challenge (tasks defined by “global players”,

e.g., Yahoo! and Microsoft)

– Open Source Software Comp. (foster community activities)

• Semantic Web guys:– Linked Data Cup (data generation)

– Semantic Web in-Use (mashup creation)

Page 5: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

SUCCESSFUL COMMUNITY CHALLENGES: TAKE-HOME MESSAGE

• Challenges are not a single event• On-going process, running through different stages:

– Data generation– Solving restricted, high-impact issues– Fostering open source frameworks – Assembling mashups

• Accepted by the community

Page 6: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

BRAINSTORMING AREA:STRUCTURE OF THE CHALLENGE

• Challenge needs to be focused on specific tasks:– Tasks assemble a “Big Data pipeline”– Specified by academia and industry

• Hybrid approach to engage participants:– Utilize benchmark activities– Computing tasks on “Open Data”

Page 7: T OWARDS A  B IG  D ATA C OMMUNITY  C HALLENGE

TIME TO BREAKOUT!• Discussions should focus on:

– Where to find large-scale, interesting “open” data sets?– Which tasks could form a sophisticated Big Data

pipeline ensuring a broad range of implementations?

BREAKOUT HOW-TO:• Breakout and student groups as

yesterday• Prepare one slide for each question


Recommended