Post on 10-May-2015

Kennisalliantie Nieuwjaarsreceptie 31 januari 2013: Prof. dr. Jacob de Vlieg: “Taming the Big Data Beast Together” CEO en wetenschappelijk directeur van het Netherlands eScience Center (NLeSC)


Netherlands eScience Center ICT Synergy Hub, Amsterdam

Taming the Big Data Beast - Together Nieuwjaarsbijeenkomst Kennisalliantie Delft, 31 januari-2013 Prof. dr. Jacob de Vlieg ¹ ² 1. CEO & Scientific Director of Netherlands eScience Center, NWO-SURF 2. Head Computational Design & Discovery, CMBI, Radboud University, Medical Center, Nijmegen, Netherlands


• Big Data in Science: Challenges & Opportunities – Top Sector ICT Roadmap theme: “Data, Data, Data”

• Netherlands eScience Center (NLeSC)

– Expert centre for Big Data Research

• Joint NWO-NLeSC “Big Data” project call

– Public-private partnerships

Data are the lifeblood of modern science and the digital economy

Big Data: a complex concept – 4Vs: Volume, Variety, Velocity, Verification

Big Data: a complex concept – 4Vs: Volume, Variety, Velocity, Verification

Big Data inextricably connected to eScience/HPC ICT top sector roadmap: e-Science is about intelligent infrastructure to

model and/or to access big data

Key eScience challenges Big Data research

– Cross-type data integration – Data-driven & multi-models simulations – Visualization & analytics – High performance computing: connected computers & fast networks.

Science itself is changing …We need to change with it…

Neelie Kroes in “Giving Europe’s Scientists the Tools to Deliver”

Two key words: multidisciplinary research & data-driven discovery

eScience and the mystery of the empty labs

• Much more data per experiment (miniaturized and/or automation) • External data sources & outsourcing • Experimental design, data management & analytics(eScience)

Use apps and wearable sensors to monitor daily life e.g. hours of sleep, food consumed, exercise taken, etc. Quantified Self = Big Data + Mobile + Sensors + Visualization + Gamification .

Quantified Self Movement -> Big Data

eScience Hero

• Big Data

• Pattern recognition

• Machine learning

• Social Media

Andy Grove (ex-CEO Intel)

Fights for medical innovation; parkinson’s disease

Voice algorithms spot Parkinson's disease: data-driven diagnostics

• Machine learning algorithms that analyse voice recordings to detect Parkinson's symptoms early on (Little at al. @ Media Lab, MIT)

• Social Media:

Looking for volunteers to contribute to the database to improve pattern recognition

•21andme •PatientsLikeMe.com •And so on

Social networking health sites: patient-driven data collection

Big Data V= Verification: privacy, compliance, etc

'Data Scientist' is now the hottest job title in Silicon Valley…

Tim O'Reilly Founder of O'Reilly Media Supporter free software and open source movements

McKinsey projected that the US needs 140,000 to 190,000 more workers with “deep analytical expertise”

Netherlands eScience Center

Netherlands organization for scientific research:

Principal Dutch body for ICT innovation for research

NL-eSC SURF Science park, Amsterdam; SARA, EGI Networked innovation model Bridge:

•Science & advanced ICT •Industry & Academic Research

•Training & Education New ways to do research made possible because of Big Data/eScience

NLeSC portfolio divided in themes •Sustainability & Environment - Climate - Water management -Energy -Ecology •Chemistry & Materials -Chemistry

•Humanities & Social Sciences - Humanities -Social Sciences

•Life Sciences - Green Genetics - Translational Research IT - Foods - Cognition/Neuroscience •eScience Methodology & ‘Big Data’ - eScience Methodology - Astronomy

Can scientists from digital humanities help food researchers?

Digital Humanities: BiographyNED

Project Leader: Guus Schreiber

Will improve current version of the Biography Portal by incorporating analytical tools to show interconnections, trends, geographical maps and time lines.

Food Research: Food Specific Ontologies for Food Focused Text Mining

Project Leader: Wynand Alkema

Addressing absence of domain specific structured vocabularies which limits the use of data mining & knowledge management methods in food research.

eScience & Big Data: providing leads for new food applications

NLeSC eScience engineers: Scientists bridging research and advanced ICT

Deliver sustainable solutions for data-driven research Work both at center and on site

NLeSC eScience Engineers: Work both at center and on site: •Exchange of eScience expertise •Re-use of proven eScience (technology hopping) •Career development & training

Collaborative Innovation Network Taming the Data Beast Together


Grand scientific challenges leads to innovative eScience & Big Data Research

•eScience to allow unprecedented level of detail (large scale distributed computing) •State-of-the-art visualization techniques to analyze hundreds of Terabytes of output

•Re-use of proven eScience concepts in new areas (e.g. sector water)

Prof. Henk Dijkstra, Univ. of Utrecht NLeSC Integrator Climate

eSalsa NLeSC project: data-driven simulations & advanced visualization to understand Climate Change

Dr. Jason Maassen eScience Engineer NLeSC

The number of data-driven start-ups is growing—particularly when it comes to social media.

Taming the Big Data Beast

Development of a high performance Twitter analysis platform

Hadoop – MapReduce architecture @ a large SARA computer cluster

Smart search & analysis software

Goal is to ask “Big Data” research questions e.g.

• Ability to analyze microblogging data produced over years • Time dependant • Real time sentiment analysis • And so on…

Prof. Antal van den Bosch NLeSC Integrator Humanities Radboud University Nijmegen

Dr. Erik Tjong Kim Sang eScience Engineer NleSC

Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work


To link minds and eScience

The key to scientific questions y!

Joint NWO-NLeSC “data sciences” call • Focus on stimulating public-private partnerships

• Three instruments:

– Industrial Partnership Programme (IPP) – Technology Area’s (TA) – Knowledge Innovation Mapping SMEs (KIEM MKB)

Rosemarie van der Veen-Oei (NLeSC) r.vanderveen@nwo.nl T 070 3440 851

Mark Kas (NWO) m.kas@nwo.nl T 070 3440 811, M 06 205 93 207

www.nlesc.nl Netherlands eScience Center

