+ All Categories
Home > Data & Analytics > The BlueBRIDGE approach to collaborative research

The BlueBRIDGE approach to collaborative research

Date post: 15-Apr-2017
Category:
Upload: blue-bridge
View: 327 times
Download: 0 times
Share this document with a friend
30
BlueBRIDGE receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 675680 www.bluebridge-vres.eu The BlueBRIDGE approach to collaborative research Gianpaolo Coro CNR, Italy [email protected]
Transcript
Page 1: The BlueBRIDGE approach to collaborative research

BlueBRIDGE receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 675680 www.bluebridge-vres.eu

The BlueBRIDGE approach to collaborative research

Gianpaolo CoroCNR, [email protected]

Page 2: The BlueBRIDGE approach to collaborative research

Context

Progress in Information Technology has changed the paradigms of Science

The large and fast increase of volume and complexity of data requires new approaches to collect-curate-analyse the data

This requires new tools to guarantee exchange and longevity of the data and of the reapplication of the experiments

Page 3: The BlueBRIDGE approach to collaborative research

Big Data• Large volume

• High generation velocity

• Large variety

• Untrustworthy (veracity)

• High complexity(variability)

Big Data: a dataset with large volume, variety, generation velocity, containing complex and untrustworthy information that requires nonconventional methods to extract, manage and process information within a reasonable time.

Page 4: The BlueBRIDGE approach to collaborative research

New Science Paradigms Open Science: make scientific research, data and dissemination

accessible to all levels of an inquiring society, amateur or professional.

Keywords: Open Access, Open research, Open Notebook Science

E-Science: computationally intensive science is carried out in highly distributed network environments that use large data sets and require distributed computing and collaborative tools.

Keywords: Provenance of the scientific process, Scientific workflows

Science 2.0: process and publish large data sets using a collaborative approach. Share from raw data to experimental results and processes. Support collaborative experiments and Reproducibility-Repeatability-Reusability (R-R-R) of Science.

Keywords: collaborative and repeatable Science

Page 5: The BlueBRIDGE approach to collaborative research

Requirements for IT systems

• Support collaborative research and experimentation• Implement Reproducibility-Repeatability-Reusability of

Science• Allow sharing data, processes and findings• Grant free access to the produced scientific knowledge• Tackle Big Data challenges• Sustainability: low operational costs, low maintenance

prices• Manage heterogeneous data/processes access policies• Meet industrial processes requirements

Page 6: The BlueBRIDGE approach to collaborative research

e-Infrastructurese-Infrastructures enable researchers at different locations across the worldto collaborate in the context of their home institutions or in national or multinational scientific initiatives.• People can work together having shared access to unique or distributed scientific

facilities (including data, instruments, computing and communications).

Examples:

Belief, http://www.beliefproject.org/OpenAire, http://www.openaire.eu/i-Marine, http://www.i-marine.eu/EU-Brazil OpenBio, http://www.eubrazilopenbio.eu/

Page 7: The BlueBRIDGE approach to collaborative research

Virtual Research Environments

• Define sub-communities

• Allow temporary dedicated assignment of computational, storage, and data resources

• Manage policies

• Support data and information sharing

Inte

grat

es

e-Infrastructure

Unified Resource Space

Enab

les

VRE VRE VRE

WPS

External e-Infrastructures

Page 8: The BlueBRIDGE approach to collaborative research

Virtual Research EnvironmentsInnovative, web-based, community-oriented, comprehensive, flexible, and

secure working environments.

• Communities are provided with applications to interact with the VRE services• Client services are provided both with APIs (Java, R) and simple HTTP-REST interfaces

Page 9: The BlueBRIDGE approach to collaborative research

VREs ExampleThe D4Science e-Infrastructure

D4Science supports scientists in several domains1. More than 25 000 taxonomicstudies per monthwww.i-marine.eu

2. More than 60 000 species distribution maps produced and hostedwww.d4science.eu

3. Used to build a pan- European geothermal energy mapwww.egip.d4science.org

4. Processing and management of heterogeneous environmental and Earth system data

www.envriplus.eu

5. Enhances communication and exchange in Linguistic Studies, Humanities, Cultural Heritage, History and Archaeologywww.parthenos-project.eu

Page 10: The BlueBRIDGE approach to collaborative research

BlueBRIDGE VREs

Stock Assessmentassess the health status of fisheries stocks.

http://www.bluebridge-vres.eu/services/stock-assessment

CMSY model

Marine Protected Areas reduce adverse impact of human activities (e.g. fishing, aquaculture, tourism) on ecosystems, and ensure these activities are properly embedded in policy frameworks.

http://www.bluebridge-vres.eu/services/protected-area-impact-maps

Page 11: The BlueBRIDGE approach to collaborative research

Education VREsLecture-style: the course topics stress is different depending on the audience

Interactive: after each explained topic, students do experiments

Experimental: students reproduce the experiment shown by the teacher and possibly repeat it on their own data

Social: students communicate via messaging or VRE discussion panel

• 1 course/yearIn Pisa

• 1 course/yearIn Paris

• 12 coursesIn Copenhagen

www.bluebridge-vres.eu

International Council for the Exploration of the Sea

• 38 coursesAll over the world+1000 attendees

Page 12: The BlueBRIDGE approach to collaborative research

Social networking is key to share information in e-InfrastructureBlueBRIDGE offers a continuously updated list of events / news produced by users and applications

User-shared News

Application-shared News

Share News

BlueBRIDGE VREs:Social Networking

Page 13: The BlueBRIDGE approach to collaborative research

A free-of-use folder-based file system allows managing and sharing information objects.

Information objects can be • files, dataset,

workflows, experiments, etc.

• organized into folders

• shared• disseminated via public

URLs

BlueBRIDGE VREs:The Workspace – an online files storage system

Page 14: The BlueBRIDGE approach to collaborative research

StorageDatabases Cloud storage Geospatial data

Metadata generation and management Harmonisation Sharing

Data management

Cloud computing Elastic resources assignment

Multi-platform: R, Java, Fortran

Processing

BlueBRIDGE Facilities:Overview

Page 15: The BlueBRIDGE approach to collaborative research

Data Processing

Page 16: The BlueBRIDGE approach to collaborative research

• Experiments on Big Data• Sharing inputs and results• Save the provenance of experiments• Supports R-R-R of experiments

WPS

REST

• Input/Out• Parameters• Provenance

Cloud Computing Platform

Page 17: The BlueBRIDGE approach to collaborative research

BlueBRIDGE computational capabilitiesProject resources: 6 Virtual Machines (VM) with 16 virtual CPU cores, 16GB of RAM and

100GB of storage 100 VMs with 2 virtual CPU cores, 8GB of RAM and 20GB of storageProcesses: ~ 200 algorithms hosted in all the VREs ~ 20 contributing institutes ~ 30,000 requests per month ~ 2000 scientists/students in 44 countries using VREs Programming languages: R, Java, Python, Fortran, Linux-compiled

External providers (European Grid Infrastructure): 6 VMs: 8 virtual CPU cores, 16GB of RAM and 100GB of storage 2 VMs: 16 virtual CPU cores, 32GB of RAM and 100GB of storage 24 VMs: 2 virtual CPU cores, 8GB of RAM and 50GB of storage 5VMs: 4 virtual CPUs cores, 8GB of RAM and 80GB of disk

Page 18: The BlueBRIDGE approach to collaborative research

Integrating new processesIntegration: putting a script that works offline into the Cloud computing platform.Tools: https://wiki.gcube-system.org/gcube/How-to_Implement_Algorithms_for_the_Statistical_Managerhttps://wiki.gcube-system.org/gcube/Statistical_Algorithms_Importer

R script

Computing platform Web interface and Web service

SAI - Importing tool

Automatic

Page 19: The BlueBRIDGE approach to collaborative research

Advantages The process is available as-a-Service Invoked via communication standards Higher computational capabilities Automatic creation of a Web interface Provenance management Storage of results on a high-availability system Collaboration and sharing Re-usability, e.g. from other software (e.g. QGIS)

Page 20: The BlueBRIDGE approach to collaborative research

Collaborative experiments

WS

Shared online folders

Inputs

Outputs

Results

Computational system

In the e-Infrastructure

Through third party software

Page 21: The BlueBRIDGE approach to collaborative research

Ensemble ModelImplementation of an ensemble model approach to support advice and management in fisheries.Thorpe et al. (2015). Evaluation and management implications of uncertainty in a multispecies size structured model of population and community responses to fishing. Methods in Ecology and Evolution, 6(1), 49-58.

Diet Information Life history diet information Historical fishing scenarios MSY fishing scenarios Initial abundance values Life history prior information

Total Biomass Stock Spawning Biomass Life history traits

Input OutputProcess

Python script

Page 22: The BlueBRIDGE approach to collaborative research

EM Integration

Download the python scriptand the user’s data

Execute script

Collect output

Destroy local copies of I/O and script

Save Output on the User’s Workspace, with provenance info

Scientist’s provided script

User’s data

Infrastructuremachine

Page 23: The BlueBRIDGE approach to collaborative research

EM Interface

User’s privateWorkspace

Page 24: The BlueBRIDGE approach to collaborative research

EM Interface

Page 25: The BlueBRIDGE approach to collaborative research

EM Interface

Page 26: The BlueBRIDGE approach to collaborative research

EM Interface

Page 27: The BlueBRIDGE approach to collaborative research

Scientific Workflow

Script provider

Updates the script on his private Workspace

The service downloadsthe script on-the-fly

A user executes an experiment on his/her data

The output, the input and the parameters can be shared with another user

This user can execute the experiment againand share the computation with the other user

1

2

3

4

5

6

7

89

10

Page 28: The BlueBRIDGE approach to collaborative research

Limitations and requirements

Input OutputScript

Script

Required Provided

Issues: Code is often designed for one precise data set Often, prototype scripts have code that is not separable from the I/O

In the context of e-Infrastructures and Science 2.0: Modularity is necessary for integration Scripts should be re-organised in a way they could be re-used on other data without

changing the code

Vs

Page 29: The BlueBRIDGE approach to collaborative research

WS

Self-consistent comp. products

RepeatabilityProvenance Prov-O

ReusabilityUse of standards

Reproducibility

Conclusions E-Infrastructures endow processes with several Science

2.0 features BlueBRIDGE offers an e-Infrastructure and resources to

host processes and collaborate Effort is required to algorithms providers to comply with

service and generalisation requirements

Page 30: The BlueBRIDGE approach to collaborative research

Recommended