Post on 28-Dec-2015
transcript
1
eScience on Distributed Infrastructure in Poland Marian BubakAGH University of Science and TechnologyACC CyfronetKrakow, Poland
dice.cyfronet.pl
PLAN-E, the Platform of National eScience/Data Research Centers in Europe, 29-30 September 2014, Amsterdam
2
ACC Cyfronet AGHPL-Grid Consortium and ProgrammeFocus on users: training and supportPlatforms and tools: towards PL-ecosystemInternational cooperation, conferencesSummary
Outline
3
Credits
ACC Cyfronet AGHMichał TurałaKrzysztof ZielińskiKarol KrawentekAgnieszka SzymańskaMaciej TwardyAngelika Zaleska-WalterbachAndrzej OziębłoZofia MosurskaMarcin RadeckiRenata Słota Tomasz GubałaDarin NikolowAleksandra PałukPatryk LasońMarek MagryśŁukasz Flis
ICMMarek NiezgódkaPiotr BałaMaciej Filocha
PCSSMaciej StroińskiNorbert MeyerKrzysztof KurowskiBartek Palak Tomasz PiontekDawid Szejnfeld Paweł Wolniewicz
WCSSPaweł TykierkoPaweł DziekońskiBartłomiej Balcerek
TASKRafał TylmanMścislaw NakoniecznyJarosław Rybicki
… and many others
domain experts ….
4
ACC Cyfronet AGH
High Performance Computing
High Performance Networking
Centre of Competence
Participation and coordination of
national and international scientific
projects.
Computational power, storage and libraries
for scientific research.Coordinator of
PL-Grid InfrastructureDevelopment.
Main node of Cracow MAN.South Poland main node of
PIONIER network.Access to GEANT network.
40 years of expertise
RankTOP500
Site System CoresRmax
TflopsRpeak
Tflops
176VI.2014
CyfronetPoland
Cluster Platform InfinibandHewlett-Packard
25,468 266.9 373.9
5
Motivation and background
Experiments in silico:advanced, distributed computing
big international collaboration
e-Science and e-Infrastructure interaction
World progress in Big Science:Theory, Experiment, Simulation
Data intensive computing
Numerically intensive computing
Computational Science problems to be addressed:algoritms, environments and deployment
4th paradigm, Big Data, Data Farming
Needs:increase of resources
support for making science
6
PL-Grid Consortium
Consortium creation – January 2007a response to requirements from Polish scientists
due to ongoing Grid activities in Europe (EGEE, EGI_DS)
Aim: significant extension of amount of computing resources provided to the scientific community (start of the PL-Grid Programme)
Development based on: projects funded by the European Regional Development Fund as part of the Innovative Economy Program
close international collaboration (EGI, ….)
previous projects (5FP, 6FP, 7FP, EDA…)
National Network Infrastructure available: Pionier National Project
computing resources: Top500 list
Polish scientific communities: ~75% highly rated Polish publications in 5 Communities
PL-Grid Consortium members: 5 High Performance Computing Polish Centres, representing Communities, coordinated by ACC
Cyfronet AGH
7
PL-Grid and PLGrid Plus in short
PLGrid Plus Project (2011–2014)Budget: total ca.18 M€, from EU: ca.15 M€
Expected outcome: focus on users
specific computing environments
QoS by SLM
PL-Grid Project (2009–2012)Budget: total 21 M€, from EU 17M€
Outcome: Common base infrastructure
National Grid Infrastructure (NGI_PL)
Resources: 230 Tflops, 3.6 PB
Extension of resources and services by:500 Tflops, 4.4 PB
Keeping diversity for usersClusters (thin and thick nodes, GPU)
SMP, vSMP, Clouds
8
PL-Grid project
PL-Grid aimed at significantly extending the amount of computing resources provided to the Polish scientific community (by approximately 215 TFlops of computing power and 2500 TB of storage capacity) and constructing a Grid system that would facilitate effective and innovative use of the available resources.
Polish Infrastructure for Supporting Computational Science in the European Research Space – PL-Grid
Budget: total 21 m€, from EC 17m€Duration: 1.1.2009 – 31.3.2012Managed by the PL-Grid Consortium made up of 5 Polish supercomputing and networking centres Project coordinator: Academic Computer Centre Cyfronet AGH, Krakow, PolandProject web site: projekt.plgrid.pl
Main Project Objectives:
Common (compatible) base infrastructureCapacity to construct specialized, domain Grid
systems for specific applicationsEfficient use of available financial resources Focus on HPC and Scalability Computing for
domain specific Grids
9
PL-Grid project – results
Publication of the book presenting the scientific and technical achievements of the Polish NGI in the Springer Publisher, in March 2012:
„Building a National Distributed e-Infrastructure – PL-Grid”
In Lecture Notes in Computer Science, Vol. 7136, subseries: Information Systems and Applications
Content: 26 articles describing the experience and the scientific results obtained by the PL-Grid project partners as well as the outcome of research and development activities carried out within the Project.
First working NGI in Europe in the framework of EGI.eu (since March 31, 2010)
Number of users (March 2012): 900+
Number of jobs per month: 750,000 - 1,500,000
Resources available:
Computing power: ca. 230 TFlops
Storage: ca. 3600 TBytes
High level of availiability and realibility of the resources
Facilitating effective use of these resources by providing:
innovative grid services and end-user tools like Efficient Resource Allocation, Experimental Workbench and Grid Middleware
Scientific Software Packages
User support: helpdesk system, broad training offer
Various, well-performed dissemination activities, carried out at national and international levels, which contributed to increasing of awareness and knowledge about the Project and the grid technology in Poland.
10
PLGrid Plus projectDomain-oriented services and resources of Polish Infrastructure for Supporting Computational Science in the European Research Space – PLGrid Plus
Budget: total ca. 18 M€ including funding from the EC: ca.15 M€
Duration: 1.10.2011 – 31.12.2014
Five PL-Grid Consortium Partners
Project Coordinator: ACC CYFRONET AGH
The main aim of the PLGrid Plus project is to increase potential of the Polish Science by providing the necessary IT services for research teams in Poland, in line with European solutions.
Preparation of specific computing environments so called domain grids i.e. solutions, services and extended infrastructure (including software), tailored to the needs of different groups of scientists.
Domain-specific solutions created for 13 groups of users, representing the strategic areas and important topics for the Polish and international science:
11
PLGrid Plus project – activities
Integration ServicesNational and International levels
Dedicated Portals and Environments
Unification of distributed Databases
Virtual Laboratiories
Remote Visualization
Service value = utility + warranty
SLA management
Computing Intensive Solutions
Specific Computing Environments
Adoption of suitable algorithms and solutions
Workflows
Cloud computing
Porting Scientific Packages
Data Intensive ComputingAccess to distributed Scientific Databases
Homogeneous access to distributed data
Data discovery, process, visualization, validation….
4th Paradigm of scientific research
Instruments in GridRemote Transparent Access to instruments
Sensor networks
OrganizationalOrganizational backbone
Professional support for specific disciplines and topics
12
New domain-specific services for 13 identified scientific domainsExtension of the resources available in the PL-Grid Infrastructure by ca. 500 TFlops of computing power and ca. 4.4 PBytes of storage capacityDesign and start-up of support for new domain grids Deployment of Quality of Service system for users by introducing SLA agreementDeployment of new infrastructure servicesDeployment of Cloud infrastructure for usersBroad consultancy, training and dissemination offer
Publication of the book presenting the scientific and technical achievements of PLGrid Plus in the Springer Publisher, in September 2014:
„eScience on Distributed Computing Infrastructure”
In Lecture Notes in Computer Science, Vol. 8500, subseries: Information Systems and Applications
PLGrid Plus project – results
Content: 36 articles describing the experience and the scientific results obtained by the PLGrid Plus project partners as well as the outcome of research and development activities carried out within the Project.
Huge effort of 147 authors, 76 reviewers and editors team in Cyfronet
13
PLGrid NG projectNew generation domain-specific
services in the PL-Grid infrastructure for Polish Science
Budget: total ca. 14 889 773,23 PLN, including funding from the EC: 12 651 715,38 PLN
Duration: 01.01.2014 – 31.10.2015
Five PL-Grid Consortium Partners
Project Coordinator: ACC CYFRONET AGH
Meteorology
Biology
Personalized MedicineComplex Networks
Mathematics
UNRES Medicine
Computational Chemistry
eBaltic-Grid
Hydrology
Nuclear Power and CFD
OpenOxides
GeoinformaticsMetal Processing Technologies
The aim of the PLGrid NG project is to provide a set of dedicated, domain-specific computing services for 14 new groups of researchers and implementation of these services in the PL-Grid national computing infrastructure.
14
PLGrid NG project – activities
Tasks:
Additional groups of experts involved − identified 14 communities/scientific topics
Development and maintenance of the IT infrastructure
In line with the best IT Service Management (ITSM) practices, such as ITIL or ISO-20000
Security on new applications, auditsIn the development stage, before deployment and during exploitation
Optimization of resource usage − IT expertsOperation Center
Optimization of application porting
User supportFirst-line support, Helpdesk, domain experts, training
Grid infrastructure (Grid services) PL-Grid
Appl
icati
on
Appl
icati
on
Appl
icati
on
Appl
icati
on
Clusters High Performance Computers Data repositories
National Computer Network PIONIER
DomainGrid
DomainGrid
DomainGrid
DomainGrid
New Advanced Service Platforms
15
PLGrid Core projectCompetence Centre in the Field of
Distributed Computing Grid Infrastructures
Budget: total 104 949 901,16 PLN, including funding from the EC : 89 207 415,99 PLN
Duration: 01.01.2014 – 31.11.2015
Project Coordinator: Academic Computer Centre CYFRONET AGH
The main objective of the project is to support the development of ACC Cyfronet AGH as a specialized competence centre in the field of distributed computing infrastructures, with particular emphasis on grid technologies, cloud computing and infrastructures supporting computations on big data.
16
PLGrid Core project – services
Basic infrastructure services
Uniform access to distributed data
PaaS Cloud for scientists
Applications maintenance environment of MapReduce type
End-user services
Technologies and environments implementing the Open Science paradigm
Computing environment for interactive processing of scientific data
Platform for development and execution of large-scale applications organized in a workflow
Automatic selection of scientific literature
Environment supporting data farming mass computations
17
Focus on users
Computer centresHardware/Software
User friendlyServices
Domain Experts
Real Users
Help Desk QoS/SLM
Grants
18
User support
Interdisciplinary team of IT experts with extensive knowledge on
different programming methods used in research: parallel, distributed and GPGPU cards programming
various scientific software
the specifics of work with HPC/Cloud systems
various aspects of work with large data sets
Support methods
PL-Grid Infrastructure user support systems (Helpdesk, User’s Forum)
documentation services, PL-Grid User’s Manual
f2f meetings and consultations in ACC Cyfronet AGH and users' home institutions
International cooperationcooperation with various institutions and initiatives dedicated to scientists’ training: Software Sustainability Institute (UK), Software Carpentry, Data Carpentry, Mozilla Science Lab, ELIXIR UK
Cyfronet is making every effort to become a Software Carpentry regional center in Poland or Central Europe
Users of the Cyfronet computing resources are provided with support and professional help in solving any
problems related to access and effective use of these resources.
19
Training
Training on basic and advanced servicestraditional − in ACC Cyfronet AGH or in the interested users’ home scientific institutions
remote − using a teleconference platform (Adobe Connect) and e-learning platforms (Blackboard Learn – currently; Moodle – planned)
Courses are prepared based on the experts' experience gained a.o. during previous projects
A survey assessing the training is performed after each course
23
GridSpace: a platform for e-Science applications
Experiment: an e-science application composed of code fragments (snippets), expressed in either general-purpose scripting programming languages, domain-specific languages or purpose-specific notations. Each snippet is evaluated by a corresponding interpreter.
GridSpace2 Experiment Workbench: a web application - an entry point to GridSpace2. It facilitates exploratory development, execution and management of e-science experiments.
Embedded Experiment: a published experiment embedded in a web site.
GridSpace2 Core: a Java library providing an API for development, storage, management and execution of experiments. Records all available interpreters and their installations on the underlying computational resources.
Computational Resources: servers, clusters, grids, clouds and e-infrastructures where the experiments are computed.
24
Collage: executable e-Science publications
Goal:
Extending the traditional scientific publishing model with computational access and interactivity mechanisms; enabling readers (including reviewers) to replicate and verify experimentation results and browse large-scale result spaces.
Challenges:
Scientific: A common description schema for primary data (experimental data, algorithms, software, workflows, scripts) as part of publications; deployment mechanisms for on-demand reenactment of experiments in e-Science.
Technological: An integrated architecture for storing, annotating, publishing, referencing and reusing primary data sources.
Organizational: Provisioning of executable paper services to a large community of users representing various branches of computational science; fostering further uptake through involvement of major players in the field of scientific publishing.
25
DataNet: colaborative metadata management
Objectives
Provide means for ad-hoc metadata model creation and deployment of corresponding storage facilities
Create a research space for metadata model exchange and discovery with associated data repositories with access restrictions in place
Support different types of storage sites and data transfer protocols
Support the exploratory paradigm by making the models evolve together with data
Architecture
Web Interface is used by users to create, extend and discover metadata models
Model repositories are deployed in the PaaS Cloud layer for scalable and reliable access from computing nodes through REST interfaces
Data items from Storage Sites are linked from the model repositories
26
Cloud Platform: resource allocation management
VPH-Share Master Int.
AdminDeveloper Scientist
Development Mode
VPH-Share Core Services Host
OpenStack/Nova Computational Cloud Site
Worker Node
Worker Node
Worker Node
Worker Node
Worker Node
Worker Node
Worker Node
Worker Node
Head Node
Image store (Glance)
Cloud Facade(secure
RESTful API )
Other CS
Amazon EC2
Atmosphere Management Service (AMS)
Cloud stack plugins (Fog)
Atmosphere Internal
Registry (AIR)
Cloud Manager
Generic Invoker
Workflow management
External application
Cloud Facade client
Customized applications may directly interface Atmosphere via its RESTful API called the Cloud Facade.
The Atmosphere Cloud Platform is a one-stop management service for hybrid cloud resources, ensuring optimal deployment of application services on the underlying hardware.
27
InSilicoLab science gateway framework
Goals
Complex computations done in non-complex way
Separating users from the concept of jobs and the infrastructure
Modelling the computation scenarios in an intuitive way
Different granularity of the computations
Interactive nature of applications
Dependencies between applications
Summary
The framework proved to be an easy way to integrate new domain-specific scenarios
Even if done by external teams
Natively supports multiple types of computational resources
Including private resources – e.g. private clouds
Supports various types of computations
Architecture of the InSilicoLab framework: Domain Layer, Mediation Layer with its Core Services, and Resource
Layer. In the Resource Layer, Workers (`W') of different kinds (marked with different colors) are shown.
28
Scalarm
Self-scalable platform adapting to experiment size and simulation type
Exploratory approach for conducting experiments
Supporting online analysis of experiment partial results
Integrates with clusters, Grids, Clouds
Data farming experiments with an exploratory approach
Parameter space generation with support of design of experiment methods
Accessing heterogeneous computational infrastructure
Self-scalability of the management part
What problems are addressed with Scalarm ?Scalarm overview
29
Veilfs
Functionalities provided by VeilFS
A system operating in the user space (i.e. FUSE), which virtualizes organizationally distributed, heterogeneous storage systems to obtain uniform and efficient access to data.
End users access the data stored within VeilFS through one of the provided user interfaces:FUSE client, which implements a file
system in user space to cover the data location and exposes a standard POSIX file system interface,
Web-based GUI, which allows data management via any Internet browser,
REST API.
30ChemistryInSilicoLab for chemistry
The service aims to support the launch of complex computational quantum chemistry experiments in the PL-Grid Infrastructure.
Experiments of this service facilitate planning sequential computation schemes that require the preparation of series of data files, based on a common schema.
31MetallurgySimulations of extrusion process in 3D
Main Objective: Optimization of the metallurgical process of profiles extrusion.
Optimization includes:
shape of foramera,
channel position on a die,
calibration stripes,
extrusion velocity, ingot temperatures, tools.
The proposed grid-based software simulates extrusion of thin profiles and rods of special alloys of magnesium, containing calcium supplements.
These alloys are characterized by extremely low technological plasticity during metal forming. The FEM mathematical model developed.
32Life Science Integromics – a system for researchers from biomedicine and biotechnology
The system was developed to allow:
data collection from experiments, laboratory diagnostics, diagnostic imaging, instrumental analysis and from medical interview,
integration, management, processing and analysis of the collected data using specialized software and some of data mining techniques,
hypotheses generation,
data sharing and presentation of the results.
Example:The diagram of an artificial neural network used to classify patients based on the expression of selected genes. The used method will allow to raise new hypotheses about the influence of individual genes on changes in the organisms.
33SynchroGridElegant − the service for those involved in the design and operation of Synchrotron
The developed service consists in:provision of the elegant (ELEctron Generation ANd Tracking) application in the parallel version on a cluster,
configuring the Matlab software to read output files produced by this application in a Self Describing Data Sets (SDDS) format and to generate the final results in the form of drawings.
Objectives:Preparation of tools needed to Synchrotron deployment and running, aimed at operations and research of the beam line.
Addressing the estimated users’ needs in this scientific area focusing on data access and management – especially the metadata for the experimental data gathered during the beam time.
34
International cooperation – EU funded projects
ACC Cyfronet AGH is involved in numerous projects co-financed by the EU funds and the Polish government.
Research conducted in Cyfronet focus on:
grid and cloud environments,
programming paradigms,
research portals,
efficient use of computing and storage resources,
reconfigurable FPGA and GPGPU computing systems.
36
Organization of conferences
Cyfronet for many years has been organizing national and international conferences, workshops and seminars, which bring together computer scientists and researchers involved in the creation, development and application of information technologies, as well as the users of these technologies.
The Centre has also initiated a series of conferences:
CGW Workshop, held yearly since 2001
•ACC Cyfronet AGH Users' Conference, held yearly since 2008
•as well as International Conference on Computational Science (ICCS), organized twice: in 2004 and 2008
’01
http://www.cyfronet.krakow.pl/cgw14/
38
Summary: what we offer
We develope and deploy research e-infrastructure in three dimensions:
Network & Future Internet
HPC/GRID/CLOUDs
Data & Knowledge layer
Deployments have the national scope; however with close European links
Developments oriented on end-users & projects
Achieving synergy between research projects and e-infrastructures by close cooperation and offering relevant services
Durability at least 5 years after finishing the projects - confirmed in contracts
Future plans: continuation of current policy with a support from EU Structural Funds
Center of Excellence in Life Science
CGW as a place to exchange experience and for collaboration between eScience centers in Europe