+ All Categories
Home > Documents > Supporting Research With Flexible Computation Resources

Supporting Research With Flexible Computation Resources

Date post: 23-Feb-2016
Category:
Upload: teness
View: 43 times
Download: 0 times
Share this document with a friend
Description:
Supporting Research With Flexible Computation Resources. - Federating Clouds in the UK NES , Oxford e-Research Centre and leading to EGI. David Wallom Associate Director – Innovation, Oxford e-Research Centre Technical Director – UK NES Former VP Communities - OGF. - PowerPoint PPT Presentation
Popular Tags:
28
1 Supporting Research With Flexible Computation Resources David Wallom Associate Director – Innovation, Oxford e-Research Centre Technical Director – UK NES Former VP Communities - OGF - Federating Clouds in the UK NES, Oxford e-Research Centre and leading to EGI SDCD 2012: Supporting Science with Cloud Computing 19 th November 2012
Transcript
Page 1: Supporting Research With Flexible Computation Resources

1

Supporting Research With Flexible Computation Resources

David WallomAssociate Director – Innovation, Oxford e-Research Centre

Technical Director – UK NESFormer VP Communities - OGF

- Federating Clouds in the UK NES, Oxford e-Research Centre and leading to EGI

SDCD 2012: Supporting Science with Cloud Computing 19th November 2012

Page 2: Supporting Research With Flexible Computation Resources

2

UK NGS Cloud Activities

• NGS Agile Deployment EnvironmentsEPSRC funded, 2 years

• Staff:– David Wallom (OeRC, Oxford);– David Fergusson (NeSC, Edinburgh);– Steve Thorn (NeSC, Edinburgh);– Matteo Turilli (OeRC, Oxford).

• Goals:– EC2 compatible, open source solution;– development of a dedicated pool of images, supporting both end

user and NGS requirements such as training;– collecting data about feasibility, costs, stability;– identify use cases and gather further requirements.

Page 3: Supporting Research With Flexible Computation Resources

3

Cloud Infrastructure for Research

Centralisation vs Federation• Centralisation: one large, dedicated datacentre that

serves the national HEI demand• Federation: heterogeneous set of local infrastructures

coordinated nationally in order to satisfy the HEI demand

Evaluation criteria• Funding• Scalability• Flexibility• Maintenance• Support

• Accountability• Obsolescence• Competitiveness• Security

Page 4: Supporting Research With Flexible Computation Resources

4

Eucalyptus Vs Nimbus, OpenNebula, OpenStack

Eucalyptus Pros

• Very good implementation of EC2 and EBS APIs;

• Enterprise support offered by Canonical through UEC;

• Dedicated installation in UEC;

• Modular design;

• Xen and KVM compatible;

• Open source and commercial.

Eucalyptus Cons• Design limitations;• AAA.

The others• Limited EC2 API

implementation;• No native support for EBS;• Globus WS4 (Nimbus);• Early development stage;• Slow development.

• To keep an eye on

• OpenNebula 2.2 (to be tested);

• OpenStack Compute and OpenStack Object Storage.

Page 5: Supporting Research With Flexible Computation Resources

5

NGS Cloud Prototypes

Oxford III• 6 x 2 AMD 2 core; 8GB ram.• 1 x 4 AMD 2 core; 32GB

ram.• CentOS 5.4;• Eucalyptus 1.6.2 installed

from rpm repositories;• Ganglia and Nagios

monitoring systems;• 5 default VM templates =

44/44/22/22/11 VMs (editable);

• 2TB ECB, 80GB Walrus.

Page 6: Supporting Research With Flexible Computation Resources

6

NGS Cloud Prototypes

Oxford IV• 3 x 4 Xeon 6 core; 48GB

ram.• 2 x 1 Xeon 2 core; 32GB

ram.• Ubuntu 10.10;• Ubuntu Enterprise Cloud;• 2+2 bounded public NICs

on CC;• 12TB ECB, 12TB Walrus on

SED disks;• TPM on every motherboard.

Page 7: Supporting Research With Flexible Computation Resources

7

NGS Cloud Prototypes

Edinburgh II• 32 x Sun Fire X4100• Dual-core, 2.8 GHz Opteron

8 GB RAM, 70 GB RAID1• 64 cores• 1 Headnode (Cloud and

Cluster controllers• 31 Nodes (Node controller)• Max 2 VMs per core: 124

slots (2GB RAM)• VLANs for VM isolation

Page 8: Supporting Research With Flexible Computation Resources

8

Managing and Monitoring

Tools

• Hybridfox + euca-tools: overall cloud usage and status + testing;

• Landscape: canonical, not open-source management solution for UEC. Did not try RightScale as fairly expensive and hosted service;

• Linux CLI: dedicated scripts to monitor logs and daemons status.

Issues

• Public IP Database corruption (addressed in version 2);

• No user quota on the open source version of Eucalyptus;

• No accounting on the open source version of Eucalyptus;

• VERY verbose, none persistent logs;

• Lack of error feedback in some conditions.

Page 9: Supporting Research With Flexible Computation Resources

9

User Support

Tools

• Ticketing system: web-based platform (footprints). Addressed around 200 tickets in 1 year;

• Web site: subscription instructions, links to Eucalyptus documentation and to the support e-mail;

• Mailing list: used mainly to announce new services, scheduled or unscheduled downtime, planned upgrades.

Issues

• Access through institutional firewall via proxy;

• Available resources (limitation of Eucalyptus design);

• Instructions on how to build a dedicated image;

• Almost no issues about research and cloud computing.

• Difficult to manage user access with separate cloud systems…

Page 10: Supporting Research With Flexible Computation Resources

10

NGS Cloud Usage 2010/2011

• 106 registered users: uptake has been very fast and users stayed engaged throughout the whole testing period;

• 26 institutions: 23 HEI both universities and colleges, 3 companies;

• 30 projects;• 10 research areas.

Life sciences

Teaching

Mathematics

Cloud R&D

PhysicsEcology

GeographyMedicine

Social Science

Engineering

Page 11: Supporting Research With Flexible Computation Resources

11

Exemplar Case Studies

• Evolutionary Genomics: “analysis and Information management of Next Generation Sequencing (NGS) of Genomic data poses many challenges in terms of time and size. We are exploring the translation of high quality NGS scientific analysis pipelines to make best use of Cloud infrastructure”;

• Geospatial Science: “geospatial data is a mix of raster and vector data. As rasterizing is CPU-hungry process, and all maps displayed on the screen of the final user are rasters, it is more efficient to do the process on the server side. I am investigating how this process can be dispersed across many, if not unlimited instances in a cloud”;

• Agent-based modelling of crime: “at the moment I have a tomcat server that hosts some web services used to run social simulation model, it needs access to the file system to run fortran scripts, create files etc. There are loads of problems with running our own server at uni and I think a virtual machine that I could have control over would be much better”.

Page 12: Supporting Research With Flexible Computation Resources

12

Flexible Services for the Support of Research (FleSSR)

6 Partners

• Academic and industrial;

• 3 cloud infrastructures.

Goals

Building federated cloud infrastructure, extending the use of UK NGS central services with cloud brokering and accounting.

Use cases

•Multi Platform Software Development;

•On demand Research data storage.

Page 13: Supporting Research With Flexible Computation Resources

13

FleSSR Architecture

Oxford Reading

Eduserv

Zeel/i Broker

STFC/NES Accounting Database

Page 14: Supporting Research With Flexible Computation Resources

14

FleSSR Infrastructure

• Local/Global: services depends either on local or global access. Cloud brokering is not mandatory for AWS-like service access;

• Multiple identities: every user may have multiple identities, both local and global;

• Only personal identities: group identities are not implemented. The management of every single identity is left to the legally responsible user;

• Multiple AA technologies: AA may differ depending on local and global policies/technologies;

• Multiple accounting: every single identity is accounted for its usage. Every individual may get multiple invoices.

Page 15: Supporting Research With Flexible Computation Resources

15

FleSSR Use Case: Multi Platform Software Development

Zeel/i Broker Instance configuration manager

FleSSR cloud

Build managerCVS / SVN repository

Build instance 1

Build instance 2

Build instance 3

Build instance 4

Build instance 5

Page 16: Supporting Research With Flexible Computation Resources

16

FleSSR Use Case: On demand Research data storage

Zeel/i Broker Volume Manager

FleSSR cloud

VM EBS Interface

EBS Volume

Page 17: Supporting Research With Flexible Computation Resources

17

FleSSR Output

Code

• Instance configuration and build manager: Perl command line utility + Java client utilising the Zeel/I API;

• Personal EBS volume manager: web-based, Java client for EBS volumes handling + tailored VM image with multiple data interfaces (SFTP, WebDAV, GlusterFS, rsync, ssh);

• Eucalyptus open-source accounting system: Perl aggregators and parsers for standard eucalyptus open-source log files + MySQL accounting database + PHP accounting client.

Use cases

• SKA community testing of Use case;

• Institutional ICT team testing WEB-DAV, GridFTP & GlusterFS solution as Use case 2.

Page 18: Supporting Research With Flexible Computation Resources

18

Aiming to support multiple heterogeneous user communities, the EGI Federated

Cloud Task Force

With thanks to Matteo Turilli, EGI FCTF Chair

Page 19: Supporting Research With Flexible Computation Resources

19

EG

I.eu

Coo

rdin

atio

nC

ore

softw

are

and

supp

ort

gLite UNICOREdCache ARCCommunity

Platform

IaaS NGI

VM Mgmt Data Image Sharing

Monitoring Accounting Notification

NGI

Monitoring Accounting Notification

EGI-wide message bus

NGI

VM Mgmt Data Image Sharing

Monitoring Accounting Notification

Commercial

VM Mgmt DataImage

Sharing

Monitoring Accounting Notification

Personalised environments for individual research communities in the European Research Area.

Community Services

Community Services

Globus

Globus

EGI New Challenges and Cloud Computing

With thanks to Matteo Turilli, EGI FCTF Chair

Page 20: Supporting Research With Flexible Computation Resources

20

BSC

CNRS LMU

OeRC

Masaryk

TUD

IFAE

Cyfronet

SixSq

CESNET

TCD

SRCE

DANTE

FZJ

GRNET

GWDG

Utrecht

STFC

SARA KTH

INFN

FCTSG

EGI.eu

Task Force Members and Technologies

Members• 63 individuals.• 23 institutions.• 13 countries.

Technologies• 7 OpenNebula.• 3 StratusLab.• 3 OpenStack.• 1 Okeanos.• 1 WNoDeS.

Stakeholders• 15 Resource Providers.• 7 Technology Providers.• 6 User Communities.• 3 Liaisons.

With thanks to Matteo Turilli, EGI FCTF Chair

Page 21: Supporting Research With Flexible Computation Resources

21

Federation Model

HardwareHardware

HardwareHardware

Hardware

Cloud ManagementCloud Management

Cloud ManagementCloud Management

Cloud Management

User CommunitiesUser Communities

User Communities

Federated interfaces

Federated services

• Standards and validation: emerging standards for the interfaces and images – OCCI, CDMI, OVF.

• Resource integration: Cloud Computing to be integrated into the existing production infrastructure.

• Heterogeneous implementation: no mandate on the cloud technology.

• Provider agnosticism: the only condition to federate resources is to expose the chosen interfaces and services.

With thanks to Matteo Turilli, EGI FCTF Chair

Page 22: Supporting Research With Flexible Computation Resources

22

Federation Test bed – Sep 2012

Composed of 4 services, 2 management interfaces, 7 cloud infrastructures operated by 6 Resource Providers. 3 more providers are in the process of being federated.

Page 23: Supporting Research With Flexible Computation Resources

23

InformationGLUE 2.0

BDII

MonitoringNagios

AccountingOGF UR

UR+ & StAR

Message Bus

VM metadataMarketplace

Resource ProviderVenus-C CDMI 1.0

Federation Demo – Sep 2012Resource Provider

GWDG (ON/OS) OCCI 1.1

CDMI 1.0MP/UR Clients

Resource ProviderCESNET (ON)

OCCI 1.1

CDMI 1.0

Resource ProviderCYFRONET (ON)

OCCI 1.1

Resource ProviderKTH (ON)

OCCI 1.1

CDMI 1.0

Resource ProviderCESGA (ON)

OCCI 1.1

Resource ProviderFZJ (OS)

OCCI 1.1LDAP

MP/UR Clients

LDAP

ON = OpenNebula.

OS = OpenStack.

MP = Marketplace.

UR = Usage Records.Resource Provider

IN2P3-CC (OS)OCCI 1.1LDAP

MP/UR Clients

MP/UR Clients

LDAP

MP/UR Clients

LDAP

MP/UR Clients

LDAP

MP/UR Clients

LDAP

With thanks to Matteo Turilli, EGI FCTF Chair

Page 24: Supporting Research With Flexible Computation Resources

24

Use Cases

• Structural biology – We-NMR project: Gromacs training environments.

• Musicology – Peachnote project: music score search engine and analysis platform.

• Linguistics – CLARIN project: scalable ‘British National Corpus’ service (BNCWeb).

• Ecology – BioVel project: remote hosting of OpenModeller service.

• Software development – SCI-BUS project: simulated environments for portal testing.

• Space science – ASTRA-GAIA project: data integration with scalable workflows.

With thanks to Matteo Turilli, EGI FCTF Chair

Page 25: Supporting Research With Flexible Computation Resources

25

EGI FCTF Conclusions

Output• Adoption of standards for VM and data management.• Interoperability across multiple cloud management platforms.• Federation model compatible and consistent with current EGI infrastructure.• Contribution to EGI user communities engagement and support.• Documentation made available to the community.

Cycle #3, Sep 2012 – Mar 2013: Integration• Focus on dev tools for management interfaces and clients for the test bed.• Integration of the test bed services into the EGI infrastructure.• Cloud brokering evaluation and deployment.• Focus on use cases coordination and implementation.• Opening of the test bed to early adopters.

With thanks to Matteo Turilli, EGI FCTF Chair

Page 26: Supporting Research With Flexible Computation Resources

26

Usage so far

• Compute Capacity– >900 VM slots

• Data– ~16TB

• Marketplace– 11 VM templates stored and available

• VM instantiation/Usage– >3200 VMs (Accounted for in EGI central accounting facility)

With thanks to Matteo Turilli, EGI FCTF Chair

Page 27: Supporting Research With Flexible Computation Resources

27

Federation Conclusions

• Utilisation of virtual infrastructure is the only scalable method to support large number of disparate user communities across multiple different application design models

• Federation as robust and scalable model of national/European cloud infrastructure for research,

• Federation is only possible by the availability of open standards,

• Successful pilot tests of multiple prototypes of cloud infrastructure allowed a quicker development of the final model for EGI,

• Crucial role played by Research & Development in order to customise open-source cloud infrastructure solutions to the specific needs of academic research,

• Cloud is part of an ecosystem of e-infrastructure not e-infrastructure alone.

Page 28: Supporting Research With Flexible Computation Resources

28

Questions?


Recommended