Research Area Background Area: systems – applied computer science Question: what to do? Dr. Dan Reed, Vice President Microsoft, in his Keynote talk “Clouds: from Both Sides New” in Washington in 2011 stated (my interpretation) University researchers should find a research niche because they do not have enough resources (human and financial) to compete against main stream of research carried out by big companies SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013
Transcript
Slide 1
Research Area Background Area: systems applied computer science
Question: what to do? Dr. Dan Reed, Vice President Microsoft, in
his Keynote talk Clouds: from Both Sides New in Washington in 2011
stated (my interpretation) University researchers should find a
research niche because they do not have enough resources (human and
financial) to compete against main stream of research carried out
by big companies SaaS Clouds Supporting HPC Biology Sciences - CMU
CV July 2013
Slide 2
Andrzej Goscinski Service and Cloud Computing Lab Senior
Members: A. Wong, P. Church, M. Brock
Slide 3
Biology and Medicine Needs Biology and medicine specialists
collect a lot of data Many of them only use their workstations,
desktops and even laptops to carry out data analysis Many of them
are not familiar with HPC Many biology and medicine specialists do
not program well and do not have system admin skills (they should
not have it I guess) Biology and medicine specialists would like to
use computers to get analysis results quickly without a burden of
computing jargon SaaS Clouds Supporting HPC Biology Sciences - CMU
CV July 2013
Slide 4
Lab (Current) Research Aim to carry out the study into the
development of a technology for simplifying the deployment,
exposure, access and customization of HPC science applications in
SaaS clouds This technology forms a basis of research environments
enabling science specialists to use HPC resources in clouds for
running their computational demanding software easily on-demand at
reasonable costs for the discovery of new and significant
discipline knowledge SaaS Clouds Supporting HPC Biology Sciences -
CMU CV July 2013
Slide 5
The NIST Definition of Cloud Computing NIST Special Publication
800-145, P. Mell and T. Grance, Sept 2011 Cloud computing is a
model for enabling ubiquitous, convenient, on-demand network access
to a shared pool of configurable computing resources (e.g.,
networks, servers, storage, applications, and services) that can be
rapidly provisioned and released with minimal management effort or
service provider interaction This cloud model is composed of five
essential characteristics, three service models, and four
deployment models
Slide 6
SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013
NIST: Service Models Infrastructure as a Service (IaaS) The
delivery of hardware resources as a service Users are granted
access to cloud infrastructure through virtual machines Platform as
a Service (PaaS) Build services on IaaS clouds supporting cloud
application deployment Most cloud platforms consist of a high-level
language and a well-defined Application Programming Interface
Software as a Service (SaaS) Exposes applications designed to run
on a cloud as services Eliminates the need to install or run
applications on the customers computer and is often cheaper than
buying a full software licence
Slide 7
SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013
NIST: Deployment Models Public Clouds Accessed by the general
public Allows users to rent resources such as computational time or
storage as necessary Private Clouds Used exclusively by an
organisation Allow for a specific service level agreement (SLA) to
be made to ensure availability and security Community Clouds Used
by a group of users that have shared concerns Allows for a shared
mission statement which has specific security and policy
requirements Hybrid Clouds Combines cloud resources from two or
more deployment models to accomplish a users goal
SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013
Characteristics of Clouds that Attract Business Clients only pay
for what they consume Rather than spending money on buying,
managing and upgrading servers, business administrators concentrate
on the management of their applications The required service is
always there availability is very high that leads to short times
from submission to the completion of execution Cloud computing
provides opportunities to small businesses by giving them access to
world class systems otherwise unaffordable On the other hand, even
small companies can export their specialized services to
clients
Slide 10
When Using Clouds Additional Steps Must be Carried out
Depending on the Service Model IaaS - involves construction of a
virtual cluster, compilation and deployment of distributed software
System administrators jobs PaaS - aimed at developers provide users
with a development environment and automating the deployment of
resources Limited access to development tools and languages SaaS -
users are able to access HPC applications through graphical
interfaces; however users are reliant on what cloud service
providers have made available Such software would have expensive
licenses or be not readily available SaaS Clouds Supporting HPC
Biology Sciences - CMU CV July 2013
Slide 11
Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)}
Over the past 2.5 years the percentage of companies who say they
are currently using public cloud computing services has climbed
from 14% to 40%. SaaS Clouds Supporting HPC Biology Sciences - CMU
CV July 2013
Slide 12
Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)}
The results in the latest ChangeWave cloud survey point to
continued growth for public, private and hybrid cloud computing
Within public cloud computing, software as a service (SaaS) remains
the area with the fastest growth rate When asked why their
companies do not use cloud computing, the most important reasons
are Security Concerns (41%), while 15% cite the Complexity of
Integrating with Existing IT Infrastructure SaaS Clouds Supporting
HPC Biology Sciences - CMU CV July 2013
HPC vs. HPC Clouds vs. Discipline Specialists Problem 1: HPC
requires powerful and expensive computational and data storage
hardware advanced middleware sophisticated discipline oriented
applications knowledgeable programmers and system managers Clouds
have been created for business ($$$), not to earn money from HPC
($) Most HPC clouds are based on IaaS clouds enhanced by additional
hardware and middleware to support HPC Problem 2: the cost and time
overheads in learning how to prepare a HPC cloud and properly
install and configure applications in the underlying HPC facilities
Conclusion: if discipline specialists want to use HPC clouds for
scientific discovery, they also must become system administrators
and good programmers SaaS Clouds Supporting HPC Biology Sciences -
CMU CV July 2013
Slide 15
Clouds and HPC A response to Problems 1 & 2 faced by
discipline specialists lies in cloud computing These days clouds
can support some HPC workloads Clouds are oriented to support High
Scalability Computing (HSC) rather than HPC Note: with the
improvement of communication performance clouds are becoming a
major tool for HPC Question: what kind of HPC applications could be
executed on a cloud? SaaS Clouds Supporting HPC Biology Sciences -
CMU CV July 2013
Slide 16
HPC Clouds vs. Applications SaaS Clouds Supporting HPC Biology
Sciences - CMU CV July 2013
Slide 17
HPC Clouds vs. Discipline Specialists Most HPC clouds are based
on IaaS clouds enhanced by additional hardware and middleware to
support HPC Problem 3 again: the cost and time overheads in
learning how to prepare a HPC cloud and its applications remain a
problem HPC cloud users are presented with a set of virtual and
physical servers required to put the servers together to form the
HPC facilities to run their software applications on The software
applications must be properly installed and configured in the
underlying HPC facilities Conclusion: if discipline specialists
want to use HPC clouds for scientific discovery, they must also
become system administrators and good programmers SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013
Slide 18
Web-based Software Tools/Packages In many areas of science,
discipline specialists benefit from Web-based software tools
Software tools are easy to use and attractive to specialists
through their discipline oriented interfaces scientific workflow
systems (Galaxy) web portals for accessing grid resources (P-GRADE)
web portals of scientific gateway such as HubZero Observation:
specialists appreciate easy to use Web-based discipline oriented
interfaces! Plenary "Cloud in Action" CLOUD 2013 panel
Slide 19
HPC Applications Exposed as Services in SaaS Clouds Conclusion:
discipline specialists could benefit most from the execution of
their HPC applications if they are exposed as services in SaaS
clouds and accessed through discipline (tool-based) interfaces Use
of clouds (ChangeWave Research) SaaS Clouds Supporting HPC Biology
Sciences - CMU CV July 2013
Slide 20
Merging SaaS Cloud Services and Web Tools Question: are we on a
good track? Yes, we are! Providing users faster turnaround times on
their experiments using clouds has been one of the major issues
promised to be addressed in a new version of the AGAVE software
tool AGAVE is one of the well known and widely used Web- based
software tools AGAVE delivers science-as-a-service Data processed
using analytics provided as SaaS services Plenary "Cloud in Action"
CLOUD 2013 panel
Slide 21
Direct Research Questions How to make scientists able to deploy
software applications in clouds? How to make clouds easy to use for
discipline researchers to run HPC applications? How to support the
customization and reuse of HPC applications in clouds? These three
questions form the current research scope of our Lab Our research
aim again: develop a technology that automatically creates a
virtual machine (VM) exposes an application as a service deploys it
on the VM generates an easy to use interface a Web form SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013
Slide 22
Initial Labs Research Web services, which are used to develop
services, are stateless Our response: stateful Web services Service
discovery and selection is a major threshold of the application of
cloud computing (only simple catalogues are in use) Our response: a
dynamic broker based on attributed names The application of HPC is
unaffordable to small and medium research groups and institution
Our response: the CaaS framework that exposes a cluster as a
service, and makes it available within a private and public cloud
SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013
Slide 23
From IaaS/PaaS to SaaS with a Broker (M. Brock) SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013
Slide 24
From IaaS/PaaS to SaaS with a Broker (M. Brock) The RVWS
Framework Allows current activity and characteristics of resources
to be exposed as services via WSDL documents A compatible extension
to existing Web standards The Dynamic Broker A discovery service
that uses stateful WSDL documents CaaS Infrastructure Web
service-based middleware for easy publishing, discovery and use of
clusters HPCynergy A prototype private cloud built using CaaS for
easy access to HPC resources and applications HPC Hybrid Deakin (H
2 D) Cloud Able to discover suitable resources from both public and
private clouds to execute single applications too large to singular
clusters All tasks such as parameter modification, data file break
up and multiple application monitoring handled on behalf of the
user SaaS Clouds Supporting HPC Biology Sciences - CMU CV July
2013
Slide 25
SaaS Cloud Supporting HPC Science Applications SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013 Three steps:
Deployment of HPC applications on IaaS clouds Exposure of HPC
application services Access of HPC application services
Transforming complicated HPC applications into easy-to-use SaaS
cloud services User Web Form Virtual Machine Image HPC Application
Service SaaS Cloud HPC Resources IaaS Cloud HPC Application
HPCApplication Service Registry HPC Application Deployment HPC
Application Service, Web Form Generation Publishing Accessing
Deploying Service Discovery No Yes
Slide 26
Using the Framework The discipline researcher to conduct a
scientific discovery by executing HPC applications on clouds
contacts the HPC Application Service Registry Scenario 1: the HPC
application services of researcher s interest is found Researcher
selects the cloud service Resources are selected automatically and
the application deployment service sets up and configures the cloud
The automated interface generation service constructs a user
friendly discipline specific interface for the requested HPC
application service Researcher accesses the cloud service through
the provided interface Scenario 2: the HPC application service of
users interest is not found but the discipline researcher has
programming and system administration skills and decides to deploy
a new targeted HPC application in IaaS cloud The Automatic HPC
Application Deployment System can automate parts of this process
The outcome is either a virtual machine image containing a copy of
the properly installed and configured HPC application or a software
service (consisting of input/output, invocation information and
hardware requirements) which can be deployed on a virtual machine
Stage 1: the cloud service published in the HPC application service
registry is readily accessible in IaaS cloud The new cloud service
generated by the Automatic HPC Application Deployment System is
stored for future use in the HPC Application Service Registry Stage
2: the user can employ the Automatic HPC Application Service and
Web Form Generation System to automate the formation of a HPC
Application Service exposing the HPC application The HPC
Application Service is abstracted by a user friendly discipline
specific interface that is published in the HPC application service
registry (see Scenario 1) SaaS Clouds Supporting HPC Biology
Sciences - CMU CV July 2013
Slide 27
Implementation of the HPC Cloud Framework (A. Wong) Services
provided at the Cloud service stack: Bottom (IaaS layer): the
Amazon EC2 was used to provide cloud infrastructure services Middle
(HPCaaS Layer): a HPC software library was used to expose and
access Amazon EC2 services Top (SaaS Layer): a HPC application
service was developed and exposed as a tool in the Galaxy
server
Slide 28
Galaxy provides a powerful feature for tool integration where
each tool (application) is presented to users as a Web form SaaS
Clouds Supporting HPC Biology Sciences - CMU CV July 2013 The
Galaxy Web-based Platform (A. Wong)
Slide 29
A HPC cluster was being constructed where compute instances of
the cluster would support mpiBlast execution SaaS Clouds Supporting
HPC Biology Sciences - CMU CV July 2013 An Interface to Access the
HPC Cloud (A. Wong)
Slide 30
A cluster of 8 nodes was constructed at Amazon EC 2 SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013 An Interface to
Access the HPC Cloud (A. Wong)
Slide 31
mpiBlast was accessed by supplying parameters: cluster name,
number of processes and other typical parameters SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013 An Interface to
Access mpiBlast (A. Wong)
Slide 32
mpiBlast execution finished at Amazon EC2; its result file was
transferred automatically to the Galaxy server for post processing
SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013 An
Interface to mpiBlast (A. Wong)
Slide 33
Uncinus: Cloud Deployment (P. Church) Supports Resource
Allocation Workflow Orchestration Cloud Bursting Genomics in the
clouds Gene Discovery Personalized Genomics Leverage EC 2 to
improve the speed and accuracy of analysis SaaS Clouds Supporting
HPC Biology Sciences - CMU CV July 2013
Slide 34
Uncinus: Case Study (P. Church) To identify genes transferred
upon digestion of dairy products Mother -> Child A 8 step
workflow was developed and ran on Uncinus Run on the following
resources; SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013 Resources#Nodes Amazon (cc1.4xlarge)2 Amazon (m1.Large)2
West-Lin Cluster2 Mamsap Server1
Slide 35
Uncinus: Case Study (P. Church) Cloud bursting improved
performance Workflow mode reduced run time by 8 hours SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013
Slide 36
Uncinus: Case Study (P. Church) Results from the workflow found
genes active during lactation and during digestion of dairy Is this
gene transfer or a reaction? Further work is needed SaaS Clouds
Supporting HPC Biology Sciences - CMU CV July 2013
Slide 37
Increasing Scalability Hybrid Clouds Storage Cloud Compute
Cloud Storage Cloud Compute Cloud Publishing Service Request
(Distributed) Service Broker Broker 1 Broker N Private Compute
Cloud Public Clouds SaaS Clouds Supporting HPC Biology Sciences -
CMU CV July 2013
Slide 38
Solutions from Hybrid/Federated Clouds Hybrid/Federated Cloud
Management (FCM) Architecture A recent work that provides a
reference architecture consisting of brokering services User
requests are serviced by creating virtual appliances based on user
request parameters and ran inside virtual machines Appliances are
stored in repositories and decomposed over time to support the
creation of future appliances As virtual appliances contain a
software stack (operating system) upwards, there are high data
transfer costs SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Slide 39
Solutions from Hybrid/Federated Clouds There is also an
(unnamed) toolkit for VM migration between clouds Users are able to
transfer VMs between public and private clouds to control load
(manually or automatically) However, the interface itself is
primitive at best SaaS Clouds Supporting HPC Biology Sciences - CMU
CV July 2013
Slide 40
Conclusions Clouds are being moved from business to specialized
research HPC on clouds promise scalability, faster turnaround
times, lower costs, services on demand Discipline specialist should
not be forced to become (good) programmers and system
administrators Easy and discipline oriented interfaces are very
important Web tools offer discipline oriented interfaces but are
inflexible and do not support HPC widely Combining HPC clouds and
Web tools is the way HPC applications exposed as services of SaaS
cloud and accessed using Web forms is the solution! Hybrid clouds
will grab the HPC market SaaS Clouds Supporting HPC Biology
Sciences - CMU CV July 2013