Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | octavia-york |
View: | 220 times |
Download: | 2 times |
BIOS-ICGEB HPCC for Bioinformatics 1June 1, 2015
BIOS-ICGEB: High Performance and Cloud Computing (HPCC) for
BioinformaticsKing JordanGeorgia Tech
BIOS-ICGEB HPCC for Bioinformatics 5June 1, 2015
Pedagogy is the discipline that deals with the theory and practice of education; it thus concerns the study and practice of how best to teach.
http://en.wikipedia.org/wiki/Pedagogy
What is it that we intend to teach in this course …
What is the approach that we will take to teachthe course material …
Hint … active participation of students and interaction with faculty will be crucial to our success
BIOS-ICGEB HPCC for Bioinformatics 6June 1, 2015
Talk Outline About this course
Theme Goals Structure Schedule
Overview High performance and cloud
computing for bioinformatics (HPCC)
BIOS-ICGEB HPCC for Bioinformatics 7June 1, 2015
Talk Outline About this course
Theme Goals Structure Schedule
Overview High performance and cloud
computing for bioinformatics (HPCC)
BIOS-ICGEB HPCC for Bioinformatics 8June 1, 2015
Course ThemeMuch like Physics in the 20th century, Biology in the 21st century is rapidly becoming an information science
High throughput experimental techniques generate massive amounts of biological information – e.g. sequence, expression, interaction data
Information alone is essentially worthless
The modern biologist must be able to convert information into knowledge
To do this, we must rely on computers …
BIOS-ICGEB HPCC for Bioinformatics 9June 1, 2015
Course ThemeBioinformatics occupies the intersection of the life sciences and computer science. The goal of bioinformatics is to convert information into knowledge via computational approaches.
High Performance Computing (HPC) is the use of super computers and parallel processing techniques for solving complex computational problems.
Cloud Computing means storing and accessing data and programs over the Internet instead of your computer's hard drive. The cloud is just a metaphor for the Internet.
How do HPC and Cloud Computing intersect & differ …. (later)
BIOS-ICGEB HPCC for Bioinformatics 10June 1, 2015
Course Goals
Theory – students (participants) should understand the basic theoretical underpinnings of HPCC. Theory will be emphasized only to the extent that it informs practice.
Practice – students (participants) should be able to apply HPCC tools and techniques to their research immediately following the course. [Via a specific roadmap and a specific set of tools … a ‘hot start’ which later can be adopted or changed according to student’s specific needs.]
BIOS-ICGEB HPCC for Bioinformatics 11June 1, 2015
Course Structure
AM
Research domain-specific lectures & instruction
Faculty Talks
Student (Participant) Talks
Faculty-student discussionS
tud
en
t-Facu
lty In
tera
ction
PM
Practical exercises in HPC and Cloud Computing
CIAT HPC toolsAmazon cloudEclouldsIlllumina BasespaceBIOS HPC tools
BIOS-ICGEB HPCC for Bioinformatics 12June 1, 2015
Course Structure
Our pedagogy is grounded in active learning
Student participation and student-faculty interaction will be critical to the success of the course
We need your help with this!
Be active, not passive, learners.
We are relatively small group, so we should be able to do this!
BIOS-ICGEB HPCC for Bioinformatics 18June 1, 2015
Talk Outline About this course
Theme Goals Structure Schedule
Overview High performance and cloud
computing for bioinformatics (HPCC)
BIOS-ICGEB HPCC for Bioinformatics 20June 1, 2015
HPC Overview: Supercomputer clustersA computer cluster is a single logical unit consisting of multiple computers that are linked through a local area network (LAN). The networked computers essentially act as a single, much more powerful machine. A computer cluster provides much faster processing speed, larger storage capacity, better data integrity, superior reliability and wider availability of resources.
Computer clusters are, however, much more costly to implement and maintain. This results in much higher running overhead compared to a single computer. (This is where cloud computing comes in …)
http://www.techopedia.com/definition/6581/computer-cluster
BIOS-ICGEB HPCC for Bioinformatics 21June 1, 2015
HPC Overview: Parallel computingParallel computing is a type of computing architecture in which several processors execute or process an application or computation simultaneously. Parallel computing helps in performing large computations by dividing the workload between more than one processor, all of which work through the computation at the same time.
Most supercomputers employ parallel computing principles to operate. Parallel computing is also known as parallel processing.
http://www.techopedia.com/definition/8777/parallel-computing
BIOS-ICGEB HPCC for Bioinformatics 22June 1, 2015
What is Cloud Computing?
How is it related to HPC?
How does it differ from traditional HPC?
BIOS-ICGEB HPCC for Bioinformatics 23June 1, 2015
What is Cloud Computing (skeptical view)
https://www.youtube.com/watch?v=0FacYAI6DY0
Larry Ellison, CEO Oracle, OracleWorld 2008
"The interesting thing about cloud computing is that we've redefined cloud computing to include everything that we already do. I can't think of anything that isn't cloud computing with all of these announcements. The computer industry is the only industry that is more fashion-driven than women's fashion. Maybe I'm an idiot, but I have no idea what anyone is talking about. What is it? It's complete gibberish. It's insane. When is this idiocy going to stop?"
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 24June 1, 2015
Moving towards a more specific definitionof Cloud Computing
In 2011 the National Institute of Standards and Technology (NIST) issued Special Publication 800-145, "The NIST definition of cloud computing“
Intended as a means for broad comparisons of cloud services
and deployment strategies to provide a baseline for discussion on what cloud
computing is and how it is used
Defines the following categories of concepts Essential characteristics Service models Deployment models
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 25June 1, 2015
Essential characteristics of cloud computing (NIST)
On-demand self-service Broad network access Resource pooling Rapid elasticity Measured service
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 26June 1, 2015
Service models of Cloud Computing (NIST) Software as a Service (SaaS)
The capability to use the provider's applications remotely over the network. The user does not manage the server, operating system, storage, even application capabilities.
Platform as a Service (PaaS) The capability to deploy and use user-created or acquired applications on
infrastructure made available by the provider. The user has control over deployed applications and their configuration, but does not manage servers, operating system, or storage.
Infrastructure as a Service (IaaS) The capability to provision computing resources, storage networking, on which
to deploy arbitrary software. The user has virtual control over all resources, but does not have control over the physical infrastructure.
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 27June 1, 2015
Private cloud Community cloud Public cloud Hybrid cloud
Service models of Cloud Computing (NIST)
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 28June 1, 2015
Cloud Computing can also be considered as a kind of Commodity Computing Use of large numbers of already-available computing components for parallel
computing, to get the greatest amount of useful computation at low cost.
Computing done in commodity computers as opposed to high-cost supercomputers or boutique computers
Commodity computers are computer systems manufactured by multiple vendors, incorporating components based on open standards
Such systems are said to be based on commodity components, since the standardization process promotes lower costs and less differentiation among vendors' products
http://en.wikipedia.org/wiki/Commodity_computing
BIOS-ICGEB HPCC for Bioinformatics 29June 1, 2015
Cloud Computing was made possible by the convergence of three existing technologies The internet
Research on packet networking funded in the 1960s TCP/IP introduced in the 1980s Opening to commercial traffic 1990- 1995
Virtualization Early work by IBM in the 1960s Hardware virtualization becomes mainstream in the early 2000s
Parallel computing First multi- ‐processor computers in the 1960s Birth of the Message Passing Interface (MPI) in 1992 MapReduce paper published in 2004
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 30June 1, 2015
Buy a bunch of server boxes
Add hard drives for storage
Connect servers with cables into an intranet
Install an operating system and applications
Log in remotely and start working
Traditional HPC model(Physical data center)
HPC versus Cloud Computing Models
Paul Hodor B|A|H
BIOS-ICGEB HPCC for Bioinformatics 31June 1, 2015
Buy a bunch of server boxes
Add hard drives for storage
Connect servers with cables into an intranet
Install an operating system and applications
Log in remotely and start working
Traditional HPC model(Physical data center)
HPC versus Cloud Computing Models
Paul Hodor B|A|H
Cloud Computing model(Virtual data center)
Provision a bunch of instances
Attach virtual volumes for storage
Create a virtual private cloud
Launch a machine image
Log in remotely and start working
BIOS-ICGEB HPCC for Bioinformatics 32June 1, 2015
Cloud computing: Available platforms
Lavanya Rishishwar GATech
BIOS-ICGEB HPCC for Bioinformatics 33
• Amazon Web Services - http://aws.amazon.com/• Microsoft Azure - http://azure.microsoft.com/en-us/• Google App Engine - https://cloud.google.com/appengine/• Illumina BaseSpace - https://basespace.illumina.com • IBM Cloud Computing - http://www.ibm.com/cloud-computing/us/en/• HP Eucalyptus - https://www.eucalyptus.com/ • HP Cloud - http://www.hpcloud.com/• Rackspace Cloud - http://www.rackspace.com/cloud• DigitalOcean – https://www.digitalocean.com/• CenturyLink Cloud - https://www.centurylinkcloud.com/• Verizon Cloud - http://cloud.verizon.com/• Computer Sciences Corporation - http://www.csc.com/cloud• Virtustream - http://www.virtustream.com/• VMware - http://www.vmware.com/cloud-services/• Fujitsu Cloud - http://www.fujitsu.com/global/solutions/cloud/• Dimension Data Cloud - http://cloud.dimensiondata.com/am/en/• GoGrid - http://www.gogrid.com/• Joyent - https://www.joyent.com/
June 1, 2015
Cloud computing: Available platforms
Lavanya Rishishwar GATech
BIOS-ICGEB HPCC for Bioinformatics 34June 1, 2015
Gartner Magic Quadrant of Cloud IaaS, 2014
Completeness of vision
Abili
ty t
o e
xecu
te
Cloud computing: Performance comparison