+ All Categories
Home > Documents > June 1, 2015BIOS-ICGEB HPCC for Bioinformatics1 BIOS-ICGEB: High Performance and Cloud Computing...

June 1, 2015BIOS-ICGEB HPCC for Bioinformatics1 BIOS-ICGEB: High Performance and Cloud Computing...

Date post: 25-Dec-2015
Category:
Upload: octavia-york
View: 220 times
Download: 2 times
Share this document with a friend
Popular Tags:
35
June 1, 2015 BIOS-ICGEB HPCC for Bioinformatics 1 BIOS-ICGEB: High Performance and Cloud Computing (HPCC) for Bioinformatics King Jordan Georgia Tech
Transcript

BIOS-ICGEB HPCC for Bioinformatics 1June 1, 2015

BIOS-ICGEB: High Performance and Cloud Computing (HPCC) for

BioinformaticsKing JordanGeorgia Tech

BIOS-ICGEB HPCC for Bioinformatics 2June 1, 2015

http://jordan.biology.gatech.edu/

BIOS-ICGEB HPCC for Bioinformatics 3June 1, 2015

http://bioinformatics.gatech.edu/

BIOS-ICGEB HPCC for Bioinformatics 4June 1, 2015

http://panambioinfo.org/

BIOS-ICGEB HPCC for Bioinformatics 5June 1, 2015

Pedagogy is the discipline that deals with the theory and practice of education; it thus concerns the study and practice of how best to teach.

http://en.wikipedia.org/wiki/Pedagogy

What is it that we intend to teach in this course …

What is the approach that we will take to teachthe course material …

Hint … active participation of students and interaction with faculty will be crucial to our success

BIOS-ICGEB HPCC for Bioinformatics 6June 1, 2015

Talk Outline About this course

Theme Goals Structure Schedule

Overview High performance and cloud

computing for bioinformatics (HPCC)

BIOS-ICGEB HPCC for Bioinformatics 7June 1, 2015

Talk Outline About this course

Theme Goals Structure Schedule

Overview High performance and cloud

computing for bioinformatics (HPCC)

BIOS-ICGEB HPCC for Bioinformatics 8June 1, 2015

Course ThemeMuch like Physics in the 20th century, Biology in the 21st century is rapidly becoming an information science

High throughput experimental techniques generate massive amounts of biological information – e.g. sequence, expression, interaction data

Information alone is essentially worthless

The modern biologist must be able to convert information into knowledge

To do this, we must rely on computers …

BIOS-ICGEB HPCC for Bioinformatics 9June 1, 2015

Course ThemeBioinformatics occupies the intersection of the life sciences and computer science. The goal of bioinformatics is to convert information into knowledge via computational approaches.

High Performance Computing (HPC) is the use of super computers and parallel processing techniques for solving complex computational problems.

Cloud Computing means storing and accessing data and programs over the Internet instead of your computer's hard drive. The cloud is just a metaphor for the Internet.

How do HPC and Cloud Computing intersect & differ …. (later)

BIOS-ICGEB HPCC for Bioinformatics 10June 1, 2015

Course Goals

Theory – students (participants) should understand the basic theoretical underpinnings of HPCC. Theory will be emphasized only to the extent that it informs practice.

Practice – students (participants) should be able to apply HPCC tools and techniques to their research immediately following the course. [Via a specific roadmap and a specific set of tools … a ‘hot start’ which later can be adopted or changed according to student’s specific needs.]

BIOS-ICGEB HPCC for Bioinformatics 11June 1, 2015

Course Structure

AM

Research domain-specific lectures & instruction

Faculty Talks

Student (Participant) Talks

Faculty-student discussionS

tud

en

t-Facu

lty In

tera

ction

PM

Practical exercises in HPC and Cloud Computing

CIAT HPC toolsAmazon cloudEclouldsIlllumina BasespaceBIOS HPC tools

BIOS-ICGEB HPCC for Bioinformatics 12June 1, 2015

Course Structure

Our pedagogy is grounded in active learning

Student participation and student-faculty interaction will be critical to the success of the course

We need your help with this!

Be active, not passive, learners.

We are relatively small group, so we should be able to do this!

BIOS-ICGEB HPCC for Bioinformatics 13June 1, 2015

Course Schedule – Day 1

BIOS-ICGEB HPCC for Bioinformatics 14June 1, 2015

Course Schedule – Day 2

BIOS-ICGEB HPCC for Bioinformatics 15June 1, 2015

Course Schedule – Day 3

BIOS-ICGEB HPCC for Bioinformatics 16June 1, 2015

Course Schedule – Day 4

BIOS-ICGEB HPCC for Bioinformatics 17June 1, 2015

Course Schedule – Day 5

BIOS-ICGEB HPCC for Bioinformatics 18June 1, 2015

Talk Outline About this course

Theme Goals Structure Schedule

Overview High performance and cloud

computing for bioinformatics (HPCC)

BIOS-ICGEB HPCC for Bioinformatics 19June 1, 2015

HPC Overview: Client-server architecture

BIOS-ICGEB HPCC for Bioinformatics 20June 1, 2015

HPC Overview: Supercomputer clustersA computer cluster is a single logical unit consisting of multiple computers that are linked through a local area network (LAN). The networked computers essentially act as a single, much more powerful machine. A computer cluster provides much faster processing speed, larger storage capacity, better data integrity, superior reliability and wider availability of resources.

Computer clusters are, however, much more costly to implement and maintain. This results in much higher running overhead compared to a single computer. (This is where cloud computing comes in …)

http://www.techopedia.com/definition/6581/computer-cluster

BIOS-ICGEB HPCC for Bioinformatics 21June 1, 2015

HPC Overview: Parallel computingParallel computing is a type of computing architecture in which several processors execute or process an application or computation simultaneously. Parallel computing helps in performing large computations by dividing the workload between more than one processor, all of which work through the computation at the same time.

Most supercomputers employ parallel computing principles to operate. Parallel computing is also known as parallel processing.

http://www.techopedia.com/definition/8777/parallel-computing

BIOS-ICGEB HPCC for Bioinformatics 22June 1, 2015

What is Cloud Computing?

How is it related to HPC?

How does it differ from traditional HPC?

BIOS-ICGEB HPCC for Bioinformatics 23June 1, 2015

What is Cloud Computing (skeptical view)

https://www.youtube.com/watch?v=0FacYAI6DY0

Larry Ellison, CEO Oracle, OracleWorld 2008

"The interesting thing about cloud computing is that we've redefined cloud computing to include everything that we already do. I can't think of anything that isn't cloud computing with all of these announcements. The computer industry is the only industry that is more fashion-driven than women's fashion. Maybe I'm an idiot, but I have no idea what anyone is talking about. What is it? It's complete gibberish. It's insane. When is this idiocy going to stop?"

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 24June 1, 2015

Moving towards a more specific definitionof Cloud Computing

In 2011 the National Institute of Standards and Technology (NIST) issued Special Publication 800-145, "The NIST definition of cloud computing“

Intended as a means for broad comparisons of cloud services

and deployment strategies to provide a baseline for discussion on what cloud

computing is and how it is used

Defines the following categories of concepts Essential characteristics Service models Deployment models

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 25June 1, 2015

Essential characteristics of cloud computing (NIST)

On-demand self-service Broad network access Resource pooling Rapid elasticity Measured service

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 26June 1, 2015

Service models of Cloud Computing (NIST) Software as a Service (SaaS)

The capability to use the provider's applications remotely over the network. The user does not manage the server, operating system, storage, even application capabilities.

Platform as a Service (PaaS) The capability to deploy and use user-created or acquired applications on

infrastructure made available by the provider. The user has control over deployed applications and their configuration, but does not manage servers, operating system, or storage.

Infrastructure as a Service (IaaS) The capability to provision computing resources, storage networking, on which

to deploy arbitrary software. The user has virtual control over all resources, but does not have control over the physical infrastructure.

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 27June 1, 2015

Private cloud Community cloud Public cloud Hybrid cloud

Service models of Cloud Computing (NIST)

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 28June 1, 2015

Cloud Computing can also be considered as a kind of Commodity Computing Use of large numbers of already-available computing components for parallel

computing, to get the greatest amount of useful computation at low cost.

Computing done in commodity computers as opposed to high-cost supercomputers or boutique computers

Commodity computers are computer systems manufactured by multiple vendors, incorporating components based on open standards

Such systems are said to be based on commodity components, since the standardization process promotes lower costs and less differentiation among vendors' products

http://en.wikipedia.org/wiki/Commodity_computing

BIOS-ICGEB HPCC for Bioinformatics 29June 1, 2015

Cloud Computing was made possible by the convergence of three existing technologies The internet

Research on packet networking funded in the 1960s TCP/IP introduced in the 1980s Opening to commercial traffic 1990- 1995

Virtualization Early work by IBM in the 1960s Hardware virtualization becomes mainstream in the early 2000s

Parallel computing First multi- ‐processor computers in the 1960s Birth of the Message Passing Interface (MPI) in 1992 MapReduce paper published in 2004

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 30June 1, 2015

Buy a bunch of server boxes

Add hard drives for storage

Connect servers with cables into an intranet

Install an operating system and applications

Log in remotely and start working

ssh [email protected]

Traditional HPC model(Physical data center)

HPC versus Cloud Computing Models

Paul Hodor B|A|H

BIOS-ICGEB HPCC for Bioinformatics 31June 1, 2015

Buy a bunch of server boxes

Add hard drives for storage

Connect servers with cables into an intranet

Install an operating system and applications

Log in remotely and start working

ssh [email protected]

Traditional HPC model(Physical data center)

HPC versus Cloud Computing Models

Paul Hodor B|A|H

Cloud Computing model(Virtual data center)

Provision a bunch of instances

Attach virtual volumes for storage

Create a virtual private cloud

Launch a machine image

Log in remotely and start working

ssh [email protected]

BIOS-ICGEB HPCC for Bioinformatics 32June 1, 2015

Cloud computing: Available platforms

Lavanya Rishishwar GATech

BIOS-ICGEB HPCC for Bioinformatics 33

• Amazon Web Services - http://aws.amazon.com/• Microsoft Azure - http://azure.microsoft.com/en-us/• Google App Engine - https://cloud.google.com/appengine/• Illumina BaseSpace - https://basespace.illumina.com • IBM Cloud Computing - http://www.ibm.com/cloud-computing/us/en/• HP Eucalyptus - https://www.eucalyptus.com/ • HP Cloud - http://www.hpcloud.com/• Rackspace Cloud - http://www.rackspace.com/cloud• DigitalOcean – https://www.digitalocean.com/• CenturyLink Cloud - https://www.centurylinkcloud.com/• Verizon Cloud - http://cloud.verizon.com/• Computer Sciences Corporation - http://www.csc.com/cloud• Virtustream - http://www.virtustream.com/• VMware - http://www.vmware.com/cloud-services/• Fujitsu Cloud - http://www.fujitsu.com/global/solutions/cloud/• Dimension Data Cloud - http://cloud.dimensiondata.com/am/en/• GoGrid - http://www.gogrid.com/• Joyent - https://www.joyent.com/

June 1, 2015

Cloud computing: Available platforms

Lavanya Rishishwar GATech

BIOS-ICGEB HPCC for Bioinformatics 34June 1, 2015

Gartner Magic Quadrant of Cloud IaaS, 2014

Completeness of vision

Abili

ty t

o e

xecu

te

Cloud computing: Performance comparison

BIOS-ICGEB HPCC for Bioinformatics 35June 1, 2015

Welcome!


Recommended