+ All Categories
Home > Documents > The BioBox Initiative: Bio-ClusterGrid Gilbert Thomas Associate Engineer Sun APSTC – Asia Pacific...

The BioBox Initiative: Bio-ClusterGrid Gilbert Thomas Associate Engineer Sun APSTC – Asia Pacific...

Date post: 26-Dec-2015
Category:
Upload: marianna-simpson
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
The BioBox Initiative: Bio-ClusterGrid Gilbert Thomas Associate Engineer Sun APSTC – Asia Pacific Science & Technology Center
Transcript

The BioBox Initiative:Bio-ClusterGrid

Gilbert Thomas

Associate Engineer

Sun APSTC – Asia Pacific Science & Technology Center

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 2

AgendaAgenda

• Introduction : Bio-ClusterGrid• Solaris 9 Operating Environment• Sun Grid Engine (SGE)• Grid Engine Portal (GEP)• Applications on Bio-ClusterGrid• Installation of Bio-ClusterGrid• Current and Future Developments• Questions and Answers

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 3

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 4

Introduction: Bio-ClusterGridIntroduction: Bio-ClusterGrid

• Grid-enabled Bioinformatics Package

• Consists of 4 major components– Solaris 9 Operating Environment (April 2003

version)– Collection of 28 Bioinformatics applications pre-

installed and pre-configured– Sun Grid Engine – Grid Engine Portal

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 5

Introduction: Bio-ClusterGridIntroduction: Bio-ClusterGrid

• Fast setup (2 ½ hours)• Avoid hassle of downloading, compiling and

installing biox applications. • Applications optimized for SPARC.

WINDOWS_USER
BCG : 2 1/2 hours - 3 hours Manual Installation of all the applications, OS, SGE and GEP : A week or more.

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 6

Solaris 9 Operating EnvironmentSolaris 9 Operating Environment

• Latest version of Sun Solaris

• Supports GNOME 2.0 Desktop Environment

• Improvements in Performance, Security

• Easy patch administration using Patch Manager

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 7

GNOME 2.0 Desktop GNOME 2.0 Desktop EnvironmentEnvironment

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 8

Sun Grid Engine Sun Grid Engine

• Distribute Resource Management Software

• Provides load balancing and resource management

• Supports running of parallel applications over a cluster

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 9

Grid Engine Portal Grid Engine Portal

• Integrated into Sun One Portal Server

• Provides a web interface to some applications running on Sun Grid Engine

• Remote access from anywhere, anytime and any computer with a Java-enabled browser.

• For users who dislike Command-Line Interface (CLI)

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 10

Grid Engine Portal Grid Engine Portal

• Job Submission done through customised forms for each application

• View results of jobs online and/or download to local machine.

• Email user when job is completed.

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 11

Grid Engine PortalGrid Engine Portal

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 12

Submitting BLAST job using Submitting BLAST job using GEPGEP

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 13

Blast Job Output on GEPBlast Job Output on GEP

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 14

Applications onApplications onBio-ClusterGridBio-ClusterGrid

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 15

1.Homology & Similarity Search1.Homology & Similarity Search

• Definition– Sequence similarity is observable, homology is

an hypothesis based on observation

• Applications– BLAST– FASTA– GlimmerM– Wise

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 16

2. Sequence Analysis2. Sequence Analysis• Definition

– Use of bioinformatics methods to determine the biological function and structure of genes and the proteins they code for

• Applications– ACT

– ClustalW

– EMBOSS

– HMMER

– IMAGE

– T-Coffee

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 17

3. Structural Prediction3. Structural Prediction• Definition

– Determines the 2D/3D structure of proteins

• Applications– Dowser – FastDNAml– LOOPP– Mapmaker/QTL– PAML– PHYLIP

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 18

4. Molecular Imaging/Modeling4. Molecular Imaging/Modeling

• Definition– Tools that allow user to make predictions of the secondary

structure of proteins arising from a given amino acid sequence.

• Applications– Artemis – Cn3D– GROMACS– RasMol– ReadSeq– TribeMCL– VMD

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 19

5. Development Tools 5. Development Tools

• Biojava• Bioperl• Biopython

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 20

6. Other Software6. Other Software

• Apache• SQL • GNU Compilers• Sun One Compilers (trial licence)• HPC ClusterTools (Sun’s implementation of

MPI)

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 21

Bio-ClusterGrid Installation

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 22

Bio-ClusterGrid InstallationBio-ClusterGrid Installation

1.Flash Archive Installation

2.Sun Grid Engine Installation

3.Grid Engine Portal Installation

4.Grid Installation for Cluster

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 23

1. Solaris 9 Flash Archive Installation1. Solaris 9 Flash Archive Installation

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 24

1. Solaris 9 Flash Archive Installation1. Solaris 9 Flash Archive Installation

● Flash archive contains the entire OS Image of the machine.

● All applications, files on original machine will be replicated on the clone machines upon installation.

● Installation of flash archive is much faster than a normal Solaris OE installation.

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 25

1. Solaris 9 Flash Archive Installation1. Solaris 9 Flash Archive Installation

● Installed using Solaris 9 Installation CD 04/03 or later

● Can be installed from ftp server, DVD, http server.

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 26

2. Sun Grid Engine Installation2. Sun Grid Engine Installation

● Very fast; less than 5 minutes per host● ./inst_sge -m –fast in SGE directory● Must be run by root user.

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 27

3. Cluster Grid Installation: 3. Cluster Grid Installation:

● For every execution node, “run ./inst_sge -x -auto” in SGE directory.

● Installation time : Less than 5 minutes

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 28

4.4. Grid Installation: Requirements Grid Installation: Requirements ● Users using SGE must have unix account on

every execution node (e.g. By using NIS) ● Applications must be installed in all the nodes

in the same path (e.g. By using NFS Share)● Sun Grid Engine and Grid Engine Portal root

directory must be nfs shared.

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 29

3. Grid Engine Portal Installation3. Grid Engine Portal Installation

● 3 Step Procedure● Installation of Sun One Portal Server● Installation of Gateway for Secure Access● Installation of Grid Engine Portal

● Installation takes around 30-40 minutes

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 30

Current Developments Current Developments

● Improvement to the GEP Interface● Make it easier and comfortable for biologists to run

their applications using GEP● Biologists choose their application and

immediately run their job

04 December 2003 ©

Gilbert Thomas, Associate Engineer, Sun APSTC 31

Future Developments Future Developments

● Improvement to GEP Installation Procedure● Bio-Server ● Bio-Workstation

Questions?

For more queries [email protected]


Recommended