+ All Categories
Home > Documents > Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by:...

Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by:...

Date post: 18-Jan-2018
Category:
Upload: merryl-hunter
View: 214 times
Download: 0 times
Share this document with a friend
Description:
GADU’s evolution.. GADU Just evolved into what it is today. Chiba City at Argonne. Jazz Cluster at Argonne. Grid2003 to OSG Teragrid All of them togeather.
15
Interoperability Achieved by GADU in using Interoperability Achieved by GADU in using multiple Grids. multiple Grids. OSG, Teragrid and ANL Jazz OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division Argonne National Laboratory Computational Institute University of Chicago
Transcript
Page 1: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Interoperability Achieved by GADU in using multiple Grids.Interoperability Achieved by GADU in using multiple Grids.OSG, Teragrid and ANL JazzOSG, Teragrid and ANL Jazz

Presented by:Dinanath Sulakhe

Mathematics and Computer Science DivisionArgonne National Laboratory

Computational InstituteUniversity of Chicago

Page 2: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

GADU Applications…

Its all about Comparative analysis

Insights of Biology are gained by Comparative Analysis:Insights of Biology are gained by Comparative Analysis: Unknown genes are compared against known.Unknown genes are compared against known. Similar genes tend to perform same functions.Similar genes tend to perform same functions.

Comparative analysis to know what is same and different between two strains of Comparative analysis to know what is same and different between two strains of an Organism:an Organism:

Example: What is different a organism living Boiling temperature such as 108 deg Example: What is different a organism living Boiling temperature such as 108 deg Celsius and the one living in extreme freezing conditions.Celsius and the one living in extreme freezing conditions.

Difference between Pathogenic and non-pathogenic organisms.Difference between Pathogenic and non-pathogenic organisms. Mycobecterium Tuberculosis is a Pathogen causing TB, is only 12 genes different Mycobecterium Tuberculosis is a Pathogen causing TB, is only 12 genes different

from the non-pathogenic BCG used as vaccine against TB.from the non-pathogenic BCG used as vaccine against TB.

ToolsBLAST , Blocks, Chisel, Interpro etc..BLAST , Blocks, Chisel, Interpro etc.. An embarrassingly parallel workload.An embarrassingly parallel workload.

Page 3: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

GADU’s evolution ..GADU’s evolution ..

GADU Just evolved into what it is today.

Chiba City at Argonne.Chiba City at Argonne.Jazz Cluster at Argonne.Jazz Cluster at Argonne.Grid2003 to OSGGrid2003 to OSGTeragridTeragrid

All of them togeather.All of them togeather.

Page 4: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Some Results and HighlightsSome Results and Highlights GADU can successfully use OSG and Teragrid GADU can successfully use OSG and Teragrid resources simultaneously.resources simultaneously.

Individual clusters such as ANL Jazz is also used Individual clusters such as ANL Jazz is also used parallely. parallely.

Site selection and scheduling across multiple grids.Site selection and scheduling across multiple grids.

Easily add a new site into the pool of sites.Easily add a new site into the pool of sites.

Status Site Name Site Test MaxNodes Gridcat

    ASGC_OSG 18 199 Pass

    FNAL_FERMIGRID 12 12 Pass

    FNAL_GPFARM 266 749 Pass

    GRASE-CCR-U2 114 2112 Pass

    Nebraska FAIL_TIMEOUT 252 Pass

    OSG_LIGO_PSU 28 312 Pass

    Purdue-ITaP 13 1224 Pass

    Purdue-Physics 14 63 Pass

    STAR-BNL FAIL_TIMEOUT 672 Pass

    UFlorida-PG 279 268 Pass

    UMATLAS FAIL_TIMEOUT 771 Pass

    UTA_DPCC 18 154 Inactive

    UWMadisonCMS FAIL_TIMEOUT 90 Pass

    grow-UNI-P FAIL_TIMEOUT 17 Pass

    TG_UC 44 316 NONE

    TG_NCSA 55 1000 NONE

    TG_PURDUE FAIL_FTP 1024 NONE

Last Run .. ( Last week)Last Run .. ( Last week)

RanRan 38830 BLAST Jobs38830 BLAST Jobs70% OSG70% OSG30% Teragrid30% Teragrid

Page 5: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Grid Resources..Open Science Grid and Teragrid.

AuthenticationAuthentication.. OSGOSG

OSG : GADU VOMS Server.OSG : GADU VOMS Server.DOE Grid Certificates are automatically picked by the Sites.DOE Grid Certificates are automatically picked by the Sites.

TeraGridTeraGridIndividual Accounts via Allocations.Individual Accounts via Allocations.Manually adding DOE Grid certificates to each site. (gx-map).Manually adding DOE Grid certificates to each site. (gx-map).

Application DeploymentApplication Deployment.. OSGOSG

OSG variables, $OSG_APP and $OSG_DATA is used to install GADU’s OSG variables, $OSG_APP and $OSG_DATA is used to install GADU’s applications and pre-stage the databases such as NR.applications and pre-stage the databases such as NR.

TeraGridTeraGridGADU has a Community space on each of the sites available. GADU has a Community space on each of the sites available. Applications and installed within this community space.Applications and installed within this community space.

Page 6: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Resource Independent GADU.GADU uses Pegasus based VDS and Condor-G

GlobusGRAM Interface

Pegasus

DAGManCondor-G

tc.data

Pool.config

Abstract Workflowas VDL

Condor Submit files

Submit Host

WN

Job management system

GatekeeperJobManager

WNWN

Remote Resources

WN

Job management system

GatekeeperJobManager

WNWNWN

Job management system

GatekeeperJobManager

WNWN

Information Services

GADU’s automated Analysis Server, expressing, executing and tracking the scientific workflows on Grid.

Database

Controller

Query Interface

Page 7: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Resource Independent GADU.GADU uses Pegasus based VDS and Condor-G

The Workflow Generator in GADU is responsible for producing a workflow suitable for execution in the Grid environment. This task is accomplished through the use of the “virtual data language” (VDL).

Once the VDL for the workflow is written, VDS converts it into condor submit files and a DAG that can be submitted to the site selected by the site selector.

TR FileBreaker(input filename, none nodes, output sequences[], none species) { argument = ${species}; argument = ${filename}; argument = ${nodes}; profile globus.maxwalltime = "300";}TR BLAST( none OutPre, none evalue, input query[], none type ) { argument = ${OutPre}; argument = ${evalue}; profile globus.maxwalltime = "300";}DV jobNo_1_1separator->FileBreaker( filename=@{input:"inputfile.1"|rt}, nodes="5", sequences=[@{output:"job1.0":"tmp"}, @{output:"job1.1":"tmp"}, @{output:"job1.2":"tmp"}, @{output:"job1.3":"tmp"}, @{output:"job1.4":"tmp"} ], species="Aeropyrum_Pernix")…. VDL for BLAST workflow

Page 8: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Resource Independent GADU.

4 Millionsequences

Fig. Example of a Dag representing the workflow.

ATGCATGCA

1000sequencesATGCATGCA

Page 9: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Resource Independent GADU.Representing a Site and the applications on it..

#SITE Transformation PFN TYPEANL_Jazz BLAST /soft/apps/BLAST/bin/blastall nullANL_Jazz Blocks /soft/apps/run-Blocks.pl nullANL_Jazz Chisel /soft/apps/chisel/runChisel.pl nullANL_Jazz IPRSCAN /soft/apps/iprscan_wrapper.pl nullANL_Jazz globus-url-copy /soft/apps/packages/globus-2.2.4/bin/globus-url-copy GLOBUS_LOCATION=/soft/apps/packages/globus-2.2.4/;LD_LIBRARY_PATH=/soft/apps/packages/globus-2.2.4/lib;PATH=/soft/apps/packages/globus-2.2.4/bin

pool ANL_Jazz { lrc "rls://gnare.mcs.anl.gov“ gridftp "gsiftp:// jmayor1.lcrc.anl.gov:2812/soft/apps/gadu" gridlaunch "/soft/apps/gadu/bin/kickstart" workdir "/soft/apps/gadu/vdldata" universe vanilla "jmayor1.lcrc.anl.gov:2121/jobmanager-pbs" universe globus "jmayor1.lcrc.anl.gov:2121/jobmanager-pbs" universe transfer " jmayor1.lcrc.anl.gov:2812/jobmanager-fork"}…. pool.config

tc.data

Page 10: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Resource Independent GADU.GADU uses Pegasus based VDS and Condor-G

GlobusGRAM Interface

Pegasus

DAGManCondor-G

tc.data

Pool.config

Abstract Workflowas VDL

Condor Submit files

Submit Host

WN

Job management system

GatekeeperJobManager

WNWN

Remote Resources

WN

Job management system

GatekeeperJobManager

WNWNWN

Job management system

GatekeeperJobManager

WNWN

Information Services

GADU’s automated Analysis Server, expressing, executing and tracking the scientific workflows on Grid.

Database

Controller

Query Interface

Page 11: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Requirements ... Information Services.VDS like System can to provide an Architecture independent mechanism to use different sites (Grids)

Information Services at various levelsInformation Services at various levels

Authentication – To check if the certs are valid at this site.Authentication – To check if the certs are valid at this site.Architecture – Is it an ia-32 cluster or an ia-64 ?Architecture – Is it an ia-32 cluster or an ia-64 ?Gatekeeper, GridFtp Server.Gatekeeper, GridFtp Server.Environment Variables – $OSG_APP, $TG_COMMUNITYEnvironment Variables – $OSG_APP, $TG_COMMUNITY

Number of CPUsNumber of CPUsNumber of Used CPUs.Number of Used CPUs.Number of Idle CPUs.Number of Idle CPUs.VO (user) specific jobs running at a given site.VO (user) specific jobs running at a given site.VO (user) specific jobs sitting in QUEUE at a given site (why?)VO (user) specific jobs sitting in QUEUE at a given site (why?)

We a need standards and protocols for these Information Services and identify more We a need standards and protocols for these Information Services and identify more information variables that needs to published by the Grids.information variables that needs to published by the Grids.

Gridcat or MDS or something else.Gridcat or MDS or something else.Currently GADU uses GridCat to collect site specific information for OSG and manually Currently GADU uses GridCat to collect site specific information for OSG and manually adds information for TeraGrid and Jazz. We are working on an MDS based information adds information for TeraGrid and Jazz. We are working on an MDS based information interface on TeraGrid.interface on TeraGrid.

In order to automatically add a new Grid site, we need information about the site:

Page 12: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Another Big Challenge.. Site Selection.GADU has access to 60 OSG Sites and 5 TeraGrid Sites.One challenge in using the Grid reliably for high-throughput analysis is monitoring the state of all Grid sites and how well they have performed for job requests from a given submit host. We view a site as “available” if our submit host can communicate with it, if it is responding to Globus job-submission commands, and if it will run our jobs promptly, with minimal queuing delays

4

GRID3

…..

…..

TeraGrid

JAZZ

PDSF

UBuffalo

ANL

SDSC

Test job for each site Run parallelly –Forking

site_tester.pl(each child process writes to

the site status file below)

# - manually forced to not to use1 - working site.0 - site failed

Site Status File: status | test-time* | site

1 10 jazz0 FAIL pdsf

#1 80 sdsc – tg….* - time in secs.

5

3

Blast/Blocks ServerRequest a site

Get site with details1

Site_selector.pl

get_all_working_sitesforeach working_site{

get_condor_q details.

if (#of jobs in Q == 0)&&

if ( toal # jobs on the host < max_allowed )

select the site.}get_selected_site_details

return (@site_and_details)

Site Info File: site | #max_nodes | nodes/batch |seqs/nodejazz 360 30 100pdsf 500 40 100sdsc 70 10 150…..Sequences/batch = nodes/batch x seqs/node

condor_q –global -globusID | .. .. | manager | ST | .. 1 jazz R blast..1.1 jazz R blast..2 Ubuff Q blast..…..2

7

6

GADU Server

OSG

Page 13: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Another Big Challenge.. Site Selection.GADU has access to 60 OSG Sites and 5 TeraGrid Sites.

Web Interface to Control the Selection of Sites for GADU:

http://compbio.mcs.anl.gov/sulakhe/cgi-bin/site_selection_new.pl?user=dina

Web Interface showing live status of usage:

http://compbio.mcs.anl.gov/gaduvo/gadu_jobs.cgi

Grid may not worry about this…

Page 14: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Next Steps..

• Working with Teragrid Information Services group – MDS based interface.

• Continue to improve GADU’s implementation of Site Selection.

• Trying to generalize Site Selection using the Information Services such as MDS and Gridcat.

• Continue to deploy faster scientific applications for the Bioinformatics Group at Argonne.

Page 15: Interoperability Achieved by GADU in using multiple Grids. OSG, Teragrid and ANL Jazz Presented by: Dinanath Sulakhe Mathematics and Computer Science Division.

Bioinformatics Group:Natalia Maltsev, PI

• Alex Rodriguez• Elizabeth Glass• Mark D’ Souza• Mustafa Syed• Yi Zhang

Globus and VDS• Mike Wilde• Nika Nefedova• Jens Voeckler• Ian Foster• Rick Stevens

• VDT Support.• Condor Support.• Systems at MCS.

Acknowledgements

Open Science Grid• Thanks to Ruth Pordes and OSG team for their wonderful support

TeraGrid• Charlie Catlett• Special thanks to David O’Neal, Joeseph Insley, and Sergiu Sanielevici


Recommended