Post on 21-Dec-2015
transcript
GriPhyN Virtual Data System
Mike WildeArgonne National Laboratory
Mathematics and Computer Science Division
LISHEP 2004, UERJ, Rio De Janeiro13 Feb 2004
2LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
A Large Team Effort!
The Chimera Virtual Data Systemis the work of Ian Foster, Jens Voeckler, Mike Wilde and Yong Zhao
The Pegasus Planner is the work of Ewa Deelman, Gaurang Mehta, and Karan Vahi
Applications described are the work of many, people, including: James Annis, Rick Cavanaugh, Rob Gardner, Albert Lazzarini, Natalia Maltsev, and their wonderful teams
3LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
AcknowledgementsGriPhyN, iVDGL, and QuarkNet
(in part) are supported by the National Science Foundation
The Globus Alliance, PPDG, and QuarkNet are supported in part by the US Department of
Energy, Office of Science; by the NASA Information Power Grid program; and by IBM
6LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
GriPhyN – Grid Physics Network Mission
Enhance scientific productivity through: Discovery and application of datasets Enabling use of a worldwide data grid as a
scientific workstation
Virtual Data enables this approach by creating datasets from workflow “recipes” and recording their provenance.
7LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Virtual Data System Approach
Producing data from transformations with uniform, precise data interface descriptions enables…
Discovery: finding and understanding datasets and transformations
Workflow: structured paradigm for organizing, locating, specifying, & producing scientific datasets– Forming new workflow
– Building new workflow from existing patterns
– Managing change Planning: automated to make the Grid transparent Audit: explanation and validation via provenance
8LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Virtual Data Scenario
simulate –t 10 …
file1
file2reformat –f fz …
file1file1File3,4,5
psearch –t 10 …
conv –I esd –o aodfile6 summarize –t 10 …
file7
file8
On-demand data
generation
Update workflow following changes
Manage workflow;
psearch –t 10 –i file3 file4 file5 –o file8summarize –t 10 –i file6 –o file7reformat –f fz –i file2 –o file3 file4 file5 conv –l esd –o aod –i file 2 –o file6simulate –t 10 –o file1 file2
Explain provenance, e.g. for file8:
10LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Grid3 – The Laboratory
Supported by the National Science Foundation and the Department of Energy.
11LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Grid3 – Cumulative CPU Daysto ~ 25 Nov 2003
12LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Grid2003: ~100TB data processedto ~ 25 Nov 2003
19LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Virtual Data System Capabilities
Producing data from transformations with uniform, precise data interface descriptions enables…
Discovery: finding and understanding datasets and transformations
Workflow: structured paradigm for organizing, locating, specifying, & producing scientific datasets– Forming new workflow
– Building new workflow from existing patterns
– Managing change Planning: automated to make the Grid transparent Audit: explanation and validation via provenance
20LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
VDL: Virtual Data LanguageDescribes Data Transformations
Transformation– Abstract template of program invocation
– Similar to "function definition" Derivation
– “Function call” to a Transformation
– Store past and future:> A record of how data products were generated
> A recipe of how data products can be generated
Invocation– Record of a Derivation execution
21LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Example Transformation
TR t1( out a2, in a1, none pa = "500", none env = "100000" ) {
argument = "-p "${pa};
argument = "-f "${a1};
argument = "-x –y";
argument stdout = ${a2};
profile env.MAXMEM = ${env};
}
$a1
$a2
t1
22LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Example Derivations
DV d1->t1 (env="20000", pa="600",a2=@{out:run1.exp15.T1932.summary},a1=@{in:run1.exp15.T1932.raw},
);
DV d2->t1 (a1=@{in:run1.exp16.T1918.raw},a2=@{out.run1.exp16.T1918.summary}
);
23LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Workflow from File Dependencies
TR tr1(in a1, out a2) {
argument stdin = ${a1};
argument stdout = ${a2}; }
TR tr2(in a1, out a2) {
argument stdin = ${a1};
argument stdout = ${a2}; }
DV x1->tr1(a1=@{in:file1}, a2=@{out:file2});
DV x2->tr2(a1=@{in:file2}, a2=@{out:file3});
file1
file2
file3
x1
x2
24LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Example Invocation
Completion status and resource usage
Attributes of executable transformation
Attributes of input and output files
25LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Example Workflow
Complex structure– Fan-in
– Fan-out
– "left" and "right" can run in parallel
Uses input file– Register with RC
Complex file dependencies– Glues workflow
findrangefindrange
analyze
preprocess
26LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Workflow step "preprocess"
TR preprocess turns f.a into f.b1 and f.b2
TR preprocess( output b[], input a ) {argument = "-a top";argument = " –i "${input:a};argument = " –o " ${output:b};
} Makes use of the "list" feature of VDL
– Generates 0..N output files.
– Number file files depend on the caller.
27LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Workflow step "findrange"
Turns two inputs into one output
TR findrange( output b, input a1, input a2,none name="findrange", none p="0.0" ) {argument = "-a "${name};argument = " –i " ${a1} " " ${a2};argument = " –o " ${b};argument = " –p " ${p};
} Uses the default argument feature
28LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Can also use list[] parameters
TR findrange( output b, input a[],none name="findrange", none p="0.0" ) {argument = "-a "${name};argument = " –i " ${" "|a};argument = " –o " ${b};argument = " –p " ${p};
}
29LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Workflow step "analyze"
Combines intermediary results
TR analyze( output b, input a[] ) {argument = "-a bottom";argument = " –i " ${a};argument = " –o " ${b};
}
30LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Complete VDL workflow
Generate appropriate derivationsDV top->preprocess( b=[ @{out:"f.b1"},
@{ out:"f.b2"} ], a=@{in:"f.a"} );DV left->findrange( b=@{out:"f.c1"},
a2=@{in:"f.b2"}, a1=@{in:"f.b1"}, name="left", p="0.5" );
DV right->findrange( b=@{out:"f.c2"}, a2=@{in:"f.b2"}, a1=@{in:"f.b1"}, name="right" );
DV bottom->analyze( b=@{out:"f.d"}, a=[ @{in:"f.c1"}, @{in:"f.c2"} );
31LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Compound Transformations
Using compound TR– Permits composition of complex TRs from basic
ones
– Calls are independent> unless linked through LFN
– A Call is effectively an anonymous derivation> Late instantiation at workflow generation time
– Permits bundling of repetitive workflows
– Model: Function calls nested within a function definition
32LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Compound Transformations (cont)
TR diamond bundles black-diamondsTR diamond( out fd, io fc1, io fc2, io fb1, io fb2, in fa, p1,
p2 ) {
call preprocess( a=${fa}, b=[ ${out:fb1}, ${out:fb2} ] );
call findrange( a1=${in:fb1}, a2=${in:fb2}, name="LEFT", p=${p1}, b=${out:fc1} );
call findrange( a1=${in:fb1}, a2=${in:fb2}, name="RIGHT", p=${p2}, b=${out:fc2} );
call analyze( a=[ ${in:fc1}, ${in:fc2} ], b=${fd} );
}
33LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Compound Transformations (cont)
Multiple DVs allow easy generator scripts:DV d1->diamond( fd=@{out:"f.00005"},
fc1=@{io:"f.00004"}, fc2=@{io:"f.00003"}, fb1=@{io:"f.00002"}, fb2=@{io:"f.00001"}, fa=@{io:"f.00000"}, p2="100", p1="0" );
DV d2->diamond( fd=@{out:"f.0000B"}, fc1=@{io:"f.0000A"}, fc2=@{io:"f.00009"}, fb1=@{io:"f.00008"}, fb2=@{io:"f.00007"}, fa=@{io:"f.00006"}, p2="141.42135623731", p1="0" );
...DV d70->diamond( fd=@{out:"f.001A3"},
fc1=@{io:"f.001A2"}, fc2=@{io:"f.001A1"}, fb1=@{io:"f.001A0"}, fb2=@{io:"f.0019F"}, fa=@{io:"f.0019E"}, p2="800", p1="18" );
34LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Jim Annis, Steve Kent, Vijay Sehkri, Fermilab, Michael Milligan, Yong Zhao,
University of Chicago
1
10
100
1000
10000
100000
1 10 100
Num
ber
of C
lust
ers
Number of Galaxies
Galaxy clustersize distribution
DAG
Virtual Data Example:Galaxy Cluster Search
Sloan Data
35LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Cluster SearchWorkflow Graph
and Execution Trace
Workflow jobs vs time
36LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
mass = 200decay = WWstability = 1LowPt = 20HighPt = 10000
mass = 200decay = WWstability = 1event = 8
mass = 200decay = WWstability = 1plot = 1
mass = 200decay = WWplot = 1
mass = 200decay = WWevent = 8
mass = 200decay = WWstability = 1
mass = 200decay = WWstability = 3
mass = 200
mass = 200decay = WW
mass = 200decay = ZZ
mass = 200decay = bb
mass = 200plot = 1
mass = 200event = 8
Virtual Data Application: High Energy Physics
Data Analysis
Work and slide byRick Cavanaugh andDimitri Bourilkov,University of Florida
38LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Virtual Data Grid Vision
GridOperations
simulation data
discovery
ScienceReview
Data Grid
storageelement
replica locationservice
storageelement
storageelement
Dat
aT
ran
spo
rt Sto
rage
Reso
urce
Mg
mt
virtualdata
catalogvirtual data
index
virtualdata
catalog
virtualdata
catalog
Computing Grid
workflowplanner
request plannerworkflowexecutor
(DAGman)
request executor(Condor-G,
GRAM)
requestpredictor
(Prophesy)
Grid Monitor
ProductionManager
Researcher
planning
discovery
com
po
sition
sim
ula
tio
n
anal
ysis
sharing
raw d
ata
detector
derivatio
n
41LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
GriPhyN/PPDGData Grid Architecture
Application
Planner
Executor
Catalog Services
Info Services
Policy/Security
Monitoring
Repl. Mgmt.
Reliable TransferService
Compute Resource Storage Resource
DAG (concrete)
DAG (abstract)
DAGMAN, Kangaroo
GRAM GridFTP; GRAM; SRM
GSI, CAS
MDS
MCAT; GriPhyN catalogs
GDMP
MDS
Globus
42LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Executor Example: Condor DAGMan
Directed Acyclic Graph Manager
Specify the dependencies between Condor jobs using DAG data structure
Manage dependencies automatically– (e.g., “Don’t run job “B” until job “A” has completed
successfully.”)
Each job is a “node” in DAG
Any number of parent or children nodes
No loops
Job A
Job B Job C
Job D
Slide courtesy Miron Livny, U. Wisconsin
43LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Executor Example: Condor DAGMan (Cont.)
DAGMan acts as a “meta-scheduler” – holds & submits jobs to the Condor queue at the
appropriate times based on DAG dependencies
If a job fails, DAGMan continues until it can no longer make progress and then creates a “rescue” file with the current state of the DAG– When failed job is ready to be re-run, the rescue file is
used to restore the prior state of the DAG
DAGMan
CondorJobQueue
C
D
B
C
B
A
Slide courtesy Miron Livny, U. Wisconsin
44LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Virtual Data in CMS
Virtual Data Long Term Vision of CMS: CMS Note 2001/047, GRIPHYN 2001-16
45LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
CMS Data Analysis
100b 200b
5K 7K
100K
50K
300K100K
100K
50K
100K200K
100K
100b 200b
5K 7K
100K
50K
300K100K
100K
50K
100K200K
100K
Tag 2
Jet finder 2
Jet finder 1
ReconstructionAlgorithm
Tag 1
Calibration data
Raw data(simulated
or real)
Reconstructeddata
(produced by physics
analysis jobs)
Event 1 Event 2 Event 3
Uploaded data Virtual data Algorithms
Dominant use of Virtual Data in the Future
46LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Data: 0.5 MB 175 MB 275 MB 105 MB
SC2001 Demo Version:
pythia cmsim writeHits writeDigis
1 run = 500 events
1 run
1 run
1 run
1 run
1 event
CPU: 2 min 8 hours 5 min 45 min
truth.ntpl hits.fz hits.DB digis.DB
Production Pipeline GriphyN-CMS Demo
47LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Pegasus:Planning for Execution in Grids
Maps from abstract to concrete workflow– Algorithmic and AI based techniques
Automatically locates physical locations for both components (transformations and data)– Use Globus Replica Location Service and the Transformation
Catalog find appropriate resources to execute
– Via Globus Monitoring and Discovery Serivce Reuse existing data products where applicable Publishes newly derived data products
– RLS, Chimera virtual data catalog
48LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
GridGridGrid
workflow executor(DAGman)Execution
WorkflowPlanning
Globus ReplicaLocation Service
Globus Monitoringand Discovery
Service
Information andModels
Application Models
detector
Raw data
Co
ncr
ete
Wo
rkfl
ow
Replica LocationAvailableReources
Abstract Worfklow
Dyn
amic
info
rmat
ion
Request Manager
Replica andResourceSelectorSubmission and
Monitoring System
WorkflowReduction
DataPublication
Virtual DataLanguage Chimera
DataManagement
49LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Replica Location Service Pegasus uses the RLS to find input data
LRC LRCLRC
RLI Computation
Pegasus uses the RLS to register new data products
50LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Use of MDS in Pegasus MDS provides up-to-date Grid state information
– Total and idle job queues length on a pool of resources (condor)
– Total and available memory on the pool– Disk space on the pools– Number of jobs running on a job manager
Can be used for resource discovery and selection– Developing various task to resource mapping heuristics
Can be used to publish information necessary for replica selection– Developing replica selection components
51LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Abstract Workflow Reduction
KEYThe original node
Input transfer node
Registration node
Output transfer node
Node deleted by Reduction algorithm
Job e
Job g Job h
Job d
Job aJob c
Job f
Job i
Job b
The output jobs for the Dag are all the leaf nodes– i.e. f, h, I
Each job requires 2 input files and generates 2 output files.
The user specifies the output location.
52LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
KEYThe original node
Input transfer node
Registration node
Output transfer node
Node deleted by Reduction algorithm
Job e
Job g Job h
Job d
Job aJob c
Job f
Job i
Job b
Optimizing from the point of view of Virtual Data
Jobs d, e, f have output files that have been found in the Replica Location Service.
Additional jobs are deleted. All jobs (a, b, c, d, e, f) are removed from the DAG.
53LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Job e
Job g Job h
Job d
Job aJob c
Job f
Job i
Job b adding transfer nodes for the input files for the root nodes
Plans for staging data in
KEYThe original node
Input transfer node
Registration node
Output transfer node
Node deleted by Reduction algorithm
Planner picks execution and replica locations
54LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Staging and registering for each job that materializes data (g, h, i ).
KEYThe original node
Input transfer node
Registration node
Output transfer node
Node deleted by Reduction algorithm
transferring the output files of the leaf job (f) to the output location
Job e
Job g Job h
Job d
Job aJob c
Job f
Job i
Job b
Staging data out and registering new derived products in the RLS
55LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
KEYThe original node
Input transfer node
Registration node
Output transfer node
Job g Job h
Job i
Job e
Job g Job h
Job d
Job aJob c
Job f
Job i
Job b
Input DAG
The final executable DAG
56LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Pegasus Components
Concrete Planner and Submit file generator (gencdag)– The Concrete Planner of the VDS makes the
logical to physical mapping of the DAX taking into account the pool where the jobs are to be executed (execution pool) and the final output location (output pool).
Java Replica Location Service Client (rls-client & rls-query-client)– Used to populate and query the globus
replica location service.
57LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Pegasus Components (cont’d) XML Pool Config generator (genpoolconfig)
– The Pool Config generator queries the MDS as well as local pool config files to generate a XML pool config which is used by Pegasus.
– MDS is preferred for generation pool configuration as it provides a much richer information about the pool including the queue statistics, available memory etc.
The following catalogs are looked up to make the translation– Transformation Catalog (tc.data)– Pool Config File– Replica Location Services– Monitoring and Discovery Services
58LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Transformation Catalog (Demo) Consists of a simple text file.
– Contains Mappings of Logical Transformations to Physical Transformations.
Format of the tc.data file#poolid logical tr physical tr envisi preprocess /usr/vds/bin/preprocess VDS_HOME=/usr/vds/;
All the physical transformations are absolute path names.
Environment string contains all the environment variables required in order for the transformation to run on the execution pool.
DB based TC in testing phase.
59LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Pool Config (Demo) Pool Config is an XML file which contains information
about various pools on which DAGs may execute. Some of the information contained in the Pool Config
file is– Specifies the various job-managers that are available
on the pool for the different types of condor universes.– Specifies the GridFtp storage servers associated with
each pool.– Specifies the Local Replica Catalogs where data
residing in the pool has to be cataloged.– Contains profiles like environment hints which are
common site-wide.– Contains the working and storage directories to be
used on the pool.
60LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Pool config
Two Ways to construct the Pool Config File.– Monitoring and Discovery Service
– Local Pool Config File (Text Based)
Client tool to generate Pool Config File– The tool genpoolconfig is used to query the
MDS and/or the local pool config file/s to generate the XML Pool Config file.
61LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Gvds.Pool.Config (Demo) This file is read by the information provider and published into
MDS. Format
gvds.pool.id : <POOL ID>gvds.pool.lrc : <LRC URL>gvds.pool.gridftp : <GSIFTP URL>@<GLOBUS VERSION>gvds.pool.gridftp : gsiftp://sukhna.isi.edu/nfs/asd2/gmehta@2.4.0gvds.pool.universe : <UNIVERSE>@<JOBMANAGER URL>@<
GLOBUS VERSION>gvds.pool.universe : transfer@columbus.isi.edu/jobmanager-
fork@2.2.4gvds.pool.gridlaunch : <Path to Kickstart executable>gvds.pool.workdir : <Path to Working Dir>gvds.pool.profile : <namespace>@<key>@<value>gvds.pool.profile : env@GLOBUS_LOCATION@/smarty/gt2.2.4gvds.pool.profile : vds@VDS_HOME@/nfs/asd2/gmehta/vds
62LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Properties (Demo) Properties file define and modify the behavior of Pegasus.
Properties set in the $VDS_HOME/properties can be overridden by defining them either in $HOME/.chimerarc or by giving them on the command line of any executable.– eg. Gendax –Dvds.home=path to vds home……
Some examples follow but for more details please read the sample.properties file in $VDS_HOME/etc directory.
Basic Required Properties– vds.home : This is auto set by the clients from the environment
variable $VDS_HOME
– vds.properties : Path to the default properties file > Default : ${vds.home}/etc/properties
63LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Concrete Planner Gencdag (Demo)
The Concrete planner takes the DAX produced by Chimera and converts into a set of condor dag and submit files.
Usage : gencdag --dax <dax file> --p <list of execution pools> [--dir <dir for o/p files>] [--o <outputpool>] [--force]
You can specify more then one execution pools. Execution will take place on the pools on which the executable exists. If the executable exists on more then one pool then the pool on which the executable will run is selected randomly.
Output pool is the pool where you want all the output products to be transferred to. If not specified the materialized data stays on the execution pool
64LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Future Improvements
A sophisticated concrete planner with AI technology A sophisticated transformation catalog with a DB
backend Smarter scheduling of workflows by deciding whether
the workflow is compute intensive or data intensive. In-time planning. Using resource queue information and network
bandwidth information to make a smarter choice of resources.
Reservation of Disk Space on remote machines
65LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Pegasus Portal
66LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Tutorial Outline Introduction: Grids, GriPhyN, Virtual Data
(5 minutes)
The Chimera system (25 minutes)
The Pegasus system (25 minutes)
Summary (5 minutes)
67LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Summary: GriPhyN Virtual Data System
Using Virtual Data helps in reducing time and cost of computation.
Services in the Virtual Data Toolkit– Chimera. Constructs a virtual plan
– Pegasus. Constructs a concrete grid plan from this virtual plan.
Some current applications of the virtual data toolkit -
68LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Astronomy Montage (NASA and
NVO) (B. Berriman, J. Good, G. Singh, M. Su)– Deliver science-grade
custom mosaics on demand
– Produce mosaics from a wide range of data sources (possibly in different spectra)
– User-specified parameters of projection, coordinates, size, rotation and spatial sampling. Mosaic created by Pegasus based Montage from
a run of the M101 galaxy images on the Teragrid.
69LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
Montage Workflow
1202 nodes
70LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
BLAST: set of sequence comparison algorithms that are used to
search sequence databases for optimal local alignments to a query
Lead by Veronika Nefedova (ANL) as part of the PACI Data Quest Expedition program
2 major runs were performed using Chimera and Pegasus:
1) 60 genomes (4,000 sequences each), In 24 hours processed Genomes selected
from DOE-sponsored sequencing projects67 CPU-days of processing time
delivered~ 10,000 Grid jobs>200,000 BLAST executions50 GB of data generated
2) 450 genomes processed
Speedup of 5-20 times were achieved because the compute nodes we used efficiently by keeping the submission of the jobs to the compute cluster constant.
71LISHEP2004/UERJ www.griphyn.org/chimera 13 Feb 04
For further information
Globus Project: www.globus.org
Chimera : www.griphyn.org/chimera
Pegasus: pegasus.isi.edu
MCS: www.isi.edu/~deelman/MCS