© 2006 Open Grid Forum
Everything you always wanted to know
about SAGA*
*but were afraid to ask
Thilo Kielmann
Shantenu Jha, Andre Merzky
2© 2007 Open Grid Forum
Why SAGA?
• Why are there so few grid applications out there?
• Is there a simple, stable, integrated and uniform high-level
programming interface that provides the most common grid
programming abstractions?
• Need to hide underlying complexities, varying semantics,
heterogenities and changes from application program(er)
• Measure(s) of success:
– Does SAGA enable quick development of “new” grid applications?
– Does it enable greater functionality using less code?
3© 2007 Open Grid Forum
Copy a File: Globus GASSif (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP ||
source_url.scheme_type == GLOBUS_URL_SCHEME_FTP ) {
globus_ftp_client_operationattr_init (&source_ftp_attr);
globus_gass_copy_attr_set_ftp (&source_gass_copy_attr,
&source_ftp_attr);
}
else {
globus_gass_transfer_requestattr_init (&source_gass_attr,
source_url.scheme);
globus_gass_copy_attr_set_gass(&source_gass_copy_attr,
&source_gass_attr);
}
output_file = globus_libc_open ((char*) target,
O_WRONLY | O_TRUNC | O_CREAT,
S_IRUSR | S_IWUSR | S_IRGRP |
S_IWGRP);
if ( output_file == -1 ) {
printf ("could not open the file \"%s\"\n", target);
return (-1);
}
/* convert stdout to be a globus_io_handle */
if ( globus_io_file_posix_convert (output_file, 0,
&dest_io_handle)
!= GLOBUS_SUCCESS) {
printf ("Error converting the file handle\n");
return (-1);
}
result = globus_gass_copy_register_url_to_handle (
&gass_copy_handle, (char*)source_URL,
&source_gass_copy_attr, &dest_io_handle,
my_callback, NULL);
if ( result != GLOBUS_SUCCESS ) {
printf ("error: %s\n", globus_object_printable_to_string
(globus_error_get (result)));
return (-1);
}
globus_url_destroy (&source_url);
return (0);
}
int copy_file (char const* source, char const* target)
{
globus_url_t source_url;
globus_io_handle_t dest_io_handle;
globus_ftp_client_operationattr_t source_ftp_attr;
globus_result_t result;
globus_gass_transfer_requestattr_t source_gass_attr;
globus_gass_copy_attr_t source_gass_copy_attr;
globus_gass_copy_handle_t gass_copy_handle;
globus_gass_copy_handleattr_t gass_copy_handleattr;
globus_ftp_client_handleattr_t ftp_handleattr;
globus_io_attr_t io_attr;
int output_file = -1;
if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) {
printf ("can not parse source_URL \"%s\"\n", source_URL);
return (-1);
}
if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP &&
source_url.scheme_type != GLOBUS_URL_SCHEME_FTP &&
source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP &&
source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) {
printf ("can not copy from %s - wrong prot\n", source_URL);
return (-1);
}
globus_gass_copy_handleattr_init (&gass_copy_handleattr);
globus_gass_copy_attr_init (&source_gass_copy_attr);
globus_ftp_client_handleattr_init (&ftp_handleattr);
globus_io_fileattr_init (&io_attr);
globus_gass_copy_attr_set_io (&source_gass_copy_attr, &io_attr);
&io_attr);
globus_gass_copy_handleattr_set_ftp_attr
(&gass_copy_handleattr,
&ftp_handleattr);
globus_gass_copy_handle_init (&gass_copy_handle,
&gass_copy_handleattr);
4© 2007 Open Grid Forum
Copy a File: SAGA
#include <string>
#include <saga/saga.hpp>
void copy_file(std::string source_url, std::string target_url)
{
try {
saga::file f(source_url);
f.copy(target_url);
}
catch (saga::exception const &e) {
std::cerr << e.what() << std::endl;
}
}
5© 2007 Open Grid Forum
What SAGA is
• Simple API for Grid Applications:
• For grid-aware applications:
• Dealing with “the Grid” explicitly
• High-level (= application-level) abstractions
• Hides details of underlying middleware(s)
6© 2007 Open Grid Forum
What SAGA is NOT
• SAGA does NOT hide the grid:
• It still exposes that resources
(like files or jobs) can be remote
• But it hides all those details that you never
wanted to deal with...
• SAGA is NOT a service (management)
interface
8© 2007 Open Grid Forum
SAGA: In action
A SAGA engine can talk to many middlewares at
the same time, with dynamic selection/loading
10© 2007 Open Grid Forum
The SAGA Interface Hierarchy
Let us first have a quick tour around the API.
Then, we will look at specific (small) examples.
12© 2007 Open Grid Forum
Errors and Exceptions
SAGA defines a hierarchy of exceptions(and allows implementations to fill in specific details)
13© 2007 Open Grid Forum
Session, Context, Permissions
Only needed if you wish to handle multiple credentials.Otherwise, your default context is used.
15© 2007 Open Grid Forum
Application Monitoring/Steering
Metric defines application-level data structure(s) that can be monitored and modified (steered).
16© 2007 Open Grid Forum
Asynchronous Operations, Tasks
Most calls can be synchronous, asynchronous,or tasks (need explicit start.)
20© 2007 Open Grid Forum
Jobs
• job_service uses job_description to create a job
• job_description attributesare based on JSDL
• State model is based on BES• job_self represents the SAGA
application
21© 2007 Open Grid Forum
Files, Directories, Name Spaces
Both for physical and replicated (“logical”) files
24© 2007 Open Grid Forum
Is SAGA Simple?
• Well, it depends:
• It is certainly not simple to implement;lots of the pain using the middleware goes into the SAGA engine and adaptors.
• But it is simple to use (see next slides):• Look&Feel vs. Functional Packages• Somewhat like MPI:
most users only need a very small subset of calls
25© 2007 Open Grid Forum
File Management
saga::directory dir ("any://remote.host.net//data/");
if ( dir.exists ("a") && dir.is_file ("a") ) { dir.copy ("a", "b", Overwrite); }
list <string> names = dir.find ("*-[123].txt");
saga::directory tmp = dir.open_dir ("tmp/", Create); saga::file file = dir.open ("tmp/data.txt");
26© 2007 Open Grid Forum
Job Submission
saga::job_description jd; saga::job_service js ("gram://remote.host.net"); saga::job j = js.create_job (jd);
j.run ();
cout << "Job State: " << j.get_state () << endl;
j.wait ();
cout << "Retval " << j.get_attribute ("ExitCode") << endl;
27© 2007 Open Grid Forum
Jobs (cont.)
saga::job j = js.create_job (jd); j.run ();
j.suspend (); j.resume ();
j.checkpoint ();
j.migrate (jd);
j.signal (SIGUSR1);
j.cancel ();
28© 2007 Open Grid Forum
Job Migration
saga::job self = js.get_self ();
self.checkpoint (); self.migrate (jd); self.signal (SIGUSR1); self.cancel ();
29© 2007 Open Grid Forum
Checking all my Jobs
vector<string> ids = js.list ();
while ( ids.size () ) { string id = ids[ids.size () - 1]; saga::job j = js.get_job (id); cout << id << " : " << j.get_state () << endl;
ids.pop_back (); }
30© 2007 Open Grid Forum
JSDL-based Job Descriptions
saga::job_description jd;
jd.set_attribute ("Executable", "/bin/tail"); jd.set_attribute ("Arguments", "-n, 20, -f, all.log"); jd.set_attribute ("Environment", "TMPDIR=/tmp/"); jd.set_attribute ("WorkingDirectory", "data/"); jd.set_attribute ("FileTransfer", "last.log >> all.log"); jd.set_attribute ("Cleanup", "False");
32© 2007 Open Grid Forum
Name Spaces
• Management of entities in name spaces• Files, replicas, information, resources,
steering parameters, checkpoints...• Manages hierarchy (mkdir, cd, ls, ...)• Managed entries are opaque (copy,
move, delete...)
33© 2007 Open Grid Forum
Files
• Implements name space interface• POSIX oriented: read, write, seek• Grid (not so simple) optimizations:
• Scattered I/O• Pattern based I/O• Extended I/O (a la GridFTP)
34© 2007 Open Grid Forum
Replicas
• Implements name space interface,adds name space entries
• O/REP oriented: list, add, remove replicas, manage meta data
• Grid optimizations are hidden(replica placement, consistency,...)
35© 2007 Open Grid Forum
Name Spaces
saga::ns_dir dir ("gridftp://remote.host.net//data/");
if ( dir.is_entry ("a") && ! dir.is_dir ("a") ) { dir.copy ("a", "../b"); dir.link ("../b", "a", Overwrite); }
list <string> names = dir.find ("*-{123}.text.");
saga::ns_dir tmp = dir.open_dir ("tmp/", DeReference); saga::ns_entry entry = dir.open ("tmp/data.txt");
entry.copy ("data.bak", Overwrite);
36© 2007 Open Grid Forum
Files
saga::file f ("gridftp://remote.host.net/data/data.bin");
char buf[100];
if ( f.get_size () >= 223 ) { int pos = f.seek (123, Current); int len = f.read (saga::buffer (buf), 100); }
37© 2007 Open Grid Forum
Tasks: sync and async operations
saga::file file ("gsiftp://remote.host.net/data/data.bin");
// normal, synchronous file.copy ("data.bak");
// async versions saga::task t1 = file.copy <saga::task::Sync> ("data.bak.1"); saga::task t2 = file.copy <saga::task::Async> ("data.bak.2"); saga::task t3 = file.copy <saga::task::Task> ("data.bak.3");
// t1: Done // t2: Running // t3: New
38© 2007 Open Grid Forum
Tasks: getting results
saga::file file ("gsiftp://remote.host.net/data/data.bin");
// normal, synchronous ssize_t size_0 = file.get_size ();
// async versions saga::task <ssize_t> t1 = file.get_size <saga::task::Sync> (); saga::task <ssize_t> t2 = file.get_size <saga::task::Async> (); saga::task <ssize_t> t3 = file.get_size <saga::task::Task> ();
// wait...
ssize_t size_1 = t1.get_result (); ssize_t size_2 = t2.get_result (); ssize_t size_3 = t3.get_result ();
39© 2007 Open Grid Forum
Upcoming API Extensions
• MessageBus
• Structured data transfer, also many-to-many
• Service Discovery
• Based on GLUE schema
• Adverts
• Persistent storage of application-level data
• Checkpointing/Recovery
• Based on GridCPR
40© 2007 Open Grid Forum
Implementations (late) 2007
• LSU/VUA: C++ engine
• Local, GT4/preWS, OMII-UK GridSAM,
XtreemOS
• LSU: C++ “light”
• Local, GT4/preWS, OMII-UK GridSAM
• LSU: C, Perl, Python, Fortran
• Wrappers for C++ engine
41© 2007 Open Grid Forum
Implementations (late) 2007
• VUA: Java engine
• Local, GT2/3/4, ssh, OMII-UK GridSAM,
XtreemOS, PBS, SGE, gLite(?)
(many via Java-GAT)
• DEISA: Java library
• DEISA (Unicore) files and jobs
• NAREGI: Java library
• NAREGI services