www.cactuscode.org www.gridlab.org
Developing Applications on Today’s Grids
Tom GoodaleMax Planck Institute for
Gravitational [email protected]
Grid Apps 2
Grid ProjectsGrid Projects
Globus GrADS Condor GridLab DataGrid
Many more, see for example: http://www-fp.mcs.anl.gov/~foster/grid-projects/ http://www.gridcomputing.com/
Grid Apps 3
GlobusGlobus
http://www.globus.org Large and established project which has contributed
much Grid middleware Based at Argonne National Laboratories (USA) Globus is the most widely deployed software for Grid
computing “The Globus Project is developing fundamental
technologies needed to build computational grids.”
Grid Apps 4
GrADSGrADS
http://hipersoft.cs.rice.edu/grads/ Grid Application Development Software “The goal of the Grid Application Development
Software (GrADS) Project is to simplify distributed heterogeneous computing in the same way that the World Wide Web simplified information sharing over the Internet. The GrADS project will explore the scientific and technical problems that must be solved to make grid application development and performance tuning for real applications an everyday practice.”
Grid Apps 5
CondorCondor
http://www.cs.wisc.edu/condor/ “The goal of the Condor Project is to develop,
implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing (HTC) on large collections of distributively owned computing resources. Guided by both the technological and sociological challenges of such a computing environment, the Condor Team has been building software tools that enable scientists and engineers to increase their computing throughput. “
Grid Apps 6
GridLabGridLab
http://www.gridlab.org “ The GridLab project will develop an easy-to-use,
flexible, generic and modular Grid Application Toolkit (GAT), enabling todays applications to make innovative use of global computing resources. The project is grounded by two principles, (i) the co-development of infrastructure with real applications and user communities, leading to working scenarios, and (ii) dynamic use of grids, with self-aware simulations adapting to their changing environment.”
Grid Apps 7
DataGridDataGrid
http://eu-datagrid.web.cern.ch/eu-datagrid/ “DataGrid is a project funded by European Union. The
objective is to build the next generation computing infrastructure providing intensive computation and analysis of shared large-scale databases, from hundreds of TeraBytes to PetaBytes, across widely distributed scientific communities.”
Grid Apps 8
......
There are many more projects – the preceding was just a sample
More projects are starting all the time The “Grid” is gaining interest...
Grid Apps 9
Grid InfrastructureGrid Infrastructure
There is already a lot of infrastructure out there to help one run applications on grids
Not so much infrastructure so far for tailoring applications to run on grids, but that doesn't stop existing legacy applications being able to run in grid environments.
Lots of effort currently underway to develop portals to facilitate use of the existing infrastructure with existing applications.
Grid Apps 10
GlobusGlobus
Globus project has developed the most widely deployed grid infrastructure
This infrastructure splits into Security Data management Resource management Information systems
Additionally the project have developed an MPI implementation which helps MPI applications to run across multiple computational resources.
Grid Apps 11
GSIGSI
The Globus Toolkit uses the Grid Security Infrastructure for enabling secure authentication and communication over an open network. GSI provides a number of useful services for Grids, including mutual authentication and single sign-on.
GSI uses public key encryption and X.509 certificates, along with SSL, with extensions to allow single sign on and credential delegation.
Globus implementation GSSAPI compliant
Grid Apps 12
Data ManagementData Management
The Globus Toolkit includes various data management components
GridFTP GSI enabled FTP including multiple parallel streams to
increase overall throughput Data Replication
Multiple copies of data distributed to allow faster access Toolkit provides replica catalogue and replica management
software GASS
Global Access to Secondary Sources Can access data from anywhere with a URL
Grid Apps 13
Resource ManagementResource Management
Globus Resource Allocation Manager (GRAM) GRAM processes the requests for resources for remote
application execution, allocates the required resources, and manages the active jobs. It also returns updated information regarding the capabilities and availability of the computing resources to the Monitoring and Discovery Service (MDS).
GRAM provides an API for submitting and canceling a job request, as well as checking the status of a submitted job. The specifications are written by the user in the Resource Specification Language (RSL), and is processed by GRAM as part of the job request.
Grid Apps 14
Information SystemsInformation Systems
Monitoring and Discovery Service (MDS) The MDS contains static and dynamic information
about compute resources, as well as static and dynamic information about the network performance between compute resources.
LDAP based database Hierarchical
Grid Apps 15
MPICH-G2MPICH-G2
Can be used to run across multiple distributed parallel resources
Based on the widely available MPICH MPI implementation from Argonne – in fact a standard device which may be built if globus is installed on the system
May use vendor's native MPI implementation for intra-machine communication
Uses Globus infrastructure to launch jobs on remote resources
Grid Apps 16
CondorCondor
Condor converts collections of distributively owned workstations and dedicated clusters into a distributed high-throughput computing facility.
Uses ClassAds to specify resource requirements for jobs.
Contains checkpointing and process migration Can use Globus to be batch system across multiple
resources
Grid Apps 17
How Do Applications Use These ?How Do Applications Use These ?
To use Condor or MPICH-G2 no changes need to be made to the application to make use of new distributed features
So simple applications used in their current mode may be able to use infrastructure transparently.
More complicated applications have more needs...
Grid Apps 18
What is an Application ?What is an Application ?
Sometimes causes much confusion in conversations Is an application a single process, or many processes
collaborating to perform some task ? For the purposes of this tutorial I will define an
application as the latter An application is one or more processes which
perform a particular task such as a simulation or a calculation on behalf of a user
E.g. all processes in an MPI Job
Grid Apps 19
Application DevelopersApplication Developers
This is very similar to the requirements for an application to be able to run on many different architectures
Need now to also think that not all processes in an application are necessarily running on the same resource or even the same architecture
Not all processes have access to the same environment, or may be able to reach the same set of remote resources
What do Application Developers Need to Think About in Grid Environments ?
Grid Apps 20
IOIO
As discussed for frameworks, files must be in some format which is readable on all architectures
Not all processes may have access to the same file systems, so may need to use communication technologies to access files remotely
The user may not ordinarily have access to any of the filespaces accessible to the application, so there must be some way to migrate files to and from the space available to the application.
Grid Apps 21
Parallel IssuesParallel Issues
If using MPI, must be an MPI version which can run heterogeneously e.g. MPICH-G2, PACX.
When running across multiple resources, the bandwidth and latencies of communication between processes on different resources is much greater than between processes on a single resource Need to think about communication patterns – is it possible
to reduce the amount of communication by, for example, buffering data for longer and sending larger batches of data.
Grid Apps 22
Inter Process CommunicationInter Process Communication
Need to locate other processes in application These may be on remote resources Remote resources may be firewalled
Grid Apps 23
PortabilityPortability
Need to be able to compile and run in heterogeneous environments
Not all resources have the same sets of software available
When starting a distributed application, how does one make sure that there is a suitable executable there ?
Should base code on standards, not on individual compiler vendors' specific features.
Grid Apps 24
FirewallsFirewalls
In the modern world a lot of resources are protected by firewalls. These restrict the ports which may be access from the outside world, and often the locations in the outside world from which these ports may be opened.
Not generally a problem for an application running on this resource
A real problem for monitoring such an application A real problem for running an application across
multiple such resources
Grid Apps 25
Testbeds - What and Why ?Testbeds - What and Why ?
A testbed is a (heterogeneous) set of machines which you may test your application on.
May or may not have a uniform distribution of grid infrastructure.
Why use one ? Set of resources which you can find out about and have
accounts on. Can ask the sysadmins what went wrong. Can request installation of other software Can thus test your application in a Grid environment with less
pain than on a random set of machines
Grid Apps 26
Grid Programming ToolsGrid Programming Tools
While there are many Grid projects, and much grid middleware, there is, to date, very little in the way of toolkits which make it easy for an application developer to write an application which makes full use of the possibilities of the Grid.
Both MPICH-G2 and Condor allow specific classes of applications to make use of the power of the grid to run distributed applications, however access to resource and data management is still hard to do from an application, and IPC for distributed application is still hard.
Grid Apps 27
GAT - What ?GAT - What ?
The Grid Application Toolkit (GAT), which is currently being developed by the GridLab project aims to make this easier
The GAT aims to develop an API to enable application developers to make use of the best Grid infrastructure when and as it becomes available
The GAT API allows access to “fundamental grid operations”
The GAT abstracts these operations allowing access to alternative implementations or instances of entities providing these operations
Grid Apps 28
GAT - Why ?GAT - Why ?
People want to use the Grid However they don't want to have to learn all about
the various Grid technologies Users want to just submit a job and get results back Application developers want to be able to write
applications which can access Grid resources and run in a Grid environment; they don’t want to have to rewrite parts of their application when new technologies come along
Want to be able to have applications developed today, so they can use the Grid as it emerges.
Provides a “buffer zone” between applications and the Grid.
Grid Apps 29
The Grid is complex …The Grid is complex …
Monitoring
Resource Management
InformationSecurity
DataManagement
GLOBUS
ApplicationManager
Logging
NotificationMigration
Profiling
SOAP WSDL Corba OGSA Other
Other GridInfrastructure?
Cactus
“Is there a better resource I could be using?”
Grid Apps 30
…need to make it easier to use …need to make it easier to use
GAT
Cactus
“Is there a better resource I could be using?”
GAT_FindResource( )
The Grid
Grid Apps 31
GridLab ArchitectureGridLab Architecture
The GridLab architecture is split in several pieces: The application itself A library which interfaces between the application
and the Grid middleware The Grid middleware.
The GridLab project aims to develop the library – the GAT Engine - and a set of middleware – GridLab services.
Grid Apps 32
GAT: What is It?GAT: What is It?
GAT: Grid Application Toolkit Implements the GAT-API
Used by applications (different languages) GAT Adaptors
Connect to capabilities/services
GAT Engine Provides the function bindings for the GAT-API
Grid Apps 33
The GAT EngineThe GAT Engine
This is a library which applications link against to make use of Grid infrastructure
It provides stub calls for the basic Grid operations Applications can always make calls to any of these
operations, and will get an error back if it is not available.
Thus an application need not be re-written, recompiled or re-linked to make use of new middleware.
The actual access to middleware is provided by dynamically loadable modules which provide access to specific implementations of these grid operations
Grid Apps 34
GAT EngineGAT Engine
When an application makes a GAT-API call, the engine searches through an internal database of adaptors for the requested capability and calls it
Grid Apps 35
GAT AdaptorGAT Adaptor
Interface between GAT Engine and one or more capabilities Translates user requests to appropriate interface syntax
for a capability provider Active adaptors change dynamically Includes “security context” Return appropriate error codes
Examples OGSA adaptor (provides many capabilities) GRAM adaptor (directly talk to gatekeepers) Adaptors for each GridLab service provider “Local” adaptors (GAT_MoveFile => “cp”, GATFindResource =>
“localhost”)
Grid Apps 36
GAT AdaptorGAT Adaptor
Grid Apps 37
GAT Adaptor InitialisationGAT Adaptor Initialisation
Grid Apps 38
GAT Adaptor Call (I)GAT Adaptor Call (I)
Grid Apps 39
GAT Adaptor Call (ii)GAT Adaptor Call (ii)
Grid Apps 40
Current Status Current Status
Many GridLab services in development or available in prototype or alpha release Data management, resource brokering, application
monitoring, information systems, access for mobile devices GridLab portal under development GAT Engine available in prototype form
Usable with a test API Allows access to available GridLab services and some other
Grid middleware Grid Operations for GAT API being identified and
codified and actual API being developed.
Grid Apps 41
The Same Application … The Same Application …
Application
GAT
Application
GAT
Application
GAT
Laptop The GridSuper Computer
No network! Firewall issues!
Grid Apps 42
Getting Ready For The GridGetting Ready For The Grid
Grid Toolkits and middleware aren't a magic wand which will 'grid enable' your application. You still need to think !
Unless your application is very simple, or makes use of a framework, you are very likely to need to modify it.
Use standards ! This is basic to any portable application
If you want to locate data you need to be able to describe it. The grid will not magically decrease data processing time for
data intensive applications unless the data can be described adequately.
Grid Apps 43
Getting Ready ...Getting Ready ...
As said before, simple applications can be 'grid enabled' in a basic way by use of MPICH-G2 or Condor.
In fact any basic MPI application is 'grid enabled' ! However it may need modification to run optimally in a Grid environment.
Grid Apps 44
Various ScenariosVarious Scenarios
Application steering This should be done in some standard way so that in the
future it can be replaced by some actuators from some toolkit which gives authentication and authorisation
Checkpointing and IO Should be in some standard file format May want to advertise files to some data managements
system Visualisation
Again standard file or data formats allow grid middleware to operate
Can be linked to file advertising
Grid Apps 45
What Frameworks and Toolkits Give YouWhat Frameworks and Toolkits Give You
Frameworks such as Cactus give you a lot of these things. Using such a framework frees you as an application developer from having to worry about a lot of issues; the framework developers have hopefully done it instead.
Similarly toolkits such as the GAT free you from having to worry about specific Grid infrastructure. All the access to Grid middleware, and worrying about how it is deployed can be delegated to the toolkit.
Using frameworks and toolkits frees you from having to worry about a lot of generic things, leaving you more time and energy to work on application specific things.
Grid Apps 46
ExampleExample
A simple of example of an GAT enabled application will be presented and run.
This will be available fromhttp://www.gridlab.org/WorkPackages/wp-1/Examples