DEFINITION Grid computing is the federation of computer
resources from multiple administrative domains to reach a common
goal. computataional of grid is a hardware and software
infrastructure that provides independent pervasive and inexpensive
access to high end computational capabilities.
What is Grid Computing? Who Needs It? An Illustrative Example
Grid Users Current Grids
What is Grid Computing? Computational Grids Homogeneous (e.g.,
Clusters) Heterogeneous (e.g., with one-of-a-kind instruments)
Cousins of Grid Computing Methods of Grid Computing
Computational Grids Each user A network of geographically
distributed resources including computers, peripherals, switches,
instruments, and data. should have a single login account to access
all resources. Resources may be owned by diverse
organizations.
Computational Grids Grids are typically managed by grid ware.
Grid ware can be viewed as a special type of middleware that enable
sharing and manage grid components based on user requirements and
resource attributes (e.g., capacity, performance,
availability)
Cousins of Grid Computing Parallel Computing Distributed
Computing Peer-to-Peer Computing Many others: Cluster Computing,
Network Computing, Client/Server Computing, Internet Computing,
etc...
Distributed Computing People often ask: Is Grid Computing a
fancy new name for the concept of distributed computing? In
general, the answer is no. Distributed Computing is most often
concerned with distributing the load of a program across two or
more processes.
PEER2PEER Computing Sharing of computer resources and services
by direct exchange between systems. Computers can act as clients or
servers depending on what role is most efficient for the
network.
Distributed Supercomputing Combining multiple high-capacity
resources on a computational grid into a single, virtual
distributed supercomputer. Tackle problems that cannot be solved on
a single system.
High-Throughput Computing Uses the grid to schedule large
numbers of loosely coupled or independent tasks, with the goal of
putting unused processor cycles to work.
On-Demand Computing Uses grid capabilities to meet short-term
requirements for resources that are not locally accessible. Models
real-time computing demands.
Collaborative Computing Concerned primarily with enabling and
enhancing human-to-human interactions. Applications are often
structured in terms of a virtual shared space.
Logistical Networking Global scheduling and optimization of
data movement. Contrasts with traditional networking, which does
not explicitly model storage resources in the network. Called
"logistical" because of the analogy it bears with the systems of
warehouses, depots, and distribution channels.
Who Needs Grid Computing? A chemist may utilize hundreds of
processors to screen thousands of compounds per hour. Teams of
engineers worldwide pool resources to analyze terabytes of
structural data. Meteorologists seek to visualize and analyze
petabytes of climate data with enormous computational demands.
An Illustrative Example Tiffany Moisan, a NASA research
scientist, collected microbiological samples in the tidewaters
around Wallops Island, Virginia. She needed the high-performance
microscope located at the National Center for Microscopy and
Imaging Research (NCMIR), University of California, San Diego.
She sent the samples to San Diego and used NPACIs Telescience
Grid and NASAs Information Power Grid (IPG) to view and control the
output of the microscope from her desk on Wallops Island. Thus, in
addition to viewing the samples, she could move the platform
holding them and make adjustments to the microscope.
CONT.. The microscope produced a huge dataset of images. This
dataset was stored using a storage resource broker on NASAs IPG.
Moisan was able to run algorithms on this very dataset while
watching the results in real time.
Grid Users Grid developers Tool developers Application
developers End Users System Administrators
Grid Developers Very small group. Implementers of a grid
protocol who provides the basic services required to construct a
grid.
Tool Developers Implement the programming models used by
application developers. Implement basic services similar to
conventional computing services: User authentication/authorization
Process management Data access and communication
Application Developers Construct grid-enabled applications for
end-users who should be able to use these applications without
concern for the underlying grid. Provide programming models that
are appropriate for grid environments and services that programmers
can rely on when developing (higher-level) applications.
System Administrators Balance local and global concerns. Manage
grid components and infrastructure. Some tasks still not well
delineated due to the high degree of sharing required.
ADVANTAGE Can solve larger, more complex problems in a shorter
time Easier to collaborate with other organizations Make better use
of existing hardware
DISADVANTAGE Grid software and standards are still evolving
Learning curve to get started Non-interactive job submission
The grid- Present, Past, Future Number of derivatives in grid
computing. Share resources and different architecture.1. Compute
Grids2. Data Grids3. Science Grids4. Access Grids5. Knowledge
Grids6. Cluster Grids7. Terra Grids8. Commodity Grids
1. Compute Grids vendors: Grid Gain - Professional Open Source
JPPF - Open Source2. Data Grids vendors: Oracle Coherence-
Commercial GemStone- Commercial GigaSpaces Commercial JBossCache -
Professional Open Source EhCache- Open Source
Data Functional data requirements for Grid Computing
applications are: To integrate multiple distributed, heterogeneous,
and independently managed data sources. Data transfer mechanisms
Data caching and/or replication mechanisms to minimize network
traffic. Data discovery mechanisms Data encryption and integrity
Backup/restore mechanisms and policies
ComputationFunctional computational requirements for grid
applications are: Independent management of computing resources.
Intelligently and transparently select computing resources.
Availability, dynamic resource configuration, Failure detection and
failover mechanisms. Secure resource management, access, and
integrity.
Computational and Data GridsData requirements in the early grid
solutions: Discover data. Databases, utilizing meta-data and other
attributes of the data. The provisioning of computing facilities
for high-speed data movement. Flexible data access and data
filtering capabilities.
Current Grid Activities Sharing of resources can be different
in present grid.1. Computing power2. Data3. Hardware4. Software5.
Network services
Dynamic benefits of coordinated resource sharing in a
virtualorganization.
The usage patterns found within each of the virtual
organizations. A virtual organization for weather prediction. For
example, this virtual organization requires resources such as
weather prediction software applications to perform the mandatory
environmental simulations associated with predicting weather. A
virtual organization for financial modeling. For example, this
virtual organization requires resources such as software modeling
tools for performing a multitude of financial analytics,
virtualized blades to run the above software, and access to data
storage facilities for storing and accessing data.
Number of requirements for Grid Computing architectureThree
categories1. Resource categories2. Virtual organization3.
Users/Applications
Providing facilities for the following scenarios: Dynamic
discovery of computing resources, based on their capabilities and
functions. Immediate allocation and provisioning of these
resources, based on their availability and the user demands or
requirements. The management of these resources to meet the
required service level agreements (SLAs). The provisioning of
multiple autonomic features for the resources, such as self-
diagnosis, self-healing, self-configuring, and self-management. The
provisioning of secure access methods to the resources, and
bindings with the local security mechanisms based upon the
autonomic control policies.
Virtual organization must be capable of providing facilities
for: Virtual task forces, or groups, to solve specific problems
associated with the virtual organization. Dynamic collection of
resources from heterogeneous providers based upon users needs and
the sophistication levels of the problems. Dynamic identification
and automatic problem resolution of a wide variety of troubles,
with automation of event correlation, linking the specific problems
to the required resource and service providers. The dynamic
provisioning and management capabilities of the resources required
meeting the SLAs. The formation of a secured federation (or
governance model) and common management model for all of the
resources respective to the virtual organization. The secure
delegation of user credentials and identity mapping to the local
domain(s). The management of resources, including utilization and
allocation, to meet a budget and other economic criteria.
Users/applications typically found in Grid Computing
environments must be able to perform the following characteristics:
The clear and unambiguous identification of the problem The
identification and mapping of the resources The ability to sustain
the required levels of QoS, while adhering to the anticipated and
necessary SLAs. The capability to collect feedback regarding
resource status, including updates for the environments respective
applications.
GRID APPLICATIONS Grid computing applications can be aligned to
have a common needs Application partitioning that involves breaking
the problem into discrete pieces Discovery and scheduling of tasks
and workflow Data communication distributing the problem data where
and when it is required Provisioning and Distributing application
codes to specific system nodes
Contd Results management assisting in the decision process of
the environment Autonomic features such as self-configuration,
self- optimization , self-recovery and self management Let us
explore some of these Grid application and their usage pattern
Schedulers Responsible for management of jobs such as
allocating the resource needed for any specific job , parallel
execution of tasks, data management and service level management
Schedulers form the hierarchical structures , with meta schedulers
as the root and other schedulers as the leaves Meta schedulers or
cluster schedulers for parallel execution
Scheduler embodies local , meta-level and cluster schedulers
LOCALDiagram : SCHEDUL ER JOB META JOB META SCHEDUL SCHEDULE USER
ER R JOB CLUSTER SCHEDULE R
Contd Jobs submitted to the grid computing applications are
evaluated based on the ir service level requirement It involves the
complex work flow management and data movement activities to occur
on a regular basis
Contd There are schedulers that must provide capabilities for
areas such as Advanced resource reservation SLA validation and
enforcement Monitoring job execution and status Rescheduling and
corrective action
Resource Broker It providing the paring service between the
service requester and service provider This paring enables the
selection of best available resource It will collect the
information from the respective resources and uses this information
for paring purpose
Resource Broker Diagram: RESOURCE BROKER SELECT RESOURCE
INFORMATION RESOURCE 1 USER SELECT SCHEDULER EXECUTE TASK
INFORMATION SCHEDULER RESOURCE 2 EXECUTE TASK
Contd Resource broker provides the feed back to the users on
the available resource Resource broker may select the suitable
scheduler for the execution of tasks
Contd The paring process in a resource broker involves
allocation and support functions such as Allocate appropriate
resource for task execution Support users deadline and budget
constraints for scheduling optimizations
Load balancing Load balancing features must always be
integrated into any system in order to avoid processing delays and
over commitment of resource Load balancing may be built in
connection with resource broker and schedulers The level of load
balancing involves partitioning of jobs ,identifying the resource
and queueing of the jobs
Contd Used to running the parallel jobs in parallel It support
failure detection and management It redistribute the jobs to other
resource if needed
Grid portals Grid portals are like the web portals grid portals
provide the uniform access to the grid resources Grid portals
provide 1. Resource access 2. Scheduling capabilities 3. monitoring
statusinformation
Contd some examples of grid portals capabilities are 1.
Querying database 2. File transfer facilities 3. Manage job through
job status feed back 4. Security management 5. Provide personalized
solution
Integrated solutions It is the combination of existing advanced
middleware and application functionalities combined to provide high
performance results across the grid computing environment It
support more complex utilization of grid such as the coordinated
and optimized resource sharing, enhanced security management, cost
optimization ,etc It achieves the level of flexibility utilizing
infrastructure provided by application and middleware frame
works
GRID INFRASTRUCTURE GRID infrastructure forms the core
foundation for the successful grid applications Grid computing
infrastructure component must address several potentially
complicated areas in many stages of implementation , they are 1.
Security 2. Resource management 3. Information services 4. Data
management
Diagram: GRID APPLICATIONS G G R R I I D RES INF D DAT M M OUR
OR A I I CE MAT D SECU ION D MA D D RITY MA NAG L NAG SER L EME E E
EME VICE NT W W NT S A A R R E E HOSTING ENVIRONMENT
Security Heterogeneous nature of resources complicated polices
- complex security schemes These computing resources are hosted in
differing security domains and Heterogeneous platforms Security
requirements data integrity , confidentiality and information
privacy
Contd The grid computing data exchange must be protected using
secure communication channels including SSL/TLS Secure message
exchange mechanisms such as WS- Security Security infrastructure
grid security infrastructure (GSI)
Contd Resource management area is the selection of correct
resource from grid resource pool Fully based on SLA
Information services Providing valuable information respective
to grid computing infrastructure resources Service are entirely
depends on resource availability, capacity and utilization The
information is valuable and mandatory feedback respective to
resource managers Grid solutions are constructed to reflect portals
Metrics are helpful in SLA
Data management Data forms the single most important asset in a
grid computing system Data maybe input to the resource output from
the resource Data must be near to the computation where it is used
Data storage mechanisms Storage Area Network (SAN) ,network file
system, virtual database
contd Developers and providers must factor into decision are
related to selecting the most appropriate data management mechanism
for grid computing infrastructureThis includes size of 1. data
repositories2. resource geographical distribution3. security
requirements4. schemes for replication5. caching facilities