Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 1
Grid Computing, SAO, and Autonomic ComputingPaul GiangarraSr. Technical Staff Membere-mail: [email protected]
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 2
AgendaGrid Computing, a Brief IntroductionGrid Computing Core ConceptsGrid Computing Standards and ArchitectureInformation and Grid ComputingAutonomic Computing and Grid ComputingService Oriented Architecture and Grid Computing(Now What do I do With All This?)
The Realm of the PossibleSummary and Questions
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 3
What’s the Problem?Grid Problem:
Provide for flexible secure coordinated resource sharing among dynamic collections of individuals, institutions & resources (a.k.a. virtual organizations)This includes unique authentication, authorization, resource access, and resource discovery
Grid Challenge:Create an architecture and solution set based on open standards and where they exist exploit existing technologies to solve this
See: The Anatomy of the Grid by Foster, Kesselman, Tuecke
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 4
What Is NOT a Grid?The 8:00 AM rush hour (that’s gridlock)
A bunch of PCs on a network(it’s a lot more than that)
A cluster, a network attached storage device, a scientific instrument, a network, etc.(each is an important component of a Grid, but by itself each does not constitute a Grid)
KEY: Grid Computing is NOT a silver bullet!
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 5
So, What Is a Grid?More correctly, what is Grid Computing?
Based on services-oriented architectureBased on standard, open, general-purpose protocols and interfacesGrid Computing, Services, and Technologies:
Help coordinate and manage disparate and possibly heterogeneous resources that are not subject to centralized controlCan be used to deliver non-trivial quantities of serviceCan be used to aggregate disparate IT elements such as compute resources, data storage and filing systems to create a single, unified virtual system
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 6
StorageData
Applications
Processing I/O Operating System
Microcosm – Pre-Internet “System”
What Is Grid Computing?
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 7
What Is Grid Computing?
....a single unified image
StorageData
Applications
Processing I/O Operating System
Macrocosm – Distributed Resources and Applications
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 8
Grid Computing EnablesDistributed computing across networks using open standards supporting heterogeneous resources by providing facilities for:
Virtualized Sharing of ResourcesVirtual Organizations & Collaboration
Autonomic Management of ResourcesQuality of Service & Optimization
Secure Reliable Access to ResourcesOn Demand Computing and Utility Models
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 9
Grid Computing, SAO, and Autonomic Computing
Grid Computing Core Concepts
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 10
3 Models and Unique Value Propositions
IncreasedResults:
Resource useFlexibilityProductivityReliability/Availability
ComplexityTotal cost of ownership
Decreased
Grid Computing Value Proposition
On Demand“Access data & processing capabilities in a utility-like fashion…….. Make vs. Buy”
Processing“Aggregate processing power from a distributed collection of heterogeneous systems”
Data
“Secure access and sharing of distributed data & information ina collaborative fashion”
Resiliency“Improve the quality of service of distributed systems, despite unplanned events”
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 11
Grid Computing Resources & Types
Grid ResourcesComputationStorageDataApplicationsCommunication (I/O)Software & LicensesSpecial equipment, capacities, architectures, & policies
Grid TypesCollaboration GridCompute Grids
Desktop ScavengingServer
Data/Information GridsContentDataFileStorage
Grid Resources Virtualized Across the Grid Types
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 12
1. Intra-GridsGrid
NAS/SAN
Grid
NAS/SAN
Grid Deployment OptionsA Function of Business Need, Technology and Organizational Flexibility
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 13
1. Intra-Grids
2. Extra-Grids
GridGrid
NAS/SANNAS/SAN
Grid
NAS/SAN
VPN
A Function of Business Need, Technology and Organizational Flexibility
Grid Deployment Options
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 14
1. Intra-Grids
2. Extra-Grids
3. Inter-Grids
GridGrid
NAS/SANNAS/SAN
Grid
NAS/SAN
VPN
A Function of Business Need, Technology and Organizational Flexibility
Grid Deployment Options
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 15
Motivations for Grid Computing
SupportHeterogeneous
SystemsEnable
Collaboration
ReduceTime toResults
IncreaseCapacity
ImproveEfficiencyReduceCosts
ProvideReliability
& Availability
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 16
Motivations for Grid Computing
Increase CapacityExploit distributed resources to provide capacity for high-demand applications
• Existing applications that cannot be run effectively on a single processor
• New large scale application that provide strategic business advantages
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 17
Motivations for Grid Computing
Increase CapacityExploit distributed resources to provide capacity for high-demand applications
Improve Efficiency / Reduce Costs
Reduce infrastructure cost associated with over-provisioned resourcesReduce the cost of manpower to manage and configure resources
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 18
IBMIBMIBM
Provide Reliability / AvailabilityUse distributed resources Monitor work progressRestart failed jobs
Motivations for Grid Computing112234
567891011
JobScheduler
TIMEOUT !
JOB 1JOB 1 JOB 2JOB 2 JOB 3JOB 3JOB 1JOB 1Recovery / Restart
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 19
Motivations for Grid Computing
Reduce “Time to Results”Exploit opportunities for parallel computing to allow business critical computation to be completed in a timely fashionGain competitive advantage by allowing computation to be executed more frequently and on customer demand Deliver real-time results to internal and external customers
112
234
567891011
March
29March
28March
27
Serial Execution
Parallel Execution
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 20
Provide Reliability / AvailabilityUse distributed resources Monitor work progressRestart failed jobs
Support Heterogeneous systemsDifferent hardware, system platforms,
and available middlewareSpecialized equipment
Motivations for Grid Computing
Linux / Z-OS
IBM
IBM
AIX / Linux
IBM
IBM
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
serverp Se ries
IBM
H C R U6
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 21
Motivations for Grid Computing
Enable CollaborationsEnable collaboration across applications to integrate results Support large multi-disciplinary collaborationsBoth within a single organization and between partners
Air Force
ArmyNavy
C2C
MissionPlanning
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 22
Grid Computing, SAO, and Autonomic Computing
Grid Computing Standards and Architecture
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 23
The Value of Open Standards
Networking:The Internet
(TCP/IP)
Communications:e-mail
(pop3,SMTP,Mime)
Information:World-wide Web
(html, http, j2ee, xml)
Applications:Web Services
(SOAP, WSDL, UDDI)
Distributed Computing:Grid
(Globus / OGSA)
Operating System:Linux
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 24
Cooperation on Standards
MicrosystemsMicrosystems
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 25
WSDLDescribes what the service is, how to use it (XML document)
UDDI (optional)Yellow pages for web services
(Universal Directory, Discovery and Integration ) Directory
SOAPConnect the service (“the envelope”)
Core Web Services Technologies
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 26
Value Proposition
Increase business flexibility through standardized services
Enabling the ecosystem
Extend IT Infrastructure to suppliers and Business Partners
Radical reduction in complexity of integration
Leverage existing investments and skills
IBM provides the industry's broadest support for Web services
Development LifecycleTransaction ServicesInformation IntegrationCollaboration ServicesManagement Services
IBM Software Activities
Drive definition, adoption and interoperability of Web services
Open standards-based Open standards-based technology for flexible technology for flexible
integrationintegration
Making Web Services Work
Basic Profile 1.1 - Final Specification published August 24, 2004
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 27
Open Grid Services Architecture (OGSA)
Objectives:
Manage resources across distributed heterogeneous platforms Deliver seamless QoSProvide a common base for autonomic management solutionsDefine open, published interfaces
Exploit industry-standard integration technologies
Web Services: SOAP, XML, WSDL, WS-Security, UDDI…
Integrate with existing IT resources
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 28
Web Services “Stack”
HTTP(S), SMTP, FTP, BEEP, TCP/IP, …
Messaging
WSDL
Quality of Service
WS-Transactions
ComponentsComposite
Transport
SOAP RMI/IIOP, JMS, …
WS-CoordinationWS-SecurityWS-Reliable
Messaging
DescriptionWS-Policy
UD
DI, W
S-A
ddressing, WS
-Inspection
Atomic
BPEL4WS WS-Coord
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 29
Grid Protocol vs. Internet Protocol
Fabric
Connectivity
Resource
Collective
Applications
Applications
Transport
Internet
LinkGrid
Pro
toco
l Arc
hite
ctur
e
Inte
rnet
Pro
toco
l Arc
hite
ctur
e
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 30
Grid Computing Protocol Architecture
Resource and Connectivity protocols, which facilitate the sharing of resourcesBuild on capabilities provided by lower layersDesign goals:
Place few constraints on implementationFocus on small set of core abstractionsEmphasize identification and definition of protocols and servicesIdentify and define APIs and SDKsProvide for a Secure Environment
Fabric
Connectivity
Resource
Collective
Applications
The layered Grid Computing protocol architecture is based on Open Standards
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 31
Grid Protocol – FabricProvides the resources to which shared access is mediated by Grid protocolsExamples include computational resources, storage systems, catalogs, or network resources
Includes logical resources such as distributed file systems and clusters
Resources implement inquiry mechanisms that permit discovery of their structure, state, and capabilities
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 32
Grid Protocol – ConnectivityDefines core communication and authentication protocols required for Grid-specific network transactionsCommunication protocols enable the exchange of data between Fabric layer resources.Authentication protocols build on communication servicesProvide cryptographically secure mechanisms for verifying the identity of users and resources.
Asymmetric cryptography
TransportRouting Naming
Single Sign On Delegation Security Integration Trust Relationships
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 33
Grid Protocol – ResourceBuilds on Connectivity layer communication and authentication protocols
Defines protocols (and APIs and SDKs) for the secure negotiation, initiation, monitoring, control, accounting, and payment of sharing operations on individual resources
Concerned entirely with individual resourcesIgnores issues of global state and atomic actions across distributed collections
API/SDK
MonitorControl Negotiation
InitiationAccountingPayment
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 34
Grid Protocol – CollectiveProtocols and services (and APIs and SDKs) that are not associated with any one specific resource but rather are global in nature and capture interactions across collections of resources
Directory servicesCo-allocation, scheduling, and brokering servicesMonitoring and diagnostic servicesData replication servicesGrid-enabled programming systemsWorkload managementCommunity authorization and accountingSoftware discovery servicesCollaborative services
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 35
Grid Protocol – Application LayerPutting it all together
InteragencyCollaborative
Data Grid
ComputeIntensive
Simulation
WeatherSimulation and
Modeling
Utility compute providers HA Operational
SupportSystems
B2B Hubs and trading
networks }
}}}}
Application layer:Grid enabledsecure and scalableVirtual Organizations
Collective layer:Global interactionsand servicesResource layer:ResourcemanagementservicesConnectivity layer:Security, transport,routing,
Fabric layer:Physical resources
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 36
OGSA – Open Grid Services Architecture
Network
OGSA Enabled
Storage
OGSA Enabled
Servers
OGSA Enabled
MessagingOGSA Enabled
DirectoryOGSA Enabled
File SystemsOGSA Enabled
DatabaseOGSA Enabled
WorkflowOGSA Enabled
SecurityOGSA Enabled
Web Services
OGSI – Open Grid Services Infrastructure
Grid Data Services Grid Core
Services
Grid Program Execution Services
Domain Specific Services
OGSA Architected Services
Applications
Open Grid Services Architecture (OSGA)
Enabled Hardware and Operating System Platforms
Enabled “generalpurpose” middleware
Support for web services on a
variety of platforms, languages and protocols
Open architecture forinteroperability
Open and value-addedvendor implementations
Applications & systemsbuilt on standards
Open Standards Based Architecture: 2003
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 37
• OGSA Services can be defined and implemented asWeb services
• OSGA can take advantage of other Web services standards
• OGSA can be implemented using standard Web services development tools
• Grid applications will NOT require special Web services infrastructure
Network
OGSA Enabled
Storage
OGSA Enabled
Servers
OGSA Enabled
MessagingOGSA Enabled
DirectoryOGSA Enabled
File SystemsOGSA Enabled
DatabaseOGSA Enabled
WorkflowOGSA Enabled
SecurityOGSA Enabled
Web Services
WS-Resource Framework & WS-Notification are an evolution of OGSI
OGSI – Open Grid Services Infrastructure
Web Services
OGSA Architected Services
Applications
WS-
Serv
ice
Gro
up
WS-RenewableReferences
WS-
Notif
icatio
n
Modeling Stateful
Resources with Web Services
WS-Base Faults
WS-ResourceProperties W
S-Resource
Lifetime
WS-RF & WS-Notification and OGSA
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 38
Web Servicesdynamic, addressable, state-full, manageable
OGSA Structure
OGSA Architected ServicesGrid Data ServicesGrid Program Execution
Services Grid Core Services
WS-Addressing
WS-PolicyWS-CoordinationWS-Security
WS-Trust
Domain Specific Services
SecurityPolicy ManagementService CommunicationService Management Security
•Registries and Discovery Services (SG)
• Attribute Propagation and Query• Service Domain
•Service Orchestration •Metering & Accounting
• Installation & Deployment
• Messaging and Queuing Services
• Event Services• Distributed Secure
Logging Service
Policy ManagementService CommunicationService Management
• Authentication• Authorization &
Access Control• Credential
Validation & Transformation
• Trust Broker
• Policy Service Manager• Policy Agent• Policy Transformation Service• Policy Resolution Service• Policy Validation Service• Policy Administration Services
and Negotiation Framework
• Job Scheduler & Queuing Services
• Resource Reservation Services
• Workload Managers and Micro-Scheduling Services
• Data Access Services• Data Transformation &
Federation Services• Data Replication Service• Data Caching Service• MetaData Catalog Services
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 39
Meta OS Grid Services Service
CollectionsJob
SchedulingFile
TransferData
ReplicationProvisioningLoggingProblemDetermination
ResourceManagement
ClusterManagementPolicy
Security APIsglobus_gss_assist - simplifies the use of the GSSAPI in the globus environment [1.1.x, 2.0]GSS API - the Generic Security Service API C bindings (IETF draft) [version 2]
Information Service APIsOpenLDAP - an API for the LDAP protocol used by MDS (developed by the OpenLDAP Project) [version 1.2]
Communication APIsglobus_io - provides high-performance I/O with integrated security and a socket-like interface [1.1.x,2.0]globus_nexus - provides multithreaded, asynchronous, thread-safe multiprotocol communication facilities [1.1.x,2.0]globus_nexus_fd - provides NEXUS-based support for file descriptors and timed events (This API is obsolete as of release1.1.2. We recommend use of globus_io instead.) [1.1.1]
Data Access APIsglobus_ftp_control - provides low-level services for implementing FTP client and servers [2.0]globus_ftp_client - provides a convenient way of accessing files on remote FTP servers [2.0]globus_gass_copy - provides a uniform interface for accessing files using a variety of protocols [2.0]globus_gass - provides clients with access to remote files [1.1.x]globus_gass_transfer - provides an API for clients and servers involved in GASS data transferglobus_gass_cache - manages the local GASS cache on a client system [1.1.x,2.0]globus_gass_server_ez - provides a simple set of GASS server capabilities [1.1.x,2.0]globus_gass_server - provides GASS server functionality (This API is obsolete as of release 1.1.2. We recommend use of globus_gass_transfer instead.) [1.1.1]globus_gass_client - allows clients to get and put remote files via several protocols (This API is obsolete as of release 1.1.2. We recommend use of globus_gass_transfer instead.) [1.1.1]
Data Management APIsglobus_replica_catalog - provides an interface to a catalog of data collections, logical files, and physical locations [2.0]globus_replica_management - allows clients to manage files within a file replication system [2.0]
Resource Management APIsglobus_gram_client - provides remote job submission and management capabilities [1.1.x,2.0]globus_gram_myjob - provides a basic communication mechanism for processes within a GRAM job [1.1.x,2.0]globus_gram_jobmanager - provides a simple, consistent way to interact locally with a variety of schedulers such as LSF, LoadLeveler, PBS, Condor, etc. [1.1.x,2.0]globus_duroc - provides resource coallocation services for starting distributed jobs [1.1.x,2.0]
Fault Detection APIsglobus_hbm_client - allows a client process to be monitored by a Heartbeat Monitor system [1.1.x]globus_hbm_datacollector - allows clients to monitor multiple processes and enables the notification of exceptions [1.1.x]
Portability APIsglobus_module - provides a mechanism for activating and deactivating software modules [1.1.x,2.0]globus_libc - provides a portable implementation of libc[1.1.x,2.0]globus_thread - implements threads and synchronization mechanisms [1.1.x,2.0]globus_dc - provides cross-platform data conversion servicesglobus_utp - supports the use of timers for monitoring applications and other programs [1.1.x,2.0]globus_list - support for linked lists [1.1.x,2.0]globus_fifo - supports first-in-first-out queues [1.1.x,2.0]globus_hashtable - supports hash tables [1.1.x,2.0]globus_url - supports URL strings [2.0]globus_error - provides an abstract error type for function return codesglobus_poll - supports polling on I/O channels
see: http://www.globus.org/developer/api-reference.html
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 40
Recent Developments (Jan 20, 2004)WS-Resource Framework & WS-Notification
announced January 20th 2004at Globus World in San Francisco
Proposals to extend to Web servicesModeling Stateful Resources with Web Services
Driven by requirements from:Grid computingSystems ManagementBusiness computing
WS-
Serv
ice G
roup
WS-RenewableReferences
WS-
Notif
icatio
n
Modeling Stateful
Resources with Web Services W
S-Base Faults
WS-ResourceProperties
WS-Resource
Lifetime
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 41
A family of Web services specification proposalsIntroduces a design pattern to specify how to use Web services to access “stateful” componentsIntroduce message based publish-subscribe to Web services
WS-
Serv
ice G
roup
WS-RenewableReferences
WS-
Notif
icatio
n
Modeling Stateful
Resources with Web Services
WS-Base Faults
WS-ResourceProperties WS-Resource
Lifetime
IntroducedIn Jan
To be developed
What Was Announced
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 42
WS-NotificationProvides a publish-subscribe messaging capability for Web Services
WS-Resource FrameworkThere are many possible ways Web services might model, access and manage stateWS-RF is a family of Web services specifications that clarify how “state” and Web services combine
Both: Build upon existing Web services specifications and technologyHelp align Grid computing, Systems Management and Web services
Contributed to by:WS-Resource Framework: IBM, Globus, HPWS-Notification: IBM, Globus, Akamai, HP, SAP, Tibco, Sonic
What Was Announced
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 43
The WS-Resource Framework Model
What is a WS-Resource?Examples of WS-Resources: • Physical entities (e.g. processor, communication link,
disk drive)or Logical construct (e.g. agreement, running task, subscription)
• Real or virtual• Static (long-lived, pre-existing) or
Dynamic (created and destroyed as needed)• Simple (one), or Compound (collection)
Unique – Has a distinguishable identity and lifetime
resource
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 44
The WS-Resource Framework Model
Architecture rationaleWS-Resource framework exploits WS-Addressing
Web services and WS-Resources are referenced using an “Endpoint Reference”Services that create or locate WS-Resources returnEndpoint References
Web service and WS-Resource are separate:A Web service is statelessA WS-Resource provides a context / mechanism for stateful execution
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 45
WS-NotificationWS-Notification
Brings enterprise quality publish and subscribe messaging to Webservices
• Loosely coupled, asynchronous messaging in a Web services context• Composes with other Web services technologies• Facilitates integration between different messaging middleware
environmentsExploits WS Resource framework and Web services technologiesStandardizes the role of Brokers, Publishers, Subscribers and ConsumersProvides two forms of publish/subscribe: direct publishing and brokered publishing
Standardizes Web service message exchanges for publishing, subscribing and notification deliveryDefines XML model of Topics and TopicSpaces to categorize and organize notification messages
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 46
Open Grid Infrastructure (OGSI)
Grid Service Implementation Independence
HardwareOperating System
Other Middleware
Hosting Environment
Implementation
Abstract service interface remains the
same
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 47
Open Grid Infrastructure (OGSI)
Grid Service Implementation – Examples
Hardware
Operating System
Other Middleware
Hosting Environment - J2EE
File TransferService
File System
Storage System (NAS/SAN)
Implementation
Abstract service interface remains the
same
Database (DB2)
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 48
Grid Computing, SAO, and Autonomic Computing
Information…… and Grid Computing
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 49
Managing Information at Different Levels
Global NamingMeta-data and catalogFederation and Transformation
Data
Distributed File Systems / Remote AccessFile Transfer / Data ReplicationCaching
File
NAS / SAN “Storage Cluster”
Automatic or Dynamic provisioning of storage
Support for hierarchy managementStorage
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 50
IBM Products for an Information Grid
* Avaki is an IBM business partner
Data backup/restore, data archive and retrieve
Enterprise wide reporting, file level analysis, subsystem reporting, automated capacity provisioning
Creates pools of managed disks spanning multiple storage subsystems. Includes dynamic data-migration function.
Provides a common file system specifically designed for storage networks. Manages the metadata on the storage network instead of within individual network servers.
Provides scalable access to GPFS from outside cluster. GPFS + NFSv4 provides the performance of a SAN File System scalable to a WAN.
Cluster based, shared disk, parallel file system. Data and metadata can flow to all nodes and all disks in parallel. Featured in HPC environments. Available on pSeries and Linux clusters.
Data catalog, data provisioning, reusable data integrations, caching capabilities.
Relational database that runs on Linux, Unix, Windows, z/OS, and OS/390
Federated data server, replication server
Features BenefitsProduct
Centralized protection leading to faster backups and restores with less resources needed. Tivoli Storage Manager
Manageability features, Integrated Information capabilities via Web Services, Integrated business intelligence, and more
DB2 UDB
Security and access control in a grid environment.NFS v4
Storage on demand for file systems. Reclaim wasted space consumed by non-essential files. Ensure storage used efficiently for future capacity.
Tivoli Storage ResourceManager
Centralized point of control for volume mgmt. Allows administrators to migrate storage from one device to another w/o taking it offline.
SAN Volume ControllerStorageFile
Data
Not a client-server file system like NFS, DFS, or AFS: no single server bottleneck, no protocol overhead for data transfer.
GPFS (General Parallel File System)
Provides high performance access to data and enables sharing across heterogeneous application servers. Allows applications on any server within the SAN to access any file in the network without making changes to the application.
SAN File System
Provisioning, access, and integration of data from multiple, heterogeneous, distributed sources.
Avaki Data Grid 5.0*
Query and access distributed data without requiring central repository. Supports movement of data from mixed relational data sources.
DB2 Information Integrator
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 51
Grid Computing, SAO, and Autonomic Computing
Autonomic Computing… …and Grid Computing
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 52
A continuously evolving and dynamic state that establishes the correct balance between what is managed
by a person and what is managed by the system
Focus on business, not infrastructure
Autonomic Computing Is
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 53
Why Autonomic Computing?
Heterogeneity
Large state space
Unpredictable human element
Unpredictable scalabilityContinuous Change
Open-endedness
Connectedness
The interconnected characteristics of a
complex system need…
…Systems level understanding with certain
component and system characteristics
Real-timeSelf-adaptiveSelf-organizingSelf-healingSelf-formingSelf-testing Resilient
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 54
Self-managing Systems Deliver:Increased ResponsivenessAdapt to dynamically changing environments
Business ResiliencyDiscover, diagnose,
and act to prevent disruptions
OperationalEfficiencyTune resources and balance workloads to maximize use of IT resources
Secure Information and Resources
Anticipate, detect, identify, and protect
against attacks
“Autonomic computing allows companies to operate more efficiently and achieve more from their existing IT environments, enabling increased responsiveness, business continuance and availability.” — Rick Sturm
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 55
The Autonomic Element: Sense & Respond
An autonomic element contains continuous control loop that monitors activities and takes action Autonomic elements learn from past experience to build action plansManaged elements are consistently monitored
Knowledge
Analyze Plan
Monitor Execute
Element
Sensors Effectors
The autonomic computing control loop
“IBM’s autonomic approach to automation goes well beyond integration to the truly intelligent, responsive and proactive capabilities needed to deliver e-business on demand.”
— Mark Hydar
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 56
Levels of Automation
Level 2 Level 3 Level 4 Level 5Level 1
Basic
Managed
Predictive
Adaptive
Autonomic
Manual analysis and problem solving
Centralized tools, manual actions
Cross-resource correlation and guidance
System monitors, correlates and takes action
Dynamic business policy based management
Evolution not revolution
“Autonomic computing is a vision that will take several years to realize, but with the model that IBM has outlined, there are benefits attainable at every step, which pay you back... fairly quickly for the investments you make.”
— Mike Gilpin
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 57
Self-configuringAdapt automatically to the dynamically changing environments
Self-Configuring
Self-Configuring
Self-healingDiscover,
diagnose, and react to disruptions
Self-HealingSelf-
Healing
Self-optimizingMonitor and tune
resources automatically
Self-Optimizing
Self-Optimizing
Self-protectingAnticipate, detect, identify, and protect against attacks from anywhere
Self-Protecting
Self-Protecting
Autonomic Computing: Self Managing Systems
Autonom
ic Capabilities
OGSA Structure + Autonomic Backplane
Adaptive Grid
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 58
Grid Computing and the oDOE
Open
Linux
XML WSDLWSDL
SOAPOGSA
Self-protectingSelf-protecting
Self-healingSelf-healing
Self-optimizingSelf-optimizing
Self-configuringSelf-configuring
Autonomic
Virtualized
Integrated
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 59
Service-Oriented Architecture Evolution
Web Services
Complex Event Processing
Enterprise Infrastructure
Component Orchestration
Semantic Web
Standards-based info management framework
Warfighter events pattern recognition
Distributed collaborative processing with discovery
Orchestration of C4ISR components
Intelligent M2Mcollaboration
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 60
Service Oriented ArchitectureChange of Paradigm at the core of Grid Computing
Services “encapsulate” heterogeneous resourcesServices provide a compose-able, orchestrable, extensible base Common Resource Model (CRM) for abstractions key to manageability of resources
Simple Rules:Any function is implemented once and once only as a ServiceServices can be runtime or deployment-time re-usedService providers and requesters are loosely bound:
• Each service is defined by an implementation independent interface.• Services are defined in terms of common business function and data
models.• Communication protocols that emphasize interoperability and location
transparency are used to mediate service interactions
Service “contract” can come with a QoS “clause” (SLA)
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 61
Anatomy of a Service Interface
Interface by contractAn explicit interface definition or contract is used to bind a service requestor and a service providerSpecifies explicitly only the mutual behaviour -specifies nothing about the implementation of the requestor or the providerAllows either to change implementation or identity freely
Interface granularityBased on Service Type:Examples:
• Business Process Services• Business Transaction Services• Business Function Services• Technical Function Services
Interface Code
Interface Code
Internal code and processs
Shared process and interface definitions
CONTRACT
SYSTEM 1
SYSTEM 2
Internal code and processs
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 62
Refactoring: Things to Deal WithMany Existing Applications are Monolithic or Tightly Coupled Need to Re-Factor Applications
Some things to worry about are:• Distributed threads • Data locking• Latency
Re-Hosting ApplicationsExploit Meta-OS servicesAchieve platform independenceRe-Factor for distributed parallel execution
Need for Re-Hosted MiddlewareAbility to Exploit Grid computing services, e.g. Distributed ProvisioningManage (and exploit) Quality of Service across the Grid
Challenge: Move to and Exploit Services Oriented Architecture
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 63
Can Your Application Benefit from Grid Computing?
How do you know if your application can benefit from Grid computing? Ask these questions:Q. Is the application computationally intensive?Q. Does it serve a distributed or collaborative community?Q. Can the tasks or jobs the application performs run in parallel?Q. Does the application do pattern matching?Q. Does it have a reasonable network bandwidth profile?
A. If the answer to any or all of these is yes, then Grid-enablement is feasible.
Q. What is the application processing type (e.g., serial or batch)?
A. Batch is currently more amenable to Grid enablement.
Q. Do the operations within the task have time and/or sequencing dependencies?
A. The fewer dependencies, the better.
Q. What are the bottlenecks in the existing use of the application (e.g., single processor performance, scalability, memory, data output volume, pre/post processing)?
A. Grid can potentially address these bottlenecks.
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 64
Processors
Time223+837+383+662+121+554+123+816+228+772+452+827+972+274+...+832+971+753+981+2282+23
223+...+772
452+...+845
183+...+559
884+...+121
314+...+265
271+...+173
491+...+23
2443+...+9772
Parallel application done
Serial application done
Rearranging computations to execute in parallel on Grid
CPU – Make Execution Parallel
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 65
Sequence
Sequence
if
Loop
Sequence
Sequence
Sequence Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
if
Sequence
Sequence Sequence
if
if
CPU – Programming Code Control Graph
Rearranging computationsSeparate subgraphs to run in parallelConsider data dependenciesChange algorithms
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 66
Compute & Data Intensive Application
Video conversion problemCapture video tape onto computer hard drive• About 200 Megabytes per minute• 25 Gigabytes for a 2 hour tape
Compress video and audio• Can take days at higher quality level
Write VCD, SVCD, or DVD disk (650 MB to 4.7 GB)
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 67
Single stream:
Using a Grid:
VCR
2 hours
HD
24 GB
10 minutes
HD
4.7 GB
HD compression HD
Data
Transfer
Dat
aTr
ansf
er
compressioncompressioncompression
Data
TransferD
ataTransfer
Data
TransferD
ataTransfer D
ata
Tran
sfer
Dat
aTr
ansf
erD
ata
Tran
sfer
Dat
aTr
ansf
er
HDHDHDHDHDHDHDHD
45 minutesat 100mb/s
9 minutesat 100mb/s
The Grid
compression
<<30 hours
VCR HD
2 hours
compression
30 hours
HD
10 minutes
24 GB 4.7 GB
Compute & Data Intensive Application
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 68
Overlapping data transfer with capture and computing:
VCR
2 hours
HD
24 GB
10 minutes
HD
4.7 GB
compressionD
ata
Tran
sfer
HD
The Grid
Data
Transfer
Dat
aTr
ansf
erD
ata
Tran
sfer
Dat
aTr
ansf
erD
ata
Tran
sfer
Dat
aTr
ansf
erD
ata
Tran
sfer
Dat
aTr
ansf
erData
Transfer
Data
Transfer
Data
Transfer
Data
Transfer
Data
Transfer
Data
Transfer
Data
Transfer
HD HD HD HD HD HD HD
compressioncompressioncompressioncompressioncompressioncompressioncompression
Compute & Data Intensive Application
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 69
Six Strategies for Grid Application Enablement
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 70
Six Strategies for Grid Application Enablement
Strategy 1: Batch AnywhereOnly the grid (not the application, the client, the user, or anything else) decides which node to use for the jobThe machine submitting the job might not be a node in the gridExample application: a query to determine whether a given number, x, is a prime number. More than one node in the grid can submit the same query. The grid returns the correct results to the submitter.
Strategy 2: Independent Concurrent Batch Multiple independent instances of the same application run concurrently and independently without interference.Independent jobs are common. For example, Job X for Account A can run concurrently with Job X for Account B. Databases and other resources don't have hot spots or deadlocks.
Strategy 3: Parallel BatchTake each user's batch work, subdivide it, disperse it out to multiple nodes, collect it, and then aggregate the results.
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 71
Six Strategies for Grid Application Enablement
Strategies 4, 5, & 6 use services on the grid in order to get jobs done. Strategy 4: Service
Focus on the transition from a batch to a service-oriented architectureA follow-on to Independent Concurrent BatchIt is not assumed that each client subdivides its work and spreads it over multiple service instances
Strategy 5: Parallel ServicesService with the subdivided work model of Parallel Batch. Provides multiple service instancesPermits these instances to be invoked in parallel on the client's behalf
Strategy 6: Tightly Coupled Parallel ProgramsThe domain of specialized applications in engineering, physics, and biological modeling, such as finite state analysisProvides intense communications and synchronization between client and services and among services
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 72
From Enablement to Exploitation
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 73
Three Stages for ImplementationRun
Strategies 1 and 2, and the simplest form of Strategy 3, focus on the ability of an application to run in a grid.
AdaptThe more complex form of Strategy 3 as well as Strategies 4 and 5 significantly adapt the function and value of the business application by enabling it to use a grid without requiring many changes that are specific to grid middleware. The same application could be structured to run in a non-grid environment.
ExploitApplications at Strategy 6 exploit the grid or cluster infrastructure for their operation because they were written from the start with a grid in mind. Strategy 6 applications cannot finish in a timely and successful manner without running in a grid.
See: http://www-106.ibm.com/developerworks/grid/library/gr-enable/
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 74
It’s Not Just Limited to Applications
MiddlewareApplication ServersGPFS, Database, Transaction ManagersSystems Management SoftwareCollaborative Software…
ResourcesProcessorsStorageNetwork…
And more…
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 75
Example: GPFS Parallel AccessParallel Cluster File System
Cluster – fabric-interconnected nodes (IP, SAN, …)
Shared disk – all data and metadata on fabric-attached disk
Parallel – data and metadata flows from all of the nodes to all of the disks in parallel under control of distributed lock manager.
Fine grain locks – efficient sharing of individual files
GPFS File System Nodes
Switching fabric(System or storage area network)
Shared disks(SAN-attached or network
block device)
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 76
GPFS: Information Management For The Grid
Goal: sharing GPFS file systems over the WAN
WAN adds 10-60 ms latency… but under load, storage latency is much higher than this anyway!
New GPFS featureGPFS NSD now allows both SAN and IP access to storageSAN-attached nodes go directNon-SAN nodes use NSD over IP
Award winning demo at SC03
Work in progress
/NCSAGPFS File System
/NCSAover WAN
/Sc2003GPFS File System
/SDSCGPFS File System
/SDSCover WAN
/SDSCover SAN
/NCSAover SAN
SDSC Compute Nodes
Sc2003 Compute Nodes
NCSA Compute Nodes
NCSA NSD Servers
Sc03 NSD Servers
SDSC NSD Servers
Scinet
NCSA SAN
Sc03 SAN
SDSC SAN
Visualization
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 77
Some Important Infrastructure Considerations
SecurityAuthentication/authorizationClient and server concerns
Information servicesWhat resources exist, what is their state and how do I access them?
Data managementHow do I access, move, replicate data to where I need it?
Resource managementHow do I run a job and monitor its state?
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 78
Grid Computing, SAO, and Autonomic Computing
The Realm of the Possible
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 79
Why Are Customers Implementing Grid Computing Solutions?
Accelerate Business ProcessesGrids provide the ability to shorten application run-times without upgrading existing servers.(i.e. Charles Schwab, MassMutual, RBC Insurance, Nippon Life Insurance, Royal Dutch Shell, EADS)Ability to run new High Performance Computing (HPC) applicationsGrid computing provides the opportunity to run new applications due to the cost effective grid virtual computing environment. (i.e. AIST, UMass, FNMOC, TeraGrid)Data Sharing & CollaborationGrid architecture provides the ability to store, share and analyze large volumes of data(i.e. eDiamond, NDMA, WestGrid, CERN, European DataGrid, Kansai Electric)Accelerate Research & DevelopmentGrids provide Life Science companies the ability to speed up drug research & development.(i.e. Smallpox Grid, Aventis, Novartis)I/T Optimization & Resiliency – Virtualization of Servers & Storage
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 80
Grid Infrastructure
Grid Computing – Industry Applications
DerivativesAnalysis
Statistical Analysis
Portfolio Risk
Analysis
Batch Throughput
Product Design
Process Simulation
FiniteElement Analysis
Failure Analysis
Cancer Research
Drug Discovery
Protein Folding
Protein Sequencing
CollaborativeResearch
Weather Analysis
High Energy Physics
Unique by Industry with Common Characteristics
Seismic Analysis
Reservoir Analysis
Bandwidth Consumption
Digital Rendering
Multiplayer Gaming
Primary Focus
Energy
Financial Services
Manufacturing
Life Sciences Telco & Media
Government & Higher Education
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 81
IBM Grid Focus Areas and Information Grid
Financial ServicesPublicIndustrial
PublicPublicIndustrial
IndustrialFinancial ServicesPublic Industrial
Sectors
Provide large scale data sharing infrastructure for industry and scientific collaboration
Virtualized distributed storage and data resources
Facilitating access to large scale data marts and broadly distributed client data.
Sharing design data across large multi-party projects.
Sharing of public data sources. Also supports use of shared compute resources.
Information Grid
Create large-scale IT infrastructures to drive economic development and/or enable new government services
Optimize computing and data assets to improve utilization, efficiency and business continuity
Enable faster and more comprehensive business planning and analysis through the sharing of data and computing power
Share data and computing power, for computing intensive engineering and scientific applications, to accelerate product design
Accelerate and enhance the R&D process by enabling the sharing data and computing power seamlessly for research intensive applications
Description
Government Development Grid
Enterprise Optimization
Grid
Business Analytics Grid
Engineering and Design Grid
Research and Development Grid
Virtualization of Compute and Information Resources
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 82
MREN STARTAPMAGPI
NCNISOX
ESNet
CANet*2
U Toronto
U Penn
U NCOak Ridge
U CHI
NGIX Chicago Peering Point
Indianapolis GigaPop
Atlanta GigaPop
New York GigaPop
Abilene Peer Network
Abilene Connector
Project Site
Sponsored by: University of Pennsylvania
National Digital Mammographic Archive
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 83
Research & DevelopmentResearch & DevelopmentCombines Grid Computing with Radiology to makebreast cancer diagnosis faster and treatment moreeffective
IBM assisted with implementing a Gridinfrastructure across the hospitals to manage andretrieve digital mammograms
Secure transmission of all patient records
Grid solution architecture includes:IBM pSeries, xSeries, Linux, DB2, GPFS, Globus
WebSite: http://nscp.upenn.edu/NDMA
NDMA: National Digital Mammographic Archive
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 84
Connects nine (9) major supercomputing sites: NCSA, SDSC, Argonne NL, CalTech, PSC, UTexas, IndianaU, PurdueU, Oak Ridge NL
40 gigabit network backbone connecting the sites20 Teraflops of computing power1 Petabyte of disk accessible data storage
Accessible to thousands of scientists working on advanced research
Applications include:Real Time Brain MappingEarthquake ModelingMolecular Dynamics simulationMcell – Monte Carlo simulation of cellular micro physiologyEncyclopedia of Life – Protein catalog
IBM project team and solution includes:IBM High Performance Computing (HPC) expertiseIBM GPFS expertiseIBM Linux Clusters – Itanium2 processorsIBM Power4 processors – p690 RegattasIBM Grid Computing & Linux consulting services
The TeraGrid – Extensible Terascale Facility
National Science Foundation Grid Computing project ($90M):
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 85
CERN
Cambridge
Newcastle
EdinburghUS Sites
EU
Glasgow
Cardiff
Southampton London
Belfast
Dublin
Oxford
Manchester
Multiple Grid Applications including:HighEnergy PhysicsAircraft Engine MaintenanceCombinatorial ChemistryOceanographic studyParticle Physics & AstronomyBiomolecular analysisEnvironmental simulation
Heterogeneous Grid:IBM, Sun, HP serversLinux, Globus, Condor, SRB
UK eScience Grid
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 86
The SMALLPOX Research GridA massive distributed computing grid running a computational chemistry application to help fight the smallpox virus:
Screened 35 million potential drug moleculesTwo (2) million computer processors in 200
countries were connected to this grid
The Grid architecture will reduce the time required to develop a commercial drug by several years:
“In-Silico” Research
IBM collaborated with:United DevicesAccelrysEvotec OAIUS Department of DefenseOxford University
IBM provided the hardware and software for storing and analyzing the molecule screening results:
p690, AIX, DB2, Linux
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 87
Butterfly.netThe Butterfly Grid:
an end-to-end solution designed to support up to one million simultaneoususersbased on IBM WebSphere Application Server, DB2 and the Globus Toolkitrunning on IBM eServerxSeries clusters at an IBM e-business Hosting Center
Modeling and Simulation platform
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 88
The Butterfly Grid: Service Provider Program
Package:Butterfly server software suiteButterfly game admin appsGlobus, provisioning, policy mgt, billing
Shared Grid or DedicatedGamersIndustrialMilitary
NotesInter-node resource sharingValue-added broadband packageSLAs, QoS guarantees, ratings/certification
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 89
Japan AIST(National Institute of Advanced Industrial Science & Technology)
Collaborations
Government
Life Science Nanotechnology
LAN Internet
Academia Corporations
Grid Technology
Advanced Computing Center.
Other Research Institutes
One of the world’s most powerful Linux-based supercomputersMore than 11 trillion calculations per secondMore powerful than the current third most powerful supercomputer in the world
Solution Linux Cluster
• 2116 CPU AMD Opteron Cluster• 520 CPU Intel Madison Cluster
Globus Toolkit 3.0 (OGSA)
ChallengeAIST, Japan‘s largest national research organization needed to provide an on-demand computing infrastructure which dynamically adapts to support various research requirements of its collaborators focusing on grid computing, life sciences, and nanotechnology.
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 90
Grid Computing @ IBM
Charlotte (1) 3RTP (2) 7
Cambridge (2) 4Hawthorne (2) 4Poughkeepsie (4) 4Somers (1) 1Southbury (4) 13Yorktown Heights (7) 28
Markham (2) 14
San Jose (7) 13San Mateo (2) 7
Hursley (1) 3London (1) 2
Montpellier (2) 10
Uithorn (1) 2
Boeblingen (2) 2
Zurich (1) 6
Haifa (1) 2Austin (9) 58Roanoke (2) 4
Bangalore (1) 1
Chiba (1) 2Tokyo (2) 2
Taipei (2) 2
Rochester (3) 15
Chicago (1) 1 Sapporo (1) 4Beijing (2) 7
27 different geographic locations137 end user teams66 Grid applications
Heterogeneous platforms:- Linux on x, z, p series- AIX on pSeries
Globus 2.2 & Globus 3.0
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 91
IBM Grid Middleware –Product Roadmap
Grid Services (OGSA) & Web Services
Scheduling
Information Virtualization
Provisioning
Workload Management
Billing and Metering
Transaction Management
Gri
d C
apab
iliti
es
TotalStorage
GridXpertGridXpert
IBM Grid Toolbox
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 92
1. Intra-Grids
2. Extra-Grids
3. Inter-Grids
GridGrid
NAS/SANNAS/SAN
Grid
NAS/SAN
VPN
A Function of Business Need, Technology and Organizational Flexibility
Grid Deployment Options
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 93
…. Look Familiar?
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 94
… How About This?
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 95
SummaryGrid Computing still evolvingIt is built on existing and new open computing standardsIt exploits existing components and technologiesIt can and is being used todayThere are many ways and places to exploit Grid ComputingMake decisions based on “business” needsIBM is leading with both products and services for Grid Computing
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 96
Thank You
Questions?
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 97
References (Articles and Publications)M.Mitchell Waldrop, Grid Computing, MIT Technology Review, May 2002, pgs 30-37
I. Foster, C. Kesselman, S. Tuecke, The Anatomoy of the Grid, http://www.globus.org/research/papers/anatomy.pdf
I. Foster, C. Kesselman, J. Nick, S. Tuecke, The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration, http://www.globus.org/research/papers/ogsa.pdf
I. Foster, C. Kesselman, eds., The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, San Francisco, Calif. (1999)
IBM Redbook: Introduction to Grid Computing with Globus, http://www.ibm.com/redbooks/
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 98
References (URLs)IBM Grid Web Site: http://www.ibm.com/grid/Globus: http://www.globus.org/OGSA (Open Grid Services Architecture): http://www.globus.org/ogsa/Global Grid Forum: http://www.gridforum.orgGrid Computing Planet: http://www.gridcomputingplanet.comGrid Today Newsletter: http://www.gridtoday.comNASA's Information Power Grid: http://www.ipg.nasa.govDOE Science Grid: http://www.doesciencegrid.orgParticle Physics Data Grid, PPDG: http://www.ppdg.net/National Digital Mammographic Archive:http://www.isi.edu/us-uk.gridworkshop/presentations/hollebeek.pdfNSF TeraGrid: http://www.teragrid.org/Nasa Information Power Grid: http://www.nas.nasa.gov/About/IPG/ipg.htmlUK eScience Program: http://www.research-councils.ac.uk/escience/UK e-Science Grid Program: http://www.escience-grid.org.uk/e-Diamond: http://www.gridoutreach.org.uk/docs/pilots/ediamond.htmEuropean Union DataGrid Project: http://www.eu-datagrid.org/
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 99
Additional IBM Grid Information: Red Paper & Red Book
Download from http://www.redbooks.ibm.com
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 100
IBM RedBook: Grid Enabling Applications
Download from www.redbooks.ibm.com
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 101
References:http://www.ibm.com/developerworks/grid/library/gr-visual
developerWorks Journal, November 2003 Issue
Good reference for IBM and customer technical people
Covers some of the same material as this presentation
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation
Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 102
http://www.varbusiness.com/sections/news/breakingnews.asp?articleid=45311varBusiness, October 27, 2003 Issue
References: (Continued)