Date post: | 15-Dec-2015 |
Category: |
Documents |
Upload: | mauricio-dills |
View: | 218 times |
Download: | 2 times |
http://www.cs.utk.edu/netsolve
NetSolve Happenings
A Progress Report of the NetSolve Grid Computing
System
Cluster and Computational Grids for Scientific ComputingSeptember 24-27, 2000
Le Château de Faverges de la Tour, Lyon, France.
http://www.cs.utk.edu/netsolve
Outline
• The Grid.• NetSolve Overview.• The Key to Success:
– Interoperability.– Applications.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ • Interoperability, Applications and
NetSolve.
http://www.cs.utk.edu/netsolve
Current Trends in HPC
• Highlights of TOP 500 computers (June 2000).– #1: 9632 processor Intel based “ASCI Red” at
Sandia National Laboratory. 2379.6 Gflops. (74.2%)– #2 & #3: 2144 Gflops & 1608 Gflops. (55%, 52%)– Others in top 10: LLNL, LANL, Leibniz
Rechenzentrum (Munich), University of Tokyo.– #10: 815.1 Gflops. 1324 procs, Cray T3E900
(68.4%)– #250: 58.68 Gflops. 256 procs, Hitachi based arch.
(76.2%)– #500: 43.82 Gflops. 64 procs, SunHPC (400Mhz)
(85.6%)
http://www.cs.utk.edu/netsolve
Computational Grids• Motivation
– Regardless of the number and capacity of computational resources available,there will always be a need/desire formore computational power.
– Innovations to increase computationalcapacity not only through hardware, but software infrastructures as well.
– Often the case where all resources (data, storage facilities, computational servers, human users, etc.) are distributedly (even globally) located.
– Need for technology that reliably manages large collections of distributed computational resources, efficiently scheduling and allocating their services to meet the needs of users while providing robustness, high availability and quality of service.
http://www.cs.utk.edu/netsolve
Vision for the Grid
• Uniform, location independent, and transient access to the resources of science and engineering to facilitate the solution of large scale, complex, multi-institutional, multidisciplinary data and computational based problems.
• Resources can be:– Hardware (networks, CPU,
storage, etc.)– Software (libraries, modules,
source code, etc.)– Human collaborators
http://www.cs.utk.edu/netsolve
Attack of the Grid NetSolve
AppLeS
NWS
IBP
HabaneroCumulvs
Harness
WebOS
TeraWeb
PVM
Ninf
Globus
CondorJINI
Legion
Electronic Notebook
UniCoreNinja
NEOSPUNCH
Everyware
NCSA WorkbenchWebflow
Gateway JiPANG
LoCI
IPG NAG-NASA
SinRG
http://www.cs.utk.edu/netsolve
The NetSolve Grid Environment
- Brief Overview of the NetSolve System.
http://www.cs.utk.edu/netsolve
NetSolve Overview
• More than just a “not very well-defined user-level protocol!”
• Problem Solving Environment Toolkit• Client/Agent/Server system.• Remote access to hardware AND software.• “Robust, fault-tolerant, flexible,
heterogeneous environment that provides dynamic management and allocation policies for distributed computational resources.”
http://www.cs.utk.edu/netsolve
Is That Your Final Answer?
NetSolve - The Big Picture
ServiceResults
AgentInformation Service Query
Client
Scheduling
Computational Resources
Dude, I need more computer power.…AND my software selection totally sucks!
What’s the name of thatrocking system again?
NetSolve!
http://www.cs.utk.edu/netsolve
NetSolve Infrastructure
C Fortran
Matlab SCIRun Custom
PSEs andApplications
Metacomputing Resources
Globus
Globusproxy
Ninf Legion
Ninfproxy
Legionproxy
NetSolve
NetSolve
NetSolveproxy
MiddlewareResource Discovery
System Management Resource Scheduling
Fault Tolerance
http://www.cs.utk.edu/netsolve
NetSolve Credits
• Sudesh Agrawal• Dorian Arnold• Dieter Bachmann• Susan Blackford• Henri Casanova• Jack Dongarra• Yuang Hang• Karine Heydemann
• Michelle Miller• Keith Moore • Terry Moore• Ganapathy Raman• Keith Seymour• Sathish Vahdiyar• Tinghua Xu
http://www.cs.utk.edu/netsolve
The Problem
• The goal of the grid: “enable and maintain the controlled sharing of distributed resources to solve multidisciplinary problems of common interest to different groups or organizations.”
• Hodgepodge of systems – each possessing their unique perspective, AND UNFORTUNATELY their unique custom protocols and components.
http://www.cs.utk.edu/netsolve
Why The Problem?
• Sociological:– Of course, mine is bigger, better, … Even if not, I
cannot admit that, dismiss my efforts and use yours.
• Technical:– Immaturity– Doesn’t exactly fit needs– Software problems
• Economical:– Reinvest time and efforts, throwing away existing
code to incorporate ones.– I’ve been funded for this, so …
http://www.cs.utk.edu/netsolve
The Problem (cont’d)
• No single system will emerge as the single Grid computing system of choice:– Each has unique characteristics that appeal to
different classes of users• Ease of install/administration/maintenance• Stringent Security• Ease of integration• Performance• Interface• Services Provided• Code Robustness/system maturity• …
http://www.cs.utk.edu/netsolve
Q & A
If interoperability is indeed desirable,necessary or both for success of the Grid.
AND
The consensus is an unwillingness to changeexisting custom protocols, objects, etc.
THEN
Are we stuck?
http://www.cs.utk.edu/netsolve
Current Solutions
• Laborious integration efforts that only work between specific systems, typically under specialized circumstances.
Globus
Condor
Condor-G
NetSolve
Globus
NetSolve Proxies
Condor
NetSolve
Condor-ServersNinf
NetSolve
Ninf Proxies
http://www.cs.utk.edu/netsolve
NetSolve
EveryWare
GlobusLegion…
Current Solutions (cont’d)
•Computing Portals as front-ends tosweep the dirt of un-interoperablesystems under the cover.
GlobusPBS
NPACI Resources
HotPage
GlobusNinf
NetSolve
JiPANG
Legion
STOPCAUTION
http://www.cs.utk.edu/netsolve
A Better Solution?
• Representation standards for objects, protocols, services, etc. would be ideal.
•Is there a possibility of using _____ to allow us to keep our customizations while allowing other systems to translate/interpret them?
XML?
http://www.cs.utk.edu/netsolve
NetSolve Interoperability
• XML PDFs– Use XML as the language to implement the
description of software services.– Proliferation of XML tools and parsers to exploit.– Collaboration with Ninf project to establish a
standardized IDL.
• Investigate XML representation for “standard” Grid components – machines, storage, etc.
• Standard objects/languages allow systems to share information. There still needs to be some commonly understood protocols to allow inter-system transactions.
http://www.cs.utk.edu/netsolve
NetSolve Interoperability
• Within the current NetSolve framework:– Publishing the client-proxy interface
allows other metacomputing systems to easily leverage NetSolve resources via.
– Implementing new proxies allow NetSolve client users to leverage other metacomputing systems.
http://www.cs.utk.edu/netsolve
Client Proxies
• Negotiates for metacomputing services on behalf of the client.
• Allows client to be more lightweight.• Proxies provide a translation between
“language” of the client and “language” of the underlying services, i.e. NetSolve, Globus, etc.
http://www.cs.utk.edu/netsolve
NetSolve Infrastructure
C Fortran
Matlab SCIRun Custom
PSEs andApplications
Metacomputing Resources
Globus
Globusproxy
Ninf Legion
Ninfproxy
Legionproxy
NetSolve
NetSolve
NetSolveproxy
MiddlewareResource Discovery
System Management Resource Scheduling
Fault Tolerance
http://www.cs.utk.edu/netsolve
Applications for the Grid
• Heterogeneous application types/classes– independent parallelism, pipeline
simulations may represent a key class of applications that can efficiently perform on a Globally distributed computational infrastructure.
http://www.cs.utk.edu/netsolve
Data Persistence
• Chain together a sequence of requests.• Analyze parameters to determine data
dependencies. Essentially a DAG is created where nodes represent computational modules and arcs represent data flow.
• Transmit superset of all input/output parameters and make persistent near server(s) for duration of sequence execution.
• Schedule individual request modules for execution.
http://www.cs.utk.edu/netsolve
Request Sequencing
• Goals:– Transmit no unnecessary (redundant)
data parameters.– Ensure all necessary data parameters are
transmitted.– Execute modules simultaneously
whenever possible.
http://www.cs.utk.edu/netsolve
Request Sequencing Interface
…netsl(“command1”, A, B, C);netsl(“command2”, A, C, D);netsl(“command3”, D, E, F);…
…netsl_begin_sequence( );netsl(“command1”, A, B, C);netsl(“command2”, A, C, D);netsl(“command3”, D, E, F);netsl_end_sequence(C, D);…
http://www.cs.utk.edu/netsolve
DAG Construction
• “C” Implementation.• Analyze all input/output references in the
request sequence.• Two references are equal if they refer to the
same memory address.• Size parameters checked for “subset” objects.• Only NetSolve “Matrices” and “Vectors” are
checked.• Constructed DAG scheduled for execution at
NetSolve server.
http://www.cs.utk.edu/netsolve
DAG for Example Sequence
…netsl_begin_sequence( );netsl(“command1”, A, B, C);netsl(“command2”, A, C, D);netsl(“command3”, D, E, F);netsl_end_sequence(C, D);…
command1
command2
command3
A B E
C
D
F
http://www.cs.utk.edu/netsolve
netsl(“command1”, A, B, C);netsl(“command2”, A, C, D);netsl(“command3”, D, E, F);
Client Server
command1(A, B)
result C
Client Server
command2(A, C)
result D
Client Server
command3(D, E)
result F
netsl_begin_sequence( );netsl(“command1”, A, B, C);netsl(“command2”, A, C, D);netsl(“command3”, D, E, F);netsl_end_sequence(C, D);
Client Server
sequence(A, B, E)
Server
Client Serverresult F
input A,intermediate output C
intermediate output D,input E
Data Persistence (cont’d)
http://www.cs.utk.edu/netsolve
Enhanced Sequencing
• Multiple NetSolve server sequencing.– Currently only single NetSolve server can be
used to service entire sequence.– If no single server possesses all software, cannot
be executed as sequence.– Truly parallel execution only on SMPs like the SGI
server used.
• Investigate whether graph scheduling heuristics and algorithms for parallel machines can apply to distributed resources as well.
http://www.cs.utk.edu/netsolve
Data Logistics and Distributed Storage Infrastructures• Expand Data Persistence model to
multiple servers using Distributed Storage Infrastructures to conveniently cache data parameters near all involved servers.
• Example DSIs: IBP, GASS, …• Leveraging remote storage as request
parameters, users can pre-allocate data to expedite services or use already remote data in NetSolve requests.
http://www.cs.utk.edu/netsolve
Multiple Server Sequencing and DSIs
Sequence Parameters
DSI data caches
Server
Server cluster
Server
client
http://www.cs.utk.edu/netsolve
Conclusion• Small likelihood that any single system will emerge as
the Grid system of choice. Therefore, the interoperability of systems and standardization of protocols and object representations becomes highly desirable.
• The Grid community should continue to develop the concepts and technologies necessary to facilitate a seamless Grid environment that is easy to use, highly available and highly efficient.
• However, they should promote more cooperation and less competition in an effort to establish a global heterogeneous GC fabric that makes supercomputing power available to the masses.