Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | abner-rogers |
View: | 216 times |
Download: | 1 times |
Experiences in the Grid‘the grid, the bad, and the ugly’
Surrey Universitye-Science Day 2nd December 2002
Prof Simon CoxTechnical Director,
Southampton Regional e-Science CentreSchool of Engineering Sciences
University of Southampton
What is the Grid?
3
IT Drivers: Moore’s Law (i)
“Moore’s Law, the prediction that transistor density would double every 18 to 24 months, has become a self-fulfilling prophecy. Computers have been gaining 10
times the processing power every five years. The exponential growth of chip transistor density will continue at least another decade.”
“By 2010, the typical desktop computer will have a 30-GHz processor that performs 1 trillion instructions per second. Handheld computers will run at clock
speeds of 5 GHz, faster than today's high-end systems”Pat Gelsinger, Intel Corp.’s CTO at FOSE 2002 trade show, Washington
4
IT Drivers: Moore’s Law (ii)
• Moore’s law highly functional end-systems Compute and Data
• Network exponentials produce dramatic changes in geometry and geography 9-month doubling: double Moore’s law!
• New modes of working and problem solving emphasize teamwork, computation
• New business models and technologies facilitate outsourcing
5
IT environment
Too Expensive
Compute/ Data
Network
Proprietary
Homogeneous
Couldn’t Interoperate
Couldn’t Collaborate
Where we were What’s New
COMMODITY
+
OPEN STANDARDS
Web Services
Grid Middleware
Databases/ XML
W3C
6
Grid Computing
7
The Grid Problem“Flexible and secure sharing of resources among
dynamic collections of individuals within and across organisations”
• Resources = assets, capabilities, and knowledge Capabilities (e.g. application codes, analysis tools) Compute Grids (PC cycles, commodity clusters, HPC) Data Grids Experimental Instruments Knowledge Services Virtual Organisations Utility Services
Grid middleware mediates between these resources
8
Opportunity• Grid aims to do for corporate IT what the web
did for information Unify and coordinate resources
compute, data, …
• Grid paradigm Lower TCO by reducing complexity Facilitate new ways of operating
Leverage existing infrastructure Seamless integration of new infrastructure Interoperability: intra-company, inter-company
A Brief History of Grid
10
• Early 90s Gigabit testbeds, metacomputing
• Mid to late 90s Early experiments (e.g., I-WAY), academic software projects (e.g., Globus, Legion), application experiments
• 2002 Dozens of application communities & projects Major infrastructure deployments Significant technology base (esp. Globus ToolkitTM) Growing industrial interest Global Grid Forum: ~500 people, 20+ countries
The Grid: A Brief History
11
The Grid World: Current Status• Dozens of major Grid projects in scientific & technical
computing/research & education Deployment, application, technology
• Some consensus on key concepts and technologies Open source Globus Toolkit™ a de facto standard for major
protocols & services Far from complete or perfect, but out there, evolving rapidly,
and large tool/user base
• Global Grid Forum a significant force
• Industrial interest emerging rapidly
http://www.gridforum.org
12
Summary• The Grid problem: Resource sharing & coordinated problem
solving in dynamic, multi-institutional virtual organizations
• Grid architecture: Protocol, service definition for interoperability & resource sharing
• Grid Middleware Globus Toolkit a source of protocol and API definitions—and reference
implementations Open Grid Services Architecture represents next step in evolution Condor High throughput Computing Web Services & W3C leveraging e-business
• e-Science Projects applying Grid concepts to applications
Grid Architecture
14
Distributed Terascale Facility
15
Distributed Terascale Facility
16
UK e-Science Grid
Cambridge
Newcastle
Edinburgh
Oxford
Glasgow
Manchester
Cardiff
Southampton
London
Belfast
DL
RAL Hinxton
17
£80m Collaborative projects
E-ScienceSteering
Committee
DG Research Councils
Director (Tony Hey)Director’s
Management RoleDirector’s
Awareness and Co-ordination Role
Generic Challenges EPSRC (£15m), DTI (£15m)
Industrial Collaboration (£40m)
Academic Application SupportProgramme
Research Councils (£74m), DTI (£5m)
PPARC (£26m) BBSRC (£8m) MRC (£8m) NERC (£7m) ESRC (£3m) EPSRC (£17m) CLRC (£5m)
Grid TAG
UK e-Science Programme
18
Gri
d
Info
rmat
ion
S
ervi
ce
Un
ifo
rmR
eso
urc
eA
cces
s
Bro
keri
ng
Glo
bal
Q
ueu
ing
Glo
bal
Eve
nt
Ser
vice
s
Co
-S
ched
uli
ng
Dat
a M
ana
gem
ent
Un
ifo
rm D
ata
Acc
ess
Co
llab
ora
tio
n
and
Rem
ote
In
stru
men
t S
ervi
ces
Net
wo
rk
Cac
he
Co
mm
un
icat
ion
S
ervi
ces
Au
then
tic
atio
n
Au
tho
riza
tio
n
Sec
uri
ty
Ser
vice
s
Au
dit
ing
Fau
lt
Man
ag
emen
t
Mo
nit
ori
ng
Grid Common Services: Standardized Services and Resources Interfaces
Web Services and Portal ToolkitsApplications (Simulations, Data Analysis, etc.)
Application Toolkits (Visualization, Data Publication/Subscription, etc.)Execution support and Frameworks (Globus MPI, Condor-G, CORBA-G)
Distributed Resources
Science Portals andScientific Workflow Management Systems
Condor poolsof workstations
network caches
tertiary storage
scientific instrumentsclusters
national supercomputer
facilities
Architecture of a Grid
= operational services (Globus, SRB)
High Speed Communication Services
19
Gri
d P
roto
co
ls a
nd
Gri
d S
ec
uri
ty I
nfr
as
tru
ctu
re
Combining Grid and Web ServicesC
lien
tsApplication
PortalsWeb
ServicesGrid Services:
Collective and Resource AccessResources
Compute(many)
Storage(many)
Communi-cation
Instruments(various)
GRAM
GridFTPData Replica and Metadata Catalog
GridMonitoring
Architecture
GridInformation
Service
We
b B
row
se
r
Grid Web ServiceDescription (WSDL)& Discovery (UDDI)
Grid X.509Certification
Authority
SRB/MetadataCatalogue
Condor-G
CORBA
MPI
Secure, Reliable
Group Comm.
Discipline /Application
SpecificPortals
(e.g. SDSCTeleScience)
ProblemSolving
Environments(AVS, SciRun,
Cactus)
EnvironmentManagement(LaunchPad,
HotPage)
Job Submission /Control
File Transfer
Data Management
CredentialManagement
Monitoring
Events
WorkflowManagement
other services:•visualization•interface builders•collaboration tools•numerical grid generators•etc.
Apache Tomcat&WebSphere&Cold Fusion=JVM + servlet
instantiation + routing
CoG Kits implementing Web Services in
servelets, servers, etc.Python, Java, etc.,
JSPs
compositionframeworks(e.g. XCAT)
XM
L /
SO
AP
ov
er
Gri
d S
ec
uri
ty I
nfr
as
tru
ctu
re
Gri
d P
roto
co
ls a
nd
G
rid
Sec
uri
ty I
nfr
astr
uct
ure
Apache SOAP,.NET, etc.
……
htt
p,
htt
ps
. e
tc.
X W
ind
ow
sP
DA
Grid ssh
Grid Middleware
21
Grid Middleware(coordinate and authenticate use of grid services)
• Globus (and GGF grid-computing protocols) Security Infrastructure (GSI) Resource Allocation Mechanism (GRAM) Resource Information System (GRIS) Index Information Service (GIIS) Grid-FTP Metadirectory service (MDS 2.0+) coupled to LDAP server
• Condor (distributed high performance throughput system) Condor-G allows us to handle dispatching jobs to our Globus system Condor development started in 1985 at University of Wisconsin
(Miron Livny)
22
The Globus ProjectMaking Grid computing a reality
• Close collaboration with real Grid projects in science and industry
• Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure
• Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing
• The Globus Toolkit: Open source, reference software base for building grid infrastructure and applications
• Global Grid Forum: Development of standard protocols and APIs for Grid computing
http://www.gridforum.orghttp://www.globus.org
234
http:/ / www.cs.wisc.edu/ condor
The Condor Project (Established ‘85)Distributed High Throughput Computing researchperformed by a team of ~25 faculty, full t ime staffand students who:
h f ace sof tware engineering challenges in adistributed UNI X/ Linux/ NT environment,
h are involved in national and internationalcollaborations,
h actively interact with academic and commercialusers,
h maintain and support a large distributedproduction environment,
h and educate and train students.Funding – US Govt. (DoD, DoE, NASA, NSF),AT&T, IBM, I NTEL, Microsoft UW- Madison
24
Web Services• Increasingly popular standards-based framework for accessing network
applications W3C standardization: Microsoft (.NET), IBM (WebSphere), Sun (J2EE), etc
• XML and XML Schema Representing data in a portable format
• WSDL: Web Services Description Language Interface Definition Language for Web services
• SOAP: Simple Object Access Protocol XML-based RPC protocol; common WSDL target
• WSDL (/ WS-Inspection) Conventions for locating service descriptions
• UDDI: Universal Description, Discovery, & Integration Directory for Web services
25
Structure Structure (XML Schemas)(XML Schemas)
XML Web Services FrameworkXML Web Services Framework
WireWire DescriptionDescription DiscoveryDiscovery
Syntax (XML)Syntax (XML)
Envelope & Envelope & Extensibility Extensibility
(SOAP)(SOAP)InspectionInspection
(DISCO)(DISCO)
Directory (UDDI)Directory (UDDI)Service Service
DescriptionDescription(WSDL)(WSDL)
Process Process OrchestrationOrchestration
(XLANG)(XLANG)AttachmentsAttachments
W3C W3C RecRec
W3C WGW3C WG
FutureFuture
SecuritySecurity
RoutingRouting
ReliabilityReliability
ServiceServiceDescriptionDescription
(WSDL)(WSDL)
ProcessProcessOrchestrationOrchestration
(XLANG)(XLANG)BPEL4WS
26
Grid Services• Grid Services are defined in terms of
Web Services Description Language (WSDL) interfaces, and provide the mechanisms to create and compose sophisticated distributed systems: Lifetime management Reliable and secure remote invocation Change management Credential management Notification
in a Web services environment
Grid Applications: ‘e-Science’
28
Scientific Grid Applications• A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour
• 1,000 physicists worldwide pool resources for peta-op analyses of petabytes of data
• Civil engineers collaborate to design, execute, & analyze shake table experiments
• Climate scientists visualize, annotate, & analyze terabyte simulation datasets
• An emergency response team couples real time data, weather model, population data
29DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
tomographic reconstruction
real-timecollection
wide-areadissemination
desktop & VR clients with shared controls
Advanced Photon Source
Online Access to Scientific Instruments
archival storage
30
CERN’s Large Hadron Collider1800 Physicists, 150 Institutes, 32 Countries
100 PB of data by 2010; 50,000 CPUs?
31
Grid Communities & Applications:Data Grids for High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
www.griphyn.org www.ppdg.net www.eu-datagrid.org
32
Network for EarthquakeEngineering Simulation
• NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
• On-demand access to experiments, data streams, computing, archives, collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC
33
Business Grid Applications• Engineers at a multinational company collaborate on the
design of a new product
• A multidisciplinary analysis in aerospace couples code and data in four companies
• An insurance company mines data from partner hospitals for fraud detection
• An application service provider offloads excess load to a compute cycle provider
• An enterprise configures internal & external resources to support e-Business workload
34
35
NASA Information Power Grid• Vision: To revolutionize how computing is used in
NASA’s science and engineering by providing the middleware services for routinely building large-scale, dynamically constructed, and transient, problem solving environments from distributed, heterogeneous resources
• A persistent computing and data grid
William E. Johnston, Project Manager
NASA Advanced Supercomputing (NAS) DivisionNASA Ames Research Center
http://www.ipg.nasa.gov
36
Multi-disciplinary Simulations:Aviation Safety Example
VirtualNational Air
Space(VNAS)
•FAA Ops Data•Weather Data•Airline Schedule Data•Digital Flight Data•Radar Tracks•Terrain Data•Surface Data
The vision for VNAS is that whole system simulated aircraft are inserted into a realistic environment. This requires integrating many types of operations data as drivers for the simulations.
37
Future Aviation Safety Systems
Information Power Grid managed compute and data management resources
Grid Services: Uniform access to distributed resources
Airframe Models
Human Models
Landing Gear Models (LaRC)
Stabilizer
Models
Wing Models (ARC)
Application framework
Gri
d
Info
rmat
ion
S
ervi
ce
Un
ifo
rmR
eso
urc
eA
cces
s
Bro
keri
ng
Glo
bal
Qu
euin
g
Glo
bal
Eve
nt
Ser
vice
s
Co
-Sch
edu
lin
g
Dat
a C
atal
og
uin
g
Un
ifo
rm D
ata
Acc
ess
Co
llab
ora
tio
n
and
Rem
ote
In
stru
men
t S
ervi
ces
Net
wo
rk C
ach
e
Co
mm
un
icat
ion
S
ervi
ces
Au
then
tica
tio
n
Au
tho
riza
tio
n
Sec
uri
ty
Ser
vice
s
Au
dit
ing
Fau
lt
Man
agem
ent
Mo
nit
ori
ng
compute and data management requestsWest Coast TRACON/Center Data(Performance Data Analysis & ReportingSystem (PDARS) - AvSP/ASMM ARC)Atlanta HartsfieldInternational Airport(Surface Movement AdvisorAATT Project)NOAA Weather Dbase(ATL Terminal area)Airport Digital Video(Remote Tower Sensor System)
Engine Models(GRC)
To NAS Data Warehouse
ARC
SDSC
LaRCGSFC
KSCJSC
Boeing
NGIXNREN CMU
GRC
NTON-II/SuperNet
NCSA
EDC
JPL
MSFC
300 node Condor pool
38
Aviation Safety
Multiple sub-systems, e.g. a wing lift model operating at NASA Ames and a turbo-machine model operating at NASA Glenn, are
combined using GRC’s NPSS (Numerical Propulsion System Simulation) application framework that manages the interactions
of multiple models and uses IPG services to coordinate computing and data storage systems across NASA.
39
Grid Enabled Optimisation and Design Search for Engineering (GEODISE)
Southampton, Oxford and ManchesterSimon Cox- Grid/ W3C Technologies
and High Performance ComputingGlobal Grid Forum Apps Working Group
Andy Keane- Director of Rolls Royce/ BAE Systems University Technology Partnership in Design Search and Optimisation
Mike Giles- Director of Rolls Royce University Technology Centre for Computational Fluid Dynamics
Carole Goble- Ontologies and DARPA Agent Markup Language (DAML) / Ontology Inference Language (OIL)
Nigel Shadbolt- Director of Advanced Knowledge Technologies (AKT) IRC
BAE SYSTEMS- Engineering
Rolls-Royce- Engineering
Fluent- Computational Fluid Dynamics
Microsoft- Software/ Web Services
Intel- Hardware
Compusys- Systems Integration
Epistemics- Knowledge Technologies
Condor- Grid Middleware
40
Design
41
Modern engineering firms are global and distributed
“Not just a problem of using HPC”
CAD and analysis tools, user interfaces, PSEs, and Visualization
Optimisation methods
Data archives (e.g. design/ system usage)
Knowledge repositories & knowledge capture and reuse tools.
Management of distributed compute and data resources
How to … ?
… improve design environments… cope with legacy code / systems
… integrate large-scale systems in a flexible way
… produce optimized designs
… archive and re-use design history
… capture and re-use knowledge
Design Challenges
Geodise will provide grid-based seamless access to an intelligent knowledge repository, a state-of-the-art collection of optimisation and search tools,
industrial strength analysis codes, and distributed computing & data resources
GEODISE
APPLICATION SERVICE
PROVIDERCOMPUTATION
GEODISE PORTAL
OPTIMISATION
Engineer
Parallel machinesClusters
Internet Resource ProvidersPay-per-use
Optimisation archive
Intelligent Application Manager
Intelligent Resource Provider
Licenses and code
Session database
Design archive
OPTIONSSystem
Knowledge repository
Traceability
Visualization
Globus, Condor, OGSA
Ontology for Engineering,
Computation, &Optimisation and Design Search
CAD SystemCADDSIDEASProE
CATIA, ICAD
AnalysisCFDFEMCEM
ReliabilitySecurity
QoS
Knowledge Technologies
Advanced Knowledge Technologies, IRC (Soton)
44
Access Grid• Collaborative work
among large groups• ~80 sites worldwide
Ambient mic(tabletop)
Presentermic
Presentercamera
Audience camera
Access Grid: Argonne, others www.accessgrid.org
45
An environment that enables geographically distributed scientists to achieve research goals more effectively, while enabling their results to be used in
developments elsewhere
myGridManchester, Newcastle, Nottingham, Sheffield, Southampton, IT Innovation Centre, European Bioninformatics Institute, AstraZeneca, GlaxoSmithKline,
Merck KGaA, Epistemics Ltd, GeneticXchange, Network Inference, IBM, Sun
myG
rid M
idd
lew
are
Building personalised extensible environments for data-intensive in silico experiments in biology
1
2
54
3
http://www.mygrid.org.uk
46
Portal
Meta Data:Ontology
WorkflowRepository
Meta Data:Service Type
Directory
RepositoryClient
OntologyClient
WorkflowClient
PersonalRepository
myGrid demo
47
Grid ENabled Integrated Earth System Model (GENIE) Reading, Southampton, Bristol, CEH (Wallingford and Edinburgh), Hadley Centre,
Imperial, UEA
Paul Valdes- Professor of Earth System Science, Reading
Simon Cox- Technical Director, Southampton e-Science Centre
Melvin Cannell - Director of CEH Edinburgh
John Darlington- Director of London e-Science Centre
Richard Harding- Head of Global Processes Section, CEH Wallingford
Tony Payne- Reader in glaciology, Bristol
John Shepherd- Professor of Marine Sciences, Southampton
Andrew Watson- Professor of Environmental Sciences, UEA.
Tim Lenton- Science Coordinator
Bob Marsh- Collaborator
Peter Cox- Collaborator (Hadley Centre)
Intel- Hardware
Compusys- Systems Integration
Condor- Grid Middleware
48
Earth System Science
“By taking the ‘whole systems’ approach, we are more likely to find sustainable solutions to environmental problems.”
49
Earth System Model
Atmosphere ModelNon-transient eddy resolving planetary
wave
Terrestrial Carbon Cycle Model
Simplified TRIFFID
Ocean Carbon and Nutrient Cycle
ModelIncluding sediments
Cryosphere ModelDynamic ice sheet including fast flow.
Simple seaice
Terrestrial Hydrology ModelSimplified MOSES
scheme
Ocean Model3-D, non-eddy-
resolving, frictional geostrophic
“What are the sources, sinks and transportation processes of carbon within the Earth system?”
50
Reality Grid (www.realitygrid.org)
A tool for investigating A tool for investigating condensed matter and materialscondensed matter and materials
The Investigators
• .
•Principal Investigator: Prof P.V. Coveney Centre for Computational Science, Queen Mary, Univ. of London
•Edinburgh Co-investigator: Prof. M.E. Cates Department of Physics, Univ. of Edinburgh
•Imperial College Co-investigator: Prof. J. Darlington IC Parallel Computing Centre
•Loughborough Co-investigator: Prof. R. KalawskyDept. of Computer Science & VR Centre, Loughborough Univ
•Manchester Co-investigators: Dr J.M. Brooke, Manchester Computing and Prof. J. Gurd, Centre for Novel Computing, Univ. of Manchester
•Oxford Investigator: Prof. A. SuttonDept. of Materials, Univ. of Oxford
51
ICENI
• IC e-Science Networked Infrastructure
• Developed by LeSC Grid Middleware Group
• Collect and provide relevant Grid meta-data
• Use to define and develop higher-level services
• Interaction with other frameworks: OGSA, Jxta etc.
The Iceni, under Queen Boudicca, united the tribes of South-East England in a revolt against the occupying Roman forces in AD60.
52
Visualisation Client 2
Service Oriented Architecture
Componentised Steering & Visualisation
Visualisation Client 1
ApplicationComponent
VisualisationServer
RenderingEngine 1
VisualisationClient 3
RenderingEngine 2
Dataset A & BDataset B
Dataset A
Same viewof dataset A
View ofdataset B
53
Issues• Systems Architecture and Platform
Strategies
• Algorithms, Methods and Libraries
• Distributed Computing and Resources
• Distributed Data Management
• Visualization
• Problem Solving Environments
• Collaborative tools and frameworks for managing collaboratories
• Knowledge Technologies
Grid/ e-Science
54
Grid Summary• The Grid problem: Resource sharing & coordinated
problem solving in dynamic, multi-institutional virtual organizations
• Grid architecture: Protocol, service definition for interoperability & resource sharing
• Open Standards Middleware Web Services & W3C leveraging e-business Open Grid Services Architecture represents next step in evolution
• e-Science Projects applying Grid concepts to applications
Questions