Date post: | 17-Jan-2016 |
Category: |
Documents |
Upload: | theodore-bennett |
View: | 214 times |
Download: | 0 times |
SC06, Tampa FLNovember 11-17, 2006
Science Gateways on the TeraGridPowerful Beyond Imagination!
Nancy Wilkins-DiehrTeraGrid Area Director for Science Gateways
San Diego Supercomputer Center
SC06, Tampa FLNovember 11-17, 2006
Questions I Hope to Answer Today
• What is the TeraGrid?• What are Science Gateways?• Why TeraGrid and Gateways?• Initial Strategy• Implementation Details
– Issues to address when using TeraGrid
• Some Gateway Highlights• Future growth
SC06, Tampa FLNovember 11-17, 2006
What is the TeraGrid?
• NSF-funded facility to offer high end compute, data and visualization resources to the nation’s academic researchers
SC06, Tampa FLNovember 11-17, 2006
TeraGrid Technology
Data
18.8 Petabytes StorageMemory Intensive
Resources
Computation Visualization
100+ Teraflops Computation
40gigabit/second cross-country network
SC06, Tampa FLNovember 11-17, 2006
Over 100 Tflops in Computing Power
55
20.4
15.611
10.23
10.23
10
6.6
6.55
6
5.7345.7 3.12.2 20.610.340.310.17
TACC Lonestar
IU Big Red
SDSC DataStar
Purdue Radon
NCSA Mercury
NCSA Tungsten
PSC BigBen
Purdue Lear
NCSA Cobalt
PSC Lemieux
NCAR Frost
SDSC BlueGene
SDSC IA64
IU IA-32
NCSA Copper
UC ANL IA64
ORNL IA32
PSC Rachel
IU Tiger
SC06, Tampa FLNovember 11-17, 2006
Powerful Remote Visualization Capabilities
• Maverick UltraSPARC IV – 64 UltraSPARC IV processors 256 GB Memory 2 Nodes (128
processors, 512 GB memory) – Unique configuration that has resulted from a joint project between
Sun Microsystems and the Texas Advanced Computing Center to provide terascale remote visualization.
– Leverages the vast capabilities of Sun’s E25K enterprise server with the addition of commodity graphics
• UC/ANL's IA-32 TeraGrid Linux Visualization Cluster consists of – 96 nodes with dual Intel Xeon 2.4 GHz processors, with 4 GB of
memory– nVidia GeFORCE 6600GT AGP graphics card per node. – 16 TB local high-performance GPFS, and access to the TeraGrid-
wide GPFS-WAN file-system.
SC06, Tampa FLNovember 11-17, 2006
Extensive Data Collections and Storage Space
• Data Collections– ~100 collections– Wide variety of disciplines– http://www.teragrid.org/userinfo/data/collections.php
• Storage Space– 18PB rotating disk– Many PBs archival capacity– 220 TB Global filesystem– Collection management software
SC06, Tampa FLNovember 11-17, 2006
All of These Resources Available to Researchers at No Cost
• TeraGrid creates integrated, persistent, and pioneering computational resources that significantly improve our nation’s ability and capacity to gain new insights into our most challenging research questions and societal problems.
• Proposal-based access, researchers can use resources at no cost– Collaborative opportunities, but
Principal Investigators must be from the U.S.
SC06, Tampa FLNovember 11-17, 2006
TeraGrid PI’s By Institution as of May 2006
TeraGrid PI’s
Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI
SC06, Tampa FLNovember 11-17, 2006
Gateways are part of TeraGrid’s 3-pronged strategy to further science
• DEEP Science: Enabling Terascale Science– Make science more productive
through an integrated set of very-high capability resources
• Advanced Support for TeraGrid Applications (ASTA) projects
• WIDE Impact: Empowering Communities– Bring TeraGrid capabilities to the
broad science community• Science Gateways
• OPEN Infrastructure, OPEN Partnership– Provide a coordinated, general
purpose, reliable set of services and resources
• Grid interoperability working group
SC06, Tampa FLNovember 11-17, 2006
Science GatewaysA new initiative for the TeraGrid
• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:
• Resources• Users – from expert to K-12• Software stacks, policies
• Science Gateways– Provide “TeraGrid Inside”
capabilities– Leverage community investment
• Three common forms:– Web-based Portals – Application programs running on
users' machines but accessing services in TeraGrid
– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.
Workflow Composer
SC06, Tampa FLNovember 11-17, 2006
Gateways are growing in numbers
• 10 initial projects as part of TG proposal• >20 Gateway projects today• No limit on how many gateways can use TG resources
– Prepare services and documentation so developers can work independently
• Open Science Grid (OSG)• Special PRiority and Urgent Computing Environment
(SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric Discovery
(LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering Online (CSE-
Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology and
nanoHUB• GIScience Gateway (GISolve)• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)
• Many others interested– SID Grid– HASTAC
SC06, Tampa FLNovember 11-17, 2006
What Did We Learn About Common Gateway Requirements?
• Accounting– Support for accounts with
differing capabilities– Ability to associate compute job
to a individual portal user– Scheme for portal registration and
usage tracking– Dynamic accounts
• Security– Community account privileges– Need to identify human
responsible for a job for incident response
– Acceptance of other grid certificates
• Web Services – Many will build on the Globus
Toolkit, but additional interfaces may be needed
– Web Service security– Interfaces to scheduling and
account management are common requirements
• Software– Interoperability of software stacks
between TeraGrid and peer grids– Software installations for
gateways across all TG sites– Community software areas– Management (pacman, other
options)
SC06, Tampa FLNovember 11-17, 2006
Gateway Web Services Needs
• Interfaces provided by the TeraGridThe list of services that have been identified by the gateways developers includes:
– Resource Status Service (both polling and pub/sub)
– Job Submission Interface • The gateways expect this to be provided
by WS-GRAM
– Job Tracking Interface (Both polling and pub/sub)
– File/Data Staging Interface – Retrieve Usage Information – Retrieve Inca Info – Advanced Reservation Interface – Cross-site Run interface– Pushing DN to an RP interface
• Interfaces provided by the GatewaysThe list of services that have been identified by the gateways developers and the TeraGrid Security group includes:
– Retrieve user information for a job – Retrieve accounting
information/statistics – Provides the necessary means to
track down problem job submissions, identify malicious users.
• Don't submit jobs from the user who submitted job (resource, job id), until we say it's Ok.
– The accounting interface requires no information, but returns sufficient accounting information and statistics to report to funding agencies, program managers, etc.
SC06, Tampa FLNovember 11-17, 2006
National Virtual ObservatoryFacilitating Scientific Discovery
• Access to telescope images from around the world
• NVO provides access to combined sky surveys– Different views of the same
cosmological phenomenon can reveal new insights
• New science enabled by enhancing access to data and computing resources– Data correlation– Understanding of physical
processes– Identification of new phenomenon
• NVO is a set of tools used to exploit the data avalanche
SC06, Tampa FLNovember 11-17, 2006
Linked Environments for Atmospheric Discovery•Providing tools that are needed to make accurate predictions of tornados and hurricanes
•Meteorological data•Forecast models•Analysis and visualization tools
•Data exploration and Grid workflow
SC06, Tampa FLNovember 11-17, 2006
spruce.teragrid.orgSpecial Priority and Urgent Computing Environment
SC06, Tampa FLNovember 11-17, 2006
NCAR Earth System Grid
• Science Gateway for climate research
– Enabling analysis and understanding gained from global Earth System computational models
• ESG originally a distributed data management/access system but it has evolved into more.
• User registration, authorization controls, and metrics tracking
• CCSM model source, initialization datasets, post-processing codes, and analysis and visualization tools.
• Prototypes of model- submission environments
– Eventually real-time tracking of model status along with references to available output datasets.
• Expect to see more model runs at higher- resolution and with greater component scope.
SC06, Tampa FLNovember 11-17, 2006
Did I Answer Your Questions?
• What is the TeraGrid?• What are Science Gateways?• Why TeraGrid and Gateways?• Initial Strategy• Implementation Details
– Issues to address when using TeraGrid
• Some Gateway Highlights• Future growth
SC06, Tampa FLNovember 11-17, 2006
Would development of a gateway help your research?
• Think about your current bottlenecks– What would you like to explore if only you had
• Lots of disk• Lots of compute resources• Powerful analysis capabilities• A nice interface to information
• www.teragrid.org
• Nancy Wilkins-Diehr, [email protected]