1
NCSA 2015 Strategic Planning Process
April 21, 2010
José L. Muñoz (Acting) Director, OCI
(thanks to Blatecky, Parashar and Pennington)
1
Outline
OCI CF21 SOFTWARE HPC
2
National Science FoundationNational Science Foundation
National Science Board
Director Deputy Director
Computer & InformationSci & Eng ($633M)
Engineering($764M)
Geo-Sciences($909M)
Mathematical & Physical
Sciences($1380M)
Education & Human Resources
Office of Budget, Finance & Award
Management
Biological Sciences ($733M)
Office of Cyber-
infrastructure
($219M)
Office of Polar
Programs
Office of Internation
al Sci & Engr
Office of Integrated Activities
Social, Behavioral
& Economic Sciences($257M)
Budgets presented are FY 2010 Request
Office of Information Reserouce Management
4
OCI FY09 BUDGET BREAKDOWN$279M Includes ARRA
5
OCI BUDGET BREAKDOWNFY10: $219M
CyberInfrastructure Framework for 21st Century Science and
Engineering(CF 21)
6
7
Five Crises
Computing Technology Multicore: processor is new transistor Programming model, fault tolerance, etc New models: clouds, grids, GPUs,… where
appropriate Data, provenance, and viz
Generating more data than in all of human history: preserve, mine, share?
How do we create “data scientists”? Software
Complex applications on coupled compute-data-networked environments, tools needed
Modern apps: 106+ lines, many groups contribute, take decades
8
Five Crises con’t
Organization for Multidisciplinary Computational Science “Universities must significantly change organizational
structures: multidisciplinary & collaborative research are needed [for US] to remain competitive in global science”
“Itself a discipline, computational science advances all science…inadequate/outmoded structures within Federal government and the academy do not effectively support this critical multidisciplinary field”
Education The CI environment is running away from us! How do we develop a workforce to work
effectively in this world? How do we help universities transition?
What is Needed?An ecosystem, not components…
9
NSF-wide CI Framework for 21st Century Science & Engineering
People, Sustainability, Innovation, Integration
Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience
DiscoveryCollaboration
Education
Maintainability, sustainability, and extensibility
Cyberinfrastructure Ecosystem
Software Applications, middleware Software development and support Cybersecurity: access, authorization, authentication
Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity
Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation
Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities
Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers
Scientific Instruments Large Facilities, MREFCs,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc
CF21 A goal of Virtual Proximity – as though you
are one with your resources Continue to collapse the barrier of distance and
remove geographic location as an issue ALL resources (including people) are virtually present,
accessible and secure• Instruments, HPC, Vis, Data, Software, Expertise, VOs,
etc
11
End-to-End Integrated Cyberinfrastructure
Science, throughput and usefulness becomes the metric
12
Driving Forces
Need to support the efficient pursuit of S&E Multi-domain, multi-disciplinary, multi-location Leading edge CI network capabilities Seamless integration
Need to connect Researcher to Resource Access to major scientific resources and instruments CI resource availability – at speed and in real-time
• (HPC, MREFC, Data Center, Vis center, Clouds, etc)
Campus environment including intra-campus State, regional, national and international network
and infrastructure transparency12
CF21: Cyberinfrastructure Framework…
High-end computation, data, visualization, networks for transformative science Facilities/centers as hubs of innovation
MREFCs and collaborations including large-scale NSF collaborative facilities, international partners
Software, tools, science applications, and VOs critical to science, integrally connected to instruments
Campuses fundamentally linked end-to-end; clouds, loosely coupled campus services, policy to support
People. Comprehensive approach workforce development for 21st century science and engineering
13
14
ACCI Task ForcesCampus Bridging: Craig Stewart, IU (BIO)
Computing: Thomas Zacharia, ORNL/UTK (DOE)
Grand Challenge Communities/VOs: Tinsley Oden, Austin (ENG)
Education & Workforce: Alex Ramirez, CEOSE
Software: David Keyes, Columbia/KAUST (MPS)
Data & Viz: Shenda Baker, Harvey Mudd (MPS); Tony Hey, (CISE)
Completion by end of year Advising NSF Conducting Workshop(s) Recommendations Input to NSF informs CF21
programs, 2011-12 CI Vision Plan
14
CF21 Plan
Existing Task Forces Recommendations and input
Need to establish CF21 group at NSFCI lead from each DirectorateCreation of the CF21 document is the goalEarly Draft by January 2011
CF21 Colloquium (C2) – in process Need to have a budget building exercise
for CF21 for FY12
15
SOFTWARE
16
Software is Critical CI – Unprecedented complexity, challenges
Software is essential to every aspect of CI – “the glue” Drivers, middleware, runtime, programming systems/tools,
applications, …
This software is different …. ? In its natures, who builds it, how is it built, where it runs, its lifetime,
etc.
Software crisis? Software complexity is impeding the use of CI
• Science apps have 103 to 106+ lines, have bugs• Developed over decades – long lifecycles (~35 years)
Software/systems design/engineering issues• Emergent rather than by design
Quality of science in question
Software Grand Challenge SW as the modality for CF21 and
Computational Science in the 21st Century Sustainable SW as a CI resource
What SW to sustain?How to sustain it?
Fundamental Grand Challenge: Robust, Sustainable and Manageable Software at CI-ScaleRepeatability, Reliability, Performance, Usability,
Energy efficiency, ….
Sustainability, manageability, etc., are NOT add-ons – it has to be integrated into the design
Many complex aspects….
Building the right software – application involvement, understanding requirements scales, types of software, target user communities
Building software right – teams, reward structures, processes, metrics, verification/testing
Protecting investments – active management, sustainability, leverage/reuse, ownership, business models
Building trust – user community must be able to depend on the availability of a robust and reliable software infrastructure!
Software Infrastructure for Sustained Innovations (SI2) -
Mechanisms Create a software ecosystem that scales
from individual or small groups of software innovators to large hubs of software excellence 3 interlocking levels of funding
Focus on innovation Focus on sustainability
Sustained Long-Term Investment in Software Transform innovations into sustainable
software that is an integral part of a comprehensive cyberinfrastructure robust, efficient, resilient, repeatable, manageable, sustainable,
community-based, etc.
Catalyze and nurture multidisciplinary software as a symbiotic “process” with ongoing evolution Domain and computational scientists, software technologists
Address all aspects, layers and phases of software Systematic approaches
Theory validated by empirical trials
Tools that embody and support processes
Metrics, validation mechanisms, governance structures
Amortised over large (global) user communities
Support for maintenance and user support
Sustained Long-Term Investment in Software
Significant multiscale, long-term programEnvisions $200-300M over a decadeConnected institutes, teams, investigators Integrated into CF21 framework
22
Many individuals w/short term grant
Numerous teams of scientists and computational and computer scientists with longer term grants
3-6 centers, 5+5 years, for critical mass, sustainability
Sensor Nets
Experiments/Instruments
Data Archives (DataNet)Data Archives (DataNet)
X
Infrastructure (XD)
Visualization/Analytics
Software Infrastructure for Sustained Innovation (SI2): Details
Letters of Intent (Required) – May 10, 2010 Title, Team, Synopsis (science/engr. drivers, target user
community, specific software elements)
Full Proposals – June 14, 2010 SSE: ~2 PIs + 2 GAs, 3 years SSI: ~3-4 PIs, 3-4 GAs, 1-2 senior personnel, 3-5 years No S2I2 in FY 10
Note: Proposal preparation instructions, supplementary document
requirements Additional review criteria
Please do read the solicitation!
Email questions to [email protected]
High Performance Computing
25
HPC Task Force General Questions
Access to advanced computing resources2011-2015 time frame
Applications development and supportDevelopment, maintenance and support
Computer science and engineering Innovations to advance development and use
Integration of research and educationPre-college through post-graduate
TrainingPreparation of the scientific workforce
Gathering Community Input On: How do we best address sustainability, user
requirements in HPC? Revisions to current acquisition model? What is the proper balance between production and
experimental systems? How to build on TG and XD for integration, advanced
services, in the future: CF21 Alternate models of computing:
Clouds, grids, etc Commercial providers? Pay per service?
Exascale and beyond Already jointly sponsoring workshops with DOE on advanced
software for exascale How to advance the applications community for this? Partnerships with DOE, other agencies moving forward
Workshop #1, Dec 2010 Workshop held in Arlington, VA Position papers and report available at:
http://www.nics.tennessee.edu/workshop Resultant set of recommendations being used to
help shape programs and longer term strategy Follow on workshops on Applications and Software Two major recommendations
Recommendations By 2015–2016, academic researchers should
have access to a rich mix of HPC systems that: deliver sustained performance of 20–100 petaflops
on a broad range of science and engineering codes; are integrated into a comprehensive, national
cyberinfrastructure environment; and are supported at national, regional, and/or campus
levels.
Recommendations To sustain and promote the stability of resources, NSF
should direct the evolution of its supercomputing program in a sustainable way, allowing researchers and HPC centers to select the best value in computational and data platforms and enabling centers to offer continuous service to the community.
NSF should: Commit to stable and sustained funding for HPC centers
to allow them to recruit and develop the expertise needed to maximize the potential offered by NSF’s hardware investments. Rigorous review and oversight processes can be developed and implemented to provide assurance that centers meet NSF expectations for performance.
Encourage HPC centers to build long-term relationships with vendors, thus providing researchers with the benefits of a planned road map for several generations of chip technology upgrades and with continuity in architecture and software environments. Results-oriented acquisition strategies can be applied to ensure that vendor performance meets center and NSF needs.
End
31