Post on 23-Mar-2018
transcript
ASCR-BER Requirements • March 29, 2016 Department of Energy • Biological and Environmental Research 1
Office of Science Office of Biological
and Environmental Research
May 15, 2017
BASC: Next Generation Computing: Needs and Opportunities for Weather, Climate, and Atmospheric Sciences Department of Energy, Office of Science Office of Biological and Environmental Research Gary Geernaert, Division Director Climate and Environmental Sciences Dorothy Koch, Program Manager Earth System Modeling
ASCR-BER Requirements • March 29, 2016 Department of Energy • Biological and Environmental Research 2
Department of Energy - Office of Science
Steve Binkley Associate Director
Deputy Director
High Energy Physics
Biological and Environmental
Research (BER)
Sharlene Weatherwax,
Associate Director
Basic Energy
Sciences
Fusion Energy
Sciences
Advanced Scientific
Computing Research
(ASCR)
Barb Helland, Associate Director
Nuclear Physics
Biological Systems Science
Climate and Environmental
Sciences
Gary Geernaert, Director
Current and planned HPC-intensive activities related to
DoE climate and atmospheric sciences
LASSO Leveraging the Southern Great Plains
ARM site with Large-Eddy Simulation
ACME High-resolution (25km) coupled Earth System Modeling
targeting DoE Leadership Computing Facilities
An end-to-end approach to exascale systems —
from libraries to algorithms to applications to
hardware
Includes next-generation ACME computing
ASCR(Computing)-
BER(Climate)
partnership program Science and model
development projects
IDEAS Interoperable Design of Extreme-
scale Applications Software
BASC • May 2017 Department of Energy • Biological and Environmental Research
Accelerated Climate Model for Energy
ACME is a modeling project launched by DOE’s in
July 2014 to develop a branch of the CESM to
❖ Advance a set of science questions that
demand major computational power and
advanced software: “water cycle”,
“biogeochemistry” and “cryosphere-ocean”
❖ Provide high resolution coupled climate
simulations (15-25 km), with regionally
refined grids <10 km
❖ Focus on near-term time horizon: 1970-
2050
❖ Design codes to effectively utilize next and
successive generations DOE Leadership
Class computers, both hybrid and multi-core,
through exascale
V1 (version 1) is currently in production-simulation (to
be released in late 2017)
V2-v3 are under development (release in 2020-2023)
ACME will be an open source Earth system
model ready to run on DOE computers, including
DOE’s NERSC.
Examples of current (v1)
computational performance
- Edison: 12 SYPD (100km
coupled)
- Cori KNL: 6 SYPD (100km
coupled), 3 SYPD (25 km
atmosphere-only)
- Titan: 1.4 SYPD (25km)
- Mira: 0.33 SYPD (25km)
BASC • May 2017 Department of Energy • Biological and Environmental Research
Future directions – 5-10 years Earth System Modeling (ACME): Non-hydrostatic atmosphere (down to 100m with regional refinement); Eddy-resolving ocean (down to 100m for coastal modeling, inundation), fully integrated dynamic land ice; above and below-ground hydrology and BGC, dynamic vegetation, sub-grid orography
Model analysis: Calibration, testing and analysis on very large ensembles, better model-observation integration methods, embedded UQ and diagnostics
“Integrated” modeling: interoperable framework that includes ESM, IAM, IAV for various sectors (including energy), at appropriate scales and configurations to solve particular problems
Subsurface modeling: Watershed and genome-enabled BGC models, community infrastructure for interoperable hydrology and BGC
LES modeling: that integrates ARM data, for model parameterization development, testing of remote retrievals
BASC • May 2017 Department of Energy • Biological and Environmental Research
DOE’s Office of Science Computing Facilities and Programs
ASCR Computing Facilities:
• OLCF (hybrid, CPU-GPU)
Titan (27 PF Cray XK7 hybrid)
Summit (200 PF)
• ALCF (many-core)
Mira (10 PF IBM Blue Gene/Q)
Aurora (2018; 180 PF Knights Hill Xeon Phi)
• NERSC (many-core)
Edison (2.6 PF Cray XC40/30 Intel Xeon)
Cori (31 PF Cray XC40 Intel Xeon Phil KNL)
Programs for computer allocation awards
• INCITE: Open science competition; must effectively use
machine, ALCF and OLCF
5.8 Billion hours to 55 Projects
• ALCC: DOE-Science-relevant awards, to all 3 systems
3 Billion processor hours to 49 projects
• ERCAP: DOE-SC Office program, to NERSC
640 Awards!
Climate applications share and compete for these resources with the DOE and outside
community.
Smaller dedicated resources are sometimes purchased for quick turn-around simulations
Detailed specs for current and next systems
Applications, like ACME, are actively exploring heterogenous computing in
both its many-in-core (Cori/Theta) and GPU-accelerator (Titan) forms.
The DoE strategy is to target a wide cross-section of grand-
challenge scale computing applications in exascale design.
Exascale: ASCR facilities is undergoing extensive
procurement process
• Advanced Scientific Computing Research
• Basic Energy Sciences
• Biological and Environmental Research
• Fusion Energy Sciences
• High Energy Physics
• Nuclear Physics
As part of this, ASCR has convened
workshops and solicited
input/reports from each of the
Offices within the Office of Science
The 10-year milestone for DoE climate
modeling is linked to the design and
deployment of a DoE exascale
computing facility in the mid 2020s. http://exascaleage.org/ber/
Workshop in March, 2016
Report is 366pp
BASC • May 2017 Department of Energy • Biological and Environmental Research
Computational challenges to DOE climate modeling in next decade
DOE climate modeling is committed to using DOE machines, which are challenging due
to low-power, low-bandwidth trends in hardware
Added challenge is to effectively use both many-core and hybrid architectures, while
also preparing for unknown exascale platform.
DOE projects have significant advantage of co-location with facilities and other HPC-
intensive research projects within the DOE Laboratory system. However the projects do
not have significant dedicated resources.
Being on the “bleeding-edge” of computing is painful. The systems evolve quickly and
significant resources go toward ongoing updating of the codes
ACME project is committed to simulating the coupled climate system. Performance of
each system (ocean, atmosphere and coupling) must be considered
In addition to the “capacity” computing, DOE also needs good mid-size “capability”
resource that is efficient in data-processing for model analysis
BASC • May 2017 Department of Energy • Biological and Environmental Research
Future “reality-check”, strategies
Reality:
Although ACME has substantial allocations on DOE machines, it
can perform less than 200 years of coupled simulation at 25 km
resolution (in one calendar-year)
At exascale, assuming the ability to simulate on next-generation
low-power machine, one could perform an ensemble at 25 km, or
further increase resolution to 10km and perform 200 year
simulation
Strategies:
DOE has interests in new strategies for science/computation:
• Algorithms with high flop-to-memory ratio
• Algorithms with large sub-grid work for GPU’s, like MMF
• Methods to get statistics from simulations (besides brute-force
ensemble methods)
• Use of in-situ diagnostics to reduce I/O
• New/better methods to initialize the coupled-system, to avoid
long spin-up
• More extensive and invasive parallelization of code, e.g. task-
based
• For portability: memory/pattern “abstractions” of code, use of
portable libraries, programming models
AXICCS workshop in
Sep 2016
Report is 228 pp https://science.energy.gov/ber/comm
unity-resources/
Office of Science Office of Biological
and Environmental Research
Thank you! BER
https://science.energy.gov/ber
ASCR
https://science.energy.gov/ascr
ASCR facilities
https://science.energy.gov/user-facilities/
ACME https://climatemodeling.science.energy.gov/projects/accelerated-climate-modeling-energy