Date post: | 18-Dec-2015 |
Category: |
Documents |
Upload: | justin-white |
View: | 220 times |
Download: | 0 times |
Applied Math Issues in FACETS
Speaker: Lois Curfman McInnes, ANL
Core: Alexander Pletzer, Tech-X– John Cary, Johan Carlsson, Tech-X: Core solver– Srinath Vadlamani, Tech-X: Turbulent flux computation via FMCFM– Ammar Hakim, Mahmood Miah, Tech-X: FACETS infrastructure– Allen Malony, Alan Morris, Sameer Shende, Paratools: Performance analysis– Greg Hammett, PPPL: Suggesting stable time stepping schemes– Alexei Pankin, Lehigh University: Providing core transport benchmark against ASTRA
Edge: Tom Rognlien, LLNL– John Cary et al., Tech-X: FACETS integration– Ron Cohen, LLNL: Edge physics, scripting– Hong Zhang, ANL: Nonlinear solvers– Satish Balay, ANL: Portability and systems issues via TOPS, no FACETS funding– Maxim Umansky, LLNL: BOUT physics, no FACETS funding– Sean Farley, LSU: Math grad student, summer 2008 at ANL + ongoing: BOUT/PETSc interface– Mike McCourt, IIT and Cornell: Applied math grad student, summer 2007 at ANL: UEDGE/PETSc interface
Coupling: Don Estep, CSU− Du Pham, Simon Tavener, CSU: Analysis of stability and accuracy issues in coupling− Ron Cohen, Tom Rognlien, LLNL: Physics issues in coupling
FACETS SciDAC ReviewMay 14, 2009
2
Nonlinear PDEs pervade FACETS components
– Core (via new core solver, Tech-X)– Edge (via UEDGE and BOUT, LLNL)
• Discussion emphasizes– PDE representation of physics– Parallelization and performance analysis– Stability and accuracy issues in
coupling– Collaborations with SciDAC CETs and Institutes
• Future work– Core-edge coupling as we move to
implicit coupling– Possibly kinetic models in edge physics
via Edge Simulation Laboratory (ESL)– Possibly wall and sources components
• Initial focus: Fully implicit Newton methods in
3
TOPS provides enabling technology to FACETS; FACETS motivates enhancements to TOPS
TOPS Overview −TOPS develops, demonstrates, and
disseminates robust, quality engineered, solver software for high-performance computers
−TOPS institutions: ANL, LBNL, LLNL, SNL, Columbia U, Southern Methodist U, U of California - Berkeley, U of Colorado - Boulder, U of Texas - Austin
PI: David Keyes, Columbia Univ.www.scidac.gov/math/TOPS.html
Towards Optimal Petascale Simulations
CS
Math
Applications
TOPS
FACETS fusion
4
• Design and implementation of “solvers”
– Linear solvers
– Eigensolvers
– Nonlinear solvers
– Time integrators
– Optimizers
• Software integration• Performance optimization
Overall scope of TOPS
0),,,( =ptxxf &
0),( =pxF
bAx =
BxAx λ=
0,0),(..),(min ≥= uuxFtsuxu
φ
(with sensitivity analysis)
(with sensitivity analysis)
Optimizer
Linear solver
Eigensolver
Time integrator
Nonlinear solver
Indicates dependence
Sensitivity AnalyzerSUNDIALS, TrilinosTAO, Trilinos
PARPACK, SuperLU, Trilinos
hypre, PETSc, SuperLU, Trilinos
PETSc, Trilinos
PETSc, SUNDIALS, Trilinos
Primary emphasis of TOPS numerical software
5
Nonlinear PDEs in Core and Edge Components
Dominant computation of each can be expressed as nonlinear PDE: Solve F(u) = 0, where u represents the fully coupled vector of unknowns
Core: 1D conservation laws:
where q = {plasma density, electron energy density, ion energy density}
F = fluxes, including neoclassical diffusion,electron/ion temperature,gradient induced turbulence, etc.
s = particle and heating sources and sinks
Challenges: highly nonlinear fluxes
€
∂q
∂t+∇ • F = s
Edge: 2D conservation laws: Continuity, momentum, and thermal energy equations for electrons and ions:
, where & are electron and ion densities and mean velocities
where are masses, pressures, temperatures are particle charge, electric & mag. fields are viscous tensors, thermal forces, source
where are heat fluxes & volume heating termsAlso neutral gas equation
Challenges: extremely anisotropic transport, extremely strong nonlinearities, large range of spatial and temporal scales
€
∂n
∂t+∇ • (ne,ive,i) = Se,i
p
€
nme,i
∂ve,i
∂t+ me,ine,ive,i • ∇ve,i =∇pe,i +qne,i(E + ve,i × B /c)
€
ne,i
€
ve,i
€
3
2n
∂Te,i
∂t+3
2nve,i • ∇Te,i + pe,i∇ • ve,i = −∇ • qe,i −Π e,i • ∇ve,i +Qe,i€
me,i, pe,i,Te,i
€
q, E, B
€
−∇• Πe,i −Re,i + Se,im
€
qe,i,Qe,i€
Πe,i, Re,i, Se,im
6
FACETS/TOPS collaboration focuses on nonlinearly implicit methods
• Popular nonlinear solution approaches– Explicit Methods
• Splitting of coupled variables– Often by equation or by coordinate direction– Motivated by desire to solve complicated
problems with limited computer resources– Semi-Implicit Methods
• Maintain some variable couplings– Fully Implicit Methods
• Maintain all variable couplings• For example, preconditioned Newton-Krylov methods
• Implicit algorithms have demonstrated efficient and scalable solution for many magnetic fusion energy problems
7
• Newton: Solve: Update:• Krylov: Projection methods for solving linear
systems, Ax=b, using the Krylov subspace
– Popular methods: GMRES, TFQMR, BiCGStab, CG, etc.
• Preconditioning: In practice, typically needed− Transform Ax=b into an equivalent form: or where inverse action of B approx. that of A, but at a smaller cost
• Matrix-free: Newton-like convergence without the cost of computing/storing the true Jacobian, F’(u)− Krylov: Compute only Jacobian-vector products, F’(u) v− Preconditioning: Typically use ‘cheaper’ approx. for F’(u) or its
inverse action
7
F’(u ) u = – F(u ) u = u + dul-1l
ll
l-1 l-1
Newton-Krylov methods are efficient and robust
€
F '(ul+1) ∂ul = − F(ul−1)
ul = ul+1 + λ ∂ul
€
K j = span(r0,Ar0,A2r0,K ,A
j−1r0)
€
B−1Ax = B−1b
€
(AB−1) (Bx) = b
8
PETSc provides parallel Newton-Krylov solvers via SNES
•PETSc: Portable, Extensible Toolkit for Scientific computation– www.mcs.anl.gov/petsc– Targets parallel solution of large-scale PDE-based
problems•SNES: Scalable Nonlinear Equations Solvers
– Emphasizes Newton-Krylov methods– Uses high-level abstractions for matrices, vectors,
linear solvers • Easy to customize and extend• Supports matrix-free methods• Facilitates algorithmic experimentation
– Jacobians available via application, Finite Differences (FD) and Automatic Differentiation (AD)
9
Core and Edge components use PETSc flexibility via SNES
Solve F(u) = 0: Fully implicit matrix-free Newton-Krylov methods
€
F '(ul+1) ∂ul = − F(ul−1)
ul = ul+1 + λ ∂ul
– Can choose from among a variety of algorithms and parallel data structures
– UEDGE now has access to many more parallel solver options
Post-Processing
ApplicationInitialization
FunctionEvaluation
JacobianEvaluation
PETSc
PETSc code
Application code
application or PETSc for Jacobian (finite differencing)
Matrices Vectors
Krylov SolversPreconditioners
GMRES
TFQMR
BCGS
CGS
BCG
Others…
SSOR
ILU
B-Jacobi
ASM
Multigrid
Others…
AIJ
B-AIJ
Diagonal
Dense
Matrix-free
Others…
Sequential
Parallel
Others…
UEDGE + Core Solver Drivers (+ Timestepping + Parallel Partitioning)
Nonlinear Solvers (SNES)Options originally used by UEDGE
10
Challenges in nonlinear solvers for core
• Plasma core is the region well inside the separatrix• Transport along field lines >> perpendicular transport, leading to
homogenization in poloidal direction• Core satisfies set of 1D conservation laws:
q = {plasma density, electron energy density, ion energy density} F = highly nonlinear fluxes including neoclassical diffusion, electron/ion temperature gradient induced turbulence, etc.s = particle and heating sources and sinks
− New FACETS capability: get s from NUBEAM via core/sources coupling
separatrix
hot plasma core
€
∂q
∂t+∇ • F = s
11
Implicit core solver applies nested iteration with parallel flux computation
• Extremely nonlinear fluxes lead to stiff profiles (can be numerically challenging)– Implicit time stepping for stability– Coarse-grain solution easier to find– Nested iteration used to obtain fine-grain solution– Flux computation typically very expensive, but problem dimension relatively small– Parallelization of flux computation across “workers” …“manager” solves nonlinear equations on 1 proc using PETSc/SNES
• Runtime flexibility in assembly of time integrator (including any diagonally implicit Runge-Kutta scheme) for improved accuracy
Nonlinear solve
12
Flexibility of FACETS framework allows users to explore time stepping schemes with no change to source code
Nested iterationimproves robustnessof nonlinear solver
• Explicit method is unstable• Crank-Nicholson is marginally stable• Use BDF1 for stability and accuracy• Other schemes, e.g., various IMEXSSP can be coded at runtime.
Stable to ETGmodes
Sharp kink develops
Ref: A. Pletzer, et.al., "Benchmarking the parallel FACETS core solver," Poster presented at the 50th Annual Meeting of the Division of Plasma Physics, Dallas, TX, November 17-21, 2008.
Radial coordinate
Ion
tem
per
atu
re (unstable)
13
Participation of Paratools identified performance bottleneck in core solver
Paratools (A. Malony et al.)affiliated with the SciDAC Performance Engineering Research Institute (PERI)
• Load imbalance responsible for lack of scalability at high processor count (128)
• Also, careful profiling identifies redundant flux computation at low processor count (8)
14
Challenges in nonlinear solvers for edge
•UEDGE Issues– Strong nonlinearities– Parallel Jacobian
computations
•UEDGE Features– Multispecies plasma; variables ni,e, u||i,e, Ti,e for
particle density, parallel momentum, and energy balances
– Reduced Navier-Stokes or Monte Carlo neutrals– Multi-step ionization and recombination– Finite volume discretization; non-orthogonal mesh– Steady-state or time dependent
15
More complete parallel Jacobian data enables robust solution for problems with strong nonlinearities
• New capability: Computing parallel Jacobian matrix using matrix coloring for finite diff.– More complete parallel Jacobian data enables
more robust parallel preconditioners• Impact
– Enables inclusion of neutral gas equation (difficult for highly anisotropic mesh, not possible in prior parallel UEDGE approach)
– Useful for cross-field drift casesPoloidal distance
UEDGE parallel partitioning
8 proc: Matrix-free Newton w. GMRES: Block Jacobi stagnates; complete Jacobian data enables convergence
Previous parallel UEDGE Jacobian(Block Jacobi only)
Recent progress: Complete parallel Jacobian data
5 equations: ion density, ion velocity, gas density diffusion, electron temp, ion temp
Missing Jacobian elements
16
Computational experiments explore efficient and robust edge solvers
Matrix-free Newton with GMRES,
8-proc case for LU preconditioner:
• 57% time: UEDGE parallel setup (17 sec)
• 43% time: parallel nonlinear solver (13 sec)
• 8%: Create Jacobian data structure, determine parallel coloring, scaling
• 6%: Compute Jacobian: FD approx via coloring, including 40 f(u) computations• 4%: Compute f(u) for RHS + line search • 25%: Linearized Newton solve: GMRES/LU via MUMPS (hold Jacobian/PC fixed
for 5 Newton iterations)
Problem size: 24,576 (128x64 mesh w. 3 unknowns per mesh pointComputational environment: Jazz @ ANL: Myrinet2000 network, 2.4 GHz Pentium Xeon procs with 1-2 GB of RAM
17
New work with BOUT uses both SUNDIALS (integrators) and PETSc (preconditioners)
BOUT (BOUndary Turbulence), LLNL• Motivation and physics
– Radial transport driven by plasma turbulence; BOUT(C++) to provide fundamental edge model
• 2D UEDGE approx. turbulent diffusion• 3D BOUT models turbulence in detail
– Ion and electron fluids; electromagnetic – Full tokamak edge cross-section
BOUT edge density turbulence, ni/ni
• Numerics and tools– Finite-difference; 2D parallel partitioning– Time dep; implicit PVODE/CVODE– Can couple turbulent fluxes to UEDGE
• Current status within FACETS− Parallel BOUT/PETSc/SUNDIALS verified against original BOUT− Transitioning to BOUT++− Experimenting with preconditioners
18
Preliminary investigation of model problems reveals stability issues arising from coupling
Explicit coupling• Implicit Euler for each
component solve• “Nonoverlapping” coupling
strategy• 512 cells in each component
Simple model problem
Weak instability• There is a weak instability for equal diffusion constants
19
Numerical analysis tasks over the next year
• Devise and analyze a sequence of model problems– The model problems should have increasing complexity
• Two coupled heat equations in one dimension with various coupling strategies
• Coupled one-dimension – two dimension heat equations• Add strong inhomogeneous behavior “parallel” to the interface
boundary• Add complications: rapid changes in diffusion in the interior,
nonlinear diffusion, multirate time integration– Conduct numerical studies using “manufactured” solutions
with realistic behavior for various coupling strategies– Carry out rigorous stability analysis for various coupling
strategies and general solutions– Carry out analogous tests for FACETS codes
• Extend a posteriori analysis techniques to finite volume methods for coupled problems– Apply to nonlinear problems with realistic discretizations
by computing stability
20
FACETS motivates new PETSc capabilities that benefit the general community
•New features included in Dec 2008 release of PETSc-3.0– SNES: limit Newton updates based on application-
defined criteria for maximum allowable step• Needed by UEDGE
– MatFD: parallel interface to matrix coloring for sparse finite difference Jacobian estimation• Needed by UEDGE
•New research: FACETS core-edge coupling inspires support for strong coupling between models in nonlinear solvers– multi-model algebraic system specification– multi-model algebraic system solution
21
FACETS/TOPS work inspires new research for SciDAC CS/math teams
• General Challenge: How to make sound choices during runtime among available implementations and parameters, suitably compromising among – accuracy, performance, algorithmic robustness, etc.
• FACETS Challenge: How to select and parameterize preconditioned Newton-Krylov algorithms at runtime based on problem instance and computational environment?
• Research in Computational Quality of Service (CQoS)– Goal: Develop general-purpose infrastructure for dynamic
component adaptivity, i.e., composing, substituting, and reconfiguring running component applications in response to changing conditions
– Collaboration among SciDAC math/cs teams• Center for Technology for Advanced Scientific Component Software (TASCS),
Paratools, Performance Engineering Research Institute (PERI), and TOPS
– FACETS-specific capabilities can leverage this infrastructure
22
FACETS collaborations on ‘solvers’ with SciDAC math/CS teams & CSU are essential
•TOPS, CSU, Paratools, PERI, and TASCS provide enabling technology to FACETS– TOPS: Parallel solvers via numerical libraries– CSU: Insights to stability/accuracy in coupling– PERI/Paratools: Performance analysis and tuning– TASCS: Component technology (ref: T. Epperly)
•FACETS motivates new work by CSU, TOPS, Paratools, PERI, and TASCS– New CSU research on stability & accuracy issues– New TOPS library features + algorithmic research– New capabilities in TASCS/PERI/Paratools for
CQoS