Page 1
A dynamic parallel coupler
E-Mail : [email protected]
URL: http://www.cerfacs.fr/~palm
Andrea Piacentini Rennes - November 2007
Page 2
Overview
• The genesis of the PALM project
• Code coupling issues
• PALM main features
• How to set up a PALM coupled application
• Some PALM applications
Page 3
The genesis of the PALM project
Origin of the PALM project: in 1996 with the operational ocean forecasting project MERCATOR.
A data assimilation suite can be designed and implemented as a coupling.
The different tasks - running a forecast, apply the observation operator, computing the misfit, approximate the error statistics, invert matrices, minimize a cost function and so on - are thought of as independent pieces of code to be assembled within a portable, flexible and efficient framework.
Page 4
The genesis of the PALM project
The PALM Team at CERFACS was in charge of the development of the software backbone of the MERCATOR project.
First a fully featured prototype was implemented using the available technology at the end of the 90's. At the issue of this phase the first version of the coupler was released with the name PALM_RESEARCH.
Based on the expertise developed on the prototype, the final MPI2 based version of PALM was designed and implemented. It is currently released with the name PALM MP.
Thanks to the flexibility of the approach, the non-specific formalism used to describe the coupling interfaces and the portability of the software, PALM established itself as a natural choice for all sorts of dynamic parallel coupling and code assembling projects.
Page 5
Overview
• The genesis of the PALM project
• Code coupling issues
• PALM main features
• How to set up a PALM coupled application
• Some PALM applications
Page 6
Code coupling issues
To build modular applications by assembling their elementary components (data assimilation…)
Split the problem in order to reach a better task scheduling, taking advantage of the intrinsic parallelism of the application
To set up new applications from existing codes
Parallel post-processing
Optimisation: algorithm on top of a model Research of the optimal position of the intakes in a
combustion chamber
Why coupling scientific computing codes?To model a system as a whole
Ocean - atmosphere
Fluid - structure
NEMO ARPEGE
+
Flux exchanges
Climate coupled modelling
Ocean Atmosphere
Page 7
Code 1 Code 2data
What does it imply?
-Manage executions (serial or parallel)
-Drive the information exchanges between codes
Commitments:
-Easy set-up: non intrusive interfaces
-Efficiency: no loss in performances
-Portability: standard technical solutions
-Flexibility: test different configurations with a few changes
-Genericity: reuse components in other couplings
Code coupling issues
Page 8
First solution: code merging
prog1 & prog2
Program prog2
Subroutine sub_prog2(in,out)
…
end
Program prog1
…
Call sub_prog2(in, out)
…
end
•Very efficient in computing time (data exchange by reference, …)
BUT:• Integration problems (the code independency is not respected): common blocks, logical units, compilation options, languages, …
•No flexibility (problems in maintenance, evolution, …)•Potential memory wastes•Task parallelism cannot be implemented
N.B. This is merging and not coupling (parallelism not taken into account)
Code coupling issuesImplementation
Page 9
Second solution: use a communication mechanism (PVM, MPI, CORBA, pipe UNIX, files, …)
Prog1 Prog2
Program prog2
…
Call XXX_recv(prog1, data)
end
Program prog1
…
Call XXX_send(prog2, data)
end
• More or less efficient depending on the mechanism
BUT:• Coupling is not generic• Good experience in parallel computing required• Lack of flexibility (parallelism, interpolation, …)• Not always portable depending on the chosen mechanism• Too complex with more than two codes and many exchanges
Code coupling issuesImplementation
Page 10
OASISThird solution: choose a coupler
• The coupler starts the executables in a sequence or in parallel
• Data exchanges are invoked by general primitives PUT/GET
• The coupler uses a portable and efficient communication protocol (MPI)
• The coupler provides interpolation tools, data redistributions, …
Huge gain in integration ease, performances, maintenance and evolution of the coupled application.
Prog1 Prog2
PALM
Code coupling issuesImplementation
Page 11
Overview
• The genesis of the PALM project
• Code coupling issues
• PALM main features
• How to set up a PALM coupled application
• Some PALM applications
Page 12
PALM main features
First function of the coupler:
Managing the components execution
Page 13
Spawning and execution
Static coupling Dynamic coupling
Both codes are started at the beginning of the simulation and exit together at the end of the application
Example: Ocean/Atmosphere coupling in climate modelling
Code1
Code3
Code2
Code1 Code2
Parallel or serial executions
Loops and conditional executions
Classical couplers PALM coupler
Complex algorithms can be easily end efficiently implementedDynamic resources management (processors and memory)
timeLoop
Code4
condition
Page 14
How to describe complex algorithms?
A Graphic User Interface for:
Representing the components
Handling the parallelism
Describing loops and conditions
Describing the communications (exchanges)
With all the benefits of a G.U.I. as, for instance:
Coherency checks on the exchanged data
Pictorial flow chart representation of the algorithm
Performance analysis, debugging and run-time monitoring
Spawning and execution
Page 15
Snapshot of the G.U.I.
Page 16
• Unit: computational component - Code- Module of a code- Data loaders- Pre-defined units
(F77/F90 subroutines, C or C++ functions, Java, Python, Matlab and so on through F90 or C/C++ interfaces)
• Parallel unit
32
PALM glossary
Page 17
• Branch: sequence of units and instructions => algorithm
PALM glossary
Page 18
PALM main featuresTwo levels of parallelism
Distributed components: units
Task parallelism: branches
Parallel computing “simply drawing”
Components assembling: blocks
Page 19
Second function of the coupler:
Driving the data exchanges
PALM main features
Page 20
NO explicit indication of the receiving units on the producer side, NOR of the producer on the receivers side
Announce data production: PROVIDER
CALL PALM_Put (…)
Register a data request: CLIENT
CALL PALM_Get (…)
“End point” communication paradigm
In the sources of the units: potential sending or reception
PALM main features
Page 21
• Object: data chunk produced or received by a unit
•Communication: exchange of an object between two units
PALM glossary
N.B.: “Packed data” for coherent data batches can simplify the data flow representation
Page 22
Actual communications are described by connecting the plugs in the graphic interface
Handshaking between clients and providers
PALM main featuresCommunication mechanism
Page 23
A communication takes place only if
PALM main featuresCommunication mechanism
A PALM_Put has been issued on the provider sideA PALM_Get has been issued on the client side
A “tube” has been drawn between the two plugs
Page 24
(0)
(1)
Parallel data exchanges
– Automatic remapping of the objects if the distribution on the provider side is different from the distribution on the client side
00 10 20 3001 11 21 31
(0) (1) (2) (3)
(0)
(1)
PALM main featuresIndependent units
Page 25
Direct Send (MPI)
Storage in a local mailbox (memoryto memory copies)
Storage on a mailbox on thedriver (MPI)
PALM main featuresCommunication mechanism
Optional use of files fortoo large objects
Page 26
Overview
• The genesis of the PALM project
• Code coupling issues
• PALM main features
• How to set up a PALM coupled application
• Some PALM applications
Page 27
Preliminary step:
How a code becomes a PALM unit ?
How to set up a PALM coupled application
Page 28
Pre-requisites:
• Code sources should be available (it is possible to couple black boxes, but efficiency is much lower)
• Codes have to be written in a compiled language (C, C++, FORTRAN 77 ou 90) or interfaced with such languages (Python, Java)
• Codes must run under Linux/Unix
How a code becomes a PALM unit
Page 29
• Replace PROGRAM (FORTRAN) by subroutine or main (C) by a function name
For all kinds of units
For a parallel unit
• Skip the calls to MPI_INIT and MPI_FINALIZE
• Replace MPI_COMM_WORLD by PL_COMM_EXEC
• Replace the calls to STOP, EXIT by PALM_Abort
How a code becomes a PALM unit
Page 30
Create an identity card: !PALM_UNIT \! -name unit name\! -functions { function to be invoked }! -comment { optional comment }!!PALM_SPACE \! -name space name \! -shape (.,., … ,.) \! -element_size PL_[INTEGER, REAL, DOUBLE_PRECISION, …] \! -comment { optional comment }!!PALM_OBJECT \! -name object name \! -space name of its space \! -intent [IN, OUT, INOUT]\! -time [ON, NO] \ ! -tag [ON , NO] \ ! -comment {optional comment }!!PALM_DISTRIBUTOR ! -name distributor name \! -type [regular, custom] \ ! -nbproc number of processes in the distribution \! -function name of the distribution function \! -comment {optional comment }
unit
space
object
distributor
How a code becomes a PALM unit
Page 31
Add the PALM primitives:
CALL PALM_Put (‘space1’, ‘obj1’, time, tag , array, error)
CALL PALM_Get (‘space2’, ‘obj2’, time, tag , array, error)
Error code
Variable pointing to the data associated to the object
Integer further tag for the current instance of the object
Integer time stamps for the current instance of the object
String with the name of the object as it is declared in the i.d. card
String with the name of the object space
How a code becomes a PALM unit
Page 32
Definition: Pre-defined units interfacing common mathematical libraries (BLAS, LAPACK, …) applied on the fly to objects exchanged between units. (Interesting for units conversion, grid to grid interpolation, …)
Load the unit from the Algebra Toolbox
Users can compose (and share) their specific algebra units.
Execution and communications are managed as if it were a user unit
Pre-defined PALM algebra units
Page 33
Main step:
Once the units have been defined, the coupled application (algorithm and
communications) is describedin the graphic user interface PrePALM
How to set up a PALM coupled application
Page 34
Description of the coupling algorithm
Model Branch Observations Branch
In the graphic interface the user define the execution sequences (branches), with control structures
Page 35
The actual communications are described in the graphic interface by linking the plugs corresponding to PALM_Put and PALM_Get
Description of the communication fluxes
KEY POINT:
Analysis of the algorithm data
flow and coherent definition of
the interfaces between sub-
modelsN.B.:
PALM handles time
interpolations for objects
produced by units with a
different time step.
Page 36
User defined units
.f.f.f .c
.ppl
IHMPrePALM
.pil.f
.f .c…
Palm_main Entité_exe
.a.a.a
PALM library
.a
MPI and algebralibraries
.a.a.a
PALM library
.a
MPI and algebralibraries
mpirun –np 1 palm_main
Palm_main Entity_exe
Compile and run a PALM_MP application
Page 37
• A PALM output file per branch with different verbosity levels for different classes of messages
Debugging, analysis and monitoring tools
• A user defined debugging function, called on the PALM_Put and PALM_Get sides of an active communication to check the coherency of the objects
• A performance analyser accounting for the elapsed and the CPU time
• A graphic post-mortem replay or run-time monitoring of the application
Page 38
Performance analyser in PrePALM
Page 39
Post-mortem replay and run-time monitoring
Page 40
To sum up
PALM is a software dealing with couplings or with complex applications built upon sub-modules with no loss of performances.
Many pros:
Evolution and maintenance of a complex application and its components are easier
A framework for the integration of existing independent codes in an application (sub-models assembling, multi physics coupling, collaborative development, …)
Take the best advantage from the intrinsic parallelism of an application
A friendly and simple G.U.I. to describe, implement and supervise applications.
Page 41
Overview
• The genesis of the PALM project
• Code coupling issues
• PALM main features
• How to set up a PALM coupled application
• Some PALM applications
Page 42
PALM in the SEVE project
Sol Eau Végétation Energie: integrated modelling of land surface processes at different space and time scales (water and carbon cycles)
Pool of actors: INRA CSE (Avignon) , INRA EPHYSE (Bordeaux) , LISAH (Montpellier) , CESBIO (Toulouse) , CNRM/GMME (Toulouse), LTHE, CEMAGREF, CERMICS
Ongoing actions:
Complex landscapes description and integrated interactions modelling : Segmentation of landscapes Hetherogeneity, different scales, discontinuities and linear structures
Multi-physics and multi-scale coupling, dynamic time stepping
3D transferts modelling in different compartments at a landscape scale
Models enhancement: data assimilation from teledetection
Data bases
Page 43
Farm buildings & manure stores
Grass fields
Arable fields
Forest & shrubland
Farm management model
Multi-ecosystem model
Nested atmospheric model
Distributed hydrological model
Landscape database &
GIS
Controller programme
Non-agricsources
Landscape inventory input data
Soils & hydrologyinput data
Meteorol.input data
Output mapsdatasets &scenarios
Emission / deposition
Leaching / uptake
Lateral transfer in catchments
Local and regional atmospheric dispersion
Model data exchange
N & C flux
Farm buildings & manure stores
Grass fields
Arable fields
Forest & shrubland
Farm management model
Multi-ecosystem model
Nested atmospheric model
Distributed hydrological model
Landscape database &
GIS
Controller programme
Non-agricsources
Landscape inventory input data
Soils & hydrologyinput data
Meteorol.input data
Output mapsdatasets &scenarios
Emission / deposition
Leaching / uptake
Lateral transfer in catchments
Local and regional atmospheric dispersion
Model data exchange
N & C flux
PALM for NitroScape ??
Page 44
PALM in the ANR corayl projectCoupling combustion and radiatif
AVBPAVBP DOMASIUMDOMASIUM
CFD - LES Radiative model
Reactive Filtered Navier-Stokes equations.
Radiation equations in complex geometry
Parallel model Parallel model
Two levels of parallelism.
Independent coupled models
Coupling scheme: n CFD time steps for each radiative step
Energy sources. Thermal fluxes on the walls.
Energy sources. Thermal fluxes on the walls.Temperature
Species
TemperatureSpecies
chambre de combustiontrès hautes températures.
phénomènes radiatifs prépondérants. calcul massivement parallèle.
CERFACS EM2C (Ecole Centrale Paris)LGPSD (Ecole des mines d’Albi)
PhD Jorge Amaya
Page 45
Model nesting:the COMODO-PALM project
IMAG/CNRS - Laurence Viry
1/3° NATL3 North Atlantic
Before PALM•The components communicate by message passing (MPI)
•Very aggressive method, with deep specific changes in the source coedes: rewriting at every model or algorithm update
•Implementation of the two levels of parallelism rather cumbersome
With PALM•Independent coupled components (MPI2)
•Flexibility, modularity, performancesRun-time components spawningLoops around parts of the application (iterative algorithms of domain decomposition)
•Coupled units does not depend on the nesting algorithm
•Automatic parallel communications
•Code reuse in new developments
OPA
Gascogne Bay
1/15° BABY15
OPA
Page 46
Eric Mercier (SNECMA Moteur), Florent Duchaine
Itération 3
Itération 1
Temperature forecast in combustion chambers
Page 47
Shape optimisation MIPTO (INTELLECT D.M.)
Florent Duchaine CERFACS - TURBOMECA
Page 48
RANS – LES coupling - Florent Duchaine
Code reuse
Page 49
1/3°
1/15°
2°1/4°
1/15°
From regional to global, fr
om OI to variational, fr
om R&D to
operational
PALM in MERCATOR
Operational oceanography
All the data assimilation suites have been developed with PALMThe same methods are used for the different resolutions. The upgrades of the direct model do not impact on assimilationThe PALM_MP coupled implementation is less memory consuming than a standard implementation
Page 50
PALM in the ADOMOCA project
Assimilation de DOnnées dans les MOdèles de Chimie Atmosphérique
Pool of the main actors in the atmospheric chemistry french research SA, LA, LSCE, LPCE, LMD , IPSL, CERFACS, L3AB, CNRM, ACRI, CNES
4D-VAR
The assimilation algorithm is edited in the graphic interface PrePALM: no changes in the source codes.
In this case, new units are plugged in: e.g. the tangent linear and adjoint models
Choice of a common tool for sharing:
components: models and operators
methods: algorithms and diagnostics
CERFACS CNRM
3D-FGAT
Page 51
PALM in the ADONIS project
• Aim: best use of the measurements collected in the nuclear reactors for the tuning and improvement of the kernel modelling
• 3 applications : – Manara: Best estimate of the physical
fields by variational assimilation– Kafeine: Optimal tuning of the EDF
neutronic numerical model– PhD thesis: Compare different data
assimilation algorithms without the constraints of industrial direct application.
Assimilation de DOnnées NeutronIqueS
Page 52
PALM is portable on most platforms (linux PCs, unix workstations, mainframe, supercomputers).
Web page: http://www.cerfacs.fr/~palm
Training:At CERFACS 2-3 times/yearOr on demand
E-mail: [email protected]@cerfacs.fr
More info