Towards a distributed finite element package for electromagnetic field computation

IEEE TRANSACTIONS ON MAGNETICS, VOL. 29, NO. 2, MARCH 1993 1923

Towards a Distributed Finite Element Package for Electromagnetic Field Computation

H. Magnin and J.L. Coulomb Laboratoire dElectrotechnique de Grenoble, URA 355 C N R S

ENSIEG, BP 46,38402 Saint-Martin dHbres, FRANCE

Abstract - The recent evolution of both hardware and software techniques provide now the possibility of interconnecting various computers, thus allowing them to communicate and cooperate. On the other hand, three-dimensional computation of electromagnetic fields with finite elements involves time expensive calculations, and needs the use of various software and hardware tools: paral le l c o m p u t i n g , e f f i c i e n t v i s u a l i z a t i o n (specialized graphic processors, libraries), etc ...

This paper presents the transformation of an existing finite element program into a distributed application, allowing its users to handle the “remote” computational power available on a heterogeneous network of computers. Many difficulties are discussed, such as the localization of resources, the replicated database management, the cost of network communications, and parallel distributed computing.

I. INTRODUCTION

The hardware context is made of a network of 20 Hewlett-Packard/Apollo workstations, also comprising an Alliant FX/80 mini-supercomputer. These computational resources are used to execute electromagnetic field computing software, and in particular FLUX3D, a 3D finite element package. Our first aim was to transform it into a distributed application providing the user with transparent access to the Alliant’s computational power, while working on any workstation of the network. The advantages of such a configuration are well known: the “number cruncher” is used only to perform computations for which it is efficient, and is discharged of the other operations (for example graphics), performed locally by the workstation(s).

The “network level” tool used to build this distributed application is the Network Computing System (NCS), provided by Hewlett-Packard [l]. NCS is a Remote Procedure Call (RPC) facility [2], chosen as the standard by the Open Software Foundation. Knowing the network address of a remote host, it allows a running process (called the client) to call a procedure executed by another process (called the server) on the remote host. It ensures transparent transmission of parameters and recovery of results, even between heterogeneous computers. The effects of a remote procedure call are the same as if the procedure had been executed locally by the client process.

Manuscript received June 1,1992

I ‘ call PROC(A.33 , C)

1 ’mw-ron time

Fig. 1 - The remote procedure call mechanism

11. DIFFICULTIES

In our client/server model, the client process is “naturally” attached to the user-end of the application (generally on a workstation), while the server processes are considered as “slaves” answering to requests from the client. Even with such a simple scheme, many difficulties appear, the most representative of them being listed below.

-Access to global data. In the RFT “philosophy”, all input data and address space for a remote procedure to work correctly must be specified in its interface. Unfortunately, many procedures in a Finite Element (FE) package need global data for their execution: Fortran COMMONS, C globals, or database entities.

- Exclusive cooperation between client and server. Trying to respect the “no side effect” model for all server procedures in a distributed FE package, issued from an already existing program like FLUX3D (250000 lines of Fortran code) is not realistic. It would imply, in particular, a cascade of pro interface changes, resulting in the need to maintain di versions of these procedures. Moreover, the interfaces ,of these subroutines would certainly become complicated increasing the programming complexity and the amount data to transmit over the network at each call (which is expensive). So, a mechanism to transfer global data between client and server must be designed. Unfortunately, in this case, each server loses its multi-user capacity: during all the time it cooperates with a client (probably many calls), it must not be called by another one, because of the conflictual access by clients to the server’s global address space. D ’

a whole “cooperating session”, client and server process must be exclusively bound together, and this is what w the exclusive allocation of the server to the client.

-Localization of servers. At each remote procedure call, the client process must specify the identity of the server to which it is assigned. So, a client process willing to call a procedure remotely, must be able to localize it, i.e. to know

0018-9464/93$03.00 0 1993 IEEE

1924 IEEE TRANSACTIONS ON MAGNETICS, VOL. 29, NO. 2, MARCH 1993

which servers on the network are available to execute this procedure, to have one of them exclusively allocated, and to obtain an identifier representing this server. RPC facilities often provide broadcast mechanisms (the first available server executes the call), but because of our need for exclusive allocation, such a mechanism cannot be employed.

- Identification of client and server procedures. In fact, the most difficult work to do is when the procedures which are candidates for remote execution are identified. For each one of these, the computational domain must be determined. It is made of all the data the subroutine needs for execution, and the address space required for it to store its results. When dealing with global data, this task may become infeasible because of program complexity. Moreover, when this computational domain is known, some (sometimes many) extra remote procedures must be written, with only the aim of transfering this global information from the client’s address space to the server’s (and vice-versa).

111. METHODOLOGY

When making a distributed application from a large software package like FL,UX3D, there appears a need for transparency. First, the use of the distributed version of the program should not differ from that of the “classical” version. Overall, program development must not be complicated: those who write FE procedures (new formulations, models, ...) must ignore the fact that the program can be distributed. This remark leads us to distinguish between four types of people working with a distributed finite element package, as in [31.

- The user. He employs the software to compute electromagnetic fields in some devices, without dealing with intemal opemting details.

- The FE programmer. He implements new calculation functionalities using existing code as an environment. He needs not know about distributed computing facilities.

- The system programmer. He deals with the elaboration, and evolution, of specific tools for the construction of the distributed application. He will especially, respond to requests of the distributed application programmer (see below) for the tools to suit to his requirements.

- The distributed application programmer. He uses the tools provided by the system programmer to achieve the distributed application. He identifies server side procedures, constructs client and server programs, manages servers running on the network hosts.

In this paper, we represent the system programmer, and the next sections deal with the tools which were developed to solve the difficulties mentioned in section 11.

IV. THE LOCATING SERVER

To call a server for remote computations, a client process must know which servers on the network are available to do the work. Then, one of these servers must be exclusively allocated to the client during their collaboration.

With this aim, we designed a special server that holds in a database the information about all servers running on the

network. Each server is identified in a unique way by its functionalities, the host on which it is running, and the number of the 1/0 ports on which it “listens”. In our model, we identify the functionalities of a server by a name, generally corresponding to the name of the server program.

As a server is made of a collection of procedures, one of our tools is a standard main program to be linked with them. In particular, this main program contacts the locating server and sends to it the identification (name, host, port number) of the new running server, before listening for requests. When a server is stopped or interrupted, our environment ensures that the locating server is notified of its liberation.

To have a server allocated, a client process uses a single call specifying as input the name of the requested server. The first name-matching and available (free) server is allocated, and the client is returned an integer identifier representing that server (rather than let the client directly manage host names and port numbers). This identifier will be the means for the client to contact the server in our environment.

Fig. 2 illustrates the comunication between a client process and the locating server (holding its database), during the allocation phase. After use of a server, the client must explicitly free it. Nevertheless, our tools ensure that all the servers allocated to a client are automatically liberated when the client process exits or stops.

I Host INamelPort I Statel I Host IName !Port I State I I

1 I I 1

LOCATING SERVE 3 LOCATING SERVER 7 I

___) Server-id

I

4 “RES“

Fig.2 - Exclusive allocation of a server to a client

The problem of identifying an object given its name is called name resolution. Name resolution in distributed systems is subject to much research [4,5], and our single server implementation of this naming service will certainly have to be improved in the future. In particular, a distributed multi-server model as in [2] will enhance fault tolerance, for example in case of locating server crashes. However, ourcentralized model makes the implementation of mutual exclusion mechanisms easier.

V. THE FULLY REPLICATED DATABASE

V.1 - Justijiication and operating mode

We expect the construction and evolution of FLUX3D as a distributed application to be as easy as possible. In particular, we want to provide the distributed application programmer (see 111) with an efficient tool avoiding the very

EEE TRANSACTIONS ON MAGNETICS, VOL. 29, NO. 2, MARCH 1993 1925

difficult task of determining the computational domain of each procedure to be executed remotely. By chance, FLUX3D is an integrated modeling software, organized around a database. This means that all global data gained accessed to by its subroutines are stored in this database, except for some well known and located database item identifiers. Given these conditions, we designed a mechanism allowing us to share this database between client and server processes. It is implemented inside DB-LIB [6], the Database Management System (DBMS) of FLUX3D, and can be employed in the case of every program which would use this DBMS.

Potentially, the constitutive procedures of a server may require access to any (maybe all) data in the base. On the other hand, network communications are time expensive when compared with virtual memory access times, but we expect distribution to speed up the execution of the program. For these reasons, we decided to replicate fully the database contents between client and servers of the application.

After having obtained allocation of a server needing to access the database, the client must send a complete copy of it to that server, using a single call of the form:

DB-replicate (server-id ) Of course, such a replication will take a non-negligible

amount of time. For this reason, servers that export treatments requiring database access should execute sufficiently time expensive algorithms, for the time economy due to speedup to exceed the delay for replication.

Thus this full replication needs to be achieved only once oust after the server allocation). An incremental database update mechanism between client and server has been implemented. Once client and server(s) dispose of their own copy of the database, only changes (modifications) made by the one or the other have to be transferred. In practice, every remote procedure call to a server will require three instructions, looking like:

RPCCALL (server-id,procedure-to-cal1,parameters ...) DB-send-changes (server-id )

DBAet-changes (server-id )

according to client modifications since the last update.

server modifications since the last update.

where - DB-send-changes updates the server replica of the database,

- DB_get-changes updates the client replica, according to

V.2 - Implementarion

With each entity in the database is associated an integer information (a modification state), memorizing actions it was subject to. Basically, three actions modify an object :

(1) Creation (2) Modification (write) (3) Suppression During the execution of an algorithm, some objects are

subject to modifying operations composed of an ordered set of these actions. Each action on an object involves the evolution of its modification state, following a finite state automaton. At update time, each modified object is replicated again in the obsolete copy of the database, according to the value of its modification state. When the update is complete, all states of all objects are reset to “not-modified”.

Since all actions on entities of the database are performed exclusively through procedures made for this purpose, the management of modification states has been implemented quite transparently inside these subroutines. This way, both the distributed application andjinite element programmers do not see any difference. Also, neither the procedures which are candidate for remote execution, nor the client subrdutines, need any changes conceming access to the database.

VI. THE DISTRIBUTED APPLICATION

Since now, we take the place of the distributed application programmer, using the environment described before. In a previous work [7,8], the whole solving library of FLUX3D (integration of element submatrices, assembly of the global system of equations, the solving of that system) has been efficiently adapted to vector and parallel computation on the Alliant FX/80. We want this vector- parallel finite element solver to be implemented as a server, accessible remotely by users working with FLUX3D on workstations of the network.

The server is made of all procedures necessary to solve a finite element problem while the client handle all interactive commands (fig.3). With the tools developed by the “system programmer”, this task is rather straightforward to achieve. Table I presents the comparison between the times obtained by solving locally on an Apollo DN3000 type workstation, or remotely on the Alliant, a 3D magnetostatic problem with the magnetic vector potential A. The number of unknowns is about 11000, and the finite element mesh is made of 2000 quadratic tetrahedra. The four columns correspond to the following cases: (1) The problem is linear, and the resolution in made locally

on the workstation. (2) The same linear problem is solved remotely on the

Alliant, implying the initial full database replication. (3) The problem is modified: the ferromagnetic parts are set

to saturable, and the resolution is made agai the workstation.

(4) The previous non-linear problem is solved remotel the Alliant, the server replica of the database nee only an incremental update related to non-linear data. The variation of the speedup ratio between solv

(without database update) on the Alliant and th (ratio 10 to 20) is justified by the fact that supercomputer is a much used resource, being employed simultaneously by multiple users. In consequence, its performance per process varies.

Step (1) (2) (3) (4)

UpdateserverDB 0 60” 0 6”

Update client DB 0 5” 0 5”

SolveFEproblem 49’ 5’ 10” 3 h 30’ 11’

TOTALTIME 49’ 6’ 15” 3 h30‘ 11’ 11” ,

TABLE I - SOLVING TIMES

1926 LEEE TRANSACTIONS ON MAGNF,TICS. VOL 29. NO. 2, MARCH 1993

Server (on the Alliant FX/SO)

1- 1

redica 1

Interactive resolution

Problem descrintor

Mesh generator

I Client (on a workstation)

Fig. 3 - The client / server decomposition

VII. PARALLEL DISTRIBUTED COMPUTING

In its basic "philosophy", the RPC mechanism does not allow parallel computing, because at each remote call the client process waits for the server to complete execution of the procedure. Nevertheless, we implemented non-blocking remote procedure calls, giving the possibility of issuing many calls "simultaneously" ('just like independant forked Unix processes), and to wait for their completion.

Using this facility, we experimented with a strategy in which many servers (executed by different workstations) could, in parallel, integrate a subset of the element submatrices of a FE model, and assemble them into their own "incomplete global linear system" [9]. After all servers have completed their task (rendezvous), the client process then gets back and adds the servers' macro-contributions to build the complete linear system, and solves it. Note that with that strategy, the servers only require read access to the database, and therefore no problem of conflictual write operations can occur.

Unfortunately, when using many servers requiring the database, the initial full data replication has to be performed many times (once per server), and becomes more and more time expensive. In consequence, the speedup obtained for the construction of the global system of equations never exceedcd a ratio of 2 (using 3 servers in parallel). Moreover, using more than 3 servers made this speedup ratio decrease, because of time due to network transfers.

A way to minimize the time due to network transfers is to use some data compression in the transmission between processors. At this time such procedures have not yet been experimented by the authors.

VIII. CONCLUSION

Transforming a finite element program into a distributed application is a difficult task. Although some powerful and transparent tools like remote procedure call facilities are now available, many difficulties have to be faced, in particular sharing of global data between cooperating processes running on different computers of a network.

After having designed a set of tools to facilitate the clienuserver decomposition and implementation for a finite element package like FLUX3D, we showed that it is possible to improve the performances of such an application by the use of distributed computing.

The model in which a single remote server running on a powerful host is employed to perform coarse grained computations seems to give the best performances. Because of communication times, parallel distributed computing is less efficient. Moreover, it requires a real programming effort (leading to little transparency) when being implemented in a steady-state finite element package.

REFERENCES

[ 11 Network Computing System (NCS) reference, Apollo Computer Inc., 1987

[2] A.D. Birrell, B.J. Nelson, "Implementing Remote Procedure Calls", ACM Trans. on Computer Systems, v01.2 (l), Feb. 1984, pp 39-59

[3] B.N. Bershad, H.M. Levy, "A Remote Computation Facility for a Heterogeneous Environment", IEEE Computer, May 1988, pp. 50-60

[4] D.B. Terry, "Structure-free Name Management for Evolving Distributed Environments", proc. 6th int. Conf. on Distributed Comput. Syst., 1986, pp. 502-508

[5] A.D. Birrell, R. Levin, R.M. Needham, M.D. Schroeder, "Grapevine: An Exercise in Distributed Computing", Comm. ACM, vol. 25 (4). April 1982 pp 260-275

[6] J. Ph. Iafrate, 0. Santana, J.L. Coulomb, "DB-LIB : A Tool Rox of Data Description and Management", IEEE Trans. Magn., 24 (I), 1988, pp. 342-345

[7] H. Magnin, J.L. Coulomb, "A Parallel and Vectorial Implementation of Basic Linear Algebra Subroutines in Iterative Solving of Large Sparse Linear Systems of Equations", IEEE Trans. Magn., vol. 25 (4), pp. 2895- 2897, 1989

[ 8 ] H. Magnin, J.L. Coulomb, R. Perrin-Bit. "Parallel and Vectorial Solving of Finite Element Problems on a Shared- Memory Multiprocessor", IEEE Trans. Magn., vol. 28 (2), pp. 1712-1715, 1992

[9] G. Mahinthakumar arid S.R.H. Hoole, "A Parallelized Element by Element Jacobi Conjugate Gradients Algorithm for Field Problems and a Comparison with other Schemes", Int. J . App. Electromag. in Mads, vol. 1 (I) , pp. 15-28, 1990

Henri Magnin was born in Thonon-les-bains, France, in 1963. He was graduated in electrical engineering from the "Ecole Centrale de Lyon" in 1985 . He received the PhD degree in electrical engineering in 1991 from the "Institut National Polytechnique de Grenoble".

Jean-Louis Coulomb was born in Nimes, France, in 1949. He was graduated in electrical engineering in 1972 from the "Ecole Nationale SupCrieure dElectricit6 et de GCnie Physique de Grenoble". He received the "Doctorat de 3" cycle" and the "Doctorat d'Etat" degrees in electrical engineering in 1975 and 1981 from the "Institut National Polytechnique de Grenoble".

He is presently Professor at the INPG and works at the "Laboratoire d'Electrotechnique de Grenoble". His current interests are: numerical field computation, finite element mesh generation, shape optimization and CAD software.

Date post:	22-Sep-2016
Category:	Documents
Upload:	j-l
View:	217 times
Download:	5 times

Towards a distributed finite element package for electromagnetic field computation

Documents