00048072

8/11/2019 00048072

1/6

A

Concept for Distr ibuted Control Systems

M. Glaser, C. Kordecki,

U.

Rembold

University of Karlsruhe

Ins ti tut e for Real-Time Computer Syste ms an d Robotics

P.O. Box

6980,

D-7500 Karlsruhe, FRG

This pape r describes the HERO S-System (Hierarch ical ly Ex-

tendib le Real -Time Ope ra t ing System), which w as especial ly

developed and implemented for the control and supervision of

robots. It al lows the dynamic creation of processes and the ir

management . The processes in HEROS have no knowledge of

thei r mu tual existence and possess no global kernel routine s o r

variab les . The in terprocess communicat ion and synchroniza-

tion

is

accomplished through a channel mechanism . For this

opera t ion , s imple and effec tive function ca l l s a r e ma de avai l -

ab le to the programmer ' s use . The uncoupl ing of the processes

which is at ta ined th rough the channel concept enables the tas ks

to be independently defined, implemented a nd to be operated in

p a ra l l e l . Th e c h a n n e l c o n c e p t a l l o ws t h e p ro c e s se s t o b e

allocated independently of each other t o the processors . The

H E R O S s y s t e m is composed of severa l c lusters connected

through a loca l a r ea ne twork

LAN).

Each cluster consists of

several conventional single board computers (SBC), a global

memory and a ne twork cont ro l le r . Wi th in such a c luster , the

h a rd re a l - t i me re q u i re me n t s o f a ro b o t c a n b e fu l f i l l e d .

HER OS- t a sk s c a n a l so c o mmu n i c a t e wi t h o t h e r c o mp u t e r

s y s t e m s t h r o u g h t h e LAN, whereby the speci f ic ne twork

charac teri s t ics and the response t ime of the serv ices de termine

the rea l - t ime behavior .

1.Introduction

The most important requirement of a hard real-time

environment is the observance of time limi ts. Infor-

mation which is supplied

too

early or

too

late is use-

less and can lead

t o

an undefined state in a control

system.

To assure real-time operation i t is necessary

t o

have

a high computer performance ( in extreme case one

computer for each sensor and actuator) and

t o

pro-

vide easy mechanisms for interaction between the

participating processes. Additionally,

it

is required

that the whole system is hierarchically constructed

t o ensure an efficient control on the different system

levels.

A

dynamic process reconfiguration is also

needed for automatically adaptation to changing ex-

ternal environmental conditions.

Typical applications of hard real-time systems are

flight control, traffic control, computer integrated

manufacturing (CIM) and robotics.

The HEROS-system (Hierarchically Extendible Real-

Time Operat ing System), developed by the in stit ute

for Real-Time Computer Control Systems and

Robotics of the University of Karlsruhe is a prototype

for the control and supervision of indust rial robots

which operate in flexible assembly cells. One of the

aims of the project was the development of a dis-

tributed multiprocessor system for real-time

application under following conditions:

Only hardware components, which are the

sta te of the a rt in the industry should be

used

to

build up and integ rate the system.

A

well known and available programming

environment should be used for the system

and user software development.

The multiprocessor system should be pro-

grammed with standard languages like C

or Pascal, and with new programming

tools for parallel computer architectures.

The multiprocessor system must be de-

signed t o control and supervise robots, es-

pecially in an indus trial environment

where usually hard real-time require-

ments are encountered.

An other research area was the dynamic process

allocation which is necessary for the process

reconfiguration during system operation, whereby

the independence between processes and processors

must be ensured. Also communication capabilities

between the multiprocessor system and an external

system through a local area network or a special

channel are

t o

be investigated.

In this paper first, the HEROS system architecture

and its kernel are described. Next, a mechanism for

the process communication, which is an essential

component of the system is explained. As a conclu-

sion, the implementation of the system and the main

object for future work are discussed.

667

0073-1129/89/oooO/0667 01.00 1989 IEEE

8/11/2019 00048072

2/6

2procesSmodel

In HEROS a process is composed of a local data

structure and a sequential program. The data struc-

ture can be manipulated by the own program only.

The processes communicate through

message

pas s -

ing

and each process can only send a message t o an-

other, if it has a channel to it. This model assures

th at the process environment is a local one. It was

proposed by R. Strom and

S

Yemini [2] and used on

the distributed systems programming language

NIL [3].

Processes can be created and activated by other pro-

cesses at any node of the system. Since there is no

global environment the information for the new pro-

cess is passed by parameters a t the creation time. A

process can extend its knowledge about the environ-

ment through subsequent communication 41.

Each process gets at creation time a system wide

unique identification number (PID), which is com-

posed of the cluster number, node number and the

processor number within the node. This enables a

fast process localization. Each process can be termi-

nated by itself, by others or by the system if a fatal er-

r o r has happened. A t the termination time all

re-

sources, which the process has used, like channels,

memory

or IO-devices will be returned to the system.

s yst m

architecture

The proposed distributed multiprocessor system

consists of clusters, which are loosely coupled

through a local area network (LAN). A cluster is

composed of several tightly coupled nodes connected

through a bus with a partly common address space

(Fig. 1 .Each cluster contains a global memory for

the process communication between nodes and a

LAN controller, which makes the connection

t o

the

outside. Each node is composed of a microprocessor

unit and a local memory.

This architecture simplifies the development and

test of user applications. The programs can be devel-

oped on the host computer with available program-

ming tools and languages like C and Pascal. The ob-

ject code of these programs is linked with the HEROS

run-time library and downloaded

t o

the multipro-

cessor system through the LAN.

A

further advan-

tage of this architecture is that the HEROS processes

can use external services (Fig. 2.a). The user can ac-

cess data from remote systems and use them on his

process. Fqr example construction plans can be di-

rectly taken from a

C D

system and used for indus-

trial production.

In addition

t o

the communication with external sys-

tems via the LAN, the HEROS system provides the

possibility of integrating microcontrollers for sensors

into the multiprocqssor system. The microcon-

trollers will e connected to the local bus of a cluster

(Fig. 2.b). The communication with the multiproces-

sor

system is done through a statically defined

memory space in the global memory. Since the

microcontrollers have not a HEROS kernel, each

user must define and implement his own

communication and synchronization mechanisms

in order to integrate the controller into the system. Of

course the same mechanisms as in HEROS can be

used in this case.

1

Figure 1:The

HEROS

rchitecture

A

very short reaction time as required in a hard real-

time environment can be ensured only within clus-

ters. Which means, that related processes requiring

a closed loop control with critical reaction times

must be installed in the same cluster. Reaction times

outside the clusters are dependent on the response

time of the external service and on the network

specification. For example with an ethernet, the

minimal transport time of the packets in the network

cannot be predicted because of the CSMNCD method

used. Certainly, there are better response times at

low network rates as in token rings. On the other

hand, processes of closed loop controls which have

long dead times, can be distributed over different

clusters without adverse effect on the required time

limits.

y-

HEROS

a)

hmurh the local

network

Figure 2: Communication with external sys tems

4. HEROS kernel

Every node of the multiprocessor system has an

identical operating system kernel (Fig. 3). It is

composed of two components, the HEROS Local

Kernel (HLK) and the HEROS Node Server (HNS).

Additionally, each cluster contains in a special node

the HEROS Cluster Server (HCS). Its purpose is

t o

manage the communication and synchronization

of

the cluster with the outside. For the user HNS and

HCS remain hidden; he can

o n ly

make calls

to

the

local kernel. In this manner the independence and

668

8/11/2019 00048072

3/6

the transparence of a process according t o i ts

localization can be attained. That means, processes

which send informations

to

their communication

partner need not know whether the receiver is in the

s m e node, cluster o r somewhere else. This task is

take n over by the system.

If a user call invoked by the kernel cannot be pro-

cessed locally

it

is forwarded

t o

the node server re-

spectively to the cluster server. This passing of calls

goes on until is it possible to process it. The results

and error messages will be sent back in the same

way to the calling process. The separation of the local

HEROS kernel in two well defined components per-

mits to make corrections or extensions on the im-

plementation of the server or kernel without chang-

ing the user applications.

1 I

Figure 3: Structure of the local HEROS kernel

4.1. HEROSnode server

The HEROS node server (HNS) is a system process

with the task

t o

forward user calls which cannot be

served locally to an other node of a cluster. The HNS

is the representative of the local processes in a clus-

ter. It is limited

t o

just send the user s call to the

other clusters and

to

wait for the results. These are

finally returned

t o

the local process.

The HEROS node server receives also messages from

the other HNS of the cluster, change them into local

kernel calls and sends the results back. The com-

munication mechanism of the node server

is

the

same as the process communication and

it

will be

explained further on.

4.2. HEROS

ocal kernel

Principally, the local kernel has the same task a s a

common operating system kernel. Since it works

under real-time conditions there have to be special

and well defined algorithms applied for it s develop-

ment and implementation. The kernel functions

must be very compact to keep the system overhead as

small as possible. Besides, the execution time must

be exactly calculable so that the latency of the system

can be determinated accurately. In case the execu-

tion time of a kernel function cannot be kept within a

defined range, it must be possible t o divide

it

into

atomic code blocks; that means into not interruptible

program pieces

[51.

Thus, it is possible for the pro-

cessor to execute one atomic block within the re-

quired time range and right af ter that to execute

an

other process with higher priority

o r

t o execute an

interrupt service routine. Easy and fast data struc-

tures for process management are also very impor-

tant for the reduction of the execution time. For ex-

ample the different queues for the process manage-

ment must be simply organized and easily reached.

Also the ordering of the processes within the queues

must be done with little effort.

set-channel

lfree put-channel

block-procees

kill-process get-channel

change-priority select-channcl

Table 1: HEROS

system

calls

To keep the local HEROS kernel small and

to

main-

ta in a clear overview of the system only a minimum

of function calls were defined and implemented

(Table

1 .

It contains among others, functions for

process and memory management a s well as for

process communication. The system call

c r e a t e q r o c e s s is the only one which contains

several atomic blocks and has no exactly calculable

execution time. For tha t reason

its

discussion will be

of exemplary nat ure only. The function call in C

syntax is shown as follows:

creategrocess c luster ,

cpu. prio,

stacksize,

argc ,argv,

filename,

apid

The pa ramete rs have the following meaning:

CP U is the number of the node within the cluster

where the new process shall be allocated and

started.

pr io is the process priority.

stacksize

is the size of the user stack.

argc is the number of arguments that will be

passed.

argu is the pointer to the argument list.

filename

is

a string and gives the path of the ob-

ject code on the host filesystem which shall be

loaded through the local area network and allo-

cated in the corresponding node.

pid points to the process identification number

returned from the function cr ea teqr oces s .

The local kernel, which receives the c r e a t e q r o c e s s

call sends a message

to

the corresponding node if it

cannot execute the function locally. Otherwise, it

creates a process control block (PCB) checks with a

remote call if the file exists on the host and receives

after

a successful call the information of the object

code size. With this data

it

allocates the necessary

memoe space on the node. Then, it sends

t o

the host

the command to link the object code a t the desired

address. Subsequently, the object code is loaded into

the reserved memory space through the LAN and

the process is ready to start. From the above

9

8/11/2019 00048072

4/6

discussion it is easy t o see that the execution time of

the system call for the creation of a process cannot be

calculated exactly, because the response time of the

host and the network specification influence it. The

other functions of Table 1 use only local resources,

th us their execution times can be exactly calculated.

The memory functions serve t o administrate the lo-

cal memory area on the node. These functions can be

used by the system as well as by the users. For in-

stance, the kernel mus t reserve local memory

la l loca te)for code, data and stacks when it creates a

new process. On the other hand similar functions for

the global memory management were also defined

and implemented, with the difference that they are

not visible to the user. These functions are used only

by the kernel t o administrate channels and other

global resources on the cluster. The functions for

process communication will be described in the next

section.

The process control block contains in the local kernel

only the absolutely necessary information, namely

the actual process state, the priority, the pointer t o

the used resources and the context. They are the

contents of all processor reg isters a t the time of the

process switch. A process can assume the states

b l o c ke d , r e a d y , r u n n i n g , and i n t e r r u p t e d . State

transitions can be executed only by system calls or

interrupt service routines.

A

process can change t o

the r e a d y state only through the function

d eb l o ck q ro cess

and to the state

blocked

through the

function b l o c k q r o c e s s . After creation each process

is by default in the state blocked.

A

process can only

get into the state interrupted if it was running at the

interr upt time.

The

PCBs

in the HEROS local kernel are structured

in doubly linked lists. To each of these lists a process

priority is attached, in other words there a re exactly

as many queues as there are process priorities. This

kind of structure reduces the effort of selecting the

next process t o be run by the processor after the pro-

cess switching. A t process switching only the

PCBs

in the queues have to be checked which have the

same or a higher priority as the actual running pro-

cess.

5.The

communication mechanism

The communication between processes, especially of

those allocated on different nodes is very important

for the performance of parallel systems. With in-

creasing communication the design and develop-

ment of processes gets more complex. All relations

between the processes must be defined, coordinated

and synchronized at runtime. Also the degree of

parallelism is strongly influenced by the form of the

interprocess communication. For example with syn-

chronous communication the process must wait for

the explicit readiness of the other process before

it

can receive o r send data. That means especially for

multiprocessor systems, that the nodes of the com-

municating processes remain for a relatively long

time inactive even though they are capable of per-

forming. The HEROS system solves the above men-

tioned problems with the aid of a special channel

mechanism [l]. Thus, the multiprocessor architec-

tur e can be used fully. Normally a HEROS process

keeps running until it has

t o

wait for a certain event

o r until

it

cannot receive o r send any more data. T hat

means, the process blocks only when it is absolutely

necessary.

The HEROS channel represents a one way connec-

tion between two processes. It consists of a descrip-

tor, data objects and a state automata. The descriptor

contains the process identification numbers

(PID)

of

the communicating processes, access rights

t o

the

ports and the channel state.

The HEROS communication mechanism works

asynchronously. The data object

t o

be sent is being

stored in the channel and the sender can continue its

work without delay; that means, it must not wait for

being ready t o receive the da ta of the partner process.

The resulting uncoupling can get lost under certain

circumstances, namely if the sender produces data

faster then the receiver can consume. This leads

t o

a

situation that after some time the producer cannot

write additional data into the channel because the

consumer has not read them yet. For this reason the

sender process must be blocked and cannot continue

with its work. To eliminate this problem, the data

objects are being buffered i n the HEROS channel. To

each channel a capacity

is attached which is defined

by the user at the channel creation time. The capaciy

gives the information how many data objects the

channel may contain. Thus, the user can choose the

degree of uncoupling of his communicating

processes.

The most important characteristics of the HEROS

channel are:

The channel can be seen from the outside only

through the sender and receiver port. With this

channel view, processes don t need t o know

anything about their communicating partners.

This enables also an independent development of

user processes and their independent allocation

on any node of the system.

The channel access is protected because each

channel i s attached t o the sending and receiving

processes with unequivocal

PIDs.

An unautho-

rized access with a function call will not be exe-

cuted and an error value will be returned.

The size of each data object administrated by the

channel is a rbitrary and it must only be defined

by the user a t the channel creation time. The

user can attach his own data type t o the object.

He must only assure th at the sender and receiv-

er work with the same data structure. Through

this data transparence the development and

programming of communicating processes can

670

8/11/2019 00048072

5/6

be considerably simplified. For instance, if the

user wants t o transfer a matrix he only needs t o

send

it

as an object and not as a byte stream.

In HEROS there are

t w o

possibilities for a chan-

nel access. With the ALTERNATE mode a

channel data object must be read by the receiver

before

it

can be overwritten by a new one. This

kind of access has the same behavior as a pipe.

All data written in the channel must be read by

the receiver. In case the number of unread data

objects is equal to the user defined channel

capacit y any attempt

to

write in a channel

blocks the producer. Only after the consumer

has read at least one object from the channel and

so emptied at least one data container, the

sender will be deblocked. By the RANDOM

access the da ta objects deposited in the channel

and not yet read can be overwritten by other

newer data. That means, the producer can write

into the channel at all times without blocking .

The RANDOM access is of interest

to

the sensor

data processing. For instance, a producer can

put periodically the actual measured data of a

sensor into the channel. The evaluating process

can read data from this channel with the

guarantee that this always corresponds to the

current system state. Thus, the consumer does

not have

t o

synchronize himself with the sensor

process neither does he have t o give periodically

commands for measurements.

Name

v i r g i n

inspected

c los e d

state

C ha nne l c re a te d a nd no t w r i t t e n

C ha nne l ha s bc c n wr i t t e n

C ha nne l ha s be e n re a d

Channel access

is

not further a l lowed

Table

2:

Channel states

To ensure and control the access discipline t o the

data objects the history of the access

t o

the objects

must be stored. For this purpose a channel can as-

sume one of the s tates shown in Table

2.

State tran-

sitions can only be executed by channel function

calls. After the creation of a channel it always goes

into the state virgin.

destm

dos ed

read

read

wr1 e

Figure4: t a te t ransi t ions for

a

channel with capacity

1

A

reading operation from a channel with the sta te

virgin always leads t o a veto signal, which forces the

process t o block. First there must be a write opera-

tion, by which the channel gets his initial value. The

states written and inspected describe the state after

the l ast operation. The s tate closed describes that the

channel should not be used further. Write opera-

tions in this state lead directly t o an exception han-

dling. For read operations this

is

true only after the

last contents

of

the channel has been read. The pos-

sible state transitions for a channel with the capacity

1 are shown in Fig.

4.

I

--.

ather ,

.

I c n r v r g n

I

Noden

odem

I

Figure

5: Th e HEROS hannel

With the help of the discussed mechanisms the par-

ticipating processes block themselves at the execu-

tion

of

a channel operation. This happens during the

first access t o a data object in the channel. The pro-

cess keep running as long as possible and it is con-

trolled by the data stream. The kernel functions in

the HEROS system which support this channel

mechanism are shown in Table 1. Because there is

no global environment in the multiprocessor system,

the processes which want to exchange informations

through a channel must receive the address

of

the

joint channel by the aid

of

a third process, the

so-

called father process. The father process establishes

with the function create-channel a channel and get

its identification number back. Afterwards, it creates

with the system call createqrocess the sender and

receiver processes (Fig. 5). Thereby, it gives through

passed parameters the identification number

of

the

joint channel

to

the child processes.

Now,

the father

or the child processes can set the access disciplines

of the channel by the function set-channel . The

sender can write a message into the channel with

the system call put- channel and the receiver can

pick i t up with the call get-channel.

6.Implementation

and

results

The HEROS system was developed on a SUN work-

station under Unix BSD

4.2.

The largest part of the

system kernel was implemented in the program-

ming language C. Only few parts such as driver

routines and the switch-context had

t o

be written in

assembler. For the HEROS cluster several MC68000

single board computer cards were connected with

the VME bus. The ethernet on the campus of the

cniversity was used as an example of a local area

network. The protocols used for the communication

between clusters and the SUN workstations were

TCPAP

67

8/11/2019 00048072

6/6

Some experiments with the first HEROS-prototype

were made in order to

measure the system perfor-

mance. F or this purpose

MC68

ingle board com-

puters running at

lOMHz were used. Table 3 gives

an overview of the average time of some system calls

and the switch-context on one node.

Function IAv mt i b l

A c k n o w l ~ e n t

The HEROS system is being developed at the Insti-

tute for Real-Time Computer Control Systems and

Robotics, Faculty of Informatics, University

of

Karlsruhe, 7500 Karlsruhe, Federal Republic of

Germany. The work was funded by the Deutsche

Forschungs-gemeinschaft.

switch-context

block-proce8s

deblock-process

5

ut-channel integer)

282

get-channel integer)

8

Table 3: Execution time of system calls

In order

t o

operate and control the multiprocessor

with a SUN workstation several extra tools had

t o

be

developed. A menu controlled program running on

the workstation enables

t o

create, block, deblock and

terminate HEROS processes on every node

of

the sys-

tem. Also control information, as for instance the

process state on the different nodes can be obtained

from the workstation. On the other hand, HEROS

processes have the possibility

t o

use the window

facilities on the SUN. For this purpose a client was

installed in HEROS and

a

server on the workstation

for window handling. A process that wants t o send

data

t o

the workstation needs only

t o

create a window

with the aid of the client and

t o

send its data t o the

client. Thus, the user can supervise and control the

whole system on a central screen.

Control programs need often access

t o

data stored in

files. For this reason, a service according to a server-

client principle was installed. It supports Unix file

operations.

A

HEROS process can access the data on

the host computer by the functions open, read, wri te

and close . Also access to other file systems will be

possible with a newly developed HEROS server,

which will support the standard NFS protocols

(Network File System).

To

simplify the use of sensors a resource m n ger is

being planned. Its task will be

t o

install, initialize,

and handle the driver processes for the different

sensors of the system. The driver processes need only

be implemented once for each sensor type and will be

able t o prepare

the sensor data. Each user process

which wants

t o

use a sensor can access it by pre-de-

fined functions under the control of the resource

manager.

A

sensor can be seen as a logical entity

which can be reached only through operations on a

data structure. This enables a clear separation be-

tween the technical implementation and the proper

sensor data processing.

Finally, there are tools in development which make

the measurement of HEROS parameters possible

such as the processor balance, length

of

the process

queues and the memory demand. Thus, there is

hope that in the near future the system performance

can be improved. In the first phase, optimization is

done by the user and in the second phase it is done

automatically and during the running

of

the sys-

tem.

References

Kordecki, C.: Concept for a Distributed Multiprocessor

System , PhD dissertation at the University of Karlsruhe,

1987

Srom E. and Yemini S.: N I L An Integrated Language

and System for Distributed Programming , proceedings,

ACM SIGPLAN Conference on Programming Languages

Issue s in Systems, June 1983, pp. 73-82

Srom E. and Yemini S.: The NIL Distributed Syst ems

Programming Language: A A Status Report , SIGPLAN

Notices, Vol. 20, Nr. 5,May 1985, pp.

36-44

Ciciani B. nd CiofX G.: A Proposal for Autonomous and

Dynamical Cooperating Processes , IFIP Conference on

Distributed Processing, Amsterdam, October 1987

Parnas D. and Faulk S.: On Synchronization in Hard-

Real-Time Systems , Communications

of

ACM

Vol.

31 Nr.

3, M arch

1988,

pp. 274-287

612

Date post:	03-Jun-2018
Category:	Documents
Upload:	priyatham-gangapatnam
View:	222 times
Download:	0 times

00048072

Documents