Date post: | 03-Jun-2018 |
Category: |
Documents |
Upload: | priyatham-gangapatnam |
View: | 222 times |
Download: | 0 times |
of 6
8/11/2019 00048072
1/6
A
Concept for Distr ibuted Control Systems
M. Glaser, C. Kordecki,
U.
Rembold
University of Karlsruhe
Ins ti tut e for Real-Time Computer Syste ms an d Robotics
P.O. Box
6980,
D-7500 Karlsruhe, FRG
This pape r describes the HERO S-System (Hierarch ical ly Ex-
tendib le Real -Time Ope ra t ing System), which w as especial ly
developed and implemented for the control and supervision of
robots. It al lows the dynamic creation of processes and the ir
management . The processes in HEROS have no knowledge of
thei r mu tual existence and possess no global kernel routine s o r
variab les . The in terprocess communicat ion and synchroniza-
tion
is
accomplished through a channel mechanism . For this
opera t ion , s imple and effec tive function ca l l s a r e ma de avai l -
ab le to the programmer ' s use . The uncoupl ing of the processes
which is at ta ined th rough the channel concept enables the tas ks
to be independently defined, implemented a nd to be operated in
p a ra l l e l . Th e c h a n n e l c o n c e p t a l l o ws t h e p ro c e s se s t o b e
allocated independently of each other t o the processors . The
H E R O S s y s t e m is composed of severa l c lusters connected
through a loca l a r ea ne twork
LAN).
Each cluster consists of
several conventional single board computers (SBC), a global
memory and a ne twork cont ro l le r . Wi th in such a c luster , the
h a rd re a l - t i me re q u i re me n t s o f a ro b o t c a n b e fu l f i l l e d .
HER OS- t a sk s c a n a l so c o mmu n i c a t e wi t h o t h e r c o mp u t e r
s y s t e m s t h r o u g h t h e LAN, whereby the speci f ic ne twork
charac teri s t ics and the response t ime of the serv ices de termine
the rea l - t ime behavior .
1.Introduction
The most important requirement of a hard real-time
environment is the observance of time limi ts. Infor-
mation which is supplied
too
early or
too
late is use-
less and can lead
t o
an undefined state in a control
system.
To assure real-time operation i t is necessary
t o
have
a high computer performance ( in extreme case one
computer for each sensor and actuator) and
t o
pro-
vide easy mechanisms for interaction between the
participating processes. Additionally,
it
is required
that the whole system is hierarchically constructed
t o ensure an efficient control on the different system
levels.
A
dynamic process reconfiguration is also
needed for automatically adaptation to changing ex-
ternal environmental conditions.
Typical applications of hard real-time systems are
flight control, traffic control, computer integrated
manufacturing (CIM) and robotics.
The HEROS-system (Hierarchically Extendible Real-
Time Operat ing System), developed by the in stit ute
for Real-Time Computer Control Systems and
Robotics of the University of Karlsruhe is a prototype
for the control and supervision of indust rial robots
which operate in flexible assembly cells. One of the
aims of the project was the development of a dis-
tributed multiprocessor system for real-time
application under following conditions:
Only hardware components, which are the
sta te of the a rt in the industry should be
used
to
build up and integ rate the system.
A
well known and available programming
environment should be used for the system
and user software development.
The multiprocessor system should be pro-
grammed with standard languages like C
or Pascal, and with new programming
tools for parallel computer architectures.
The multiprocessor system must be de-
signed t o control and supervise robots, es-
pecially in an indus trial environment
where usually hard real-time require-
ments are encountered.
An other research area was the dynamic process
allocation which is necessary for the process
reconfiguration during system operation, whereby
the independence between processes and processors
must be ensured. Also communication capabilities
between the multiprocessor system and an external
system through a local area network or a special
channel are
t o
be investigated.
In this paper first, the HEROS system architecture
and its kernel are described. Next, a mechanism for
the process communication, which is an essential
component of the system is explained. As a conclu-
sion, the implementation of the system and the main
object for future work are discussed.
667
0073-1129/89/oooO/0667 01.00 1989 IEEE
8/11/2019 00048072
2/6
2procesSmodel
In HEROS a process is composed of a local data
structure and a sequential program. The data struc-
ture can be manipulated by the own program only.
The processes communicate through
message
pas s -
ing
and each process can only send a message t o an-
other, if it has a channel to it. This model assures
th at the process environment is a local one. It was
proposed by R. Strom and
S
Yemini [2] and used on
the distributed systems programming language
NIL [3].
Processes can be created and activated by other pro-
cesses at any node of the system. Since there is no
global environment the information for the new pro-
cess is passed by parameters a t the creation time. A
process can extend its knowledge about the environ-
ment through subsequent communication 41.
Each process gets at creation time a system wide
unique identification number (PID), which is com-
posed of the cluster number, node number and the
processor number within the node. This enables a
fast process localization. Each process can be termi-
nated by itself, by others or by the system if a fatal er-
r o r has happened. A t the termination time all
re-
sources, which the process has used, like channels,
memory
or IO-devices will be returned to the system.
s yst m
architecture
The proposed distributed multiprocessor system
consists of clusters, which are loosely coupled
through a local area network (LAN). A cluster is
composed of several tightly coupled nodes connected
through a bus with a partly common address space
(Fig. 1 .Each cluster contains a global memory for
the process communication between nodes and a
LAN controller, which makes the connection
t o
the
outside. Each node is composed of a microprocessor
unit and a local memory.
This architecture simplifies the development and
test of user applications. The programs can be devel-
oped on the host computer with available program-
ming tools and languages like C and Pascal. The ob-
ject code of these programs is linked with the HEROS
run-time library and downloaded
t o
the multipro-
cessor system through the LAN.
A
further advan-
tage of this architecture is that the HEROS processes
can use external services (Fig. 2.a). The user can ac-
cess data from remote systems and use them on his
process. Fqr example construction plans can be di-
rectly taken from a
C D
system and used for indus-
trial production.
In addition
t o
the communication with external sys-
tems via the LAN, the HEROS system provides the
possibility of integrating microcontrollers for sensors
into the multiprocqssor system. The microcon-
trollers will e connected to the local bus of a cluster
(Fig. 2.b). The communication with the multiproces-
sor
system is done through a statically defined
memory space in the global memory. Since the
microcontrollers have not a HEROS kernel, each
user must define and implement his own
communication and synchronization mechanisms
in order to integrate the controller into the system. Of
course the same mechanisms as in HEROS can be
used in this case.
1
Figure 1:The
HEROS
rchitecture
A
very short reaction time as required in a hard real-
time environment can be ensured only within clus-
ters. Which means, that related processes requiring
a closed loop control with critical reaction times
must be installed in the same cluster. Reaction times
outside the clusters are dependent on the response
time of the external service and on the network
specification. For example with an ethernet, the
minimal transport time of the packets in the network
cannot be predicted because of the CSMNCD method
used. Certainly, there are better response times at
low network rates as in token rings. On the other
hand, processes of closed loop controls which have
long dead times, can be distributed over different
clusters without adverse effect on the required time
limits.
y-
HEROS
a)
hmurh the local
network
Figure 2: Communication with external sys tems
4. HEROS kernel
Every node of the multiprocessor system has an
identical operating system kernel (Fig. 3). It is
composed of two components, the HEROS Local
Kernel (HLK) and the HEROS Node Server (HNS).
Additionally, each cluster contains in a special node
the HEROS Cluster Server (HCS). Its purpose is
t o
manage the communication and synchronization
of
the cluster with the outside. For the user HNS and
HCS remain hidden; he can
o n ly
make calls
to
the
local kernel. In this manner the independence and
668
8/11/2019 00048072
3/6
the transparence of a process according t o i ts
localization can be attained. That means, processes
which send informations
to
their communication
partner need not know whether the receiver is in the
s m e node, cluster o r somewhere else. This task is
take n over by the system.
If a user call invoked by the kernel cannot be pro-
cessed locally
it
is forwarded
t o
the node server re-
spectively to the cluster server. This passing of calls
goes on until is it possible to process it. The results
and error messages will be sent back in the same
way to the calling process. The separation of the local
HEROS kernel in two well defined components per-
mits to make corrections or extensions on the im-
plementation of the server or kernel without chang-
ing the user applications.
1 I
Figure 3: Structure of the local HEROS kernel
4.1. HEROSnode server
The HEROS node server (HNS) is a system process
with the task
t o
forward user calls which cannot be
served locally to an other node of a cluster. The HNS
is the representative of the local processes in a clus-
ter. It is limited
t o
just send the user s call to the
other clusters and
to
wait for the results. These are
finally returned
t o
the local process.
The HEROS node server receives also messages from
the other HNS of the cluster, change them into local
kernel calls and sends the results back. The com-
munication mechanism of the node server
is
the
same as the process communication and
it
will be
explained further on.
4.2. HEROS
ocal kernel
Principally, the local kernel has the same task a s a
common operating system kernel. Since it works
under real-time conditions there have to be special
and well defined algorithms applied for it s develop-
ment and implementation. The kernel functions
must be very compact to keep the system overhead as
small as possible. Besides, the execution time must
be exactly calculable so that the latency of the system
can be determinated accurately. In case the execu-
tion time of a kernel function cannot be kept within a
defined range, it must be possible t o divide
it
into
atomic code blocks; that means into not interruptible
program pieces
[51.
Thus, it is possible for the pro-
cessor to execute one atomic block within the re-
quired time range and right af ter that to execute
an
other process with higher priority
o r
t o execute an
interrupt service routine. Easy and fast data struc-
tures for process management are also very impor-
tant for the reduction of the execution time. For ex-
ample the different queues for the process manage-
ment must be simply organized and easily reached.
Also the ordering of the processes within the queues
must be done with little effort.
set-channel
lfree put-channel
block-procees
kill-process get-channel
change-priority select-channcl
Table 1: HEROS
system
calls
To keep the local HEROS kernel small and
to
main-
ta in a clear overview of the system only a minimum
of function calls were defined and implemented
(Table
1 .
It contains among others, functions for
process and memory management a s well as for
process communication. The system call
c r e a t e q r o c e s s is the only one which contains
several atomic blocks and has no exactly calculable
execution time. For tha t reason
its
discussion will be
of exemplary nat ure only. The function call in C
syntax is shown as follows:
creategrocess c luster ,
cpu. prio,
stacksize,
argc ,argv,
filename,
apid
The pa ramete rs have the following meaning:
CP U is the number of the node within the cluster
where the new process shall be allocated and
started.
pr io is the process priority.
stacksize
is the size of the user stack.
argc is the number of arguments that will be
passed.
argu is the pointer to the argument list.
filename
is
a string and gives the path of the ob-
ject code on the host filesystem which shall be
loaded through the local area network and allo-
cated in the corresponding node.
pid points to the process identification number
returned from the function cr ea teqr oces s .
The local kernel, which receives the c r e a t e q r o c e s s
call sends a message
to
the corresponding node if it
cannot execute the function locally. Otherwise, it
creates a process control block (PCB) checks with a
remote call if the file exists on the host and receives
after
a successful call the information of the object
code size. With this data
it
allocates the necessary
memoe space on the node. Then, it sends
t o
the host
the command to link the object code a t the desired
address. Subsequently, the object code is loaded into
the reserved memory space through the LAN and
the process is ready to start. From the above
9
8/11/2019 00048072
4/6
discussion it is easy t o see that the execution time of
the system call for the creation of a process cannot be
calculated exactly, because the response time of the
host and the network specification influence it. The
other functions of Table 1 use only local resources,
th us their execution times can be exactly calculated.
The memory functions serve t o administrate the lo-
cal memory area on the node. These functions can be
used by the system as well as by the users. For in-
stance, the kernel mus t reserve local memory
la l loca te)for code, data and stacks when it creates a
new process. On the other hand similar functions for
the global memory management were also defined
and implemented, with the difference that they are
not visible to the user. These functions are used only
by the kernel t o administrate channels and other
global resources on the cluster. The functions for
process communication will be described in the next
section.
The process control block contains in the local kernel
only the absolutely necessary information, namely
the actual process state, the priority, the pointer t o
the used resources and the context. They are the
contents of all processor reg isters a t the time of the
process switch. A process can assume the states
b l o c ke d , r e a d y , r u n n i n g , and i n t e r r u p t e d . State
transitions can be executed only by system calls or
interrupt service routines.
A
process can change t o
the r e a d y state only through the function
d eb l o ck q ro cess
and to the state
blocked
through the
function b l o c k q r o c e s s . After creation each process
is by default in the state blocked.
A
process can only
get into the state interrupted if it was running at the
interr upt time.
The
PCBs
in the HEROS local kernel are structured
in doubly linked lists. To each of these lists a process
priority is attached, in other words there a re exactly
as many queues as there are process priorities. This
kind of structure reduces the effort of selecting the
next process t o be run by the processor after the pro-
cess switching. A t process switching only the
PCBs
in the queues have to be checked which have the
same or a higher priority as the actual running pro-
cess.
5.The
communication mechanism
The communication between processes, especially of
those allocated on different nodes is very important
for the performance of parallel systems. With in-
creasing communication the design and develop-
ment of processes gets more complex. All relations
between the processes must be defined, coordinated
and synchronized at runtime. Also the degree of
parallelism is strongly influenced by the form of the
interprocess communication. For example with syn-
chronous communication the process must wait for
the explicit readiness of the other process before
it
can receive o r send data. That means especially for
multiprocessor systems, that the nodes of the com-
municating processes remain for a relatively long
time inactive even though they are capable of per-
forming. The HEROS system solves the above men-
tioned problems with the aid of a special channel
mechanism [l]. Thus, the multiprocessor architec-
tur e can be used fully. Normally a HEROS process
keeps running until it has
t o
wait for a certain event
o r until
it
cannot receive o r send any more data. T hat
means, the process blocks only when it is absolutely
necessary.
The HEROS channel represents a one way connec-
tion between two processes. It consists of a descrip-
tor, data objects and a state automata. The descriptor
contains the process identification numbers
(PID)
of
the communicating processes, access rights
t o
the
ports and the channel state.
The HEROS communication mechanism works
asynchronously. The data object
t o
be sent is being
stored in the channel and the sender can continue its
work without delay; that means, it must not wait for
being ready t o receive the da ta of the partner process.
The resulting uncoupling can get lost under certain
circumstances, namely if the sender produces data
faster then the receiver can consume. This leads
t o
a
situation that after some time the producer cannot
write additional data into the channel because the
consumer has not read them yet. For this reason the
sender process must be blocked and cannot continue
with its work. To eliminate this problem, the data
objects are being buffered i n the HEROS channel. To
each channel a capacity
is attached which is defined
by the user at the channel creation time. The capaciy
gives the information how many data objects the
channel may contain. Thus, the user can choose the
degree of uncoupling of his communicating
processes.
The most important characteristics of the HEROS
channel are:
The channel can be seen from the outside only
through the sender and receiver port. With this
channel view, processes don t need t o know
anything about their communicating partners.
This enables also an independent development of
user processes and their independent allocation
on any node of the system.
The channel access is protected because each
channel i s attached t o the sending and receiving
processes with unequivocal
PIDs.
An unautho-
rized access with a function call will not be exe-
cuted and an error value will be returned.
The size of each data object administrated by the
channel is a rbitrary and it must only be defined
by the user a t the channel creation time. The
user can attach his own data type t o the object.
He must only assure th at the sender and receiv-
er work with the same data structure. Through
this data transparence the development and
programming of communicating processes can
670
8/11/2019 00048072
5/6
be considerably simplified. For instance, if the
user wants t o transfer a matrix he only needs t o
send
it
as an object and not as a byte stream.
In HEROS there are
t w o
possibilities for a chan-
nel access. With the ALTERNATE mode a
channel data object must be read by the receiver
before
it
can be overwritten by a new one. This
kind of access has the same behavior as a pipe.
All data written in the channel must be read by
the receiver. In case the number of unread data
objects is equal to the user defined channel
capacit y any attempt
to
write in a channel
blocks the producer. Only after the consumer
has read at least one object from the channel and
so emptied at least one data container, the
sender will be deblocked. By the RANDOM
access the da ta objects deposited in the channel
and not yet read can be overwritten by other
newer data. That means, the producer can write
into the channel at all times without blocking .
The RANDOM access is of interest
to
the sensor
data processing. For instance, a producer can
put periodically the actual measured data of a
sensor into the channel. The evaluating process
can read data from this channel with the
guarantee that this always corresponds to the
current system state. Thus, the consumer does
not have
t o
synchronize himself with the sensor
process neither does he have t o give periodically
commands for measurements.
Name
v i r g i n
inspected
c los e d
state
C ha nne l c re a te d a nd no t w r i t t e n
C ha nne l ha s bc c n wr i t t e n
C ha nne l ha s be e n re a d
Channel access
is
not further a l lowed
Table
2:
Channel states
To ensure and control the access discipline t o the
data objects the history of the access
t o
the objects
must be stored. For this purpose a channel can as-
sume one of the s tates shown in Table
2.
State tran-
sitions can only be executed by channel function
calls. After the creation of a channel it always goes
into the state virgin.
destm
dos ed
read
read
wr1 e
Figure4: t a te t ransi t ions for
a
channel with capacity
1
A
reading operation from a channel with the sta te
virgin always leads t o a veto signal, which forces the
process t o block. First there must be a write opera-
tion, by which the channel gets his initial value. The
states written and inspected describe the state after
the l ast operation. The s tate closed describes that the
channel should not be used further. Write opera-
tions in this state lead directly t o an exception han-
dling. For read operations this
is
true only after the
last contents
of
the channel has been read. The pos-
sible state transitions for a channel with the capacity
1 are shown in Fig.
4.
I
--.
ather ,
.
I c n r v r g n
I
Noden
odem
I
Figure
5: Th e HEROS hannel
With the help of the discussed mechanisms the par-
ticipating processes block themselves at the execu-
tion
of
a channel operation. This happens during the
first access t o a data object in the channel. The pro-
cess keep running as long as possible and it is con-
trolled by the data stream. The kernel functions in
the HEROS system which support this channel
mechanism are shown in Table 1. Because there is
no global environment in the multiprocessor system,
the processes which want to exchange informations
through a channel must receive the address
of
the
joint channel by the aid
of
a third process, the
so-
called father process. The father process establishes
with the function create-channel a channel and get
its identification number back. Afterwards, it creates
with the system call createqrocess the sender and
receiver processes (Fig. 5). Thereby, it gives through
passed parameters the identification number
of
the
joint channel
to
the child processes.
Now,
the father
or the child processes can set the access disciplines
of the channel by the function set-channel . The
sender can write a message into the channel with
the system call put- channel and the receiver can
pick i t up with the call get-channel.
6.Implementation
and
results
The HEROS system was developed on a SUN work-
station under Unix BSD
4.2.
The largest part of the
system kernel was implemented in the program-
ming language C. Only few parts such as driver
routines and the switch-context had
t o
be written in
assembler. For the HEROS cluster several MC68000
single board computer cards were connected with
the VME bus. The ethernet on the campus of the
cniversity was used as an example of a local area
network. The protocols used for the communication
between clusters and the SUN workstations were
TCPAP
67
8/11/2019 00048072
6/6
Some experiments with the first HEROS-prototype
were made in order to
measure the system perfor-
mance. F or this purpose
MC68
ingle board com-
puters running at
lOMHz were used. Table 3 gives
an overview of the average time of some system calls
and the switch-context on one node.
Function IAv mt i b l
A c k n o w l ~ e n t
The HEROS system is being developed at the Insti-
tute for Real-Time Computer Control Systems and
Robotics, Faculty of Informatics, University
of
Karlsruhe, 7500 Karlsruhe, Federal Republic of
Germany. The work was funded by the Deutsche
Forschungs-gemeinschaft.
switch-context
block-proce8s
deblock-process
5
ut-channel integer)
282
get-channel integer)
8
Table 3: Execution time of system calls
In order
t o
operate and control the multiprocessor
with a SUN workstation several extra tools had
t o
be
developed. A menu controlled program running on
the workstation enables
t o
create, block, deblock and
terminate HEROS processes on every node
of
the sys-
tem. Also control information, as for instance the
process state on the different nodes can be obtained
from the workstation. On the other hand, HEROS
processes have the possibility
t o
use the window
facilities on the SUN. For this purpose a client was
installed in HEROS and
a
server on the workstation
for window handling. A process that wants t o send
data
t o
the workstation needs only
t o
create a window
with the aid of the client and
t o
send its data t o the
client. Thus, the user can supervise and control the
whole system on a central screen.
Control programs need often access
t o
data stored in
files. For this reason, a service according to a server-
client principle was installed. It supports Unix file
operations.
A
HEROS process can access the data on
the host computer by the functions open, read, wri te
and close . Also access to other file systems will be
possible with a newly developed HEROS server,
which will support the standard NFS protocols
(Network File System).
To
simplify the use of sensors a resource m n ger is
being planned. Its task will be
t o
install, initialize,
and handle the driver processes for the different
sensors of the system. The driver processes need only
be implemented once for each sensor type and will be
able t o prepare
the sensor data. Each user process
which wants
t o
use a sensor can access it by pre-de-
fined functions under the control of the resource
manager.
A
sensor can be seen as a logical entity
which can be reached only through operations on a
data structure. This enables a clear separation be-
tween the technical implementation and the proper
sensor data processing.
Finally, there are tools in development which make
the measurement of HEROS parameters possible
such as the processor balance, length
of
the process
queues and the memory demand. Thus, there is
hope that in the near future the system performance
can be improved. In the first phase, optimization is
done by the user and in the second phase it is done
automatically and during the running
of
the sys-
tem.
References
Kordecki, C.: Concept for a Distributed Multiprocessor
System , PhD dissertation at the University of Karlsruhe,
1987
Srom E. and Yemini S.: N I L An Integrated Language
and System for Distributed Programming , proceedings,
ACM SIGPLAN Conference on Programming Languages
Issue s in Systems, June 1983, pp. 73-82
Srom E. and Yemini S.: The NIL Distributed Syst ems
Programming Language: A A Status Report , SIGPLAN
Notices, Vol. 20, Nr. 5,May 1985, pp.
36-44
Ciciani B. nd CiofX G.: A Proposal for Autonomous and
Dynamical Cooperating Processes , IFIP Conference on
Distributed Processing, Amsterdam, October 1987
Parnas D. and Faulk S.: On Synchronization in Hard-
Real-Time Systems , Communications
of
ACM
Vol.
31 Nr.
3, M arch
1988,
pp. 274-287
612