Date post: | 21-May-2015 |
Category: |
Technology |
Upload: | andrea-tino |
View: | 89 times |
Download: | 5 times |
Riccardo Pulvirenti, Giuseppe Ravidà & Andrea TinoMarco Buzzanca & Davide Giuseppe Monaco
Università degli Studi di Catania - A.A. 2010/11
Corso di Laurea Specialistica in Ing. Informatica DIIT @ Università di Catania
Prof. Eng. A. Di StefanoEng. G. Morana
Distributed Systems 2010-11
TaskMan-Middleware 2011A Standard C++ distributed middleware for work�ow management over a P2P INET TCP/UDP IP PUSH/PULL-Sending-Model oriented network
Development Project for Distributed Systems
.cpp
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
The Project was started by both teams and advanced, together, in the initial steps. The following core design and development stages were terminated separately, getting two softwares with different characteristics and imple-mentative solutions (patterns).
Group components
Project requirementsC++ middleware kernel using a PUSH sending model in a P2P oriented-network using TCP/IP protocol.
Group components
Project requirementsC++ middleware kernel using a PULL sending model in a P2P oriented-network using UDP/IP protocol.
Riccardo [email protected]
Giuseppe Ravidà[email protected]
Andrea [email protected]
Marco [email protected]
Davide [email protected]
More about teamsDescription of team components
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Two teams were involved in the development of a C++ middleware kernel for workflow management and execution. Both teams had some chances to work together. We are going to describe the team workflow.
Requirement analysis Initial Design
Core Design
Core Design
Common components development
Core development
Core development
Requirements for communications, protocols, task man-agement, queues management, routing management.
Design of common components: syncronized queues, descriptors for tasks and workflows.
Design of specific (non common) components.
Design of specific (non common) components.
Development of common compo-nents: syncronized queues, parts and structures of common classes.
Development of specific (non common) components.
Development of specific (non common) components.
Release 1Release 2
TaskMan-Middleware TeamingAn overview of teams involved in the c++ project branch
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
We are now going to describe and discuss the main characteristics of both projetcs focusing on design choices and implementative solutions (trying to always provide accurate rationales). The members of two teams will, alter-nately, advance the presentation on the respective parts.
Part 1: Introducing main actors and all services imple-mented by the application. We’ll focus on basic details of the soft-ware providing a brief introduction to the most important elements.
Part 2: Introducing all information regarding technol-ogy, compiler, language and other tools used to develop the applica-tions.
Part 3: Introducing and describing all details regarding data flows in the applications in order to understand how data are exchanged between the distrib-uted entities of the software.
Part 4: Providing all the development details. In this part, the most important development solu-tions and patterns will be explained and described (rationales will be provided too).
Part 5: Conclusions. The most important pros and cons of the applications will be summarized point-ing out the best approaches usable in future to solve open issues.
Presentation �owPresenting solutions, patterns and choices
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Both projects have been developed with the precise purpose of obtaining a fast and low level system. According to such requirements, language and architecture were choices to be taken very carefully.
Target language: Standard C++ (g++)Rationale: Low (medium) level language, object oriented, high performance.
Target architecture: Unix/x86Rationale: High performance, Unix compatibility.
External resources: Boost Libraries 1.45.0Rationale: C++ high level library for networking, interprocess, threading, functors, binding.Used libraries: boost::serialization, boost::asio, boost::filesystem, boost::interprocess, boost::thread.
Development & codingLanguage and technical features
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
TaskMan-Middleware is a distributed kernel intended to support higher level applications meant to manage large amounts of tasks over a network of peers exposing available computational resources.
By taking advantage of Infrastructure-as-a-Service (IaaS) we can provide a high level application with some APIs in order to manage tasks and their execution over a P2P network by always guaranteeing a fair share of the computational shared resource. This approach might even lead to various forms of computing systems able to be utilized by all people in the world.
A collection of tasks can be submitted to a peer in order to let it being executed. A workflow contains different numbers of tasks, they are processed by a local entity.
The local routing entity is the Manager, it decides which is the best peer where that task must be sent to. Manager embodies complex logic.
Manager cannot decide by its own; Discovery helps him in selecting the best peer.
When a peer has been selected, it is possible to send the task to it. When a peer receives a task, this is executed.
Conceptualizing on the fly...
TaskMan-Middleware OverviewGetting started with basic dynamics
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Managing a distributed kernel, we have many elements to deal with. Let’s consider some basic aspects and some definitions that will always be with us during the exploration of the system.
Peer: A machine of the network. It is a node of our computetional network. Every machine is able to produce its own workflows, but must always let all other machines use its com-putetional resources in order to execute other tasks (not belonging to the current machine). This is needed in order to guarantee fair share.
Task: An entity to be executed. It is commonly seen as two possible types: bash command tasks and executable tasks. The second one needs binaries to be executed.
Workflow: A collection of independent tasks to be executed.
Network: A collection of machines/peers connected, through TCP/IP protocol, one to each other according to every possible schemes (no topology constraints applied).
Connection: A one-way link from a machine to another one. It states that the first machine knows the second. In our system a one-way connection implicitely turns to a bidirectional linkage because of implemented knowledge dynamics.
Setting a common groundDe�nitions and assumptions
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Every peer embodies four important entities, everyone of them is meant to work for a specific purpose and reach some goals. These entities interact altogether through local connections and network connections.
UI: User Interface is meant to produce workflows to be sent to Manager in order to be executed.
Manager: Manager works in order to execute all tasks in a workflow. It must also submit a task to its Worker (residing in the same peer) when it comes from the network. Manager works in symbiosis with Discovery in order to get the correct peer to send the task to.
Discovery: This component locally communicates with Manager in order to specify the correct peer to execute a certain task. It also communicates with Worker (through the net-work) in order to accept all their status notifications. This enables a more intelligent routing dynamic based on a performance index evaluated basing on workers’ conditions.
Worker: This component executes a task when arriving from its Manager. It also notifies all neighbour peers’ Discovery for its status to be changed.
Another component should be considered too: CORE. It initializes all components in the peer allowing them to run independently as separate threads. CORE also configures the peer basing on a configuration file.
Main actorsIntroducing UI, Manager, Discovery and Worker
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
TaskMan-Middleware implements many services in order to reach its goals:
Task routing: In order to let the network execute a task, all workflows are managed and each task is assigned to a specific worker on a different peer. It will be, leter, executed.
Peers’ status notification: In order to correctly route a task to the most suitable peer (suitable here means: “the peer having the current best time-variant performance”), all workers periodically send their status to all neighbours.
Multithreaded peers’ status management: When peers’ status notifications are received from a peer, a thread is created in order to manage every single notification.
Multithreaded queue management: All queues are managed using threads. When a new workflow or task must be dequeued, a new thread is created to manage it.
PUSH data sending model: When an executable task must be sent, immediately after its sending, binaries are sent too. This automatically implies every worker to maintain a local collection of data in order to use it when a task must be executed.
Safe sending model: When something must be sent, a control loop is considered in order to manage unreachable destination exception.
Logging: A robust logging system is used to audit every operation occurring in the kernel.
Main servicesIntroducing functionalities and capabilities
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
There are some issues that were solved when developing our system. Previ-ously, we introduced the most important services in TaskMan-Middleware, now let’s try to map an issue to all those services implemented to solve it.
Fast computation Multithread
Robustness Attempts to send
PUSH sending model Collected table for data
Intelligent routing Noti�cation system
Event tracing Synchronized logging
Con�gurability Con�guration system
Flexibility Proxy + Factory
Problems/Requirements to handleSolving issues: easy to say, a little harder to do
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Development model and architecture were decided at the beginning of the development process.
Development model: Development had to be completed very fastly. Considering time constraints, technology and requirements for the application, a “light” and easy-to-manage workflow was considered. For this reason the team selected a prototypal model, in particu-lar, a spiral model in which the creation and the specialisation of a single, initial, prototype, determined the evolution of the final software. Many branches from the original prototype were created and others abandoned in order to get the final, working, application.
Software architecture: Due to constraints and requirements, the final application was designed in order to reach, as first important target, Scalability. Flexibility was also consid-ered as an important requirement, as well as Modularity. Following a responsability-chain pattern, the team chose to develop a middleware according to a modular/multilevel plain architecture. Thanks to this architecture, the final software defines different modules (components) able to operate as single entities with the lowest possible interactions with the others (loose coupling); furthermore, thanks to proxies and factories, all communica-tions are transparent and easy-to-manage.
Development processDevelopment model and architecture
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Network flows: Exchanges happen between different peers. Typically involve Discovery/Worker.
Local flows: Exchanges happen in the same peer. Typically involve Manager/Discovery.
Network flows: Exchanges happen between different peers. Typically involve Manager/Manager.
Local flows: Exchanges happen in the same peer. Typically involve Manager/Discovery.
WorkerDescriptor flows: All flows occurring between peers exchanging a worker descrip-tor. These flows are experienced during the kernel execution and typically involve peers’ Manager, Discovery and Worker.
TaskDescriptor flows: All flows occurring between peers exchanging a task descriptor. These flows are experienced during the kernel execution and typically involve peers’ Man-ager and Discovery only.
When running, the kernel exchanges data with all other peers. Trying to figure a global configuration out, let’s consider first what types of communi-cation occur among all components.
Data flows: Executable tasks’ binaries exchanged between peers after routing task.
Data �owsDescribing data exchanged among components
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
All three types of flow are represented: line styles define the type of flow. The large dashed line represents data flows (binaries to be sent for exec tasks).
This diagram shows all flows occurring among peers. Only two peers are shown here (for simplicity).
UI
MANAGER DISCOVERY
WORKER
Send
Send
Notify
Notify
State: Worker Descriptor
State: Worker DescriptorTask Descriptor
Task Descriptor
T
W
T
W
UI
MANAGER DISCOVERY
WORKER
Proxies
Output port
Input port
Task
Worker
Data �ows (2)Data �ows at a closer look
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Every flow starts from a peer’s component to another one. ALL COMMUNICATIONS happen thanks to an intermediate entity: a proxy whose purpose is hiding low level communication dynamics and avoiding communicating entities to know how they are communicating (it can be over the network or locally).By doing so (and taking advantage of factory creation pattern) it is possible to extend our model to a Cient/Server one by just modifying proxies. This approach provides simplicty, flexibility and scalability.
After introducing flows we are ready to analyze them a bit closer and in-spect dynamics that make possible all communications among peers.
Dst CompSrc Comp
Communication �ow
PROXY
Inspecting interactionsGetting inside �ows
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Let’s consider the first steps occurring when a workflow is created and then submitted for its execution.
TIME
Ready to send the task to the
obtained worker.
A worker address is
returned to Manager.
Each task is considered
inside the work�ow.
A new thread is created to
manage the work�ow after
it has been dequeued.
AddrTDWF
Work�ow is sent to Manager through a local
proxy by UI.
A new request for a worker descriptor arrives
from Manager. A task descriptor is provided
along with this call.
A new work�ow is generated with a casual
number of tasks inside it.
MAN
DISC
UI
Local comm.
Net comm.
TaskDescriptor �owBefore sending
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Manager knows now the destination peer. Manager also evaluates that a command task must be sent, let’s see what happens.
TIME
TDTD
Task is EXECUTED.
Task typology is evaluated, it is a command
task.
Task is dequeued and a new thread is
created to manage it.Task to send is
recognized to be a command task.
The task is enqueued in
task queue residing on
Worker by a speci�c thread.
The task arrives to Manager on the
other peer.
MAN2
MAN1
W2
Local comm.
Net comm.
TaskDescriptor �ow (2.a)After sending - sending a command task
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
What if the task were an executable one? Manager has to perform a differ-ent activity and a new dynamic is considered.
TIME
TD
Data
TD
The internal collection of task data
is searched. When data is found (many
attempts are performed) then the
task is EXECUTED.
Task is recognized to
be an executable
task, need to retireve its
data.
Task is sent to local Worker.
Task is dequeued and a new thread is
created to manage it.
Data is inserted in a local
collection.
Binaries (data) are sent
directly to destination
peer’s Worker.
Data is considered and
managed by a speci�c well
created thread.
Task to send is recognized to be
an executable task.
The task is enqueued in
task queue residing on
Worker by a speci�c thread.
The task arrives to Manager on the
other peer.
MAN2
MAN1
W2
Local comm.
Net comm.
TaskDescriptor �ow (2.b)After sending - sending an executable task
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Let us focus on a Worker. Every Worker has an internal contol loop that de-tects status changes. Status notifications are sent to neighbour peers.
TIME
WDs
WD
To other neighbours.
Status noti�cation is sent to ALL neighbours.
Status noti�cation is
formed.
A status change has
been detected.Status control
loop activates.
The new status of the peer who
sent the noti�cation is
updated or added.
A new thread is created to
manage the noti�cation.
The noti�cation reaches the Discovery
of one neighbour peer.
DISC2
W1
Local comm.
Net comm.
WorkerDescriptor �ow Sending and receiving noti�cations
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Discovery can take the correct decision, to choose a peer, thanks to the notification system. To evaluate the best peer a comparison between PIs (Performance Index) is performed. PI ensures that the current best (having the best performance) peer will be selected to execute a given task.
When trying to send something, every component calls a proxy. All sending procedures in proxies are safe. The possibility that destination peer is temporary unreachable is consid-ered and many attempts are made trying to successfully send the payload. After a max number of attempts, only in this case, the sending process is aborted.
Three flows ensure that a task can be successfully executed with the lowest possible effort. This is guaranteed by taking advantage of network communications and worker status noti-fications.
All communications are performed using proxies. In particular, a proxy is created using its factory.
As the kernel runs and the network is working, a peer can communicate to its neighbours using some flows. Each flow involves two different components on the same or in different peers, or two same components in different peers.
Summarizing everything, we can say the following:
Summarizing �owsTo get the ball rolling...
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Let’s have a much closer look to everything we’ve introduced up until now. We are going to consider the most important elements focusing on chosen strategies and solutions in order to face all issues and avoid code flooding.
ADDRESSING
RUNNABLE CLASSES
SYNQUEUE(ING)
MULTITHREADING
PROXY(ING)
TDC GCOLLECT(ING)
NOTIFICATION SYSTEM SYNLOGGING
CONFIGURATION
PERFORMANCE INDEX
Deep diving into implementationsGetting started with code, design and algorithms
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Every peer dialogues to all its neighbours thanks to three specific TCP ports on a common IP address. The collection of the IP addr and the three ports, defines the peer network interface.
IP addr: Specifies the unique network location of the peer where it is reachable over the INet.
Man2Man port: Specifies the TCP port where the current peer’s Manager can listen for incoming TaskDescriptor to be executed by the local Worker.
Man2W port: Specifies the TCP port where the current peer’s Worker can listen for incoming data (bins) sent by another peer’s Manager after sending an executable task.
W2Disc port: Specifies the TCP port where the current peer’s Discovery can listen for incoming WorkerDescriptor in order to update performances of all its neighbours.
Network interface is set by CORE class at initialization time basing on settings inside the configuration file.
The Address class is responsible for containing all the necessary informa-tion for the peer network interface. This class is an associated member of the Worker class.
ADDRESSING
010203040506070809101112
namespace middleware {typedef string InetIpAddr;typedef _uint InetPort; class Address { bool operator==(..) {..} bool operator!=(..) {..} InetIpAddr _ip; InetPort _port; InetPort _port_disc; InetPort _port_w;};}
Peer addressingNetwork interfaces of every single peer
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Every component runs as a thread in the context of the main application. Every component also creates its own threads in order to perform all tasks in the best (fastest) possible way (minding concurrency).
MULTITHREADING
* * *
CORE
Crea
te/S
ubm
it W
F
UI
Deq
ue W
F
MAN
Wor
ker D
escr
ipto
rar
rival
list
ener
DISC
Deq
ue T
D
W
Man
age
WF
Task
Man
ager
Dat
aSen
d
Man
age
WD
Man
age
Task
Dat
aSen
der
Wor
ker S
tatu
s M
anag
er
= This is a thread
* = More instances of the thread are created
010203040506070809101112131415161718192021
namespace middleware {typedef struct { string man_ip_to_man_bind; string man_port_toman_bind; Worker* ptrto_worker; WorkerDiscovery* ptrto_discovery; string log_postfix;} ManagerConfig; class Manager : public Runner { void exec() {..} void exec_taskmanager() {..} void enqueuer(..) {..}public: void run() {..} void join() {..}};}
Multithreaded componentsWhere a thread could be used... it was
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
The kernel is initialized thanks to a configuration file. If no configuration file is provided, the kernel fails running, and quits.
The main actor is the configuration file. It is a plain text file with a very simple line-oriented syntax. All set-tings for the current peer are stored in the file. CORE class acquires the configuration file and sets all compo-nents using settings in the file. Taking advantage of this cascade configuration flow, the application can get scalability and flexibility. Typically, configuration files are named using the .config extension.
CONFIGURATION
01020304050607080910111213141516
C:Every unrecognized sequence (before :) is treated as a comment (by def, C).C:Every line is a configuration entry, recognized sequences are processed.C:--------------------------------------------------------------------------------C:Configuring the current peerCONFIG_ADDRME_IP:127.0.0.1CONFIG_ADDRME_MAN_PORT:1040CONFIG_ADDRME_DISC_PORT:1041CONFIG_ADDRME_W_PORT:1042C:--------------------------------------------------------------------------------CONFIG_OTHERPEERS_NUM:1C:--------------------------------------------------------------------------------C:Configuring beighbour knowledgeCONFIG_ADDRPEER_1_IP:127.0.0.1CONFIG_ADDRPEER_1_TASKS_PORT:2040CONFIG_ADDRPEER_1_WRKRS_PORT:2041CONFIG_ADDRPEER_1_DATA_PORT:2042
Con�guring the kernelCon�guration �le and syntax
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
To achieve good programming style and in order to let code be easy-to-read, all classes to be run as a thread are provided with two methods inherited by an interface (a pure C++ virtual method class, that is an abstract class).
The interface.hpp file defines a nice trick to let C++ “recognize” the interface keyword. All interfaces can be, so, declared using the interface keyword.All interfaces are implemented by simply using the colon notation as for inheritance system.The Runner interface defines the two needed methods to let a class be runnable.
The interface keyword defini-tion.
The Runner interface definition. Definition of Worker class, note how to make it runnable.
RUNNABLE CLASSES
01020304050607
#ifndef _INTERFACE_HPP_#define _INTERFACE_HPP_
// The trick#define interface class
#endif
010203040506070809
namespace middleware { interface Runner {public: // Runs thread virtual void run() = 0 // Joins thread virtual void join() = 0};}
01020304050607
#include “runner.hpp”namespace middleware {class Worker : public Runner { // Runnable class body}; }
Making things runnablePatterns used to develop runnable classes
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
As said before, all communications are performed by means of proxies, this enables the application to reach scalability and flexibility.
When a component needs to communicate with another one located in the same peer or in a different one, a proxy is used. By doing so it is possible to hide communication implementation/logic to that component with the result that it will never know whether the communication is a local call or a remote connection.
ALL PROXIES are created via a corresponding factory. ALL FACTORIES ARE FRIENDS OF THE CORRESPONDING PROXIES; this ensures that the proxy will be properly instantiated. As a consequence, ALL PROXIES HAVE PRIVATE CONSTRUCTORS.
THE POINTER-SAFE FACTORY PATTERN IS USED FOR FACTORIES. It means that a fac-tory returns the pointer to the constructed proxy which is dynamically allocated by the factory itself. ALL FACTORIES HOLD THE POINTER TO THE WELL-CREATED PROXY, in this way, WHEN A FACTORY GOES OUT OF SCOPE (destructor is called), THE PROXY WILL BE DESTROYED TOO. So the created proxy must come along with its factory.
PROXY(ING)
010203040506070809101112
namespace middleware {class TaskProxyFactory { TaskManagerProxy* _proxy;public: // Constructors TaskProxyFactory(); TaskProxyFactory(..); // Destructor ~TaskProxyFactory() { if (this->_proxy != 0) delete this->_proxy; }};}
Communications through proxiesManaging communications using proxy pattern
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
All peers have many time variant information, for example, the number of task in queue waiting to be executed. A PI is assigned to every peer basing on these information, the best peer to send a task to is the one having the highest PI.
NETWORK FACTOR: Defines a non-linear negative-2nd-order-derivative dependency between band-width and hop-distance
SPEED FACTOR: Defines an alge-bric dependency among cpu
speed, number of processors and current queue size
CAPACITY FACTOR: A non-linear factor to let cores and memory not weigh too much on final result
PERFORMANCE INDEX
Evaluating performance indexHow to decide which peer is the best to execute a task?
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
One idea: collapsing everything it is needed to let many entities to be collected together (minding or-dering) and let many threads operate on the collec-tion avoiding data non-consistency.
SynQueue is a an object developed with the intention of managing tasks and workflows to be enqueued in a sequence collection able to keep its consistency even when many threads operate on it.
SynQueue is generic and accepts all possible types as input.An internal mutex and an internal condition variable are used to keep the collection consistent.
SYNQUEUE(ING)
0102030405
using namespace middleware::queueing// Max 10 task descriptorSynQueue<TaskDescriptor, 10> q;// Infinite capacity queueSynQueue<TaskDescriptor> qq;
Synchronized queueA class to store tasks and work�ows minding concurrency
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
When data (binaries of an executable task) are sent to Worker, they are stored in a temporary place to stay, waiting to be extracted when the corresponsing task descriptors arrive.
TDC is susceptible of probable data inconsistency because of binary data never extracted (caused by task descriptor loss over the network after sending). Because of this, a control loop is necessary to periodically check for old entries. This loop resides in a thread properly created by Worker at initialization time.Every entry of the table is provided with a TimeToLive initialized to a value and decreased every cycle. When TTL reaches 0, the entry is removed.
TDC GCOLLECT(ING)
TTL = 100TTL = 11TTL = 51TTL = 1TTL = 120TTL = 23TTL = 110TTL = 25
TTL = 99TTL = 10TTL = 50TTL = 0TTL = 119TTL = 22TTL = 109TTL = 24
REMOVED
Task Data Collection GCollectingGarbage collection policy for TDC in Worker
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Inside Worker a well created thread is up to manage Worker status. When a change in the Worker status (change of PI) is detected, the system reacts sending the new WorkerDescriptor to all neighbour peers advising them to change their knowledge and update all PIs of neighbours. The notification is sent to every peer’s Discovery.
When notifications are sent by a Worker, these reach the corresponding destination peer’s Discovery.Discovery, in each peer, has a cotrol loop listening on a configured port and waiting for incoming WorkerD-escriptor to arrive. When a WorkerDescriptor arrives, it means that a neighbour has sent a notification and that that peer’s status must be changed.
The notification loop can be repro-grammed in order to send notifica-tions even without caring about status change.
Notifications ensure that Discovery is able to get a Worker for a task (to Manager when questioned) choosing the most suitable peer. This makes the general system more intelligent. If a peer has a very long queue of tasks to execute, maybe the current task should be dispatched to a different peer. But if that peer has a very powerful CPU and many cores, maybe that long queue of tasks will be shortly executed.
The notification system and PI provide the kernel with an intelligent way to dispatch tasks among peers.
NOTIFICATION SYSTEM
Notifying status changeIntelligent P2P
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
A well structured logging system ensures the possi-bility to trace all the most important events in the system. The system is based on files.
SynLogging is a very versatile system to create logs for one or more runs of the kernel. The system is meant to create four log files for each component in the application. Many peers might run on the same machine, so a postfix is used to differentiate a peer from another one and avoiding collisions.
Logging can be performed in separate or “dense” mode: it is possible to create a log for each component, or a common log for eveyone of them. Furthermore, if no postfix is specified for the current peer, all non-postfixed peers will operate on the same files and will log their content on a common location. This happens because the file writing policy is “open/append or create new”.
SYNLOGGING
APP APPMAN LOG
WLOG
DISCLOG
UILOG
COMMONLOG
OR
Logging systemAuditing and tracking events
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
In our projects, one of the most important differences relies on used Inter-net protocols in order to set communications among peers.
According to project requirements, we implemented our system communications using UDP/IP protocol. Lack of connection allows the system to be more flexible and fault tolerant when responding to node crashes or connection failures. On the other hand, the implemen-tation of a well designed protocol to manage reliability had to be considered, thus, incre-menting the development process complexity.
Another source of difference between both projects was the sending model used to transfer files and binaries of executable tasks. Rather than using a PUSH model, a PULL model was considered.
PULL sending model is able to reach high performance and reliability thanks to the ability to download binaries when needed, avoiding unnecessary waste of bandwidth and problems concerning receiver to keep an internal collection to store data (introducing all problems regarding garbage collection and so on).
Di�erences on INet ProtocolsUDP and �le transfer mode
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
Running processes and handling filesystem suggests cross-platform compli-ance, this was a delicate issue to manage.
boost::process is not officially part of the Boost Project. For this reason, no cross platform compliant operations were available in order to run processes (received tasks) and access the filesystem.
We considered the usage of standard Posix libraries. This implies the necessity to develop platform-dependent solutions in order to execute processes and to handle file permissions.
Implementation-wise, execve() and chmod() calls were used; available only on Posix-compliant systems.
Crossing platformsExecuting tasks and managing cross-platform issues
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
TaskMan-Middleware is not a complete software, it is meant to be scaled in future and also provided with more functionalities. There are also some issues to worry about, let’s see, now, the most important information.
Licensing system: GNU GPL (General Public License) v. 3.0Rationale: Possibility to enlarge current implementations, adding new ones and solving current issues.
Code location: Hosted on Google Code Project @ http://code.google.com/p/taskman-middleware.
Most important scaling target: The possibility to act on the kernel in order to support both P2P and Client/Server models by simply operating on proxies. Thanks to the proxy pattern it is possible to re-implement all proxies in order to let them make the entire system work as a client or a server or a peer.
Final considerationsEnhancements, applications and issues
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
TaskMan-Middleware suffers from some issues due to time constraints and development resources.
Code style: There are some stylistic issues to be solved regarding, expecially, printing. No class still supports the << operator in order to create an output of data. Logging classes use a print() method instead of a static stream like “cout” in conjunction with a properly << operator overloaded on it.
Inheritance: Only some classes defines an internal environment ready for future subclass-ing. Most classes do not use protected members and, so, do not imply a future extension for new implementations.
Tasks: At present, tasks can be command or executables, but, for the project’s ends, no real task final execution is performed, just simulated. It is possible to extend (not even much effort is required for this step) the system and effectively execute a command or a binary when it reaches the final destination.
Task generation: At present, workflows are created randomly by UI. User cannot directly assemble a workflow and submit it using a properly user interface. It is possible, in future, to create a front end to let users create and submit workflows.
Issues and little pathologiesElements to be solved
TaskMan-Middleware 2011 by M. Buzzanca, D. G. Monaco, R. Pulvirenti, G. Ravidà and A. Tino Distributed Systems Project 2011
TaskMan-Middleware has many possibilities regarding scaling and exstensi-bility. The current implementation can be modified in order to support new functionalities and new services.
GUI: There are many C++ libraries for graphic user controls and interface elements. Most of them are not open (like Borland, DevExpress) but many are free and open too. It would be possible to create a GUI for the kernel in order to better print events on tables or also taking advantage of many interactive controls for a better user usage experience.
Charting: Many companies are specialised in charting and they develop solutions and APIs to provide rendering and charting controls to developers (like DevExpress for Borland appli-cations). It would be possible to take advantage of logging classes and implement data structures for rendering all logs and getting statistics about network conditions, throughput, statistical approximation and much more.
QoS: It would be possible to act on proxies in order to provide information about the qual-ity of software, especially regarding network communication, packet loss, turnaround time, timings, events and much more.
Applications and future projectionsWhat can it be...
THANK YOUDANKE GRAZIE ありがとう HVALA