+ All Categories
Home > Documents > Link&Share (Graduation Project) Documentation

Link&Share (Graduation Project) Documentation

Date post: 08-Mar-2023
Category:
Upload: shams
View: 0 times
Download: 0 times
Share this document with a friend
110
Ain Shams University Faculty of Engineering Computer and Systems Engineering Department Graduation Project 2013/2014 Decentralized Peer to Peer Cloud System Supervisor Prof.Dr.Hoda Korashy Mohamed Submitted in Partial Fulfillment of the Requirement for the B.Sc. Degree, Computer and Systems Engineering Department, Faculty of Engineering, Ain Shams University, Cairo, Egypt
Transcript

Ain Shams University

Faculty of Engineering

Computer and Systems Engineering Department

Graduation Project

2013/2014

Decentralized Peer to Peer

Cloud System

Supervisor

Prof.Dr.Hoda Korashy Mohamed

Submitted in Partial Fulfillment of the Requirement for the B.Sc. Degree,

Computer and Systems Engineering Department, Faculty of Engineering,

Ain Shams University, Cairo, Egypt

I

Ain Shams University

Faculty of Engineering

Computer and Systems Engineering Department

Graduation Project

2013/2014

Decentralized Peer to Peer

Cloud System

Submitted By:

Ahmed Hassan Mohammed Shaaban

Ahmed Mahmoud Hussien Ali Baibars

Ziyad Sameh Ali El Sayed

Shady Magdy Abd El Ghany Abdo

Mohamed Ahmed Salem Sayed

Mohammed Marie El Minshawy

II

Table of Contents

1-Intro ................................................................................................................................................................... 1

1.1 Purpose ................................................................................................................................................................ 1

1.2 List of Definitions ................................................................................................................................................. 1

1.2 Scope ................................................................................................................................................................... 5

1.3 Overview .............................................................................................................................................................. 5

2-Background ........................................................................................................................................................ 6

2.1 Problem Statement: ............................................................................................................................................ 6

2.2 Problem Formulation: .......................................................................................................................................... 7

2.3 Concept Synthesis: ............................................................................................................................................... 8

2.3.1: Literature Review ........................................................................................................................................ 8

2.3.1.1 Al-chemi ............................................................................................................................................... 8

2-3-1-2: JXTA ................................................................................................................................................... 15

2.3.1.3 JXSE .................................................................................................................................................... 22

2.3.1.4: Overcoming NATS and Firewalls techniques: Hole punching, PNRP, Teredo, etc. ............................ 24

2.3.1.4.1 Teredo Tunneling: ....................................................................................................................... 24

2.3.1.4.2-PNRP(Peer Name Resolution Protocol): ..................................................................................... 25

2.3.1.4.3 TCP hole punching: ...................................................................................................................... 27

2.3.1.4.4 UDP hole punching: ..................................................................................................................... 31

2.4 Concept Generations ......................................................................................................................................... 34

2.5 Concept Reduction ............................................................................................................................................. 35

3-Design & Analysis ............................................................................................................................................. 37

3.1 General Description ........................................................................................................................................... 37

3.1.1 System Requirements Specifications ......................................................................................................... 37

3.1.2. General Capabilities. ................................................................................................................................. 37

3.1.3. Key Technical Issues .................................................................................................................................. 38

3.1.4. Assumptions and Dependencies. .............................................................................................................. 38

3.3 Use-Case Diagram with Narrative Description .................................................................................................. 39

3-4: Detailed Class Diagram .................................................................................................................................... 42

3-5: Interaction (Sequence) Diagram ...................................................................................................................... 43

3-6 Packages’ Dependencies ................................................................................................................................... 44

4-implementation and Tools ................................................................................................................................ 45

4-1: Flow-chart (block Diagram): ............................................................................................................................ 45

4-2 Program Features:............................................................................................................................................. 46

4-3: Other resources needed: .................................................................................................................................. 47

III

4-4 Testing: .............................................................................................................................................................. 47

4.4.1 Class level testing: ...................................................................................................................................... 47

4.4.2 System Level Testing: (Black Box Testing) .................................................................................................. 49

4.4.3 Performance Testing: ................................................................................................................................. 50

4.4.4 Security Testing: ......................................................................................................................................... 50

5-Results and Conclusions.................................................................................................................................... 51

5.3 performance analysis: ....................................................................................................................................... 53

5.4 Cost Analysis ...................................................................................................................................................... 56

5.5 Bill of Materials: ................................................................................................................................................ 56

5.6 Hazards and Failure Analysis: ............................................................................................................................ 56

5.7: Recommendations and further improvements: ............................................................................................... 57

6-user Guideline: ................................................................................................................................................. 58

7-Appendix ............................................................................................................................................................. i

7.1: Java Remote Method Invocation ......................................................................................................................... i

7.2 Exception in Remote Method Invocations .......................................................................................................... iv

7.3 Distributed Operating System ............................................................................................................................ ix

7-4 Detailed Class Diagram ................................................................................................................................... xvii

7-5 Detailed Sequence Diagram ............................................................................................................................ xvii

7.6 List of Readings .............................................................................................................................................. xviii

1

1-Intro

1.1 Purpose

This document is written with the intent of describing the use, operation and design of our

software product through the use of manuals, listings, diagrams, charts, statistics and other

written and graphical materials. The intended readers of this document are the project’s

evaluation committee and any interested researchers, professors or students in our software

project.

1.2 List of Definitions

1) OS (operating system): is software that manages computer hardware resources and provides

common services for computer programs.

2) Distributed system: A distributed system is a software system in which components located on

networked computers communicate and coordinate their actions by passing messages. The

components interact with each other in order to achieve a common goal

3) Distributed computing: the use of distributed systems to solve computational problems. In

distributed computing, a problem is divided into many tasks, each of which is solved by one or

more computers, which communicate with each other by message passing

4) peer to peer network/system: a type of decentralized and distributed network architecture in

which individual nodes in the network (called "peers") act as both suppliers and consumers of

resources, in contrast to centralized client–server model where client nodes request access to

resources provided by central servers.

5) Client-server model: The client–server model of computing is a distributed application structure

that partitions tasks or workloads between the providers of a resource or service, called servers,

and service requesters, called clients.

6) Bit-torrent: a protocol supporting the practice of peer-to-peer file sharing that is used to

distribute large amounts of data over the Internet.

2

7) Cloud computing: a term used to refer to a model of network computing where a program or

application runs on a connected server or servers rather than on a local computing device such

as a PC, tablet or smartphone.

8) Hash Table: a data structure used to implement an associative array, a structure that can map

keys to values. A hash table uses a hash function to compute an index into an array of buckets or

slots, from which the correct value can be found.

9) DHT(distributed hash table): a class of a decentralized distributed system that provides a lookup

service similar to a hash table; (key, value) pairs are stored in a DHT, and any participating node

can efficiently retrieve the value associated with a given key.

10) SDK(software development kit): a set of software development tools.

11) IDE(integrated development environment): a software application that provides comprehensive

facilities to computer programmers for software development. An IDE normally consists of a

source code editor, build automation tools and a debugger. Most modern IDEs offer intelligent

code completion features.

12) Mandelbrot set: a mathematical set of points whose boundary is a distinctive and easily

recognizable two-dimensional fractal shape. The set is closely related to Julia sets (which include

similarly complex shapes) and is named after the mathematician Benoit Mandelbrot, who

studied and popularized it.

13) TSP(travelling salesman problem): asks the following question: Given a list of cities and the

distances between each pair of cities, what is the shortest possible route that visits each city

exactly once and returns to the origin city? It is an NP-hard problem in combinatorial

optimization, important in operations research and theoretical computer science.

14) FIB(Fibonacci numbers): the first two numbers in the Fibonacci sequence are 1 and 1, or 0 and 1,

depending on the chosen starting point of the sequence, and each subsequent number is the

sum of the previous two.

15) Prime number: is a natural number greater than 1 that has no positive divisors other than 1 and

itself.

16) RMI(remote method invocation): a Java API that performs the object-oriented equivalent of

remote procedure calls (RPC), with support for direct transfer of serialized Java classes and

distributed garbage collection.

3

17) JXTA (Juxtapose): an open source peer-to-peer protocol specification begun by Sun

Microsystems in 2001. The JXTA protocols are defined as a set of XML messages which allow any

device connected to a network to exchange messages and collaborate independently of the

underlying network topology. JXTA™ is a set of open, generalized peer-to-peer (P2P) protocols

that allow any networked devices, sensors, cell phones, PDAs, laptops, workstations, servers and

supercomputers — to communicate and collaborate mutually as peers.

18) JXSE: it’s the same protocol as JXTA but it was continued and developed by some voluntary

developers after In November 2010, Oracle officially announced its withdrawal from the JXTA

projects.

19) Grid computing: the collection of computer resources from multiple locations to reach a

common goal. The grid can be thought of as a distributed system with non-interactive workloads

that involve a large number of files.

20) Grid: geographically distributed platforms for computations accessible to their users via a single

interface.

21) Parallel computing: a form of computation in which many calculations are carried out

simultaneously, operating on the principle that large problems can often be divided into smaller

ones, which are then solved concurrently ("in parallel").

22) NAT( Network address translation) : is a methodology of modifying network address information

in Internet Protocol (IP) datagram packet headers while they are in transit across a traffic

routing device for the purpose of remapping one IP address space into another.

23) API (application programming interface): specifies how some software components should

interact with each other.

24) PNRP(Peer Name Resolution Protocol) : a peer-to-peer protocol designed by Microsoft. PNRP

enables dynamic name publication and resolution, and requires IPv6.

25) TCP: Transmission Control Protocol (TCP): one of the core protocols of the Internet protocol

suite (IP), and is so common that the entire suite is often called TCP/IP. TCP provides reliable,

ordered and error-checked delivery of a stream of octets between programs running on

4

computers connected to a local area network, intranet or the public Internet. It resides at the

transport layer.

26) UDP (User Datagram Protocol ): one of the core members of the Internet protocol suite (the set

of network protocols used for the Internet). With UDP, computer applications can send

messages, in this case referred to as datagrams, to other hosts on an Internet Protocol (IP)

network without prior communications to set up special transmission channels or data paths.

The protocol was designed by David P. Reed in 1980 and formally defined in RFC 768.

27) NAT traversal: a general term for techniques that establish and maintain Internet protocol

connections traversing network address translation (NAT) gateways, which break end-to-end

connectivity. Intercepting and modifying traffic can only be performed transparently in the

absence of secure encryption and authentication.

28) STUN (Session Traversal Utilities for NAT): a standardized set of methods and a network protocol

to allow an end host to discover its public IP address if it is located behind a NAT. It is used to

permit NAT traversal for applications of real-time voice, video, messaging, and other interactive

IP communications.

29) ICE(Interactive Connectivity Establishment ): a technique used in computer networking

involving network address translators (NATs) in Internet applications of Voice over Internet

Protocol (VoIP), peer-to-peer communications, video, instant messaging and other interactive

media. In such applications, NAT traversal is an important component to facilitate

communications involving hosts on private network installations, often located behind firewalls

30) VOIP( Voice-over-Internet Protocol) : a methodology and group of technologies for the delivery

of voice communications and multimedia sessions over Internet Protocol (IP) networks, such as

the Internet. Other terms commonly associated with VoIP are IP telephony, Internet telephony,

voice over broadband (VoBB), broadband telephony, IP communications, and broadband phone

service.

31) LAN: (local area network): a computer network that interconnects computers within a limited

area such as a home, school, computer laboratory, or office building, using network media.

5

1.2 Scope

The project name is Decentralized Peer to Peer Cloud System and the aimed project will

implement the technology of a "Peer to Peer Decentralized Cloud System" through sharing

computer resources (computing power, storage, etc.) between peers. It depends on using the

idle time of peers and uses the available resources to get, mainly, very large computing power

to other peers for free. The project scope is to design this magic cloud computing power

through software service installed on peers with some virtualization concepts to share this

processing power.

1.3 Overview

The rest of the document is organized as follows:

In chapter 2 of this document we will introduce some background about the project, we will learn

about the problem that made the developers of this project think about a software solution for it,

we will introduce some key literature that has been researched and used in the design effort, and

we learn about the decision making process was used to reduce the number of possible conceptual

solutions to a single (optimal) solution.

In chapter 3, we will go through the Design & Analysis of the project, as we will introduce the

System Requirements’ Specifications of the project along with a number of software design

diagrams like the use case, class, state and interaction diagrams.

In chapter 4, we will discover the implementation of the project and how it works with some

descriptive diagrams like the block diagram along with a detailed sequenced description supported

with snapshots of the operation of the software project and we will introduced some tools and

techniques used during the project, besides the full applied testing plan during the project.

In chapter 5, we introduce the results of our work in the project, the conclusions obtained, the

learned lessons, cost analysis, failure analysis and some recommendation for further improvements

of the project.

In chapter 6: we provide a detailed user guide describing how to run the software project,

supported by a number of snapshots describing the steps to compile and run the project.

In chapter 7 : we introduce some small explanations for some terms and techniques mentioned in

this document for further understanding of them along with some links for further reading about

certain topics mentioned In this document and finally a list of references that were used in this

document .

6

2-Background

2.1 Problem Statement:

There are so many computers around the world that sit idle most of the day. For example, In the U.S.

alone, there are 250 million PCs – and most of us let those computers sit idle most of the time. And even

when we do use them, the average person is using about five percent or less of the CPU’s capabilities.

Along with the fact that in many universities, research organizations, and, lately, enterprises, the

following recent trends:

performance of desktop computers increases dramatically; new technological advancements stimulate

use of computing applications with extreme requirements for computational power; use of computing,

simulations, visualizations, and optimization in various research fields and practical applications is

accelerating and leads to very high demands on computing power; and the pace of development of

high-performance servers hardly equals these trends, but for very high financial costs. The computing

power of a single desktop computer is insufficient for running complex algorithms, and, traditionally,

large parallel supercomputers or dedicated clusters were used for this job. However, very high initial

investments and maintenance costs limit the availability of such systems. And so we aim that anyone

with a PC can become a provider for cloud computing power through this virtualized magic platform.

And our Objectives given the problem and needs we just mentioned are:

1. Reducing power consumptions and CO2 emissions spent at large data centers and servers.

2. Giving end users the ability to run programs and apps that needs very high computing specs

using only this software on normal PCs.

3. Expanding the computing power abilities for end users that are independent on old servers.

4. Reducing the cost and space needed for traditional data centers.

7

5. Mainly targeting very complicated simulations and programs in science and engineering that

may take hours or even days on normal PCs to process; our software aims to finish that in

minutes.

But we are facing some constraints and challenges we are trying to solve and they are mainly two.

First, peer capacities are heterogeneous so how to find suitable peers? Second, lifetime of

peers is limited so how to support long-term “reservations”? And the problem of sending and

receiving over internet as there is until now a common, trusted and reliable way to broadcast

information between computers over the internet that can be used with resulting minimum

efficiency for a start.

2.2 Problem Formulation:

The problem: There are so many computers around the world that sit idle most of the day And even

when we do use them, the average person is using about five percent or less of the CPU’s capabilities.

On the other side, new technological advancements stimulate use of computing applications with

extreme requirements for computational power. Use of computing, simulations, visualizations, and

optimization in various research fields and practical applications is accelerating and leads to very high

demands on computing power.

A thesis: we can use the idle time of computers and use the available resources to get, mainly, very

large computing power to other demanding applications for free.

A method: implement the technology of a "Peer to Peer Decentralized Cloud System" through

sharing computer resources (computing power, storage, etc.) between peers. We will design

this magic cloud computing power through software service installed on peers with some

virtualization concepts to share this processing power.

A point of view: this suggested solution will harness the power of thousands of idle cpu cycles of a lot of

computers and provide a suitable application where you can get extra processing power for free.

8

2.3 Concept Synthesis:

2.3.1: Literature Review

during the design of our software we harnessed knowledge about the terminologies used in distributed,

cloud and parallel computing and peer to peer networks, studied a lot of research papers and tutorials

and books trying to understand the algorithms behind distributing work among different computers and

how the peer to peer network is constructed and maintained and studied a lot to similar projects to ours

to broaden our knowledge about how to design our software and how it will operate and finally

searched and studied a lot of different protocols of distributing work among computers and constructing

a peer to peer network with no central place to obtain knowledge about the peers in the network which

some of this were already implemented and some suggested with details but here we will show only the

most important literature in a summarized way , leaving the full details of this in the appendix and

mentioning almost all the literature used in the list of refrences.we will focus here in 4 main sources of

literature which are as follows :

2.3.1.1 Al-chemi

� Enterprise grid framework and runtime machinery to create a high-throughput computing

environment by harnessing distributed resources

• .NET-based (Windows)

• Voluntary execution (cycle stealing) or

• Dedicated execution

• LAN or Internet

� Programming environment

• Independent grid threads (.NET API)

• File-based jobs (input, executable, output)

� Web service for interoperability with other grid middleware

• File-based jobs

� Monitoring, administration tools

Al-chemi follows the master-worker parallel programming paradigm in which a central component

dispatches independent units of parallel execution to workers and manages them. This smallest unit of

parallel execution is a grid thread, which is conceptually and programmatically similar to a thread object

(in the object-oriented sense) that wraps a "normal" multitasking operating system thread. A grid

application is defined simply as an application that is to be executed on a grid and that consists of a

9

number of grid threads. Grid applications and grid threads are exposed to the grid application developer

via the object oriented Al-chemi .NET API. Figure 1 shows its architecture:

Figure 1:the architecture of alchemi

10

Alchemi offers four distributed components, designed to operate under three usage patterns. The four

components are as follows:

1-Manager

The Manager manages the execution of grid applications and provides services associated with

managing thread execution. Threads received from the Owner are placed in a pool and scheduled to be

executed on the various available Executors. A priority for each thread can be explicitly specified when it

is created within the Owner, but is assigned the highest priority by default if none is specified. Threads

are scheduled on a Priority and First Come First Served (FCFS) basis, in that order. The Executors return

completed threads to the Manager which are subsequently passed on or collected by the respective

Owner.

2- Executor

The Executor accepts threads from the Manager and executes them. An Executor can be configured to

be dedicated, meaning the resource is centrally managed by the Manager, or non-dedicated, meaning

that the resource is managed on a volunteer basis via a screen saver or by the user. For non-dedicated

execution, there is one-way communication between the Executor and the Manager. In this case, the

resource that the Executor resides on is managed on a volunteer basis since it requests threads to

execute from the Manager. Where two-way communication is possible and dedicated execution is

desired In this case, the Manager explicitly instructs the Executor to execute threads, resulting in

centralized management of the resource where the Executor resides.

Alchemi’s execution model provides the dual benefit of:

• Flexible resource management i.e. centralized management with dedicated execution vs.

decentralized management with non-dedicated execution.

• Flexible deployment under network constraints i.e. the component can be deployment as no

dedicated where two-way communication is not desired or not possible (e.g. when it is behind a

firewall or NAT/proxy server).

Thus, dedicated execution is more suitable where the Manager and Executor are on the same Local Area

Network while non-dedicated execution is more appropriate when the Manager and Executor are to be

connected over the Internet.

11

3-Owner

Grid applications created using the Alchemi API are executed on the Owner component. The Owner

provides an interface with respect to grid applications between the application developer and the grid.

Hence it “owns” the application and provides services associated with the ownership of an application

and its constituent threads. The Owner submits threads to the Manager and collects completed threads

on behalf of the application developer via the Alchemi API.

4- Cross platform manager

The Cross-Platform Manager, an optional sub-component of the Manager, is a generic web services

interface that exposes a portion of the functionality of the Manager in order to enable Alchemi to

manage the execution of platform independent grid jobs (as opposed to grid applications utilizing the

Alchemi grid thread model). Jobs submitted to the Cross-Platform Manager are translated into a form

that is accepted by the Manager (i.e. grid threads), which are then scheduled and executed as normal in

the fashion described above. Thus, in addition to supporting the grid-enabling of existing applications,

the Cross-Platform Manager enables other grid middleware to interoperate with and leverage Alchemi

on any platform that supports web services (e.g. Gridbus Grid Service Broker).

The three usage patterns are as follows:

The components discussed above allow Alchemi to be utilized to create different grid configurations:

desktop cluster grid, multi-cluster grid, and cross-platform grid (global grid).

12

1-Cluster (Desktop Grid):

The basic deployment scenario – a cluster (shown in Figure 2) - consists of a single Manager and multiple

Executors that are configured to connect to the Manager. One or more Owners can execute their

applications on the cluster by connecting to the Manager. Such an environment is appropriate for

deployment on Local Area Networks as well as the Internet. The operation of the Manager, Executor and

Owner components in a cluster is as described above.

Figure 2: cluster architecture

2- Multi-Cluster

A multi-cluster environment is created by connecting Managers in a hierarchical fashion (Figure 3). As in

a single-cluster environment, any number of Executors and Owners can connect to a Manager at any

level in the hierarchy. An Executor and Owner in a multi-cluster environment connect to a Manager in

the same fashion as in a cluster and correspondingly their operation is no different from that in a

cluster. The key to accomplishing multi-clustering in Alchemi's architecture is the fact that a Manager

behaves like an Executor towards another Manager since the Manager implements the interface of the

Executor. A Manager at each level except for the topmost level in the hierarchy is configured to connect

to a higher level Manager as an “intermediate” Manager and is treated by the higher level-Manager as

an Executor. Such an environment is more appropriate for deployment on the Internet. Once Owners

have submitted grid applications to their respective Managers, each Manager has “local” grid threads

waiting to be executed. As discussed, threads are assigned the highest priority by default (unless the

priority is explicitly specified during creation) and threads are scheduled and executed as normal by the

Manager’s local Executors. Note that an ‘Executor’ in this context could actually be an intermediate

Manager, since it is treated as an Executor by the higher-level Manager. In this case after receiving a

13

thread from the higher-level Manager, it is scheduled locally by the intermediate Manager with a

priority reduced by one unit and is executed as normal by the Manager’s local ‘Executors’ (again, any of

which could be intermediate Managers). In addition, at some point the situation may arise when a

Manager wishes to allocate a thread to one of its local Executors (one or more of which could an

intermediate Manager), but there are no local threads waiting to be executed. In this case, if the

Manager is an intermediate Manager, it requests a thread from its higher-level Manager, reduced the

priority by one unit and schedules it locally. In both of these cases, the effect of the reduction in priority

of a thread as it moves down the hierarchy of Managers is that the “closer” a thread is submitted to an

Executor, the higher is the priority that it executes with. This allows a portion of an Alchemi grid that is

within one administrative domain (i.e. a cluster or multi-cluster under a specific “administrative domain

Manager”) to be shared with other organizations to create a collaborative grid environment without

impacting on its utility to local users. As with an Executor, an intermediate Manager must be configured

for either dedicated or non- dedicated execution. Not only does its operation in this respect mirror that

of an Executor, the same benefits of flexible resource management and deployment under network

constraints apply.

Figure 3:Multi cluster architecture

3-Cross-Platform Grid

The Cross-Platform Manager can be used to construct a grid conforming to the classical global grid

model (Figure 4). A grid middleware component such as a broker can use the Cross-Platform Manager

web service to execute cross-platform applications (jobs within tasks) on an Alchemi node (cluster or

multi-cluster) as well as resources grid-enabled using other technologies such as Globus.

Figure 4:Cross-Platorm grid architecture

14

Design and Implementation of alchemi:

The .NET Framework offers two mechanisms for execution across application domains – Remoting and

web services (application domains are the unit of isolation for a .NET application and can reside on

different network hosts). .NET Remoting allows a .NET object to be “remoted” and expose its

functionality across application domains. Remoting is used for communication between the four

Alchemi distributed grid components as it allows low-level interaction transparently between .NET

objects with low overhead (remote objects are configured to use binary encoding for messaging).

Web services were considered briefly for this purpose, but were decided against due to the relatively

higher overheads involved with XML-encoded messages, the inherent inflexibility of the HTTP protocol

for the requirements at hand and the fact that each component would be required to be configured

with a web services container (web server). However, web services are used for the Cross-Platform

Manager’s public interface since cross-platform interoperability was the primary requirement in this

regard.

Alchemi APIs: there are 2 models as follows:

1-Grid Thread Model

Alchemi simplifies the development of grid applications by providing a programming model that is

object oriented and that imitates traditional multi-threaded programming. The atomic unit of parallel

execution is a grid thread with many grid threads comprising a grid application (hereafter, ‘applications’

and ‘threads’ can be taken to mean grid applications and grid threads respectively, unless stated

otherwise). Developers deal only with application and thread objects and any other custom objects,

allowing him/her to concentrate on the grid application itself without worrying about the "plumbing"

details. Furthermore, this kind of abstraction allows the use of programming language constructs such

as events between local and remote code. All of this is offered via the Alchemi .NET API. The additional

benefit of this approach is that it does not limit the developer to applications that are completely or

“embarrassingly” parallel. Indeed, it allows development of grid applications where thread

communication is required. While Alchemi currently only supports completely parallel threads, support

for inter-thread communication is planned for the future. Finally it should be noted that grid

applications utilizing the Alchemi .NET API can be written in any .NET-supported language e.g. C#,

VB.NET, Managed C++, J#, JScipt.NET.

15

2-Grid Job Model

Traditional grid implementations have offered a high-level abstraction of the "virtual machine", where

the smallest unit of parallel execution is a process (typically referred to as a job, with many jobs

constituting a task). Although writing software for the “grid job” model involves dealing with processes,

an approach that can be complicated and inflexible, Alchemi’s architecture supports it via web services

interface for the following reasons: grid-enabling existing applications and ƒ cross-platform

interoperability with grid middleware that can leverage Alchemi Grid tasks and grid jobs are represented

internally as grid applications and grid threads respectively.

2-3-1-2: JXTA

Sun Microsystems formed Project JXTA (pronounced juxtapose or juxta), a small development team

under the guidance of Bill Joy and Mike Clary, to design a solution to serve all P2P applications. At its

core, JXTA is simply a set of protocol specifications, which is what makes it so powerful. Anyone who

wants to produce a new P2P application is spared the difficulty of properly designing protocols to handle

the core functions of P2P communication.

What Does JXTA Mean?

The name JXTA is derived from the word juxtapose, meaning to place two entities side by side or in

proximity. By choosing this name, the development team at Sun recognized that P2P solutions would

always exist alongside the current client/server solutions rather than replacing them completely.

The JXTA v1.0 Protocols Specification defines the basic building blocks and protocols of P2P networking:

• Peer Discovery Protocol—Enables peers to discover peer services on the network

• Peer Resolver Protocol—Allows peers to send and process generic requests

• Rendezvous Protocol—Handles the details of propagating messages between peers

• Peer Information Protocol—Provides peers with a way to obtain status information from other

peers on the network

• Pipe Binding Protocol—provides a mechanism to bind a virtual communication channel to a peer

endpoint

• Endpoint Routing Protocol—provides a set of messages used to enable message routing from a

source peer to a destination peer.

16

The JXTA protocols are language-independent, defining a set of XML messages to coordinate some

aspect of P2P networking. Although some developers in the P2P community protest the use of such a

verbose language, the choice of XML allows implementers of the JXTA protocols to leverage existing

toolsets for XML parsing and formatting. In addition, the simplicity of the JXTA protocols makes it

possible to implement P2P solutions on any device with a “digital heartbeat,” such as PDAs or cell

phones, further expanding the number of potential peers.

Core JXTA Design Principles:

Note: the implemented programming language for JXTA protocols is Java.

While designing the protocol suite, the Project JXTA team made a conscious decision to design JXTA in a

manner that would address the needs of the widest possible set of P2P applications. The design team

stripped the protocols of any application-specific assumptions, focusing on the core P2P functionality

that forms the foundation of all types of P2P applications. One of the most important design choices was

not to make assumptions about the type of operating system or development language employed by a

peer. By making this choice, the Project JXTA team hoped to enable the largest number of potential

participants in any JXTA-enabled P2P networking application. The JXTA Protocols Specification expressly

states that network peers should be assumed to be any type of device, from the smallest embedded

device to the largest supercomputer cluster. In addition to eliminating barriers to participation based on

operating system, computing platform, or programming language, JXTA makes no assumptions about

the network transport mechanism, except for a requirement that JXTA must not require broadcast or

multicast transport capabilities. JXTA assumes that peers and their resources might appear and

disappear spontaneously from the network and that a peer’s network location might change

spontaneously or be masked by Network Address Translation (NAT) or firewall equipment.

Apart from the requirements specified by the JXTA Protocols Specification, the specification makes

several important recommendations. In particular, the specification recommends that peers cache

information to reduce network traffic and provide message routing to peers that are not directly

connected to the network.

17

Figure 5: JXTA main architecture

18

The Logical Layers of JXTA

The JXTA platform can be broken into three layers, as shown in Figure 6

Figure 6:Logical Layer of JXTA

Building a Generic Framework for Distributed Computing using JXTA :

We build a generic distributed computational framework capable of utilizing the idle CPU cycles of any

Internet computer that the peer is installed on. This framework can be easily extended with your own

computational code. To accomplish this, we need to build the following components:

1. Computation code—The code that will be executed by the remote peers

2. Master peer—A peer that remote peers can contact to obtain the computation code as well as

data to work with using the code

3. Worker peer—A peer that resides on multiple remote computers, requests work, solves it, and

sends the results back to the master peer

19

The master peer is responsible for accepting messages from the worker peers. The messages have two

functions: to request code and data to work on and to deliver the results from the computation. As long

as the master has data available, it will provide that information upon request.

The worker peer requests work as well as data from the master. In our example, the worker receives an

object for the actual computation, and the necessary data is embedded in the object. The worker calls

an appropriate method of the object and returns the results.

The computation code component is a class used to build objects for computation. This component

follows a framework necessary for serialization and reconstruction on the worker peer. To show how

the computational objects can be used to execute real problems, the Mandelbrot algorithm is used as

an example.

Master Code: The master code for our generic distributed computation framework provides

functionality for the following:

• Launching into the JXTA network

• Building an input pipe to receive work requests

• Building and instantiating work objects

• Gathering work results

The master peer will advertise a pipe for receiving work and result requests. The peer expects to receive

a JXTA message with the type element defined. If the type element has a value of results, the message

will also contain an element called results. This element could contain a value, a string, or a serialized

object—it all depends on the complexity of the results. If the type element has any other value, the

master will assume that the remote peer is looking for new work. In this case, the message from the

worker peer should contain an element called pipe that contains an input pipe advertisement for the

worker. The worker will receive the work from the master over this pipe. The return message from the

master will contain the element work. The content of the element consists of a serialized work class

object.

20

Worker Code: The worker in the framework has two primary responsibilities: to request work and to

return results to the master peer. Work is requested through an output pipe, which the worker peer

discovers and then connects with. The worker will send a message with a type element having a value of

work. Also included in the message is an advertisement for a pipe the worker has created to receive the

work message from the master. Both the pipe and the advertisement are created dynamically when the

worker peer is executed

The two important parts of the worker peer: setup and work.

1-Setup

Once the peer has been launched into the JXTA network, it will attempt to find and connect to the pipe

advertised by the master peer. Before the connection is made to the pipe, no additional activity can

occur on the worker. Within the run() method, the code uses a loop and the sleep() method to wait until

the myOutputPipe variable has a value other than null. When the variable has an instantiated object

associated with it, the worker will attempt to obtain and execute while there is work available. The code

is designed to execute one piece of work at a time. This is accomplished by checking for a variable called

notDone. If notDone is true, this indicates that the object sent from the master is still executing and that

another one should not be obtained.

2-Work

The worker peer obtains work by calling the getWork() method . In this method, the peer sends a

message to the master that has a type element with the value work and a pipe element that contains

the pipe advertisement of the input pipe to the peer. The master will use the pipe to send the work to

the worker peer. With the message sent to the master, the variable notDone is set to true, causing the

loop in the run() method to execute, and preventing the system from requesting another piece of work.

The work is received by the listener for the worker peer’s input pipe . When a message is received from

the master, the worker peer extracts the string from the work element and converts it back into a Java

object using an ObjectInputStream and the readObject() method. The instantiated and reincarnated

object is executed by calling the run() method. Calling this method should cause the object to perform

calculations on values provided to the object by the master. The run() method should block until it has

populated the appropriate result attributes of the work object. When control returns from the run()

method, a call is made to the sendResults() method defined in lines 191 through 204. This method builds

a message with the type, results, x, and y elements. The type element holds a value of results to let the

master peer know that it is receiving results from a peer, and that the other elements hold the actual

results. The current framework returns just three results to the master. We must modify the

21

sendResults() method if more than these integers have to be provided to the master peer. A further

enhancement might be the inclusion of a result class, was defined appropriately, the work class we

discuss next might be able to populate which the worker peer could populate and send to the master. If

the result class it generically, thus allowing the peer to instantiate the result and send it.

3-Computational Code:

The distributed computation framework requires a class that can be distributed to idle and willing

machines on the Internet. The class designed for this distributed is called work. The work class is very

basic, and can be expanded as needed for a specific application such as distributed.net or SETI. The class

is designed to be instantiated into an object; the object is then initialized and sent to a worker peer in a

JXTA message. The worker peer receives the object and executes the run()method. The results from the

work object are packaged into a JXTA message and sent to a master peer that gathers all the results.

(The master peer’s input pipe could have been sent in the original message with the work object or

hardcoded into an idle peer’s code.) .The process of sending the instantiated work object to a worker

peer sounds simple, but there are a few things to keep in mind. First, the work class must implement the

Java Serializable interface. In order to implement the interface, the text “implements Serializable” must

be found after the class’s definition line. If there are no special requirements for serialization of the

class, no additional work has to occur in order for the Java system to be able to serialize an object of the

class.

The Mandelbrot algorithm is executed, and each peer is responsible for calculating a single point.

Obviously, this consumes quite a bit of network traffic. A more realistic work object would require much

more in the way of computation. The work is performed which define the doMandel() method. The

values needed in the algorithm can be found in the object’s private variables. All of the variables were

set when the work object was instantiated by the master peer.

22

2.3.1.3 JXSE

JXSE is the implementation of the JXTA protocols in the Java programming language. The JXSE acronym

is more and more used to distinguish the protocols from their implementation in Java.

The main difference between the old JXTA implemented using JAVA and the new JXSE is the modified

protocols’ structures and some different and new APIs used. We will mention some major differences in

the following pages.

Note: the words JXTA and JXSE will be used interchangeably in this part meaning the same thing: the

new and updated form of JXTA which had been implemented in java under the name JXSE

The Three Layer Cake

Conceptually, JXTA is made of three logical layers:

Figure 7:JXSE Layers

1. Platform: This layer is the base of JXTA and contains the implementation of the minimal and essential

functionalities required to perform P2P networking. Ideally, JXTA-enabled peers will implement all JXTA

functionalities, although they are not required to. This layer is also known as the core layer.

2. Services – This layer contains additional services that are not absolutely necessary for a P2P system to

operate, but which might be useful. For example: file sharing, PKI infrastructures, distributed files

systems, etc... These services are not part of the set of services defined by JXTA.

23

3. Applications – P2P applications are built on top of the service layer. However, if I develop JXSE is the

implementation of the JXTA protocols in the Java programming language. The JXSE acronym is more and

more used to distinguish the protocols from their implementation in Java. a file sharing application and

let other JXTA based applications make requests to my application, the other applications will perceive

me as a service. Therefore, the border between a service and an application depends on one's

perspective.

The Protocols

JXTA defines 6 protocols as shown in figure 8

Figure 8:JXSE Protocols

The core protocols are the Peer Resolver Protocol (PRP) and the Endpoint Routing Protocol (ERP).

The standard protocols are the Peer Discovery Protocol (PDP), the Peer Information Protocol (PIP), the

Pipe Binding Protocol (PBP) and the Rendezvous Protocol (RVP). Higher protocols use lower protocols to

delegate work. Some lower protocols can some- times call higher protocols

.

Every JXTA peer must implement at least the Endpoint Routing Protocol and the Message Propagating

Protocol, which is a sub protocol of the Peer Resolver Protocol, in order to participate to a JXTA

network. We will describe each JXTA protocol using a bottom-up approach and after describing the role

of messages in JXTA.

24

2.3.1.4: Overcoming NATS and Firewalls techniques: Hole punching, PNRP, Teredo, etc.

2.3.1.4.1 Teredo Tunneling:

In computer networking, Teredo is a transition technology that gives full IPv6 connectivity for IPv6-

capable hosts which are on the IPv4 Internet but which have no direct native connection to an IPv6

network. Compared to other similar protocols its distinguishing feature is that it is able to perform its

function even from behind network address translation (NAT) devices such as home routers.

Teredo operates using a platform independent tunneling protocol designed to provide IPv6 (Internet

Protocol version 6) connectivity by encapsulating IPv6 datagram packets within IPv4 User Datagram

Protocol (UDP) packets. These datagrams can be routed on the IPv4 Internet and through NAT devices.

Other Teredo nodes elsewhere called Teredo relays that have access to the IPv6 network then receive

the packets, unencapsulate them, and route them on.

Teredo is designed as a last resort transition technology and is intended to be a temporary measure: in

the long term, all IPv6 hosts should use native IPv6 connectivity. Teredo should therefore be disabled

when native IPv6 connectivity becomes available.

Teredo was developed by Christian Huitema at Microsoft, and was standardized in the IETF as RFC 4380.

The Teredo server listens on UDP port 3544.

Purpose

6to4, the most common IPv6 over IPv4 tunneling protocol, requires the tunnel endpoint to have a public

IPv4 address. However, many hosts are currently attached to the IPv4 Internet through one or several

NAT devices, usually because of IPv4 address shortage. In such a situation, the only available public IPv4

address is assigned to the NAT device, and the 6to4 tunnel endpoint needs to be implemented on the

NAT device itself. Many NAT devices currently deployed, however, cannot be upgraded to implement

6to4, for technical or economic reasons.

25

Teredo alleviates this problem by encapsulating IPv6 packets within UDP/IPv4 datagrams, which most

NATs can forward properly. Thus, IPv6-aware hosts behind NATs can be used as Teredo tunnel

endpoints even when they don't have a dedicated public IPv4 address. In effect, a host implementing

Teredo can gain IPv6 connectivity with no cooperation from the local network environment.

Teredo is intended to be a temporary measure: in the long term, all IPv6 hosts should use native IPv6

connectivity. The Teredo protocol includes provisions for a sunset procedure: Teredo implementation

should provide a way to stop using Teredo connectivity when IPv6 has matured and connectivity

becomes available using a less brittle mechanism.

The Teredo protocol performs several functions:

1. Diagnoses UDP over IPv4 (UDPv4) connectivity and discovers the kind of NAT present (using a

simplified replacement to the STUN protocol)

2. Assigns a globally routable unique IPv6 address to each host using it

3. Encapsulates IPv6 packets inside UDPv4 datagrams for transmission over an IPv4 network (this

includes NAT traversal)

4. Routes traffic between Teredo hosts and native (or otherwise non-Teredo) IPv6 hosts

2.3.1.4.2-PNRP(Peer Name Resolution Protocol):

Peer Name Resolution Protocol (PNRP) is a peer-to-peer protocol designed by Microsoft. PNRP enables

dynamic name publication and resolution, and requires IPv6.

PNRP was first mentioned during a presentation at a P2P conference in November 2001. It appeared in

July 2003 in the Advanced Networking Pack for Windows XP, and was later included in the Service Pack 2

for Windows XP. PNRP 2.0 was introduced with Windows Vista and is available for download for

Windows XP Service Pack 2 users. PNRP 2.1 is included in Windows Vista SP1, Windows Server 2008 and

Windows XP SP3. PNRP v2 is not available for Windows XP Professional x64 Edition or any edition of

Windows Server 2003.

26

Windows Remote Assistance in Windows 7 uses PNRP when connecting using the Easy Connect option.

The design of PNRP is covered by US Patent #7,065,587, issued on June 20, 2006.

PNRP services

PNRP is a distributed name resolution protocol allowing Internet hosts to publish "peer names" and

corresponding IPv6 addresses and optionally other information. Other hosts can then resolve the peer

name, retrieve the corresponding addresses and other information, and establish peer-to-peer

connections.

With PNRP, peer names are composed of an "authority" and a "qualifier". The authority is identified by a

secure hash of an associated public key, or by a place-holder (the number zero) if the peer name is

"unsecured". The qualifier is a string, allowing an authority to have different peer names for different

services.

If a peer name is secure, the PNRP name records are signed by the publishing authority, and can be

verified using its public key. Unsecured peer names can be published by anybody, without possible

verification.

Multiple entities can publish the same peer name. For example, if a peer name is associated with a

group, any group member can publish addresses for the peer name.

Peer names are published and resolved within a specified scope. The scope can be a local link, a site (e.g.

a campus), or the whole Internet.

PNRP and Distributed Hash Tables

Internally, PNRP uses an architecture similar to distributed hash table systems such as Chord or Pastry.

The peer name is hashed to produce a 128-bit peer identifier, and a DHT-like algorithm is used to

retrieve the location of the host publishing that identifier. There are however some significant

differences.

DHT systems like Chord or Pastry store the indices of objects (hashes) at the node whose identifier is

closest to the hash, and the routing algorithm is designed to find that node. In contrast, PNRP always

store the hash on the node that publishes the identifier. A node will thus have as many entries in the

27

routing system as the number of identifiers that it publishes. The PNRP design arguably trades increased

security and robustness for higher routing cost.

Most DHT systems assume that only one node publishes a specific index. In contrast, PNRP allows

multiple hosts to publish the same name. The internal index is in fact composed of the 128-bit hash of

the peer name and a 128-bit location identifier, derived from an IPv6 address of the node.

PNRP does not use a routing table, but rather a cache of PNRP entries. New cache entries are acquired

as a side effect of ongoing traffic. The cache maintenance algorithm ensures that each node maintains

adequate knowledge of the "cloud". It is designed to ensure that the time to resolve a request varies as

the logarithm of the size of the cloud.

2.3.1.4.3 TCP hole punching:

TCP NAT traversal and TCP hole punching refer to the case where two hosts behind a NAT are trying to

connect to each other with outbound TCP connections. Such scenario is particularly important in the

case of peer-to-peer communications, such as Voice-over-IP (VoIP), file sharing, teleconferencing, chat

systems and similar applications.

TCP hole punching is a commonly used NAT traversal technique for establishing a TCP connection

between two peers behind a NAT device in an Internet computer network. The term NAT traversal is a

general term for techniques that establish and maintain TCP/IP network and/or TCP connections

traversing network-address-translation (NAT) gateways.

Description

NAT traversal, through TCP hole punching, is a method for establishing bidirectional TCP connections

between Internet hosts in private networks using NAT. It does not work with all types of NATs, as their

behavior is not standardized. When two hosts are connecting to each other in TCP, both via outbound

connections, they are in the "simultaneous TCP open" case of the TCP state machine diagram.

28

Network Drawing

Figure 9:Network Drawing

Types of NAT

The availability of the TCP-hole-punching technique depends on the type of computer port allocation

used by the NAT. For two peers behind a NAT to connect to each other via TCP simultaneous open, they

need to know a little bit about each other. One thing that they absolutely need to know is the "location"

of the other peer, or the remote endpoint. The remote endpoint is the data of the ip-address and a port

that the peer will connect to. So when two peers, A and B, initiate TCP connections by binding to local

ports Pa and Pb, respectively, they need to know the remote endpoint port as mapped by the NAT in

order to make the connection. Here comes the crux of the problem: if both peers are behind a NAT, how

does one guess what the public remote endpoint of the other peer is? This problem is called NAT port

prediction. All TCP NAT traversal and Hole Punching techniques have to solve the port prediction

problem.

A NAT port allocation can be one of the two:

1-predictable: the gateway uses a simple algorithm to map the local port to the NAT port. Most of the

time a NAT will use port preservation, which means that the local port is mapped to the same port on

the NAT.

2- non predictable: the gateways uses an algorithm that is either random or too impractical to predict.

Table 1:Nat Port Allocation

29

Depending on whether the NATs exhibit a predictable or non-predictable behavior, it will be possible or

not to perform the TCP connection via a TCP simultaneous open, as shown below by the connection

matrix representing the different cases and their impact on end-to-end communication.

Techniques

Methods of Port Prediction (with predictable NATs)

Here are some of the methods used by NATs to allow peers to perform port prediction:

1-The NAT assigns to sequential internal ports sequential external ports. If the remote peer has

the information of one mapping, then it can guess the value of subsequent mappings. The TCP

connection will happen in two steps, at first the peers make a connection to a third party and

learn their mapping. For the second step, both peers can then guess what the NAT port mapping

will be for all subsequent connections, which solves port prediction. This method requires

making at least two consecutive connections for each peer and require the use of a third party.

This method does not work properly in case of Carrier-grade NAT with a lot of subscribers

behind each IP addresses, as only a limited amount of ports is available and allocating

consecutive ports to a same internal host might be impractical or impossible.

2-The NAT uses the port preservation allocation scheme: the NAT maps the source port of the

internal peer to the same public port. In this case, port prediction is trivial, and the peers simply

have to exchange the port to which they are bound through another communication channel

(such as UDP, or DHT) before making the outbound connections of the TCP simultaneous open.

This method requires only one connection per peer and does not require a third party to

perform port prediction.

3-The NAT uses "endpoint independent mapping": two successive TCP connections coming from

the same internal endpoint are mapped to the same public endpoint. With this solution, the

peers will first connect to a third party server that will save their port mapping value and give to

both peers the port mapping value of the other peer. In a second step, both peers will reuse the

same local endpoint to perform a TCP simultaneous open with each other. This unfortunately

requires the use of the SO_REUSEADDR on the TCP sockets, and such use violates the TCP

30

standard and can lead to data corruption. It should only be used if the application can protect

itself against such data corruption.

Details of a typical TCP connection instantiation with TCP Hole Punching

We assume here that port prediction has already taken place through one of the method outlined

above, and that each peer knows the remote peer endpoint. Both peers make a POSIX connect call to

the other peer endpoint. TCP simultaneous open will happen as follows:

1- Peer A sends a SYN to Peer B

Peer B sends a SYN to Peer A

2- When NAT-a receives the outgoing SYN from Peer A, it creates a mapping in its state machine.

When NAT-b receives the outgoing SYN from Peer B, it creates a mapping in its state machine.

3-Both SYN cross somewhere along the network path, then: SYN from Peer A reaches NAT-b, SYN from

Peer B reaches NAT-a Depending on the timing of these events (where in the network the SYN cross),

4-Upon receipt of the SYN, the peer sends a SYN+ACK back and the connection is established.

Interoperability requirements on the NAT for TCP Hole Punching:

Other requirements on the NAT to comply with TCP simultaneous open:

For the TCP simultaneous open to work, the NAT should:

• not send an RST as a response to an incoming SYN packet that is not part of any mapping

• accept an incoming SYN for a public endpoint when the NAT has previously seen an outgoing

SYN for the same endpoint

This is enough to guarantee that NATs behave nicely with respect to the TCP simultaneous open.

31

TCP Hole Punching and Carrier-grade NAT (CGN)

The technique described above works fine within a CGN. A CGN can also make use of a port overloading

behavior, which means that distinct internal endpoints with the same port value can be mapped to the

same public endpoint. This does not break the unicity of the 5-uple {Protocol, public address, public

port, remote address, remote port} and is thus acceptable. TCP port preservation can also lead to cases

where the CGN ports are overloaded and is not an issue for protocol soundness. Port overloading for

TCP allows the CGN to fit more hosts internally while preserving TCP end-to-end communication

guarantees.

2.3.1.4.4 UDP hole punching:

UDP hole punching is a commonly used technique employed in network address translator (NAT)

applications for maintaining User Datagram Protocol (UDP) packet streams that traverse the NAT. NAT

traversal techniques are typically required for client-to-client networking applications on the Internet

involving hosts connected in private networks, especially in peer-to-peer, Direct Client-to-Client (DCC)

and Voice over Internet Protocol (VoIP) deployments.

UDP hole punching establishes connectivity between two hosts communicating across one or more

network address translators. Typically, third party hosts on the public transit network are used to

establish UDP port states that may be used for direct communications between the communicating

hosts. Once port state has been successfully established and the hosts are communicating, port state

may be maintained by either normal communications traffic, or in the prolonged absence thereof, by so-

called keep-alive packets, usually consisting of empty UDP packets or packets with minimal non-intrusive

content.

UDP hole punching is a method for establishing bidirectional UDP connections between

Internet hosts in private networks using network address translators. The technique is not

applicable in all scenarios or with all types of NATs, as NAT operating characteristics are not

standardized.

Hosts with network connectivity inside a private network connected via a NAT to the Internet

typically use the Session Traversal Utilities for NAT (STUN) method or Interactive Connectivity

Establishment (ICE) to determine the public address of the NAT that its communications peers

require. In this process another host on the public network is used to establish port mapping

and other UDP port state that is assumed to be valid for direct communication between the

32

application hosts. Since UDP state usually expires after short periods of time in the range of

tens of seconds to a few minutes, and the UDP port is closed in the process, UDP hole punching

employs the transmission of periodic keep-alive packets, each renewing the life-time counters

in the UDP state machine of the NAT.

UDP hole punching will not work with symmetric NAT devices (also known as bi-directional

NAT) which tend to be found in large corporate networks. In symmetric NAT, the NAT's

mapping associated with the connection to the well known STUN server is restricted to

receiving data from the well-known server, and therefore the NAT mapping the well-known

server sees is not useful information to the endpoint.

In a somewhat more elaborate approach both hosts will start sending to each other, using

multiple attempts. On a Restricted Cone NAT, the first packet from the other host will be

blocked. After that the NAT device has a record of having sent a packet to the other machine,

and will let any packets coming from this IP address and port number through. This technique is

widely used in peer-to-peer software and Voice over Internet Protocol telephony. It can also be

used to assist the establishment of virtual private networks operating over UDP. The same

technique is sometimes extended to Transmission Control Protocol (TCP) connections, with less

success due to the fact that TCP connection streams are controlled by the host OS, not the

application and sequence numbers are selected randomly; thus any NAT device that performs

sequence number checking will not consider the packets to be associated with an existing

connection and drop them.

Flow

Let A and B be the two hosts, each in its own private network; N1 and N2 are the two NAT

devices with globally reachable IP addresses P1 and P2 respectively; S is a public server with a

well-known globally reachable IP address.

1. A and B each begin a UDP conversation with S; the NAT devices N1 and N2 create UDP

translation states and assign temporary external port numbers X and Y

2. S examines the UDP packets to get the source port used by N1 and N2 (the external NAT ports X

and Y)

3. S passes P1:X to B and P2:Y to A

4. A sends a packet to P2:Y.

5. N1 examines A's packet and creates the following tuple in its translation table: {Source-IP-

A,X,P2,Y}

6. B sends a packet to P1:X

7. N2 examines B's packet and creates the following tuple in its translation table: {Source-IP-

B,Y,P1,X}

33

8. Depending on the state of N1's translation table when B's first packet arrives (i.e. whether the

tuple {Source-IP-A,X,P2,Y} has been created by the time of arrival of B's first packet), B's first

packet is dropped (no entry in translation table) or passed (entry in translation table has been

made).

9. Depending on the state of N2's translation table when A's first packet arrives (i.e. whether the

tuple {Source-IP-B,Y,P1,X} has been created by the time of arrival of A's first packet), A's first

packet is dropped (no entry in translation table) or passed (entry in translation table has been

made).

10. At worst, the second packet from A reaches B; at worst the second packet from B reaches A.

Holes have been "punched" in the NAT and both hosts can communicate.

• If both hosts have Restricted cone NATs or Symmetric NATs, the external NAT ports will differ

from those used with S. On some routers, the external ports are picked sequentially making it

possible to establish a conversation through guessing nearby ports.

34

2.4 Concept Generations

Figure 10:Concept Generation

35

2.5 Concept Reduction

The criteria we used during our decision making process to choose the most suitable and optimal

conceptual design to use in our project was as follows:

1. We must use a peer to peer network structure for communication between computers (peers).

2. The design must have a low delay meaning that the design we will choose must produce low

delay when implemented and the application must have minimum delay on the computer the

application is running.

3. The design must have a fault tolerance feature meaning that if one computer in the network

fails, the work it was doing must be continued by any means.

4. The design is recommended to be easy to implement and be well structured to give a chance for

further improvements in the future

And so using these criteria, the decision making process we went to choose our final designs is as

follows:

Firstly , we chose to implement a distributed operating system, meaning that we design a virtual layer

above the running operating system that can divide the processing of any application to some objects

that can be sent over the network like a LAN or over internet where these objects can be processed and

the results returned back to the virtual layer , at the virtual layer the parts are collected and converted

to the operating system layer to continue processing on it , but the problems of this design are that it’s

very complicated to design , Operating system independent so we can’t implement a general design , it

wasn’t clear to implement a peer to peer structured network with this virtual layer to make the design

work correctly , the design will produce high delay because of OS,Hardware and the virtual layer delays

and finally we hadn’t no clue how to implement a fault tolerance algorithm for this design.

Secondly , we chose to use a distributed computing middleware like Al-chemi , meaning that we design

a program that works as a platform where programs run on it and this platform distributed the

processing of each application running on it and use a network to distribute the processing among the

computers in the network but again this design had some problems as most of the solutions

implemented using this design are using client server model and can’t be extended to a peer to peer

network and so we rejected this design .

36

Thirdly, we chose to use a distributed computing framework, like JXTA and JXSE, meaning we implement

a design using a predefined number of protocols with predefined API for each protocol but again we

faced the problem of the complexity of the protocols and hard to use APIs along with the problem of the

oldness of most of this tools and being out of support and being inefficient in designing right now.

Fourthly, we chose to implement a client server model then upgrade it to server – many clients model

then upgrade it to many servers – many clients model and hence approaching a near like model of the

peer to peer model but again the problem was the complexity of the design with no solution for most of

the features suggested for our design like the fault tolerance feature or how to distribute work among

the clients and the main problem of this solution is that it’s still a theoretical design with no real

development with the shortage of a complete design schema of this style of design from scratch in

distributed computing.

Fifthly and last , the p2p computing design using remote method invocation , it’s our final design and

the most successful , it’s based on a peer to peer network with an efficient fault tolerance feature , low

delay response and easy to implement with a chance for future improvement .this is the final and most

optimal design according to the mentioned criteria.

37

3-Design & Analysis

3.1 General Description

3.1.1 System Requirements Specifications

•The network starts with a single peer which starts its own network. Other peers can connect to the

network by sending a connect message to any peer in the network given by IP and port. There is no

notion of a tracker or a centralized system for handling connects and failures.

•The failure of a peer will be detected when a communication attempt fails and a RMI exception is

caught.

•It is possible to share objects between the peers in the system in order to solve more complex

problems.

•Workload will be distributed evenly between peers and their workers by keeping a short work queue

and making peers request more work from random peers in the net. Test by spawning large TSP task,

make sure that all peers have enough work at all times. The goal is to make this system utilize processing

power better than the system that was made in homework 5.

•The system is fail-safe. Any number of peers can crash without breaking the network or the current

computation. The system doesn't tolerate that two peers crash at the same time. Test by running a large

task and then force some peers to disconnect while computing tasks.

•This project is realized as an API which people can use to distribute their computations on a distributed

cluster.

3.1.2. General Capabilities.

A distributed P2P compute engine is implemented that does not fail if one or multiple peers crashes or

disconnects. The system supports multi-core computations, Divide And Conquer (DAC), shared variables,

it is scalable and does not suffer from bottlenecks due to network communication.

38

3.1.3. Key Technical Issues

•Making the system consistent when faced with failing machines and randomly disconnecting peers

•Distributing workload as efficiently as possible

•Sharing variables between the peers

•Fault Tolerance.

•Detecting Failures.

•Back up of Tasks.

•Resuming Computing.

•Clearing out broken Peers.

3.1.4. Assumptions and Dependencies.

• There is a single natural object-oriented design for a given application, regardless of the context in

which that application will be deployed.

• Failure and performance issues are tied to the implementation of the components of an application,

and consideration of these issues should be left out of an initial design.

• The interface of an object is independent of the context in which that object is used.

39

3.3 Use-Case Diagram with Narrative Description

Overview:

The whole system is consists of peers and network between them and users. Each peer has its own user;

also each peer acts as server and client at the same time.

Actors of the system:

1) User

2) Peer client

3) Peer server

4) Others peers client

5) Other peers server

40

Figure 11:Use Case Diagram

41

1) Initial configuration : the user should be able to show his network private IP

address (and configure it if needed) and set probate port number (inner-port)that

he want other peers to communicate with his compute on.

2) Take job : each peer should be able to receive jobs from other peer which act

as a server. this can be done by a request other peers for work the by broad caste

a message which ask other peers for work ( steal work() ) then other peers

respond with tasks and the info. for its owners peers.

3) Create job : each peer should be able to create a job and destitute it on the

peers in the network. this job can be on of three. a) fib. b) Mandelbrot c) TSP

4) Distribute fib. : each peer should be able to distribute fib. computation on

other peers and other peers should be ready to receive distributed computations

and calculate it then send the results to the owner peer.

5) Distribute Mandelbrot : each peer should be able to distribute Mandelbrot

computations and other peers should be ready to receive the distributed

computations and calculate it then send the result back to the owner peer.

6) Distribute TSP : each peer should be able to distribute TSP computations and

other peers should be ready to receive the distributed computations and

calculate it then send the result back to the owner peer.

42

3-4: Detailed Class Diagram

Figure 12: Detailed Class Diagram

Note: For Full Detailed Class diagram check the appendices.

43

3-5: Interaction (Sequence) Diagram

Figure 13:Detailed Sequence Diagram

Note: For Full Detailed Sequence diagram check the appendices.

44

3-6 Packages’ Dependencies

Figure 14: Packages’ Dependencies

45

4-implementation and Tools

4-1: Flow-chart (block Diagram):

Figure 15: Block Diagram of the project

46

4-2 Program Features:

• The network starts with a single peer which starts its own network. Other peers can connect to the network by sending a connect message to any peer in the network given by IP and port. There is no notion of a tracker or a centralized system for handling connects and failures.

• The failure of a peer will be detected when a communication attempt fails and a RMI exception is caught.

• It is possible to share objects between the peers in the system in order to solve more complex problems.

• Workload will be distributed evenly between peers and their workers by keeping a short work queue and making peers request more work from random peers in the net. Test by spawning large TSP task; make sure that all peers have enough work at all times. The goal is to make this system utilize processing power better a single computer performance.

• The system is fail-safe. Any number of peers can crash without breaking the network or the current computation. The system doesn't tolerate that two peers crash at the same time. Test by running a large task and then force some peers to disconnect while computing tasks.

• This project is realized as an API which people can use to distribute their computations on a distributed cluster.

47

4-3: Other resources needed:

1- Any computer wants to volunteer in this network.

2- Router to make LAN network between these computers.

3- Ant compiler.

4- JDK (JAVA development kit).

- For further improvement :

1- Azure cloud service, it will have static IP addresses.

2- Data base on it.

4-4 Testing:

4.4.1 Class level testing:

1- task.java:

We have made method called colon and clones an object before sending it to the space for the

purpose of mutability, as Tasks must be clonable so they can be distributed and duplicated over

the network.

2- MandelbrotClient.java:

We made an exception for waiting for task to finish and take, then print result.

3- TspClient.java:

We made an exception for waiting for task to finish and take, then print result.

4- FibClient.java:

We made an exception for waiting for task to finish and take, then print result.

5- GetRemoteQueueMessage.java:

We made an exception if Error while registering remote queue. (RMI exception)

48

6- GetTaskMessage.java:

We made an exception if Fail to give task. (RMI exception).

7- Message.java:

We made an exception if ERROR Message: unable to broadcast, and if unable to clone message.

(RMI exception).

8- Executor.java:

We made an Interrupted Exception if sleep failed.

We made an exception if unable to remove task from remote queue.

49

4.4.2 System Level Testing: (Black Box Testing)

- First test case:

Input: Enter "ant peer" to compile the project in the directory of the project , then the project

ask us to enter the IP address of the remote peer so I will input any name and not the IP

address , then it will ask us about the port number and I will repeat what we wrote in the IP

address of the remote peer.

Output: it will accept what we wrote in the host name of the remote peer , as it can be a

name, but as we enter a name in the port number section , it will be breaking the sequence of

the program.

- Second test case :

Input: Enter the IP address of the remote peer as "192.168.1.3" and the local port number

"1099" and remote port number "1099" then wait for responding from any other peer want to

volunteer in this network.

Note: the port number of the remote peer is blocked and the backets will be dropped from this

peer to the other peer.

Output: the program will stay until TTL end then the program sequence will be break.

- Third test case:

Input: Enter the IP address of the remote peer as "192.168.1.3" and the local port number

"1099" and remote port number "1099" then wait for responding from any other peer want to

volunteer in this network.

Note: the second peer is down.

Output: the program will stay until TTL end then the program sequence will be break.

50

4.4.3 Performance Testing:

- First test case:

Input: Enter the IP address of the remote peer as "192.168.1.3" and the local port number

"1099" and remote port number "1099" then wait for responding from any other peer want to

volunteer in this network.

Note: the port number of the remote peer is blocked and the backets will be dropped from this

peer to the other peer.

Output: the program will stay until TTL end then the program sequence will be break.

(Performance bug)

- Second test case:

Input: Enter the IP address of the remote peer as "192.168.1.3" and the local port number

"1099" and remote port number "1099" then wait for responding from any other peer want to

volunteer in this network.

Note: the second peer is down.

Output: the program will stay until TTL end then the program sequence will be

break.(Performance bug)

4.4.4 Security Testing:

We didn't implement any security algorithm on the peers or to secure the network or even to

secure the ports that we send on we just open all firewalls of the peers as we just do our project

on LAN only (see Further Improvements).

51

5-Results and Conclusions

After we execute all the three applications we get these results:

1- Mandelbrot:

Figure 16:Mandelbrot Result

52

2-TSP:

Figure 16:TSP result

53

3-fibonacci:

Figure 17: Fibonacci Result

5.3 performance analysis:

Performance Requirements

The performance of the system will be tested by running three kinds of tasks; TSP, Mandelbrot and

Fibonacci computations. The goal is that the peer to peer system should compute these tasks faster than

what has been achieved in the homework tasks in the course. The performance of the homework

compute system is well documented and should be easily comparable to the measurement from the

peer to peer system.

When we push the number of machines and cores in the network we expect to see better results than

from the homework since the space became a bottle neck in the system. The peer to peer network has

no such bottle neck and should be able to handle the massive amount of message passing better.

Performance Architecture

• The peer to peer system has no central authority, all peers are equal.

• All peers can talk to all other peers whenever they want to. It is possible to send messages to

specific peers or simply a broadcast to all peers. All peers keeps references to all the other peers

in the network.

• Workload distribution is implemented as random work stealing. That is, whenever the task

queue of a peer is shorter than some limit, the peer will start asking random peers for more

work.

• The task system supports DAC and the ability to share a variable between the workers.

• The shared variable will be broadcasted to all peers which will update their variable if it is

"newer" than the one they already have.

54

• Failure tolerance is achieved by storing copies of the task queues as a remote queue in another

peer so that the data isn't lost in case the peer fails. Each time a peer connects or loses its

remote queue it will ask if some peer in the network can keep a copy of its tasks. Each peer can

store as many as two queues for other peers. When a peer detects that the peer it is holding a

copy queue for has failed it will merge the copy-queue into its own task queue.

Experiments

When the results are compared with those from homework 5 it seems that the system is faster

at some tasks but slower at others. Typically, tasks with a lot of message passing has become

faster, while tasks which needs to send a lot of data becomes slower since the P2P

implementation moves data around a bit more than the homework system. One reason for the

reduced execution time of some tasks may be that a different data structure is used for holding

the task queues in the P2P implementation vs. the homework implementation. This data could

possibly have been implemented in the homework as well and given it a significant speed up.

One Mandelbrot visualization computed by the system can be seen below.

Figure 18:Mandelbrot

The failure tolerance of the system was tested. It handles random disconnects very well, as long

as there is only one at a time. It even reassigns tasks and updates closure links between the

different peers seamlessly such that the DAC functional works without problems. Only tasks

which are currently being executed by the failing peer has to be recomputed, and those tasks

are typically small.

55

Below a figure of the measured execution times can bee seen. The execution times measured is

the average of ten samples.

Figure 19: Performance Analysis for Mandelbrot and TSP

Figure 20: Performance Analysis for Fibonacci

0

100

200

300

400

500

600

700

800

900

1000

TSP-Single

Computer

TSP-P2P Mandel-

Single

Computer

Mandel-P2P

342

188242

523

Execution Time[ms]

Execution Time[ms]

0

500

1000

1500

2000

2500

3000

Fib-Single

Computer

Fib-P2P

Execution time[ms]

Execution time[ms]

56

5.4 Cost Analysis

All the software and libraries used during the duration of our project is free and open source and there

was no need to buy any license of any form for any software, library or tool used.so the project has a

zero budget and zero cost.

5.5 Bill of Materials:

No Hardware parts were bought so there is no bill of materials.

5.6 Hazards and Failure Analysis:

We walk with set of software development guidelines to develop the applications we want to be

distributed and want anyone want to develop any other applications on our platform walk with

these to develop greener solutions:

1. Choose Faster Code over Future Hardware. Yes, if you wait 2 years and run

your code on a future device, it will run twice as fast as it runs today, but that’s

exactly the wrong attitude. To be a green developer, don’t use the fastest machine

you can find as your acceptable performance benchmarking system. Instead, use a

machine from 2 or 4 years ago with less memory, slower CPU and smaller hard

drive. If it can run on that machine with acceptable performance, your

contribution to the rapid replacement of hardware is significantly less.

2. Include Environmental Costs in Your Cost Analysis. You might find that writing slower code and buying more servers costs a lot less than the cost of another month of development, but don’t forget the hidden costs of more servers. These days the data center electricity alone that a server uses in a given year can cost as much or more than the server itself, not to mention rack costs, cooling costs, bandwidth costs, software costs and the environmental impact of all the materials and manufacturing processes that goes into creating the server itself.

3. Be Memory Conscious. With most developer machines having 2 or 4GB of memory these days, developers often forget that there are hundreds of millions of PCs out there with just 256MB of memory (or less!). Even 256MB is an unbelievable amount of RAM. I couldn’t imagine having that much memory on a computer at the start of this decade and now I have 2GB! It’s because developers use memory as if it was free. The argument is that memory doesn’t cost anything. You can pickup 1GB sticks for under $100! The problem is for a large number of

57

machines that are not capable of having 1GB of memory, you need to replace the entire machine and guess what happens to the old machine?

4. Build in-house When Possible. Using 3rd party components and external

controls can be a great way to save time in the development of an application, but

make sure you’re not using 100MB worth of components for 1MB of gain. If the

ratio of things you use in your 3rd party controls to things you don’t use is 1:100,

that’s bad. It’s not just bad for the environment; it’s even bad for your end-user

experience.

5. Go back and solve it again. Every developer knows at least a dozen areas of

their software that they could have coded more efficiently. Often the performance

of re-solving critical areas can gain an order of magnitude in performance

improvements. But we never go back to do it again because we’re busy adding the

next new feature. Going back and solving problems we have already solved might

seem like a waste of time, but it’s often very satisfying to cut down hundreds of

lines of code to just a few lines and make it much much faster.

5.7 Recommendations and further improvements:

1. Use Azure Cloud Service with the available academic license for testing performance

and comparing, also can support the software with a database hosting to react as a

tracker for going through the internet.

2. Build a distributed database that is distributed among all peers sharing peers identities

and information.

3. Use different encryption and decryption techniques to improve security.

4. Support the software with static IPs or whenever IPv6 is available to go smoothly

through the internet.

5. Use the 4G and other fast available internet connections like Google fiber whenever

available for end user

58

6-user Guideline:

First: open your cmd.exe and ensure that you have an ant compiler to compile our software by typing

"ant" in the command prompt window

Second: change the directory to the directory of the program like this

Figure 21: changing the directory

Third: to build and compile our project just type "ant peer" and it will do what we write in the xml file

under the target name "peer" so it will compile the source files by "javac" and execute our softwar, then

ask you to type in the hostname of the remote peer.

Figure 22: enter hostname of remote peer

Here we have 2 options: First option … execute the applications on the same machine "local host”.

1-Type in "local host" to choose that our project will execute on our machine and will not execute on

another peer "we will execute our program in this case just to make statistics".

- Type in a remote port and its 1099 by default and we will not use it here as we will do our operation on

the same machine.

59

- Type in a local port and its 1096 by default.

Figure 23: enter an application to execute

-press enter after each step.

2-First we will type "mandelbrot" to execute Mandelbrot set algorithm and what we have will be like

that:

Figure 24: enter mandelbrot

- It will indicate us that "New job started", time to execute that job and the "Job completed!"

60

3-And will show us the Mandelbrot set like that

Figure 25: Mandelbrot set

4-You can do another application like TSP "Travelling Sales Problem" by typing "tsp".

- It will show us time to execute that job, after the job done; we will have length of the

shortest path and the tour between cities.

5-We will have a graph of all cities as we enter in the code and edges connect the cities like that:

Figure 23: tsp time and length of shortest path and tour between cities

61

Figure 26: TSP

6-You can do another application "Fibonacci series of 25 numbers, type in "fib" and we have all of this.

Figure 27: fibonacci series

7-The project indicates you that new job started then after milliseconds job will be complete, then we

have time to do 1 try of this job, the average time to do it and finally the result of the Fibonacci.

62

8-We can know the size of the network “the number of the peers in the network by typing size.

- And in this case we have only one peer.

-

Figure 28: size of the network

9-we can also know the peer id by typing "id".

-It’s a unique id number to every peer.

Figure 29: id of the peer

Figure 30: exit from the network

10-We can also exit from the network by typing "exit".

-It will erase the ready queue and the wait map.

11-We can finally terminate the network by typing "terminate", and this option is only on the peer that

distribute all the jobs

Second options … execute the applications on multiple peers and here we will connect two peers with

each other.

1-Type in the ip address of the remote machine "192.168.1.4" to choose the other peer that will execute

the applications with you.

Figure 31: terminate the network

63

Figure 32: enter the remote peer ip and port number of local and remote peer

-Type in a remote port and it will be 1099 .

- Type in a local port and it will be 1099 .

- press enter after each step.

2-After that go to the second machine and type in the ip address of the first peer "192.168.1.3" .

- Type in a remote port and it will be 1099.

- Type in a local port and it will be 1099 .

- press enter after each step

Figure 33: enter the remote ip and port number of local nad remote peer

3-After that return to the first peer and you will get the list to choose which application you want to

execute on the two peers.

Figure 34: enter the application to ececute

4-First we will type "mandelbrot" to execute Mandelbrot set algorithm.

- It will indicate us that "New job started", time to execute that job and the "Job completed!”

64

5-And will show us the Mandelbrot set like that

Figure 35: mandelbrot set

6-You can do another application like TSP "Travelling Sales Problem" by typing "tsp".

-

It will show us time to execute that job , after the job done , we will have length of the shortest

path and the tour between cities.

i

7-Appendix

7.1: Java Remote Method Invocation

The Java Remote Method Invocation (Java RMI) is a Java API that performs the object-oriented

equivalent of remote procedure calls (RPC), with support for direct transfer of serialized Java classes and

distributed garbage collection.

The original implementation depends on Java Virtual Machine (JVM) class representation mechanisms

and it thus only supports making calls from one JVM to another. The protocol underlying this Java-only

implementation is known as Java Remote Method Protocol (JRMP).

In order to support code running in a non-JVM context, a CORBA version was later developed.

Usage of the term RMI may denote solely the programming interface or may signify both the API and

JRMP, whereas the term RMI-IIOP (read: RMI over IIOP) denotes the RMI interface delegating most of

the functionality to the supporting CORBA implementation.

Generalized code

The programmers of the original RMI API generalized the code somewhat to support different

implementations, such as a HTTP transport. Additionally, the ability to pass arguments "by value" was

added to CORBA in order to be compatible with the RMI interface. Still, the RMI-IIOP and JRMP

implementations do not have fully identical interfaces.

RMI functionality comes in the package java.rmi, while most of Sun's implementation is located in the

sun.rmi package. Note that with Java versions before Java 5.0 developers had to compile RMI stubs in a

separate compilation step using rmic. Version 5.0 of Java and beyond no longer require this step.

Jini version

Jini offers a more advanced version of RMI in Java. It functions similarly but provides more advanced

searching capabilities and mechanisms for distributed object applications.

ii

Example

The following classes implement a simple client-server program using RMI that displays a message.

RmiServer class — listens to RMI requests and implements the interface which is used by the client to

invoke remote methods.

import java.rmi.Naming;

import java.rmi.RemoteException;

import java.rmi.server.UnicastRemoteObject;

import java.rmi.registry.*;

public class RmiServer

extends UnicastRemoteObject

implements RmiServerIntf {

public static final String MESSAGE = "Hello World";

public RmiServer() throws RemoteException {

super(0); // required to avoid the 'rmic' step, see below

}

public String getMessage() {

return MESSAGE;

}

public static void main(String args[]) throws Exception {

System.out.println("RMI server started");

try { //special exception handler for registry creation

LocateRegistry.createRegistry(1099);

System.out.println("java RMI registry created.");

} catch (RemoteException e) {

//do nothing, error means registry already exists

System.out.println("java RMI registry already exists.");

}

//Instantiate RmiServer

RmiServer obj = new RmiServer();

// Bind this object instance to the name "RmiServer"

Naming.rebind("//localhost/RmiServer", obj);

System.out.println("PeerServer bound in registry");

}

}

iii

RmiServerIntf interface — defines the interface that is used by the client and implemented by the server.

RmiClient class — this is the client which gets the reference (a proxy) to the remote object living on

the server and invokes its method to get a message. If the server object implemented java.io.Serializable

instead of java.rmi.Remote, it would be serialized and passed to the client as a value.

import java.rmi.Remote;

import java.rmi.RemoteException;

public interface RmiServerIntf extends Remote {

public String getMessage() throws RemoteException;

}

import java.rmi.Naming;

public class RmiClient {

public static void main(String args[]) throws

Exception {

RmiServerIntf obj =

(RmiServerIntf)Naming.lookup("//localhost/RmiServer");

System.out.println(obj.getMessage());

}

}

iv

7.2 Exception in Remote Method Invocations

Exceptions During Remote Object Export

When a remote object class is created that extends UnicastRemoteObject, the object is exported,

meaning it can receive calls from external Java virtual machines and can be passed in an RMI call as

either a parameter or return value. An object can either be exported on an anonymous port or on a

specified port. For objects not extended from UnicastRemoteObject, the

java.rmi.server.UnicastRemoteObject.exportObject method is used to explicitly export the object.

Exception Context

java.rmi.StubNotFoundException

1. Class of stub not found.

2. Name collision with class of same name as stub

causes one of these errors:

o Stub can't be instantiated.

o Stub not of correct class.

3. Bad URL due to wrong codebase.

4. Stub not of correct class.

java.rmi.server.SkeletonNotFoundException

1. Class of skeleton not found.

2. Name collision with class of same name as skeleton

causes one of these errors:

o Skeleton can't be instantiated.

o Skeleton not of correct class.

3. Bad URL due to wrong codebase.

4. Skeleton not of correct class.

v

java.rmi.server.ExportException The port is in use by another VM.

Exceptions During RMI Call

Exception Context

java.rmi.UnknownHostException Unknown host.

java.rmi.ConnectException Connection refused to host.

java.rmi.ConnectIOException I/O error creating connection.

java.rmi.MarshalException I/O error marshaling transport header, marshaling call header, or

marshaling arguments.

java.rmi.NoSuchObjectException Attempt to invoke a method on an object that is no longer available.

java.rmi.StubNotFoundException Remote object not exported.

vi

Exceptions or Errors during Return

Exception Context

java.rmi.UnmarshalException

1. Corrupted stream leads to either an I/O or protocol error

when:

o Marshaling return header.

o Checking return type.

o Checking return code.

o Unmarshaling return.

2. Return value class not found.

java.rmi.UnexpectedException

An exception not mentioned in the method signature occurred,

including runtime exceptions on the client. An exception object

contains the actual exception.

java.rmi.ServerError Any error that occurs while the server is executing a remote method.

java.rmi.ServerException

Any remote exception that occurs while the server is executing a

remote method. For examples, see Possible Causes of

java.rmi.ServerException.

java.rmi.ServerRuntimeException

Any runtime exception that occurs while the server is executing a

method, even if the exception is in the method signature. This

exception object contains the underlying exception.

Possible Causes of java.rmi.ServerException

These are the underlying exceptions which can occur on the server when the server is itself executing a

remote method invocation. These exceptions are wrapped in a java.rmi.ServerException; that is

the java.rmi.ServerException contains the original exception for the client to extract. These

exceptions are wrapped by ServerException so that the client will know that its own remote method

invocation on the server did not fail, but that a secondary remote method invocation made by the

server failed.

vii

Exception Context

java.rmi.server.SkeletonMismatchException Hash mismatch of stub and skeleton.

java.rmi.UnmarshalException I/O error unmarshaling call header. I/O error unmarshaling

arguments.

java.rmi.MarshalException Protocol error marshaling return.

java.rmi.RemoteException Method number out of range due to corrupted stream.

Naming Exceptions

The following table lists the exceptions specified in methods of the java.rmi.Naming class and the

java.rmi.registry.Registry interface.

Exception Context

java.rmi.AccessException Operation disallowed. The registry restricts bind, rebind, and unbind

to the same host. The lookup operation can originate from any host.

java.rmi.AlreadyBoundException Attempt to bind a name that is already bound.

java.rmi.NotBoundException Attempt to look up a name that is not bound.

java.rmi.UnknownHostException Attempt to contact a registry on an unknown host.

viii

Other Exceptions

Exception Context

java.rmi.RMISecurityException A security exception that is thrown by the

RMISecurityManager.

java.rmi.server.ServerCloneException Clone failed.

java.rmi.server.ServerNotActiveException

Attempt to get the client host via the

RemoteServer.getClientHost method when the remote server

is not executing in a remote method.

java.rmi.server.SocketSecurityException Attempt to export object on an illegal port.

ix

7.3 Distributed Operating System

A distributed operating system is a software over a collection of independent, networked,

communicating, and physically separate computational nodes. Each individual node holds a specific

software subset of the global aggregate operating system. Each subset is a composite of two distinct

service provisioners. The first is a ubiquitous minimal kernel, or microkernel, that directly controls that

node’s hardware. Second is a higher-level collection of system management components that

coordinate the node's individual and collaborative activities. These components abstract microkernel

functions and support user applications.

The microkernel and the management components collection work together. They support the system’s

goal of integrating multiple resources and processing functionality into an efficient and stable

system.This seamless integration of individual nodes into a global system is referred to as transparency,

or single system image; describing the illusion provided to users of the global system’s appearance as a

single computational entity.

Description

A distributed OS provides the essential services and functionality required of an OS, adding attributes and particular configurations to allow it to support additional requirements such as increased scale and availability. To a user, a distributed OS works in a manner similar to a single-node, monolithic operating system. That is, although it consists of multiple nodes, it appears to users and applications as a single-node.

Separating minimal system-level functionality from additional user-level modular services provides a “separation of mechanism and policy.” Mechanism and policy can be simply interpreted as "how something is done" versus "why something is done," respectively. This separation increases flexibility and scalability.

x

Overview

1-The Kernel

At each locale (typically a node), the kernel provides a minimally complete set of node-level utilities necessary for operating a node’s underlying hardware and resources. These mechanisms include allocation, management, and disposition of a node’s resources, processes, communication, and input/output management support functions.Within the kernel, the communications sub-system is of foremost importance for a distributed OS.

In a distributed OS, the kernel often supports a minimal set of functions, including low-level address space management, thread management, and inter-process communication (IPC). A kernel of this design is referred to as a microkernel.Its modular nature enhances reliability and security, essential features for a distributed OS.It is common for a kernel to be identically replicated over all nodes in a system and therefore that the nodes in a system use similar hardware.The combination of minimal design and ubiquitous node coverage enhances the global system's extensibility, and the ability to dynamically introduce new nodes or services.

2-System management components

System management components are software processes that define the node's policies. These

components are the part of the OS outside the kernel. These components provide higher-level

communication, process and resource management, reliability, performance and security. The

components match the functions of a single-entity system, adding the transparency required in a

distributed environment.

The distributed nature of the OS requires additional services to support a node's responsibilities to the

global system. In addition, the system management components accept the "defensive" responsibilities

of reliability, availability, and persistence. These responsibilities can conflict with each other. A

consistent approach, balanced perspective, and a deep understanding of the overall system can assist in

identifying diminishing returns. Separation of policy and mechanism mitigates such conflicts.

xi

3-Working together as an operating system

The architecture and design of a distributed operating system must realize both individual node and

global system goals. Architecture and design must be approached in a manner consistent with

separating policy and mechanism. In doing so, a distributed operating system attempts to provide an

efficient and reliable distributed computing framework allowing for an absolute minimal user awareness

of the underlying command and control efforts.

The multi-level collaboration between a kernel and the system management components and in turn

between the distinct nodes in a distributed operating system is the functional challenge of the

distributed operating system. This is the point in the system that must maintain a perfect harmony of

purpose, and simultaneously maintain a complete disconnect of intent from implementation. This

challenge is the distributed operating system's opportunity to produce the foundation and framework

for a reliable, efficient, available, robust, extensible, and scalable system. However, this opportunity

comes at a very high cost in complexity.

4-The price of complexity

In a distributed operating system, the exceptional degree of inherent complexity could easily render the

entire system an anathema to any user. As such, the logical price of realizing a distributed operation

system must be calculated in terms of overcoming vast amounts of complexity in many areas, and on

many levels. This calculation includes the depth, breadth, and range of design investment and

architectural planning required in achieving even the most modest implementation.[11]

These design and development considerations are critical and unforgiving. For instance, a deep

understanding of a distributed operating system’s overall architectural and design detail is required at

an exceptionally early point.[1] An exhausting array of design considerations are inherent in the

development of a distributed operating system. Each of these design considerations can potentially

affect many of the others to a significant degree. This leads to a massive effort in balanced approach, in

terms of the individual design considerations, and many of their permutations. As an aid in this effort,

most rely on documented experience and research in distributed computing.

Design considerations

xii

Transparency

Transparency or single-system image refers to the ability of an application to treat the system on which

it operates without regard to whether it is distributed and without regard to hardware or other

implementation details. Many areas of a system can benefit from transparency, including access,

location, performance, naming, and migration. The consideration of transparency directly effects

decision making in every aspect of design of a distributed operating system. Transparency can impose

certain requirements and/or restrictions on other design considerations.

Systems can optionally violate transparency to varying degrees to meet specific application

requirements. For example, a distributed operating system may present a hard drive on one computer

as "C:" and a drive on another computer as "G:". The user does not require any knowledge of device

drivers or the drive's location; both devices work the same way, from the application's perspective. A

less transparent interface might require the application to know which computer hosts the drive.

Transparency domains:

• Location transparency: Location transparency comprises two distinct aspects of transparency,

naming transparency and user mobility. Naming transparency requires that nothing in the

physical or logical references to any system entity should expose any indication of the entity's

location, or its local or remote relationship to the user or application. User mobility requires the

consistent referencing of system entities, regardless of the system location from which the

reference originates.

• Access transparency: Local and remote system entities must remain indistinguishable when

viewed through the user interface. The distributed operating system maintains this perception

through the exposure of a single access mechanism for a system entity, regardless of that entity

being local or remote to the user. Transparency dictates that any differences in methods of

accessing any particular system entity—either local or remote—must be both invisible to, and

undetectable by the user.

• Migration transparency: Resources and activities migrate from one element to another

controlled solely by the system and without user/application knowledge or action.

xiii

• Replication transparency: The process or fact that a resource has been duplicated on another

element occurs under system control and without user/application knowledge or intervention.

• Concurrency transparency: Users/applications are unaware of and unaffected by the

presence/activities of other users.

• Failure transparency: The system is responsible for detection and remediation of system

failures. No user knowledge/action is involved other than waiting for the system to resolve the

problem.

• Performance Transparency: The system is responsible for the detection and remediation of local

or global performance shortfalls. Note that system policies may prefer some users/user

classes/tasks over others. No user knowledge or interaction. is involved.

• Size/Scale transparency: The system is responsible for managing its geographic reach, number of

nodes, level of node capability without any required user knowledge or interaction.

• Revision transparency: The system is responsible for upgrades and revisions and changes to

system infrastructure without user knowledge or action.

• Control transparency: The system is responsible for providing all system information, constants,

properties, configuration settings, etc. in a consistent appearance, connotation, and denotation

to all users and applications.

• Data transparency: The system is responsible for providing data to applications without user

knowledge or action relating to where the system stores it.

xiv

• Parallelism transparency: The system is responsible for exploiting any ability to parallelize task

execution without user knowledge or interaction. Arguably the most difficult aspect of

transparency, and described by Tanenbaum as the "Holy grail" for distributed system designers.

Inter-process communication

Inter-Process Communication (IPC) is the implementation of general communication, process

interaction, and dataflow between threads and/or processes both within a node, and between nodes in

a distributed OS. The intra-node and inter-node communication requirements drive low-level IPC design,

which is the typical approach to implementing communication functions that support transparency. In

this sense, IPC is the greatest underlying concept in the low-level design considerations of a distributed

operating system.

Process management

Process management provides policies and mechanisms for effective and efficient sharing of resources

between distributed processes. These policies and mechanisms support operations involving the

allocation and de-allocation of processes and ports to processors, as well as mechanisms to run,

suspend, migrate, halt, or resume process execution. While these resources and operations can be

either local or remote with respect to each other, the distributed OS maintains state and

synchronization over all processes in the system.

As an example, load balancing is a common process management function. Load balancing monitors

node performance and is responsible for shifting activity across nodes when the system is out of

balance. One load balancing function is picking a process to move. The kernel may employ several

selection mechanisms, including priority-based choice. This mechanism chooses a process based on a

policy such as 'newest request'. The system implements the policy.

Resource management

xv

Systems resources such as memory, files, devices, etc. are distributed throughout a system, and at any

given moment, any of these nodes may have light to idle workloads. Load sharing and load balancing

require many policy-oriented decisions, ranging from finding idle CPUs, when to move, and which to

move. Many algorithms exist to aid in these decisions; however, this calls for a second level of decision

making policy in choosing the algorithm best suited for the scenario, and the conditions surrounding the

scenario.

Reliability

Distributed OS can provide the necessary resources and services to achieve high levels of reliability, or

the ability to prevent and/or recover from errors. Faults are physical or logical defects that can cause

errors in the system. For a system to be reliable, it must somehow overcome the adverse effects of

faults.

The primary methods for dealing with faults include fault avoidance, fault tolerance, and fault detection

and recovery. Fault avoidance covers proactive measures taken to minimize the occurrence of faults.

These proactive measures can be in the form of transactions, replication and backups. Fault tolerance is

the ability of a system to continue operation in the presence of a fault. In the event, the system should

detect and recover full functionality. In any event, any actions taken should make every effort to

preserve the single system image.

Availability

Availability is the fraction of time during which the system can respond to requests.

Performance

Many benchmark metrics quantify performance; throughput, response time, job completions per unit

time, system utilization, etc. With respect to a distributed OS, performance most often distills to a

balance between process parallelism and IPC.[citation needed] Managing the task granularity of

parallelism in a sensible relation to the messages required for support is extremely effective.[citation

xvi

needed] Also, identifying when it is more beneficial to migrate a process to its data, rather than copy the

data, is effective as well.

Synchronization

Cooperating concurrent processes have an inherent need for synchronization, which ensures that

changes happen in a correct and predictable fashion. Three basic situations that define the scope of this

need:

� One or more processes must synchronize at a given point for one or more other processes to

continue.

� one or more processes must wait for an asynchronous condition in order to continue

� Or a process must establish exclusive access to a shared resource.

Improper synchronization can lead to multiple failure modes including loss of atomicity, consistency,

isolation and durability, deadlock, livelock and loss of serializability.

Flexibility

Flexibility in a distributed operating system is enhanced through the modular and characteristics of the

distributed OS, and by providing a richer set of higher-level services. The completeness and quality of

the kernel/microkernel simplifies implementation of such services, and potentially enables service

providers greater choice of providers for such services.

xvii

7-4 Detailed Class Diagram

Overall Class Diagram

https://www.academia.edu/7565496/Link_Share_Class_Diagram

Another View for The overall class Diagram

https://www.academia.edu/7565497/Link_Share_Class_Diagram_another_view

Detailed Class Diagrams

https://drive.google.com/file/d/0B_0qbuF9dM3RVDhCcW53VkxOUTA/edit?usp=sharing

7-5 Detailed Sequence Diagram

Sequence Diagram for Main() Method

https://www.academia.edu/7565542/Lin_K_and_Shar_E_Sequence_Diagram_main

Full detailed sequence Diagrams

https://drive.google.com/file/d/0B_0qbuF9dM3RYzk0UGw3LTdqN1E/edit?usp=sharing

xviii

7.6 List of Readings

C++ Network Programming, Volume 2: Systematic Reuse with ACE and Frameworks ... Douglas C.

Schmidt, Stephen D. Huston

Foundations of Python Network Programming. Brandon Rhodes and John Goerzen

.NET 4 for Enterprise Architects and Developers. Sudhanshu Hate and Suchi Paharia

JXTA Java™ Standard Edition v2.5: Programmers Guide

JXSE v2.7 The JXTA Java™ Standard Edition Implementation Programmer's Guide by Jérôme Verstrynge

PROFESSIONAL C# 2012 AND .NET 4.5 ... Christian Nagel, Bill Evjen, Jay Glynn, Karli Watson, Morgan

Skinner

P2P-RPC: Programming Scientific Applications on Peer-to-Peer Systems with Remote Procedure Call

Friend-to-Friend Computing ..Instant Messaging Based Spontaneous Desktop Grid

O'Reilly - JXTA in a Nutshell – 2002

Peer-to-Peer Computing .. Dejan S. Milojicic, Vana Kalogeraki, Rajan Lukose, Kiran Nagaraja1, Jim

Pruyne, Bruno Richard, Sami Rollins 2 ,Zhichen Xu HP Laboratories Palo Alto

DuDE: A Prototype for a P2P-based Distributed Computing System

Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan

Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian,

Peter Vosshall and Werner Vogels

Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing

Utilities

xix

Design and Implementation of a Generic Library for P2P Streaming

Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall. A Note on Distributed Computing. SMLI TR-94-

29, Sun Microsystems Laboratories, M/S 29-01, 2550 Garcia Avenue Mountain View, CA 94043,

November 1994.

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and

Yuli Zhou. Cilk: An Efficient Multithreaded Runtime System,ACM SIGPLAN Symposium on Principles &

Practice of Parallel Programming (PPoPP '95), Santa Barbara CA, July 19 - 21, 1995.

Peer-to-Peer Grid Computing and a .NET-based Alchemi Framework Akshay Luther, Rajkumar Buyya,

Rajiv Ranjan, and Srikumar Venugopal

Alchemi: A .NET-based Grid Computing Framework and its Integration into Global Grids Akshay Luther,

Rajkumar Buyya, Rajiv Ranjan, and Srikumar Venugopal

JXTA: Java™ P2P Programming By Daniel Brookshier, Darren Govoni, Navaneeth Krishnan, Juan Carlos

Soto

Mastering JXTA: Building Java Peer-to-Peer Applications by Joseph D. Gradecki (Author), Joe Gradecki

Publisher: John Wiley & Sons; ISBN: 0471250848

Extending JXTA for P2P File Sharing Systems by Bhanu Krushna Rout and Smrutiranjan Sahu

Structured Peer-to-Peer Systems : Fundamentals of Hierarchical Organization, Routing, Scaling, and

Security by Dmitry Korzun and Andrei Gurtov

Distributed .NET Programming in C# by TOM BARNABY

Design and Implementation of a P2P Cloud System by Ozalp Babaoglu, Moreno Marzolla and Michele

Tamburini

xx

QuickPeer: A P2P Framework for Distributed Application Development Jun Wang and Xin Liu

State of Peer-to-Peer (P2P) Communication across Network Address Translators (NATs) by P.

Srisuresh and D. Kegel

The practice of peer-to-peer computing: Discovery : How peers locate one another by Todd

Sundsted ([email protected]), Vice President, Focus, Etcee LLC

Article Title : Virtualization in grid computing

Publication Title: Telecommunications Forum (TELFOR), 2011 19th

ISBN: 978-1-4577-1499-3

Posted Online Date: Thu Feb 02 00:00:00 EST 2012

Authors:Milic, P.; Ilic, S.; Bisevac, I.; Radoicic, S.

Article Title: Recent Advances in Trusted Grids and Peer-to-Peer Computing Systems

Publication Title: Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE

International

ISBN: 1-4244-0910-1

Posted Online Date: Mon Jun 11 00:00:00 EDT 2007

Authors: Kai Hwang

xxi

Article Title: On correlated availability in Internet-distributed systems

Publication Title: Grid Computing, 2008 9th IEEE/ACM International Conference on

ISBN: 978-1-4244-2578-5

Posted Online Date: Fri Oct 31 00:00:00 EDT 2008

Authors: Kondo, D.; Andrzejak, A.; Anderson, D.P.

Article Title: P2P-RPC: programming scientific applications on peer-to-peer systems with remote

procedure call

Publication Title: Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd

IEEE/ACM International Symposium on

ISBN: 0-7695-1919-9

Posted Online Date: Wed May 21 00:00:00 EDT 2003

Authors: Djilali, S.

Article Title: Peer-to-peer fault tolerance framework for a grid computing system

Publication Title: Computer Science and Software Engineering (JCSSE), 2012 International Joint

Conference on

ISBN: 978-1-4673-1920-1

Posted Online Date: Thu Aug 09 00:00:00 EDT 2012

Authors: Tangmankhong, T.; Siripongwutikorn, P.; Achalakul, T.

xxii

Article Title: Group-based dynamic computational replication mechanism in peer-to-peer grid

computing

Publication Title: Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE International

Symposium on

ISBN: 0-7695-2585-7

Posted Online Date: Tue May 30 00:00:00 EDT 2006

Authors: SungJin Choi; MaengSoon Baik; JoonMin Gil; ChanYeol Park; Soonyoung Jung;

ChongSun Hwang

Article Title: Emerging Era of Cooperative Empowerment: Grid, Peer-to-Peer, and Community

Computing

Publication Title: Information and Communication Technologies, 2005. ICICT 2005. First

International Conference on

ISBN 0-7803-9421-6

Posted Online Date: Mon Feb 27 00:00:00 EST 2006

Authors: Khan, J.I.

Article Title: Power management in cloud computing using green algorithm

Publication Title: Advances in Engineering, Science and Management (ICAESM), 2012

International Conference on

ISBN: 978-1-4673-0213-5

Posted Online Date: Thu Jun 14 00:00:00 EDT 2012

Authors: Yamini, R.

xxiii

Article Title: Predicting the Quality of Service of a Peer-to-Peer Desktop Grid

Publication Title: Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM

International Conference on

ISBN: 978-1-4244-6987-1

Posted Online Date: Thu Jun 24 00:00:00 EDT 2010

Authors: Carvalho, M.; Miceli, R.; Maciel, P.D.; Brasileiro, F.; Lopes, R.

Article Title: Agile computing: bridging the gap between grid computing and ad-hoc peer-to-

peer resource sharing

Publication Title: Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd

IEEE/ACM International Symposium on

ISBN: 0-7695-1919-9

Posted Online Date: Wed May 21 00:00:00 EDT 2003

Authors: Suri, N.; Bradshaw, J.M.; Carvalho, M.M.; Cowin, T.B.; Breedy, M.R.; Groth, P.T.;

Saavedra, R.

Article Title : Trust overlay networks for global reputation aggregation in P2P grid computing

Publication Title: Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th

International

ISBN: 1-4244-0054-6

Posted Online Date: Mon Jun 26 00:00:00 EDT 2006

Authors: Runfang Zhou; Kai Hwang

xxiv

Article Title : Probabilistic Observation Prediction Model based E4 Scheduling Mechanism in

Peer to Peer Grid Computing

Publication Title: Grid and Cooperative Computing, 2006. GCC 2006. Fifth International

Conference

ISBN: 0-7695-2694-2

Posted Online Date: Tue Dec 19 00:00:00 EST 2006

Authors: EunJoung Byun; Hongsoo Kim; Sungjin Choi; ChongSun Hwang

Article Title: Research on User Authentication for Grid Computing Security

Publication Title: Semantics, Knowledge and Grid, 2006. SKG '06. Second International

Conference on

ISBN: 0-7695-2673-X

Posted Online Date Mon Dec 11 00:00:00 EST 2006

Authors: Ronghui Wu; Renfa Li; Fei Yu; Guangxue Yue; Cheng Xu

Article Title : A Cloud Computing Platform Based on P2P

Publication Title: IT in Medicine & Education, 2009. ITIME '09. IEEE International Symposium on

ISBN: 978-1-4244-3928-7

Posted Online Date Tue Sep 15 00:00:00 EDT 2009

Authors: Ke Xu; Meina Song; Xiaoqi Zhang; Junde Song

xxv

Article Title: Research on Remote Heterogeneous Disaster Recovery Technology in Grid

Computing Security

Publication Title: Semantics, Knowledge and Grid, 2006. SKG '06. Second International

Conference on

ISBN: 0-7695-2673-X

Posted Online Date: Mon Dec 11 00:00:00 EST 2006

Authors: Ronghui Wu; Renfa Li; Fei Yu; Guangxue Yue; Jigang Wen

Article Title: An Agent-based Peer-to-Peer Grid Computing Architecture

Publication Title: Semantics, Knowledge and Grid, 2005. SKG '05. First International Conference

on

ISBN: 0-7695-2534-2

Posted Online Date: Mon Mar 12 00:00:00 EDT 2007

Authors: Jia Tang; Minjie Zhang

Article Title: Ad hoc Grid: An Adaptive and Self-Organizing Peer-to-Peer Computing Grid

Publication Title: Computer and Information Technology (CIT), 2010 IEEE 10th International

Conference on

ISBN: 978-1-4244-7547-6

Posted Online Date: Thu Sep 16 00:00:00 EDT 2010

Authors: Tiburcio, P.G.S.; Spohn, M.A.

xxvi

Article Title: Efficient Peer Selection in P2P JXTA-Based Platforms

Publication Title: Advanced Information Networking and Applications, 2008. AINA 2008. 22nd

International Conference on

ISBN: 978-0-7695-3095-6

Posted Online Date: Thu Apr 03 00:00:00 EDT 2008

Authors: Xhafa, F.; Daradoumis, T.; Barolli, L.; Fernandez, R.; Caballe, S.; Kolici, V.

Article Title: Storage@home: Petascale Distributed Storage

Publication Title: Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE

International

ISBN: 1-4244-0910-1

Posted Online Date: Mon Jun 11 00:00:00 EDT 2007

Authors: Beberg, A.L.; Pande, V.S.

Article Title: SETI@home-massively distributed computing for SETI

Publication Title: Computing in Science & Engineering

Posted Online Date: Wed Aug 07 00:00:00 EDT 2002

Authors: Korpela, E.; Werthimer, D.; Anderson, D.; Cobb, J.; Leboisky, M.

xxvii

Article Title: Optimizing the data distribution layer of BOINC with BitTorrent

Publication Title: Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International

Symposium on

ISBN: 978-1-4244-1693-6

Posted Online Date: Tue Jun 03 00:00:00 EDT 2008

Authors: Costa, F.; Silva, L.; Fedak, G.; Kelley, I.

Article Title: Mapping Virtual Organizations in Grids to Peer-to-Peer Networks

Publication Title: Software Engineering and Advanced Applications, 2008. SEAA '08. 34th

Euromicro Conference

ISBN: 978-0-7695-3276-9

Posted Online Date: Mon Dec 22 00:00:00 EST 2008

Authors: Dornemann, K.; Meier, D.; Mathes, M.; Freisleben, B.

Article Title: Attic: A Case Study for Distributing Data in BOINC Projects

Publication Title: Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011

IEEE International Symposium on

ISBN: 978-1-61284-425-1

Posted Online Date: Thu Sep 01 00:00:00 EDT 2011

Authors: Elwaer, A.; Harrison, A.; Kelley, I.; Taylor, I.

xxviii

Article Title: Experimentations and programming paradigms for matrix computing on peer to

peer grid

Publication Title: Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop

on

ISBN: 0-7695-2256-4

Posted Online Date: Mon Jan 24 00:00:00 EST 2005

Authors: Aouad, L.; Petiton, S.

Article Title: An Efficient Search Algorithm without Memory for Peer-to-Peer Cloud Computing

Networks

Publication Title: Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011

IEEE International Symposium on

ISBN: 978-1-61284-425-1

Posted Online Date: Thu Sep 01 00:00:00 EDT 2011

Authors Naixue Xiong; Yuhua Liu; Shishun Wu; Yang, L.T.; Kaihua Xu

Article Title: A Failure Detection Service for Internet-Based Multi-AS Distributed Systems

Publication Title: Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International

Conference on

ISBN: 978-1-4577-1875-5

Posted Online Date: Mon Jan 02 00:00:00 EST 2012

Authors: Moraes, D.M.; Duarte, E.P.

xxix

Article Title: Peer to peer distributed storage and computing cloud system

Publication Title: Information Technology Interfaces (ITI), Proceedings of the ITI 2012 34th

International Conference on

ISBN: 978-1-4673-1629-3

Posted Online Date: Thu Sep 20 00:00:00 EDT 2012

Authors: Tomas, B.; Vuksic, B.

Article Title: A modeling language approach for the abstraction of the Berkeley Open

Infrastructure for Network Computing (BOINC) framework

Publication Title: Computer Science and Information Technology (IMCSIT), Proceedings of the

2010 International Multiconference on

ISBN: 978-1-4244-6432-6

Posted Online Date: Thu Jan 06 00:00:00 EST 2011

Authors: Ries, C.B.; Hilbig, T.; Schrnder, C.

Article Title: Privacy-Preserving Energy Theft Detection in Smart Grids: A P2P Computing

Approach

Publication Title: Selected Areas in Communications, IEEE Journal on

Posted Online Date: Fri Aug 23 00:00:00 EDT 2013

Authors: Salinas, Sergio; Li, Ming; Li, Pan

xxx

Article Title: A demonstration of collaborative Web services and peer-to-peer grids

Publication Title: Information Technology: Coding and Computing, 2004. Proceedings. ITCC

2004. International Conference on

ISBN: 0-7695-2108-8

Posted Online Date: Tue Aug 24 00:00:00 EDT 2004

Authors: Minjun Wang; Fox, G.; Pallickara, S.

Article Title: Mining for statistical models of availability in large-scale distributed systems: An

empirical study of SETI@home

Publication Title: Modeling, Analysis & Simulation of Computer and Telecommunication

Systems, 2009. MASCOTS '09. IEEE International Symposium on

ISBN: 978-1-4244-4927-9

Posted Online Date: Mon Dec 28 00:00:00 EST 2009

Authors: Javadi, B.; Kondo, D.; Vincent, J.-M.; Anderson, D.P.

Article Title: A Novel P2P Network Model for Cloud Computing Based on Game Theory

Publication Title: Computer Science & Service System (CSSS), 2012 International Conference on

ISBN: 978-1-4673-0721-5

Posted Online Date: Mon Dec 31 00:00:00 EST 2012

Authors: ShouQing Qi; Lei Yang; Hai Mei Xu; LongChen Qi

xxxi

Article Title: Virtualization in Grid

Publication Title: Advanced Computing and Communications, 2008. ADCOM 2008. 16th

International Conference on

ISBN: 978-1-4244-2962-2

Posted Online Date: Fri Jan 23 00:00:00 EST 2009

Authors: Thamarai Selvi, S.

Article Title: Study on Computing Grid Distributed Middleware and Its Application

Publication Title: Information Technology and Applications, 2009. IFITA '09. International Forum

on

ISBN: 978-0-7695-3600-2

Posted Online Date: Fri Sep 04 00:00:00 EDT 2009

Authors: Shengxian Luo; Xiaochuan Peng; Shengbo Fan; Peiyu Zhang

Article Title: Optimal Client-Server Assignment for Internet Distributed Systems

Publication Title: Parallel and Distributed Systems, IEEE Transactions on

Posted Online Date: Tue Jan 22 00:00:00 EST 2013

Authors: Nishida, H.; Nguyen, T.

Article Title The comparison between cloud computing and grid computing

Publication Title Computer Application and System Modeling (ICCASM), 2010 International

Conference on

xxxii

ISBN: 978-1-4244-7235-2

Posted Online Date: Thu Nov 04 00:00:00 EDT 2010

Authors: Shuai Zhang; Xuebin Chen; Shufen Zhang; Xiuzhen Huo

Article Title: Adaptive Peer to Peer Resource Discovery in Grid Computing Based on

Reinforcement Learning

Publication Title: Software Engineering, Artificial Intelligence, Networking and

Parallel/Distributed Computing (SNPD), 2011 12th ACIS International Conference on

ISBN: 978-1-4577-0896-1

Posted Online Date Mon Oct 31 00:00:00 EDT 2011

Authors: Jamali, M.A.J.; Sani, Y.

Article Title: Quality of Service of Grid Computing: Resource Sharing

Publication Title: Grid and Cooperative Computing, 2007. GCC 2007. Sixth International

Conference on

ISBN: 0-7695-2871-6

Posted Online Date: Mon Aug 27 00:00:00 EDT 2007

Authors: Xian-He Sun; Wu, Ming

Article Title: Grid computing as applied distributed computation: a graduate seminar on

Internet and Grid computing

Publication Title: Cluster Computing and the Grid, 2004. CCGrid 2004. IEEE International

Symposium on

ISBN: 0-7803-8430-X

Posted Online Date Mon Sep 27 00:00:00 EDT 2004

Authors: Browne, J.C.

xxxiii

Article Title: Toward Internet distributed computing

Publication Title: Computer

Posted Online Date: Tue May 13 00:00:00 EDT 2003

Authors: Milenkovic, M.; Robinson, S.H.; Knauerhase, R.C.; Barkai, D.; Garg, S.; Tewari, V.;

Anderson, T.A.; Bowman, M.

Article Title: ET or EC? [SETI@Home project]

Publication Title: Antennas and Propagation Magazine, IEEE

Posted Online Date: Wed Aug 07 00:00:00 EDT 2002

Authors: Bansal, Rajeev

Article Title: Hierarchically distributed Peer-to-Peer architecture for computational grid

Publication Title: Green High Performance Computing (ICGHPC), 2013 IEEE International

Conference on

ISBN: 978-1-4673-2592-9

Posted Online Date: Mon Jun 17 00:00:00 EDT 2013

Authors: Gomathi, S.; Manimegalai, D.

Article Title: An Effective Self-adaptive Load Balancing Algorithm for Peer-to-Peer Networks

Publication Title: Parallel and Distributed Processing Symposium Workshops & PhD Forum

(IPDPSW), 2012 IEEE 26th International

ISBN: 978-1-4673-0974-5

Posted Online Date: Mon Aug 20 00:00:00 EDT 2012

Authors: Naixue Xiong; Kaihua Xu; Lilong Chen; Yang, L.T.; Yuhua Liu

xxxiv

Article Title: Discouraging free riding in a peer-to-peer CPU-sharing grid

Publication Title: High performance Distributed Computing, 2004. Proceedings. 13th IEEE

International Symposium on

ISBN: 0-7695-2175-4

Posted Online Date: Tue Aug 24 00:00:00 EDT 2004

Authors: Andrade, N.; Brasileiro, F.; Cirne, W.; Mowbray, M.

Article Title: A Scheduling Algorithm with Adaptive Allocation for Monte Carlo Simulation in P2P

Grid

Publication Title: Hybrid Information Technology, 2006. ICHIT '06. International Conference on

ISBN: 0-7695-2674-8

Posted Online Date: Mon Dec 11 00:00:00 EST 2006

Authors: Seok Myun Kwon; Jin Suk Kim

Article Title: Result verification and trust-based scheduling in peer-to-peer grids

Publication Title: Peer-to-Peer Computing, 2005. P2P 2005. Fifth IEEE International Conference

on

ISBN: 0-7695-2376-5

Posted Online Date: Mon Dec 12 00:00:00 EST 2005

Authors: Shanyu Zhao; Lo, V.; Dickey, C.G.

xxxv

Article Title: Semantic P2P Networks: Future Architecture of Cloud Computing

Publication Title: Networking and Distributed Computing (ICNDC), 2011 Second International

Conference on

ISBN: 978-1-4577-0407-9

Posted Online Date: Mon Oct 17 00:00:00 EDT 2011

Authors: Huang, Lican

Article Title: A Network Substrate for Peer-to-Peer Grid Computing beyond Embarrassingly

Parallel Applications

Publication Title: Communications and Mobile Computing, 2009. CMC '09. WRI International

Conference on

ISBN: 978-0-7695-3501-2

Posted Online Date Wed Mar 04 00:00:00 EST 2009

Authors: Schulz, S.; Blochinger, W.; Poths, M.

Article Title: An adaptive decentralized scheduling mechanism for peer-to-peer Desktop Grids

Publication Title: Computer Engineering & Systems, 2008. ICCES 2008. International Conference

on

ISBN: 978-1-4244-2115-2

Posted Online Date: Tue Feb 03 00:00:00 EST 2009

Authors: Azab, A.A.; Kholidy, H.A.

xxxvi

Article Title: How to Make BOINC-Based Desktop Grids Even More Popular?

Publication Title: Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011

IEEE International Symposium on

ISBN: 978-1-61284-425-1

Posted Online Date: Thu Sep 01 00:00:00 EDT 2011

Authors: Kacsuk, P.

Article Title: Trust based interoperability security protocol for grid and Cloud computing

Publication Title: Computing Communication & Networking Technologies (ICCCNT), 2012 Third

International Conference on

Posted Online Date: Mon Dec 31 00:00:00 EST 2012

Authors: Rajagopal, R.; Chitra, M.

Article Title: On the evaluation of services selection algorithms in multi-service P2P grids

Publication Title: Integrated Network Management-Workshops, 2009. IM '09. IFIP/IEEE

International Symposium on

ISBN: 978-1-4244-3923-2

Posted Online Date: Fri Aug 07 00:00:00 EDT 2009

Authors: Coelho, A.; Brasileiro, F.

Article Title: The Clouds distributed operating system

Publication Title:Computer

Posted Online Date: Tue Aug 06 00:00:00 EDT 2002

Authors: Dasgupta, P.; LeBlanc, R.J., Jr.; Ahamad, M.; Umakishore Ramachandran

xxxvii

Article Title: Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery

Publication Title: Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP

2007. Third International Conference on

ISBN: 978-0-7695-2994-1

Posted Online Date: Mon Feb 25 00:00:00 EST 2008

Authors: Zenggang Xiong; Yang Yang; Xuemin Zhang; Fu Chen; Li Liu

Article Title: Cluster, grid and cloud computing: A detailed comparison

Publication Title: Computer Science & Education (ICCSE), 2011 6th International Conference on

ISBN: 978-1-4244-9717-1

Posted Online Date: Mon Sep 26 00:00:00 EDT 2011

Authors: Sadashiv, N.; Kumar, S.M.D.

Article Title: Research of P2P architecture based on cloud computing

Publication Title: Intelligent Computing and Integrated Systems (ICISS), 2010 International

Conference on

ISBN: 978-1-4244-6834-8

Posted Online Date: Fri Dec 03 00:00:00 EST 2010

Authors: Zhao Peng; Huang Ting-lei; Liu Cai-xia; Xin Wang

xxxviii

Article Title: Grid Virtualization Engine: Design, Implementation, and Evaluation

Publication Title: Systems Journal, IEEE

Posted Online Date: Tue Jan 26 00:00:00 EST 2010

Authors: Lizhe Wang; von Laszewski, G.; Jie Tao; Kunze, M.

Article Title: Integration of cloud computing and P2P: A future storage infrastructure

Publication Title: Quality, Reliability, Risk, Maintenance, and Safety Engineering (ICQR2MSE),

2012 International Conference on

ISBN: 978-1-4673-0786-4

Posted Online Date: Mon Jul 23 00:00:00 EDT 2012

Authors: Hai-Mei Xu; Yan-Jun Shi; Yu-Lin Liu; Fu-Bing Gao; Tao Wan

Article Title: Increasing GP Computing Power for Free via Desktop GRID Computing and

Virtualization

Publication Title: Parallel, Distributed and Network-based Processing, 2009 17th Euromicro

International Conference on

ISBN: 978-0-7695-3544-9

Posted Online Date: Fri May 08 00:00:00 EDT 2009

Authors: Gonzalez, D.L.; de Vega, F.F.; Trujillo, L.; Olague, G.; Araujo, L.; Castillo, P.; Merelo, J.J.;

Sharman, K.

xxxix

Article Title: Research on User Authentication for Grid Computing Security

Publication Title: Semantics, Knowledge and Grid, 2006. SKG '06. Second International

Conference on

ISBN: 0-7695-2673-X

Posted Online Date: Mon Aug 20 00:00:00 EDT 2012

Authors: Ronghui Wu; Renfa Li; Fei Yu; Guangxue Yue; Cheng Xu

Article Title: Grid computing security mechanisms: State-of-the-art

Publication Title: Multimedia Computing and Systems, 2009. ICMCS '09. International

Conference on

ISBN: 978-1-4244-3756-6

Posted Online Date: Tue Sep 22 00:00:00 EDT 2009

Authors: Bendahmane, A.; Essaaidi, M.; El Moussaoui, A.; Younes, A.

Article Title: Approach of a UML profile for Berkeley Open Infrastructure for network computing

(BOINC)

Publication Title: Computer Applications and Industrial Electronics (ICCAIE), 2011 IEEE

International Conference on

ISBN: 978-1-4577-2058-1

Posted Online Date: Thu Mar 01 00:00:00 EST 2012

Authors: Ries, C.B.; Schroder, C.; Grout, V.

xl

Article Title: Grid Computing Security: A Taxonomy

Publication Title: Security & Privacy, IEEE

Posted Online Date: Thu Feb 07 00:00:00 EST 2008

Authors: Chakrabarti, A.; Damodaran, A.; Sengupta, S.

Article Title: The Geographic Structure Query Language Based on the Peer to Peer and

Cooperating Computing Hybrid Discovering Model

Publication Title: Wireless Communications Networking and Mobile Computing (WiCOM), 2010

6th International Conference on

ISBN: 978-1-4244-3708-5

Posted Online Date: Thu Oct 14 00:00:00 EDT 2010

Authors: Chen, Z.L.; Wu, L.; Xie, Z.

Article Title: V-BOINC: The Virtualization of BOINC

Publication Title: Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM

International Symposium on

ISBN: 978-1-4673-6465-2

Posted Online Date: Mon Jun 24 00:00:00 EDT 2013

Authors: McGilvary, G.A.; Barker, A.; Lloyd, A.; Atkinson, M.

xli

Article Title: Friend-to-Friend Computing - Instant Messaging Based Spontaneous Desktop Grid

Publication Title: Internet and Web Applications and Services, 2008. ICIW '08. Third

International Conference on

ISBN: 978-0-7695-3163-2

Posted Online Date: Fri Jun 20 00:00:00 EDT 2008

Authors: Norbisrath, U.; Kraaner, K.; Vainikko, E.; Batrasev, O.

Article Title: Research on cloud computing security problem and strategy

Publication Title: Consumer Electronics, Communications and Networks (CECNet), 2012 2nd

International Conference on

ISBN: 978-1-4577-1414-6

Posted Online Date: Thu May 17 00:00:00 EDT 2012

Authors: Wentao Liu

Article Title: Discovering Statistical Models of Availability in Large Distributed Systems: An

Empirical Study of SETI@home

Publication Title: Parallel and Distributed Systems, IEEE Transactions on

Posted Online Date: Mon Sep 26 00:00:00 EDT 2011

Authors: Javadi, B.; Kondo, D.; Vincent, J.-M.; Anderson, D.P.

xlii

Article Title: Research on Remote Heterogeneous Disaster Recovery Technology in Grid

Computing Security

Publication Title: Semantics, Knowledge and Grid, 2006. SKG '06. Second International

Conference on

ISBN: 0-7695-2673-X

Posted Online Date: Mon Aug 20 00:00:00 EDT 2012

Authors: Ronghui Wu; Renfa Li; Fei Yu; Guangxue Yue; Jigang Wen

Article Title: Cloud Computing: Distributed Internet Computing for IT and Scientific Research

Publication Title: Internet Computing, IEEE

Posted Online Date: Wed Sep 09 00:00:00 EDT 2009

Authors: Dikaiakos, M.D.; Katsaros, D.; Mehra, P.; Pallis, G.; Vakali, A.

Article Title: Scalability issues in cloud computing

Publication Title: Advanced Computing (ICoAC), 2012 Fourth International Conference on

ISBN: 978-1-4673-5583-4

Posted Online Date: Thu Jan 24 00:00:00 EST 2013

Authors: Somasundaram, T.S.; Prabha, V.; Arumugam, M.


Recommended