+ All Categories
Home > Documents > 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind...

1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind...

Date post: 21-Dec-2015
Category:
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
34
1 Distributed Scheduling In Distributed Scheduling In Sombrero, A Single Address Sombrero, A Single Address Space Distributed Space Distributed Operating System Operating System Milind Patil
Transcript
Page 1: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

1

Distributed Scheduling In Sombrero, A Distributed Scheduling In Sombrero, A Single Address Space Distributed Single Address Space Distributed

Operating SystemOperating System

Milind Patil

Page 2: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

2

ContentsContents• Distributed Scheduling• Features of Sombrero• Goals• Related Work• Platform for Distributed Scheduling• Distributed Scheduling Algorithm (Simulation)• Scaling of the Algorithm (Simulation) • Initiation of Porting to Sombrero Prototype• Testing• Conclusion• Future Work

Page 3: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

3

Distributed SchedulingDistributed Scheduling

• A distributed scheduling algorithm provides for sharing as well as better usage of resources across the system.

• The algorithm will allow threads in the distributed system to be scheduled among the different processors in such a manner that CPU usage is balanced.

Page 4: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

4

Features of SombreroFeatures of Sombrero

Distributed scheduling in Sombrero takes advantage of the distributed SASOS features:

• The shared memory inherent to a distributed SASOS provides an excellent mechanism to distribute load information of the nodes in the system (information policy).

• The ability of threads to migrate in a simple manner across machines has a potentially far-reaching affect on the performance of the distributed scheduling mechanism.

Page 5: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

5

Features of Sombrero (contd.)Features of Sombrero (contd.)

• The granularity of migration is a thread not a process. This allows the distributed scheduling algorithm to have a flexible selection policy (determines which thread is to be transferred to achieve load balancing).

• This feature also reduces the software complexity of the algorithm.

Page 6: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

6

GoalsGoals

• Platform for Distributed Scheduling

• Simulation of Distributed Scheduling Algorithm

• Scaling of the Algorithm (Simulation)

• Initiation of Porting to Sombrero Prototype

Page 7: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

7

Related WorkRelated Work

Load-Balancing Algorithms for

• SpriteSprite

• PVMPVM

• Condor Condor

• UNIX UNIX

Page 8: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

8

RequirementsRequirements

•A working prototype of Sombrero is needed that has the ability to manage extremely large data sets across a network in a distributed single address space.

•A functional prototype is needed which implements essential features such as protections domains, Sombrero thread support, token tracking support, etc.

The prototype is under construction and not available as development platform. Windows NT is used since the prototype is being developed on it.

Page 9: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

9

Sombrero NodeSombrero Node

Load Table

Sombrero Node

Local ThreadInformation

Selection Policy

Communication Thread

Distributed Scheduler

Sombrero Node

Local ThreadInformation

Selection Policy

Communication Thread

Distributed Scheduler

ThreadMigration

Architecture of Sombrero Nodes

Page 10: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

10

Sombrero ClustersSombrero Clusters

RMOCB0x5000

RMOCB0x7000

RMOCB0x6000

RMOCB 0x1000

RMOCB0x2000

Router0x1

Router0x11

A

B

B

BA

Cluster I

RMOCB0x3000

RMOCB0x4000

Cluster II

Cluster III

Load Table0x1000

Load Table0x5000

Load Table0x2000

The Sombrero system is organized into hierarchies of clusters for scalable distributed scheduling.

Page 11: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

11

Sombrero RouterSombrero Router

Architecture of Sombrero Routers

I/O Completion Port

Service Threads

Socket Socket Socket Socket Socket SocketSocket

Page 12: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

12

Inter-node CommunicationInter-node Communication

Sombrero nodes communicate with each other through the routers.

RMOCB 0x1000

RMOCB0x2000

RMOCB0x3000

Router0x1

Router0x11

A

B

B

BA

Cluster I Cluster II

Page 13: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

13

Router TablesRouter Tables

Router 0x1RMOCB SUBNET MASK REMOTE IP ADDRESS

0 0xC000000000000000 NULL …

0x8000000000000000 0x8000000000000000 NULL …

0x3 xx …

RMOCB SUBNET MASK

0x4000000000000000 0xC000000000000000

1 xxC D

R3

A B

R1

A :B :R3:

Page 14: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

14

Router Tables(contd.)Router Tables(contd.)

Router 0x3RMOCB SUBNET MASK REMOTE IP ADDRESS

0x4000000000000000 0xE000000000000000 NULL …

0x6000000000000000 0xE000000000000000 NULL …

0x1 xx …

RMOCB SUBNET MASK

0 0xC000000000000000

0x8000000000000000 0x8000000000000000

1 xx

C :D :R1:

C D

R3

A B

R1

Page 15: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

15

Address Space AllocationAddress Space Allocation

This project implements an address space allocation mechanism to distribute the 264 bytes address space amongst the nodes in the system.

Example:- Consider a system of four Sombrero nodes (A, B, C and D). The nodes come online for the very first time in the order - A, B , C and D.

C D

R3

A B

R1

Page 16: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

16

•The address space allocated for the nodes when A is initialized will be:A: 0x0000000000000000 – 0xfffffffffffffff

•The address space allocated for the nodes when B is initialized will be:A: 0x0000000000000000 – 0x7fffffffffffffffB: 0x8000000000000000 – 0xffffffffffffffff

Address Space Allocation(contd.)Address Space Allocation(contd.)

Page 17: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

17

•The address space allocated for the nodes when C is initialized will be:A: 0x0000000000000000 – 0x3fffffffffffffffB: 0x8000000000000000 – 0xffffffffffffffffC: 0x4000000000000000 – 0x7fffffffffffffff•The address space allocated for the nodes when D is initialized will be:A: 0x0000000000000000 – 0x3fffffffffffffffB: 0x8000000000000000 – 0xffffffffffffffffC: 0x4000000000000000 – 0x5fffffffffffffffD: 0x6000000000000000 – 0x7fffffffffffffff

Address Space Allocation(contd.)Address Space Allocation(contd.)

Page 18: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

18

Load MeasurementLoad Measurement

A node’s workload can be estimated based on some

measurable parameters:•Total number of threads on the node at the time of load

measurement.•Instruction mixes of these threads (I/O bound or CPU

bound).

Page 19: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

19

Load Measurement (contd.)Load Measurement (contd.)

p processor utilization of a thread f heuristic factor (adjusts the importance of thread depending on how it is being used)

The heuristic factor ‘f’ should have a large value for I/O intensive threads and a small value for CPU intensive threads. The values of the heuristic factor can be empirically determined by using a fully functional Sombrero prototype.

Work Load = i (pi fi)

Page 20: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

20

Load Measurement - SimulationLoad Measurement - Simulation

•In the simulation we assume that the processor utilization of all threads is the same:This is sufficient to prove the correctness of the algorithm•The measure of load at the node level is the number of Sombrero threads.•A threshold policy has been defined:

high--number of Sombrero threads HIGHLOADlow--number of Sombrero threads < MEDIUMLOADmedium--number of Sombrero threads < HIGHLOAD and number of Sombrero threads MEDIUMLOAD

Page 21: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

21

Load TablesLoad Tables

• Shared memory is used to distribute load information. (In Sombrero the shared memory consistency is managed by the token tracking mechanism)

• One load table is needed for each cluster.

• Thresholds of load have been established to minimize the exchange of load information in the network. Only threshold crossings are recorded in the load table.

Page 22: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

22

Distributed Scheduling AlgorithmDistributed Scheduling Algorithm

Highly loaded nodes in minority

Sender Initiated Algorithm

Lightly loaded nodes in minority

Receiver Initiated Algorithm

Highly loaded nodes

Lightly loaded nodes

Medium loaded nodes are not considered

Page 23: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

23

Distributed Scheduling AlgorithmDistributed Scheduling Algorithm

The algorithm used is dynamic i.e. sender initiated at lower

loads and receiver initiated at higher loads.

1. Nodes loaded in the medium range do not participate in

load balancing.

2. The load balancing is not to be done if the node belongs

to the majority (larger of the groups of highly or lightly

loaded nodes).

Page 24: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

24

Distributed Scheduling AlgorithmDistributed Scheduling Algorithm

3. Load balancing is to be done if node belongs to the minority (smaller of the groups of highly or lightly loaded nodes).

The node is heavily loaded and the algorithm is sender initiated:- choose a lightly loaded node at random and the RGETTHREADS message protocol is followed for thread migration.The node is lightly loaded and the algorithm is receiver initiated:- choose a highly loaded node at random and the GETTHREADS message protocol is followed for thread migration.

Page 25: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

25

Scaling the AlgorithmScaling the Algorithm

•Aggregating the clusters provides scalability.•Thresholds for clusters are defined as given:high: - no cluster members are lightly loaded and at least one member is highly loadedlow: - no cluster members are highly loaded and at least one member is lightly loadedmedium: - all other cases of loads where load balancing can occur within the cluster members or when all members of the cluster are medium loaded

Page 26: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

26

Scaling the AlgorithmScaling the Algorithm

1. At any level of cluster only the nodes belonging to the minority group at that level will be active. 2. Load balancing at an nth level cluster will be attempted every (nSOMECONSTANT) times the number of unsuccessful attempts at the node level.3. A suitable nth level target cluster is found through the corresponding load table and the TRANSFERREQUEST message protocol is followed for thread migration.

…... …... …... …... …...n=1

n=2

n=3

Page 27: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

27

Testing Eight NodesTesting Eight Nodes

Table 1. Testing Eight Nodes

Cluster

(Before Load Balancing)

Cluster

(After Load Balancing)

Messages

O(n)

[1,0,7] [0,1,7] 3

[2,0,6] [0,2,6] 6

[3,0,5] [0,3,5] 9

[4,0,4] [0,4,4] 12

[5,0,3] [0,5,3] 10

[6,0,2] [0,6,2] 12

[7,0,1] [0,7,1] 14

Cluster: [# of highly loaded nodes, # of medium loaded nodes, # of lightly loaded nodes]

Page 28: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

28

Testing Three ClustersTesting Three Clusters

Table 5. Testing Three Clusters

Before Load Balancing After Load BalancingLoad

Cluster I Cluster II Cluster III Cluster I Cluster II Cluster III

Messages

O(n)

{L,L,L} [2,0,6] [2,0,6] [2,0,6] [0,2,6] [0,2,6] [0,2,6] 18

{L,L,M} [2,0,6] [2,0,6] [0,8,0] [0,2,6] [0,2,6] [0,8,0] 12

{L,M,M} [2,0,6] [0,8,0] [0,8,0] [0,2,6] [0,8,0] [0,8,0] 6

{L,L,H} [0,0,8] [0,0,8] [4,4,0] [0,0,8] [0,0,8] [0,8,0] 14

{L,M,H} [0,0,8] [0,8,0] [1,7,0] [0,0,8] [0,8,0] [0,8,0] 5

{L,H,H} [0,0,8] [8,0,0] [1,7,0] [0,0,8] [0,8,0] [0,8,0] 31

{M,H,H} [0,8,0] [8,0,0] [8,0,0] [0,8,0] [8,0,0] [8,0,0] -

…... …... …...n=1

n=2

Page 29: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

29

Testing Six Clusters at Two LevelsTesting Six Clusters at Two Levels

n=1

n=2

n=3

… … … … … …Table 6. Testing Six Clusters at Two Levels

Cluster A Cluster B

I II III IV V VI

Messages

O(n)

[1,7,0] H [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

5

[2,6,0] H [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

8

[6,2,0] H [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

20

[7,1,0] H [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

23

[8,0,0] H [8,0,0] H [2,6,0] H [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

60

[8,0,0] H [8,0,0] H [3,5,0] H [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

63

[8,0,0] H [8,0,0] H [8,0,0] H [0,0,8] L [0,8,0] M [0,8,0] M[0,8,0] M [0,8,0] M [0,8,0] M [0,0,8] L [0,8,0] M [0,8,0] M

81

Before

After

Page 30: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

30

ConclusionConclusion

•The testing of distributed scheduling using the simulator verifies that the algorithm functions correctly.

•It is observed that the increase in number of messages is proportional to the increase in number of heavily loaded nodes.

•The number of messages required for load balancing at the first level and above is the same if the ratio of heavily and lightly loaded nodes is kept constant at both levels.

Page 31: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

31

Conclusion (contd.)Conclusion (contd.)

•Only one additional load table is required per additional cluster. Hence, the required number of messages is expected to increase by a small constant factor as the level of clustering increases.

•It can be concluded that the algorithm’s complexity is O(n) where n is the number of highly loaded nodes.

Page 32: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

32

Future WorkFuture Work

•Porting of code from NT to Sombrero for the Sombrero node - communication code.

•Changing definition of load measurement to the more general formula.

•Reuse code from the Sombrero router.

•Adaptive cluster forming algorithm.

Page 33: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

33

AcknowledgementsAcknowledgements

Dr. Donald Miller

Dr. Rida Bazzi

Dr. Bruce Millard

Mr. Alan Skousen

Mr. Raghavendra Hebbalalu

Mr. Ravikanth Nasika

Mr. Tom Boyd

Page 34: 1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

34


Recommended