Job Allocation Schemes in Computational Gridsbased on Cost Optimization
Satish Penmatsa
Joint work with: Dr. A.T.Chronopoulos
in Proceedings of the 19th IEEE International Parallel andDistributed Processing Symposium (IPDPS05)
Job Scheduling: problem formulation
• Given a large number of jobs, find the allocation of jobsto computers optimizing a given objective function (e.g.total execution time or total cost).
Talk Outline
• Introduction to Grid Computing
• Pricing Model
• System Model
• Price based Job Allocation Schemes
• Experimental results
• Conclusions
Grid Computing
• Grid is a type of parallel / distributed system.
– Enables the sharing, selection, and aggregation of ge-ographically distributed ”autonomous” resources dy-namically at runtime.
– Depends on the resource availability, capability, perfor-mance, cost, and users’ quality-of-service requirements.
• Difference between a Grid and a Cluster?
– The key distinction is in the way resources are managed.
• Computational grid: Tries to solve problems or applica-tions by allocating the idle computing resources over anetwork or the internet
• These computational resources have different owners whocan be enabled by an automated negotiation mechanismby the grid controllers
Pricing Model
[Ghosh et al. ‘04]
• Incomplete information alternating-offer non-cooperativebargaining game
• Players are the Grid Servers and the Computers
• Reserved valuations
• The server has to play an independent game with eachcomputer associated with it to form the price per unitresource vector, pj.
• In a system with m servers and n computers at time t,we have m× n bargaining games.
S1
S2
Sm
C1
C2
Cn
Computing Resource PoolGrid Server Pool
Ci − i th Computer BGji − Bargaining Game
Sj − j th Grid Server
betweenj th Server and i th Computer
BG11
BG12
BG1n
BG21BG22
BG2n
BGm1
BGm2
BGmn
Figure 1: Bargaining game mapping between the grid servers and computers
The Bargaining Protocol
• One of the players starts the game.
• If the server starts the game, it proposes an offer whichwill be much less than its own reserved valuation.
• If the offered price ≥ the computer’s standard price withhighest expected surplus, then the computer accepts theoffer.
• Else, the computer makes a counter offer.
• If this counter offer ≤ the server’s standard price with thehighest expected surplus, then the server accepts.
• Else the server counter offers again.
• This procedure continues until an agreement is reached.
• Grid Server: E[Surplus] = (reserved valuation of server- standard price of server)×probability(standard price)
• Computer: E[Surplus] = (standard price of computer -reserved valuation of computer)×probability(standardprice)
• Standard price represents the different offered prices usedby the players to compute their expected surplus.
• probability(standard price) is the probability that thestandard price will be accepted by the other player aspredicted by itself.
Example
• Reserved valuation of the Grid server: $100;
• Reserved valuation of the Computer: $60;
• Let the Computer make an initial offer of $110;
Offered Price ($) Probability Expected Surplus ($)40 0.10 0660 0.40 1680 0.70 1490 0.90 09100 1.00 00
Table 1: Grid Server’s computation for making decision
Offered Price ($) Probability Expected Surplus ($)60 1.00 0070 0.90 0980 0.70 1490 0.40 12110 0.10 05
Table 2: Computer’s computation for making decision
Offered Price ($) Updated Probability Expected Surplus ($)40 0.10 - 0.30 = 0.00 0060 0.40 - 0.30 = 0.10 0480 0.70 - 0.30 = 0.40 0890 0.90 - 0.30 = 0.60 06100 1.00 - 0.30 = 0.70 00
Table 3: Grid Server’s computation using modified probability for making decision
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Grid
Ser
ver’s
Exp
ecte
d Su
rplu
s
Grid Server’s Offered Price
t=0t=1t=2t=3
Figure 2: Expected surplus of the Grid server vs Offered prices
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Com
pute
r’s E
xpec
ted
Surp
lus
Computer’s Offered Price
t=0t=1t=2t=3
Figure 3: Expected surplus of the Computer vs Offered prices
GridCommunity
GridServers
Computers
S1
S2
Sm
C1
C2
Cn
1
2
m
s11
s12
s1n
s21
s22
s2n
sm1
sm2
smn
1
1
1
2
2
2
m
m
m
JobAssignment
JobAssignment
Figure 4: Grid System Model
Notations & Assumptions
•m Grid Servers
• n Computers
• φj : Job arrival rate at server j; j = 1, . . . , m
• Φ = ∑mj=1 φj : Total job arrival rate of the system
• µi : Average processing rate of computer i; i = 1, . . . , n
• Each computer is modeled as an M/M/1 queuing system
• Φ < ∑ni=1 µi
• pji : Price per unit resource as agreed between server jand computer i
Notations & Assumptions (cont’d)
• sji : Fraction of workload (jobs) that server j sends tocomputer i
• sj = (sj1, sj2, . . . , sjn) denotes the workload fractions ofserver j
• The vector s = (s1, s2, . . . , sm) denotes the load fractionsof all the servers
• The expected response time at computer i is given by:
Fi(s) =1
µi − ∑mj=1 sjiφj
(1)
• Thus the overall expected cost of server j is given by:
Dj(s) =n∑
i=1kipjisjiFi(s) =
n∑
i=1
kipjisji
µi − ∑mk=1 skiφk
(2)
• The overall expected cost of the system (i.e. of all theservers) is given by:
D(s) =1
Φ
m∑
j=1φjDj(s) (3)
which is equivalent to
D(s) =1
Φ
m∑
j=1
n∑
i=1
kipjiφjsji
µi − ∑mk=1 skiφk
(4)
subject to the constraints:
sji ≥ 0, i = 1, . . . , n, j = 1, . . . , m (5)
n∑
i=1sji = 1, j = 1, . . . , m (6)
m∑
j=1sjiφj < µi, i = 1, . . . , n (7)
Price based Job Allocation Schemes
1. Global Optimal Scheme with Pricing (GOSP )
2. Nash Scheme with Pricing (NASHP )
1. Global Optimal Scheme with Pricing (GOSP )
• The load fractions (s) are obtained by solving the nonlin-ear optimization problem D(s) (4) which gives the opti-mum expected cost of the system.
• Let µji = µi−∑m
k=1,k 6=j skiφk be the available processingrate at computer i as seen by server j.
D(s) Solution:
Theorem 1: Assuming that computers are ordered in de-
creasing order of their available processing rates (µj1 ≥
µj2 ≥ . . . ≥ µj
n), the load fractions for server j are givenby:
sji =
1φj
µ
ji −
√kipjiµi
∑cjk=1 µ
jk−φj
∑cjk=1
√kkpjkµk
if 1 ≤ i < cj
0 if cj ≤ i ≤ n
(8)
where cj is the minimum index that satisfies the in-equality:
µjcj ≤
√√√√kcjpjcjµcj(∑cjk=1 µ
jk − φj)
∑cjk=1
√kkpjkµk
(9)
Algorithm for solving D(s)
BEST-FRACTIONS(µj1, . . . , µ
jn, φj, pj1, . . . , pjn, k1, . . . , kn)
Input: Available processing rates: µj1, µj
2, . . . µjn;
Total arrival rate: φj
The price per unit resource vector: pj1, pj2, . . . pjn
The constants vector: k1, k2, . . . kn
Output: Load fractions: sj1, sj2, . . . sjn;
1. Sort the computers in decreasing order of ( µj1√
µ1k1pj1≥ . . . ≥ µ
jn√
µnknpjn);
2. t ←∑n
i=1 µji−φj
∑ni=1
õipjiki
3. while ( t ≥ µjn√
µnknpjn) do
sjn ← 0n ← n− 1
t ←∑n
i=1 µji−φj
∑ni=1
õipjiki
4. for i = 1, . . . , n do
sji ← 1φj
(µj
i − t√µipjiki
)
A Distributed Algorithm
Server j, (j = 1, . . . , m) executes:1. Initialization:
s(0)j ← 0;
D(0)j ← 0;
l ← 0;norm ← 1;sum ← 0;tag ← CONTINUE;left = [(j − 2)modm] + 1;right = [jmodm] + 1;
2. while ( 1 ) doif (j = 1) {server 1}
if (l 6= 0)Recv(left, (norm, l, tag));if (norm < ε)
Send(right, (norm, l, STOP));exit;
sum ← 0;l ← l + 1;
else {the other servers}Recv(left, (sum, l, tag));if (tag = STOP)
if (j 6= m) Send(right, (sum, l, STOP));exit;
for i = 1, . . . , n do
Obtain µji by inspecting the run queue of each computer
(µji ← µi − ∑m
k=1,k 6=j skiφk);
s(l)j ← BEST-FRACTIONS(µj
1, . . . , µjn, φj);
Compute D(l)j ;
sum ← sum + |D(l−1)j −D
(l)j |;
Send(right, (sum, l, CONTINUE));endwhile
2. Nash Scheme with Pricing (NASHP )
• In this scheme each server tries to minimize the total costof its jobs independently of the others.
• The load fractions are obtained by formulating the prob-lem as a non-cooperative game among the servers.
• The goal of server j is to find a feasible job allocationstrategy sj such that Dj(s) (2) is minimized.
Dj(s) Solution:
Theorem 2: Assuming that computers are ordered in de-
creasing order of their available processing rates (µj1 ≥
µj2 ≥ . . . ≥ µj
n), the solution sj of the optimization prob-lem Dj(s) is given by:
sji =
1φj
µ
ji −
√√√√√kipjiµji
∑cjk=1 µ
jk−φj
∑cjk=1
√√√√kkpjkµjk
if 1 ≤ i < cj
0 if cj ≤ i ≤ n(10)
where cj is the minimum index that satisfies the in-equality:
√√√√√µjcj ≤
√√√√kcjpjcj(∑cjk=1 µ
jk − φj)
∑cjk=1
√√√√√kkpjkµjk
(11)
Experimental Results
Performance metrics:
• Expected Response Time
• Fairness Index(I(C))
I(C) =[∑m
j=1 Cj]2
m ∑mj=1 C2
j(12)
System configuration
• 32 computers
• 20 servers
Relative µi 1 2 3 4 5 7 8 10#computers 7 6 5 4 3 3 2 2µi (jobs/sec) 10 20 30 40 50 70 80 100ki 1 2 3 4 5 6 7 8
System utilization vs Expected Price
0
0.05
0.1
0.15
0.2
0.25
0.3
10 20 30 40 50 60 70 80 90
Pric
e
System Utilization(%)
AscendingDescending
Random
System utilization vs Expected Response Time
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
10 20 30 40 50 60 70 80 90
Expe
cted
Res
pons
e Ti
me
System Utilization(%)
GOSPNASHP
System utilization vs Fairness Index
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
10 20 30 40 50 60 70 80 90
Fairn
ess
Inde
x
System Utilization(%)
GOSPNASHP
Heterogeneity vs Expected Price
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
2 4 6 8 10 12 14 16 18 20
Pric
e
Max Speed/Min Speed
AscendingDescending
Random
Heterogeneity vs Expected Response Time
0.05
0.1
0.15
0.2
0.25
0.3
0.35
2 4 6 8 10 12 14 16 18 20
Expe
cted
Res
pons
e Ti
me
Max Speed/Min Speed
GOSPNASHP
Conclusions
•We proposed two job allocation schemes based on pricingfor computational grids.
• The GOSP scheme tries to minimize the cost of the en-tire grid system and so is advantageous when the systemoptimum is required. But it is not fair to the servers andso to the users.
• The NASHP scheme minimizes the cost for each server.This is fair to the servers and so to the users.