of 12
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
1/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
" - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
A Disparateness!A0are %chedulin/ usin/ !Centroids
Clusterin/ and 3%4 5echni6ues in 7adoop ClusterDr. E. Laxmi Lydia1,Dr. M.Ben Swarup2 ,Dr. Challa Narsimham3
1Assoia!e "ro#essor, Depar!men! o# Compu!er Siene and En$ineerin$, %i$nan&s 'ns!i!u!e (#'n#orma!ion )ehnolo$y, %isa*hapa!nam, Andhra "radesh, 'ndia.
2"ro#essor, Compu!er Siene and En$ineerin$, %i$nan&s 'ns!i!u!e (# 'n#orma!ion )ehnolo$y,
%isa*hapa!nam, Andhra "radesh, 'ndia.3"ro#essor, Compu!er Siene and En$ineerin$, %i$nan&s 'ns!i!u!e (# 'n#orma!ion )ehnolo$y,
%isa*hapa!nam, Andhra "radesh, 'ndia.
elaxmi2++2yahoo.om,-#or-en$mail.om
A 8 % 5 R A C 5
8i/ data stora/e mana/ement is one of the most challen/in/ issues for 7adoop clusterenvironments, since lar/e amount of data intensive applications fre6uentl9 involve a hi/h de/ree
of data access localit9$ In traditional approaches hi/h!performance computin/ consists dedicated
servers that are used to data stora/e and data replication$ 5herefore to solve the prolems of
Disparateness amon/ the os and resources a :Disparateness!A0are %chedulin/ al/orithm; is
proposed in the cluster environment$ In this research 0or< 0e represent !centroids clusterin/
in i/ data mechanism for 7adoop cluster$ 5his approach is mainl9 focused on the ener/9
consumption in 5he 7adoop cluster, 0hich helps to increase the s9stem reliailit9$ 5he 7adoop
cluster consists of resources 0hich are cate/ori=ed for minimi=in/ the schedulin/ dela9 in the
7adoop cluster usin/ the !Centroids clusterin/ al/orithm$ A novel provisionin/ mechanism is
introduced alon/ 0ith the consideration of load, ener/9, and net0or< time$ 89 inte/ratin/ thesethree parameters, the optimi=ed fitness function is emplo9ed for 3article %0arm 4ptimi=ation
(3%4) to select the computin/ node$ Failure ma9 occur after completion of the successful
e>ecution in the net0orperimental results e>hiit etter schedulin/
len/th, schedulin/ dela9, speed up, failure ratio, ener/9 consumption than the e>istin/ s9stems$
e90ords? !Centroids Clusterin/, 8i/ data, 7adoop Cluster, data access localit9, data replication,
s9stem reliailit9, particle s0arm optimi=ation
I$ I&5R4D@C5I4&
'n reen! years, -i$ da!a has rapidly de/eloped in!o a ho!spo! !ha! a!!ra!s $rea! a!!en!ion #rom aademia,
indus!ry, and e/en $o/ernmen!s around !he world 012. Na!ure and Siene ha/e pu-lished speial
issues dedia!ed !o disuss !he oppor!uni!ies and hallen$es -rou$h! -y -i$ da!a 03, . M4insey, !he
well5*nown mana$emen! and onsul!in$ #irm, alle$ed !ha! -i$ da!a has pene!ra!ed in!o e/ery area o#
!oday6s indus!ry and -usiness #un!ions and has -eome an impor!an! #a!or in produ!ion 07. 8sin$ and
minin$ -i$ da!a heralds a new wa/e o# produ!i/i!y $row!h and onsumer impe!us. (69eilly Media e/en
asser!ed !ha! :!he #u!ure -elon$s !o !he ompanies and people !ha! !urn da!a in!o produ!s; 0
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
2/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
1 - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
=ha! is -i$ da!a> So #ar, !here is no uni/ersally aep!ed de#ini!ion. 'n =i*ipedia, -i$ da!a is de#ined as
:an all5enompassin$ !erm #or any olle!ion o# da!a se!s so lar$e and omplex !ha! i! -eomes di##iul! !o
proess usin$ !radi!ional da!a proessin$ applia!ions;0?. @rom a maro perspe!i/e, -i$ da!a an -e
re$arded as a -ond !ha! su-!ly onne!s and in!e$ra!es !he physial world, !he human soie!y, and
y-erspae. ere !he physial world has a re#le!ion in y-erspae, em-odied as -i$ da!a, !hrou$h'n!erne!, !he 'n!erne! o# )hin$s, and o!her in#orma!ion !ehnolo$ies, while human soie!y $enera!es i!s
-i$ da!a5-ased mappin$ in y-erspae -y means o# mehanisms li*e humanompu!er in!er#aes, -rain
mahine in!er#aes, and mo-ile 'n!erne! 0. 'n !his sense, -i$ da!a an -asially -e lassi#ied in!o !wo
a!e$ories, namely, da!a #rom !he physial world, whih is usually o-!ained !hrou$h sensors, sien!i#i
experimen!s and o-ser/a!ions suh as -iolo$ial da!a, neural da!a, as!ronomial da!a, and remo!e
sensin$ da!a, and da!a #rom !he human soie!y, whih is o#!en auired #rom suh soures or domains as
soial ne!wor*s, 'n!erne!, heal!h, #inane, eonomis, and !ranspor!a!ion.
Apahe adoop is a so#!ware #ramewor* !ha! suppor!s da!a5in!ensi/e dis!ri-u!ed applia!ions under a
#ree liense. '! has -een used -y many -i$ !ehnolo$y ompanies, suh as AmaFon, @ae-oo*, Gahoo and
'BM. adoop 01 is -es! *nown #or Map9edue and i!s dis!ri-u!ed #ile sys!em D@S. Map9edue idea is
men!ioned in a Hoo$le paper 011, !o -e simply !he !as* o# Map9edue is ano!her proessin$ o# di/ide and
onur. adoop 0 is aimed a! pro-lems !ha! reuire examina!ion o# all !he a/aila-le da!a. @or example,
!ex! analysis and ima$e proessin$ $enerally reuire !ha! e/ery sin$le reord -e read, and o#!en
in!erpre!ed in !he on!ex! o# similar reords. adoop uses a !ehniue alled Map9edue !o arry ou! !his
exhaus!i/e analysis ui*ly. D@S $i/es !he dis!ri-u!ed ompu!in$ s!ora$e pro/ides and suppor!. )hey
are !he !wo main su-proIe!s #or adoop pla!#orm. .adoop se! @'@( al$ori!hm as i!s de#aul! al$ori!hm.
Aordin$ !o our researh o# al$ori!hm #or adoop, we #ound !ha! i! una-le !o sa!is#y !he demand o#
users. =e anno! only *eep !he idea o# #irs! ome #irs! ser/ed. =e need !o !hin* a-ou! !he reuiremen!
#orm !ha! has !he hi$her priori!y, -u! a! !he same !ime we also an *eep !he #airness !o o!her users. )hen
we announed 45en!roids Clus!erin$ al$ori!hm in Bi$ Da!a5adoop Clus!er.
1$1 7AD443 A&D 7DF%4VRVIB
adoop Dis!ri-u!ed @ile Sys!em D@S0 is !he primary s!ora$e sys!em used -y adoop applia!ions.
)he adoop dis!ri-u!ed #ile sys!em is desi$ned !o handle lar$e #iles mul!i5HB wi!h seuen!ial
readJwri!e opera!ion. Eah #ile is -ro*en in!o hun*s, and s!ored aross mul!iple da!a nodes as loal (S
!ra* o# o/erall #ile dire!ory s!ru!ure and !he plaemen! o# hun*s. Da!aNode repor!s all i!s hun*s !o
!he NameNode a! -oo!up. Eah hun* has a /ersion num-er whih will -e inreased #or all upda!e.
)here#ore, !he NameNode *now i# any o# !he hun*s o# a Da!aNode is s!ale !hose s!ale hun*s will -e
$ar-a$e olle!ed a! a la!er !ime. )o read a #ile, !he lien! A"' will alula!e !he hun* index -ased on !he
o##se! o# !he #ile poin!er and ma*e a reues! !o !he NameNode. )he NameNode will reply whih
Da!aNodes has a opy o# !ha! hun*. @rom !his poin!s, !he lien! on!a!s !he Da!aNode dire!ly wi!hou!
$oin$ !hrou$h !he NameNode.
)o wri!e a #ile, lien! A"' will #irs! on!a! !he NameNode who will desi$na!e one o# !he replia as !he
primary -y $ran!in$ i! a lease. )he response o# !he NameNode on!ains who is !he primary and who are
!he seondary replias. )hen !he lien! push i!s han$es !o all Da!aNodes in any order, -u! !his han$e is
s!ored in a -u##er o# eah Da!aNode. A#!er han$es are -u##ered a! all Da!aNodes, !he lien! send a
:ommi!; reues! !o !he primary, whih de!ermines an order !o upda!e and !hen push !his order !o all
o!her seondaries. A#!er all seondaries omple!e !he ommi!, !he primary will response !o !he lien!
a-ou! !he suess. All han$es o# hun* dis!ri-u!ion and me!ada!a han$es will -e wri!!en !o an opera!ion
lo$ #ile a! !he NameNode. )his lo$ #ile main!ain an order lis! o# opera!ion whih is impor!an! #or !he
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
3/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
2 - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
NameNode !o reo/er i!s /iew a#!er a rash. )he NameNode also main!ain i!s persis!en! s!a!e -y re$ularly
he*5poin!in$ !o a #ile. 'n ase o# !he NameNode rash, a new NameNode will !a*e o/er a#!er res!orin$
!he s!a!e #rom !he las! he*poin! #ile and replay !he opera!ion lo$.
Figure 1: HDFS
1$2 A3RD@C4VRVIB
)he Map9edue #rame wor* 03 onsis!s o# a sin$le mas!er Ko-)ra*er and one sla/e )as*)ra*er per
lus!er node. )he mas!er is responsi-le #or shedulin$ !he Io-s6 omponen! !as*s in !he sla/es,
moni!orin$ !hem, and re5exeu!in$ any #ailed !as*s. )he sla/es exeu!ed !he !as*s as dire!ed -y !he
mas!er. As men!ioned, Map9edue applia!ions are -ased on a mas!er5sla/e model 0
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
4/12
International J
Volume 2, Is
' - . 2"1#, IJAFRC All Ri/hts Re
II$ RA5DB4R
adoop&s Map9edue opera!ion is
similar !o ordinary, non5preemp!i
!as* is assi$ned. As wha! ' ha/e lea
2$1 FIR%5 I& FIR%5 4@5 (FIF4)?
)his expression desri-es !he prin
-y orderin$ proess -y #irs! ome,
in nex! wai!s un!il !he #irs! is #inish
2$2 R4@&D R48I& (RR)?
'n ompu!er opera!ion, one me!ho
!he ompu!er is !o limi! eah proe
ano!her proess a !urn or O!ime5sl
2$' 7IE75 3RI4RI5 FIR%5 (73F)?
)he al$ori!hm shedulin$ proess,
"riori!y se!!in$ #or !he num-er o#
o# rea!ion is -ased on !he ini!ial
proess anno! -e han$ed durin$
rea!e an ini!ial priori!y !o de!erm
hara!eris!is han$e.
2$ BIE75D R4@&D R48I& (BRR
=ei$h!ed 9ound 9o-in is a shed
ueue in a ne!wor* in!er#ae arin#ini!esimal amoun!s o# da!a #ro
nonemp!y ueue.
urnal of Advance Foundation and Researc
ue 12, Decemer ! 2"1#$I%%& 2' * #'
served
Figure 2: MapReduce
ini!ia!i/e reues!in$ #rom !as*!ra*er !o Io-
e shedulin$ opera!in$ sys!em, whih is ann
ned a-ou! !he adoop al$ori!hms, !here are #
iple o# a ueue proessin$ !ehniue or ser/i
#irs! ser/ed -eha/ior, wha! omes in #irs! is h
d. )his is !he de#aul! al$ori!hm in adoop.
o# ha/in$ di##eren! pro$ram proess !a*e !ur
ss !o a er!ain shor! !ime period, !hen suspen
eO. )his is o#!en desri-ed as round5ro-in p
eah will -e assi$ned !o handle !he hi$hes!
!a!i when i! an -e dynami. S!a!i priori!y
hara!eris!is o# !he proess or user reuir
opera!ion. Dynami priori!y num-er re#ers
ne when !he #irs! num-er, a#!er runnin$ in !h
?
ulin$ disipline. Eah pa*e! #low or onne
. '! is !he simples! approxima!ion o# $eneraeah nonemp!y ueue, =99 ser/es a nu
in Computer (IJAFRC)
, Impact Factor * 1$'1+
000$ifarc$or/
!ra*er .)he priniple is
o! -e in!errup! one !he
ur lassi al$ori!hms.
in$ on#li!in$ demands
ndled #irs!, wha! omes
s usin$ !he resoures o#
in$ !ha! proess !o $i/e
oess shedulin$.
priori!y ready proess.
um-er is in !he proess
men!s iden!i#ied in !he
!o !he proess and !hen
e proess as !he proess
!ion has i!s own pa*e!
liFed. =hile H"S ser/es-er o# pa*e!s #or eah
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
5/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
- . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
III$ DI%3ARA5&%% ABAR %C7D@I&EA33R4AC7I&7AD443C@%5R&VIR4&&5
)his se!ion explains !he o/erall #low desrip!ion o# !he proposed Dispara!eness5aware shedulin$
approah in lus!er en/ironmen!. 'ni!ially, !he Dispara!eness lus!er en/ironmen! is rea!ed alon$ wi!h!he proper!ies o# resoure suh as resoure !ype, proessin$ speed, and !he memory. 'n order !o a/oid !he
shedulin$ delay, !he sys!em needs !o #orm a lus!er usin$ !he 45en!roids lus!erin$. Dependin$ up on
hi$her priori!ies, !he node will mo/e !o !he lus!er. @ur!hermore, !he osine similari!y is #indin$ ou! !o
ompu!e !he lus!ers. A#!er aomplishin$ !he lus!er, !he #i!ness #un!ion is es!ima!ed wi!h !he
onsidera!ion o# load, ener$y, and !ime #or eah lus!er. )hus, !he lus!ers are sheduled and !hen
exeu!ed a#!er uploadin$ !he load. (ne any #ailure ours durin$ !he proess, !he /alue mus! reompu!ed
usin$ "S( and predi!ed ano!her op!imal node. @i$ure 3 depi!s !he o/erall #low dia$ram o# !he proposed
me!hodolo$y. )he maIor omponen!s o# !he proposed sys!em are -rie#ly disussed as #ollows
%84 D%CRI35I4&(,) )ime reuired #or i
!h lus!er omple!ion in r!h lus!er
resoure
(,) Load o# r!hlus!er resoure a! i!hlus!er su-mission(,) Ener$y reuired #or i!hlus!er in r!hlus!er resoure 'n/o*in$ )ime o# r!hlus!er resoure
(,) Exeu!in$ )ime o# i!hlus!er in r!hlus!er resoure(,) 9e!rie/in$ )ime o# i!hlus!er in r!hlus!er resoure
i!hClus!er SiFe
Clus!er "roessin$ Speed o# r!hlus!er resoure
I!hClus!er SiFe SiFe o# r!hClus!er 9esoure
'n/o*e Ener$y o# r!hlus!er resoure(,) Exeu!ion Ener$y o# i!hlus!er in r!hlus!er resoure r!hClus!er 9esoure "roessin$ Ener$y
Table 1. Symbols and its descriptions
'$1 !C&5R4ID% C@%5RI&E
)he well5*nown lus!erin$ pro-lem an -e sol/ed -y 45means, whih is one o# !he simples! unsuper/isedlearnin$ al$ori!hms. Assume !he 4 num-er o# lus!ers #or lassi#yin$ a $i/en lus!er proessor in a simple
and easy way. 45means lus!erin$ does no! ha/e a $uaran!ee #or op!imal solu!ion as !he per#ormane is
-ased on an ini!ial en!roids. )hus, !he proposed sys!em uses !he par!i!ionin$ lus!erin$, say, 45en!roids
lus!erin$ as desri-ed in !he #ollowin$ al$ori!hm. )a-le ' shows !he no!a!ion and i!s desrip!ion !ha!
employed in !he proposed sys!em
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
6/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
# - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
Figure 3 An oerall !lo" diagram o! t#e proposed $%centroids clustering &n Hadoop 'luster.
Al$ori!hm 1 45Cen!roids Clus!erin$
'npu! Clus!er "roessorsC"n, * /alueBe$in
'ni!ialiFe * en!roids C*
@or i P +, 1, 2Qn !hen JJ Clus!er 9esoure "roper!y
%1=9)i, C"Si, C"SiiR@or I P +, 1, 2Q* !hen JJ 45Cen!roid6s "roper!y
%2= 9)I, C"SI, C"SiIRSimiI= CS(%1, %2)= %1 .%2
|%1
|.|%2
|R
End @or I
Min'ndex P +R
Min P Simi+Curr'ndex P +R
=hile Curr'ndex * !hen
'#SimiCurr_'ndex< !henMin = SimiCurr_'ndex
Min'ndex P Curr'ndexR
End '#
Curr'ndexTTR
End =hile
Se!Clus!er'D Min'ndexR
End #or iR
Create Hadoop
Cluster
Get
Compute Fitness function
by Load Energy and Time
CLUSTERING
Compute K-
Find cosine
Compute
Get Minimum value
node in each cluster
Schedule
Upload
Execute all
Recomputed the nodes by
Predict the Optimal
If
Failu
re
Get Resource
Properties
Resource
Processing
Memory
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
7/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
- . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
End Be$inR
)he inpu!s !a*en #or !he a-o/e 45en!roids lus!erin$ al$ori!hm are !he Clus!er proessors and !he 4
/alue. A! #irs!, !he 4 num-er o# en!roids are ini!ialiFed and !he /e!ors %1and %2 are de#ined wi!h
respe! !o !he re!rie/in$ !ime, Clus!er proessin$ speed, and siFe o# Clus!er resoure. Dependin$ up on!he /e!ors, !he osine similari!y is es!ima!ed. )hen, !he similari!y measures /eri#ies #or !he minimum
/alue o# !he urren! index. '# !he /alue is said !o -e less, !hen i! se!s as a minimum /alue. '! he*s un!il
!he urren! index is less !han !he * num-er o# en!roids. )he minimum index is se! as !he lus!er 'D.
@ur!her, !he #i!ness #un!ion is es!ima!ed un!il all Io-s are sheduled -y !he onsidera!ion o# load, ener$y,
and !ime #or eah lus!er. '! is de#ined as
() = (,)+ (,)+ (,) (1)
'$1$15I C43@5A5I4&
)he !ime ompu!a!ion #or i!h lus!er omple!ion in r!h Clus!er resoure is desri-ed as
(,)= + (,)+ (,) (2)=here, !he Exeu!in$ )ime o# i!h lus!er in r!h Clus!er resoure and 9e!rie/in$ )ime o# i!h lus!er in r!h
Clus!er resoure are $i/en respe!i/ely as in eua!ion 3 and eua!ion .
(,)=
(3)
(,)=
!+ "! (4)erein, ! is !he -andwid!h #or !he sheduler !o !he reei/er and "!men!ions !he delay !ha! ours-e!ween !he sheduler and !he reei/er #or ne!wor* ommunia!ion.
'$2$2 4AD C43@5A5I4&)he load ompu!a!ion o# r!hClus!er resoure a! i!hlus!er su-mission is as #ollows
(,)= # $%0 &'() (5)
'$2$'&RE C43@5A5I4&
An ener$y reuired #or i!h lus!er in r!h lus!er resoure is de#ined as
(,) = + (,) (6)=here, !he Exeu!ion Ener$y o# i!h lus!er in r!h Clus!er resoure is $i/en as
= *
(7)
Based on !he onsidera!ion o# !ime, load, and ener$y, !he proposed es!ima!ion o# #i!ness #un!ion has !he
#ollowin$ ad/an!a$es
)he shedulin$ delay an -e a/oided in !he ini!ial s!a$e.
MinimiFe !he ompu!a!ion os!
Derease !he exeu!ion !ime
'$2$ 3%48A%D DI%3ARA5&%%!ABAR %C7D@I&E 4D
)he heuris!i op!imiFa!ion al$ori!hms are widely used #or sol/in$ a wide /arie!y o# N"5omple!e
pro-lems. "S( is onsidered as !he la!es! e/olu!ionary op!imiFa!ion !ehniues, whih has #ewer
al$ori!hm parame!ers !han !he o!her approahes.Algorit#m 2: Disparateness%A"are Sc#eduling using (S)
Input Clus!erLis! Kn
, Clus!er 9esoure C9m
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
8/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
+ - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
4utput Alloa!ed Clus!er Lis! ALn
8e/in
Forx P +, 1, 2 Q n then
)H9 P UR JJ )emporary Clus!er 9esoure lis!
ForyP +, 1, 2 Q m thenIf K
x. 9!ype(). eualsC9y. 9!ype()then)C9.add C9y)nd IFG
nd ForyR
SC9 P "S( )C9,Kx JJ Sele!ed Clus!er resoure
Kx. se!lus!er9esoure(SC9)
ALx- SH9nd ForxR
nd 8e/inG
)he inpu!s !a*en #or !his model are lus!er lis! and !he lus!er resoure. 'ni!ially, !he !emporary lus!er
resoure lis! is emp!y and !he similari!y should -e #ur!her /eri#ied. '! he*s whe!her !he resoure !ype o#
lus!er lis! is similar !o !he resoure !ype o# lus!er resoure. '# !he /eri#ia!ion is similar, !hen !he lus!er
resoures are added !o !he !emporary lus!er resoure lis!. )he !ehniue o# "S( is applied #or !he
!emporary lus!er resoure lis! and !he lus!er lis! !o aomplish !he sele!ed lus!er resoure. )he #inal
ou!pu! o-!ained in !his shedulin$ model is !he alloa!ed lus!er resoure alon$ wi!h !he sele!ed lus!er
resoure. =hen a #ailure ours durin$ !his proess, i! an -e reompu!ed -y "S( and !he o!her op!imal
node will -e predi!ed. )he ad/an!a$es o/er !he he!ero$eneous5aware shedulin$ model are
MinimiFe !he num-er o# #ailures
'nreases !he resoure u!iliFa!ion
IV$ 3RF4RA&CA&A%I%
)his se!ion ompares !he per#ormane o# !he proposed Dispara!eness5Aware Shedulin$ DAS
al$ori!hm wi!h !wo exis!in$ shedulin$ al$ori!hms 7ei/ht 3riorit9 First (73F)H1' and Bei/hted
Round Roin (BRR)H1'.)he per#ormane me!ris used #or !he analysis are sys!em relia-ili!y,
shedulin$ delay, shedulin$ len$!h, speed up, ener$y onsump!ion, and #ailure ra!io wi!h respe! !o !he
num-er o# lus!ers. Due !o !he dynami resoure a/aila-ili!y, !he -eha/ior o# shedulin$ al$ori!hms on
real lus!er pla!#orms is no! pra!ial. )he simula!ion is !he op!imum hoie #or !es!in$ and omparin$
!he shedulin$ al$ori!hms, where !he experimen!s on real pla!#orms are o#!en non5reprodui-le. )hus, anex!ensi/e simula!ion en/ironmen! o# lus!er sys!em is -uil! as in )a-le ''.
5RIC% 3ARA5R VA@
Computin/
&ode
Num-er o# Clus!er 9esoure 7 5 7+
Num-er o# os! per Clus!er 9esoure 7 5 1+
"roessin$ Speed M'"S 1+++ 5 7+++
Num-er o# 9esoure )ypes 2 5 7
Communication Bandwid!h MB"S 1++ 5 7++La!enyms 7 5 1+
Clusters Num-er o# Clus!ers 1+ 5 1++
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
9/12
International J
Volume 2, Is
- . 2"1#, IJAFRC All Ri/hts Re
$1%%5 RIA8II5 V%$&@8R
)he relia-ili!y o# !he sys!em a
ma!hema!ially de#ined as #ollows
ere, /is !he dis!ri-u!ion o# !@i$.2 desri-es !he sys!em relia-ili
!he proposed DAS model. )he sys
approahes, where i!s /alue is $rad
Figure * T#e result o!
$2 %C7D@I&E &E57 V%$&@8
Shedule len$!h is measured as
=
Figure +. T#e result o
0.
1.
2.
System
reliability
urnal of Advance Foundation and Researc
ue 12, Decemer ! 2"1#$I%%& 2' * #'
served
iFe o# Clus!ers M'1+
7+
Table 2. Simulation parameters
4F C@%5R%
-e alula!ed -y !he a/era$e relia-ili!y
=# /01%1
(8)e relia-ili!y pro-a-ili!y o# lus!ers'1, and ni
y in !erms o# !he num-er o# lus!ers #or !he ex
!em relia-ili!y o# DAS model is $rea!er !ha
ually dereased wi!h respe! !o !he num-er o#
ystem reliability "it# respect to t#e numbe
4F C@%5R%
max('1), '2, 2 , '0 9
sc#edule lengt# "it# respect to t#e number
0
5
1
5
2
5
10 20 30 40 50 60 70 80 90Number of Clusters
HPF WRR DAS
in Computer (IJAFRC)
, Impact Factor * 1$'1+
000$ifarc$or/
++5
++
o# all lus!ers and is
s !he num-er o# lus!ers.
is!in$ "@ and =99 and
!he o!her !wo exis!in$
lus!ers.
r o! clusters
o! 'lusters.
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
10/12
International J
Volume 2, Is
- . 2"1#, IJAFRC All Ri/hts Re
@i$. 7 depi!s !he resul! o# !he sh
=99 and !he proposed DAS model.
$' %3D @3 V%$&@8R 4F C@%5
)he speed up is !he ra!io o# !he se
is ompu!ed as #ollows
@i$.< shows !he resul! o# speed up
and !he proposed DAS model. )he s
approahes, where i!s /alue is $rad
Figure ,T#e resul
$' %C7D@I&E DA V%$&@8R
Figure - T#e result o! sc#e
@i$.? shows !he resul! o# sheduli
9DS2 and MCMS and !he propose!he o!her !wo shedulin$ al$ori!hm
$&RE C4&%@35I4& V%$&@8
urnal of Advance Foundation and Researc
ue 12, Decemer ! 2"1#$I%%& 2' * #'
served
edule len$!h in !erms o# num-er o# lus!ers
)he shedule len$!h is lower !han !he o!her !
R%
en!ial exeu!ion !ime !o !he shedule len$!h
3
# # 450
46789././9.
1+
i!h respe! !o !he num-er o# lus!ers #or !he
peed up o# AS model is hi$her !han !he o!he
ually inreasin$ wi!h re$ard !o !he num-er o#
t o! speed up "it# respect to t#e number o!
4F C@%5R R%4@RC
uling delay "it# respect to t#e number o! c
$ delay wi!h respe! !o !he num-er lus!er
d AS model. )he proposed sys!em has a lows.
R 4F C@%5R R%4@RC
in Computer (IJAFRC)
, Impact Factor * 1$'1+
000$ifarc$or/
or !he exis!in$ "@ and
o exis!in$ sys!ems.
# !he ou!pu! shedule. '!
xis!in$ "@ and =99
!wo exis!in$
lus!ers.
lusters.
uster resources$
esoure #or !he exis!in$
er shedulin$ delay !han
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
11/12
International J
Volume 2, Is
+" - . 2"1#, IJAFRC All Ri/hts Re
@i$. shows !he resul! o# ener$y
exis!in$ "@ and =99 and !he pr
!he o!her !wo shedulin$ al$ori!h
resoure.
Figure T#e result o! energ
1$' FAI@R RA5I4 V%$&@8R 4F
)he resul! o# !he shedulin$ delay
=99 and !he proposed DAS model
!he o!her !wo exis!in$ shedulin$ a
Figure /T#e result o! !a
V$ C4&C@%I4&A&DF@5@R
)his paper proposes a :Dispara!e
researh wor* we represen! 45C
approah is mainly #oused on !he
sys!em relia-ili!y. )he adoop l
shedulin$ delay in !he adoop lmehanism is in!rodued alon$ wi
!hese !hree parame!ers, !he op!i
urnal of Advance Foundation and Researc
ue 12, Decemer ! 2"1#$I%%& 2' * #'
served
onsump!ion wi!h respe! !o !he num-er
posed DAS model. )he proposed sys!em o
s. '!s /alue $radually inreases in re$ards !
consumption "it# respect to t#e number o
C@%5R R%4@RC
wi!h respe! !o !he num-er lus!er resoure
is shown in @i$.. )he proposed sys!em has
l$ori!hms.
lure ratio "it# respect to t#e number o! 'lu
4R
ess5Aware Shedulin$ al$ori!hm; in !he lus
n!roids lus!erin$ in -i$ da!a mehanism #
ner$y onsump!ion in !he adoop lus!er, w
s!er onsis!s o# resoures whih are a!e$or
s!er usin$ !he 45Cen!roids lus!erin$ al$ori!h !he onsidera!ion o# load, ener$y, and ne!
iFed #i!ness #un!ion is employed #or "ar!i
in Computer (IJAFRC)
, Impact Factor * 1$'1+
000$ifarc$or/
lus!er resoure #or !he
sumes less ener$y !han
o !he num-er o# lus!er
cluster resource.
or !he exis!in$ "@ and
lower #ailure ra!io !han
ter resource.
!er en/ironmen!. 'n !his
r adoop lus!er. )his
ih helps !o inrease !he
iFed #or minimiFin$ !he
m. A no/el pro/isionin$or* !ime. By in!e$ra!in$
le Swarm (p!imiFa!ion
7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster
12/12
International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+
+1 - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/
"S( !o sele! !he ompu!in$ node. @ailure may our a#!er omple!ion o# !he suess#ul exeu!ion in !he
ne!wor*. )o impro/e !he #aul! !olerane ser/ie, !he mi$ra!ion o# !he lus!er is #oused on !he par!iular
#ailure node. )his an reompu!ed !he node -y "S( and !he orrespondin$ op!imal node is predi!ed. )he
experimen!al resul!s exhi-i! -e!!er shedulin$ len$!h, shedulin$ delay, speed up, #ailure ra!io, ener$y
onsump!ion !han !he exis!in$ sys!ems.
VI$ RFR&C%
01 %. Mayer5Shon-er$er, 4. Cu*ier, Bi$ Da!a A 9e/olu!ion )ha! =ill )rans#orm ow =e Li/e, =or*,
and )hin*, ou$h!on Mi##lin arour!, 2+13.
02 A. CuFForea, "ri/ay and seuri!y o# -i$ da!a urren! hallen$es and #u!ure researh
perspe!i/es, in "roeedin$s o# !he @irs! 'n!erna!ional =or*shop on "ri/ay and Seuri!yo# Bi$
Da!a, "SBD 61, 2+1.
03 Bi$ da!a, Na!ure 77?2+V 2++ 113