+ All Categories
Home > Documents > Application and Analysis of Multicast Blocking Modelling in...

Application and Analysis of Multicast Blocking Modelling in...

Date post: 20-Mar-2018
Category:
Upload: lamphuc
View: 214 times
Download: 2 times
Share this document with a friend
13
Research Article Application and Analysis of Multicast Blocking Modelling in Fat-Tree Data Center Networks Guozhi Li, Songtao Guo , Guiyan Liu, and Yuanyuan Yang College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China Correspondence should be addressed to Songtao Guo; [email protected] Received 23 May 2017; Accepted 8 November 2017; Published 11 January 2018 Academic Editor: Dimitri Volchenkov Copyright © 2018 Guozhi Li et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Multicast can improve network performance by eliminating unnecessary duplicated flows in the data center networks (DCNs). us it can significantly save network bandwidth. However, the network multicast blocking may cause the retransmission of a large number of data packets and seriously influence the traffic efficiency in data center networks, especially in the fat-tree DCNs with multirooted tree structure. In this paper, we build a multicast blocking model and apply it to solve the problem of network blocking in the fat-tree DCNs. Furthermore, we propose a novel multicast scheduling strategy. In the scheduling strategy, we select the uplink connecting to available core switch whose remaining bandwidth is close to and greater than the three times of bandwidth multicast requests so as to reduce the operation time of the proposed algorithm. en the blocking probability of downlink in the next time-slot is calculated in multicast subnetwork by using Markov chains theory. With the obtained probability, we select the optimal downlink based on the available core switch. In addition, theoretical analysis shows that the multicast scheduling algorithm has close to zero network blocking probability as well as lower time complexity. Simulation results verify the effectiveness of our proposed multicast scheduling algorithm. 1. Introduction Recently, data center networks (DCNs) have been widely studied in both academia and industry due to the fact that their infrastructure can support various cloud computing services. e fat-tree DCN, as a special instance and variation of the Clos networks, has been widely adopted as the topology for DCNs since it can build large-scale traffic networks by only using fewer switches [1]. Multicast transmission is needed for efficient and simul- taneous transmission of the same information copy to a large number of nodes, which is driven by many applications that benefit from execution parallelism and cooperation, such as the MapReduce type of application for processing data [2]. In fact, multicast is the parallel transmission of the data packets in complex network. For example, Google File System (GFS) is a distributed file system for massive data-intensive application in a multicast transmission manner [3]. ere have been some studies on multicast transmission in fat-tree DCNs. e stochastic load-balanced multipath routing (SLMR) algorithm selects optimal path by obtaining and comparing the oversubscription probabilities of the candidate links, and it can balance traffic among multiple links by minimizing the probability of each link to face network blocking [4]. But the SLMR algorithm only studies unicast traffic. e bounded congestion multicast scheduling (BCMS) algorithm, an online multicast scheduling algo- rithm, is able to achieve bounded congestion as well as efficient bandwidth utilization even under worst-case traffic conditions in a fat-tree DCN [5]. Moreover, the scheduling algorithm fault rate (SAFR) reflects the efficiency level of scheduling algorithm. e larger the SAFR is, the lower efficiency the scheduling algorithm has. e SAFR in fat- tree DCNs increases faster with network blocking rate (NBR) compared with that in other DCNs as shown in Figure 1. In fact, the NBR reflects the degree of network blocking [6]. e scheduling processes in the existing scheduling algo- rithms [4–6] are based on the network state at current time- slot. ey do not consider that network state may change when data flows begin to transfer aſter the current scheduling Hindawi Complexity Volume 2018, Article ID 7563170, 12 pages https://doi.org/10.1155/2018/7563170
Transcript

Research ArticleApplication and Analysis of Multicast Blocking Modelling inFat-Tree Data Center Networks

Guozhi Li Songtao Guo Guiyan Liu and Yuanyuan Yang

College of Electronic and Information Engineering Southwest University Chongqing 400715 China

Correspondence should be addressed to Songtao Guo guosongtaocqueducn

Received 23 May 2017 Accepted 8 November 2017 Published 11 January 2018

Academic Editor Dimitri Volchenkov

Copyright copy 2018 Guozhi Li et alThis is an open access article distributed under theCreative CommonsAttribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Multicast can improve network performance by eliminating unnecessary duplicated flows in the data center networks (DCNs)Thus it can significantly save network bandwidth However the network multicast blocking may cause the retransmission of alarge number of data packets and seriously influence the traffic efficiency in data center networks especially in the fat-tree DCNswith multirooted tree structure In this paper we build a multicast blocking model and apply it to solve the problem of networkblocking in the fat-tree DCNs Furthermore we propose a novel multicast scheduling strategy In the scheduling strategy we selectthe uplink connecting to available core switch whose remaining bandwidth is close to and greater than the three times of bandwidthmulticast requests so as to reduce the operation time of the proposed algorithm Then the blocking probability of downlink in thenext time-slot is calculated in multicast subnetwork by using Markov chains theory With the obtained probability we select theoptimal downlink based on the available core switch In addition theoretical analysis shows that themulticast scheduling algorithmhas close to zero network blocking probability as well as lower time complexity Simulation results verify the effectiveness of ourproposed multicast scheduling algorithm

1 Introduction

Recently data center networks (DCNs) have been widelystudied in both academia and industry due to the fact thattheir infrastructure can support various cloud computingservicesThe fat-tree DCN as a special instance and variationof theClos networks has beenwidely adopted as the topologyfor DCNs since it can build large-scale traffic networks byonly using fewer switches [1]

Multicast transmission is needed for efficient and simul-taneous transmission of the same information copy to a largenumber of nodes which is driven by many applications thatbenefit from execution parallelism and cooperation such asthe MapReduce type of application for processing data [2]In fact multicast is the parallel transmission of the datapackets in complex network For example Google File System(GFS) is a distributed file system for massive data-intensiveapplication in a multicast transmission manner [3]

There have been some studies on multicast transmissionin fat-tree DCNs The stochastic load-balanced multipath

routing (SLMR) algorithm selects optimal path by obtainingand comparing the oversubscription probabilities of thecandidate links and it can balance traffic among multiplelinks by minimizing the probability of each link to facenetwork blocking [4] But the SLMR algorithm only studiesunicast trafficThe bounded congestion multicast scheduling(BCMS) algorithm an online multicast scheduling algo-rithm is able to achieve bounded congestion as well asefficient bandwidth utilization even under worst-case trafficconditions in a fat-tree DCN [5] Moreover the schedulingalgorithm fault rate (SAFR) reflects the efficiency level ofscheduling algorithm The larger the SAFR is the lowerefficiency the scheduling algorithm has The SAFR in fat-tree DCNs increases faster with network blocking rate (NBR)compared with that in other DCNs as shown in Figure 1 Infact the NBR reflects the degree of network blocking [6]

The scheduling processes in the existing scheduling algo-rithms [4ndash6] are based on the network state at current time-slot They do not consider that network state may changewhen data flows begin to transfer after the current scheduling

HindawiComplexityVolume 2018 Article ID 7563170 12 pageshttpsdoiorg10115520187563170

2 Complexity

0 4 8 12 16 200

10

20

30

40

NLR ()

SAFR

()

Fat-treeDCellBCube

Figure 1 The relationship between network blocking rate (NBR)and scheduling algorithm fault rate (SAFR) in different DCNs

process is finished This may lead to the network loadimbalance because the bandwidth of multicast connectionhas not been allocated dynamically [7]Therefore we developan efficient multicast scheduling algorithm to achieve thescheduling of network flows at the network state of next time-slot in fat-tree DCNs

However since the network state at next time-slot isprobabilistic and not deterministic it is difficult to predictthe network state of next time-slot from the present statewith certainty and find a deterministic strategy The Markovchains can be employed to predict network state even thoughstate transition is probabilistic [8] Thus the next networkstates can be assessed by the set of probabilities in a Markovprocess [9] The evolution of the set of probability essentiallydescribes the underlying dynamical nature of a network [10]In [11] the authors proposed a scheme by using Markovapproximation which aims at minimizing themaximum linkutilization (ie the link utilization of the most blocked link)in data center networks Moreover the scheme provides twostrategies that construct Markov chains with different con-nection relationships The first strategy just applies Markovapproximation to data center traffic engineering The secondstrategy is a local search algorithm that modifies Markovapproximation

In this paper we adopt Markov chains to deduce the linkblocking probability at next time-slot and take them as linkweight in the multicast blocking model in fat-tree DCNsTherefore available links are selected based on the networkstate at next time-slot and the optimal downlink are selectedby the link weight In the downlink selection we comparethe blocking probability and choose the downlinks withlowest blocking probability at next time-slot which avoidsMSaMC failure due to delay error In particular we findthat the remaining bandwidth of the selected uplinks is closeto and greater than the three times of multicast bandwidthrequests which can reduce the algorithm execution time and

save bandwidth consumptionTheoretical analysis shows thecorrectness of the strategy while simulation results show thatMSaMC can achieve higher network throughput and loweraverage delay

The contributions of the paper can be summarized asfollows

(i) We analyzewhymulticast blocking occurs in practicalapplication Afterwards we present a novel way ofmulticast transmission forecasting and the multicastblocking model in fat-tree DCNs

(ii) We put forward a multicast scheduling algorithm(MSaMC) to select the optimal uplinks and down-links MSaMC not only ensures lower network block-ing but also maximizes the utility of network band-width resources

(iii) Theoretical analysis shows that the link blockingprobability is less than 13 by our proposed MSaMCalgorithm and the multicast network can be non-blocking if the link blocking probability is less than01

The rest of the paper is organized as follows Section 2describes the detrimental effects of multicast blocking infat-tree DCNs Section 3 establishes the multicast blockingprobability model in fat-tree DCNs and deduces the linkblocking probability at next time-slot based Markov chainsIn Section 4 we propose multicast scheduling algorithmwith Markov chains (MSaMC) and analyze the complexityof MSaMC algorithm in Section 5 In Section 6 we evaluatethe performance of MSaMC by simulation results FinallySection 7 concludes this paper

2 Cause of Multicast Blocking

A fat-tree DCN as shown in Figure 2 is represented as atriple 119891(119898 119899 119903) where 119898 and 119903 denote the number of coreswitches and edge switches respectively and 119899 indicates thenumber of servers connecting to an edge switch In fat-treeDCNs all links are bidirectional and have the same capacityWe define the uplink as the link from edge switch to coreswitch and the downlink as the link from core switch toedge switch A multicast flow request 120596 can be abstracted asa triple (119894 119863 120596) where 119894 isin 1 2 119903 is the source edgeswitch and 119863 denotes the set of destination edge switchesby the multicast flow request 120596 The number of destinationedge switches with multicast flow request 120596 is represented as|119863| |119863| le 119903 minus 1 which is denoted as fanout 119891 Note thatthe servers connecting to the same edge switch can freelycommunicate with each other and the intraedge switch trafficcan be ignored Hence both aggregation and edge layer canbe seen as edge layer

To illustrate the disadvantages of multicast blocking infat-treeDCNs a simple traffic pattern in a small fat-treeDCNis depicted in Figure 3 Suppose that there are two multicastflow requests 1205961 and 1205962 and every flow request looks foravailable links by identical scheduling algorithm Both flow1205961 and flow 1205962 have a source server and two destinationservers located at different edge switches and the sum of both

Complexity 3

Core (1 middot middot middot m)

Aggregation (1 middot middot middot r)

Edge (1 middot middot middot r)

Figure 2 The topology of fat-tree DCNs

Core 1 Core 2

Edge 1 Edge 2 Edge 3

1

2

Figure 3 The cause of multicast blocking

is greater than the available link bandwidth In particularflow 1205961 and flow 1205962 forward through core switch 1 at thesame time and are routed from core switch 1 to edge switch2 through the same link by the scheduling algorithm whichwill cause heavy blocking at the links connected to coreswitch 1 Therefore the available bandwidth to each flow willsuffer further reduction if the scheduler cannot identify heavymulticast blocking in the fat-tree DCNs

Figure 3 also explains the main reason of multicastblocking We can see that multicast blocking has occurredat the link between core switch 1 and edge switch 2 Clearlybefore the blocking at the link is alleviated other links cannotrelease the occupied bandwidth This means that the linksfrom edge switch 1 to core switch 1 from edge switch 1 to coreswitch 2 from core switch 2 to edge switch 3 and fromedge switch 3 to core switch 1 are released until the multicastblocking is alleviated However the fat-tree DCNs cannotaccept the long time to address the blocking due to therequirement for low latency

In the fat-tree DCNs different source servers may exe-cute scheduling algorithm in the same time so that they

may occupy the same link and the multicast blocking willinevitably occur Hence the multicast blocking is a commonphenomenon in the applications of DCN so that networkperformance will be reduced In addition there are alsomanyservers as hotspots of user access which may cause data flowtransfer by many to one In fact the key reason of multicastblocking is that the network link state at next time-slot is notconsidered Several works have been proposed to solve thenetwork blocking in the transmission of multicast packetsin DCNs [12 13] As data centers usually adopt commercialswitches that cannot guarantee network nonblocking an effi-cient packet repairing schemewas proposed [12] which relieson unicast to retransmit dropped multicast packets causedby switch buffer overload or switching failure Furthermorethe bloom filter [13] was proposed to compress the multicastforwarding table in switches which avoids the multicastblocking in the data center network

To the best of our knowledge the exiting multicastscheduling algorithms only considered the network state atthe current time-slot in DCNs thus the delay error betweenthe algorithm execution time and the beginning transferringtime of data flow will make the scheduling algorithm invalidBased on the consideration we focus on the study of themulticast scheduling in the network state at next time-slotbased on Markov chains

3 Model and Probability of Multicast Blocking

In the section we first establish the multicast blockingmodel based on the topology of fat-tree DCNs by using asimilar approach Then we deduce the blocking probabilityof available downlinks at next time-slot

31 Multicast Subnetwork A multicast bandwidth requestcorresponds to a multicast subnetwork in fat-tree DCNswhich consists of available core switches and edge switches forthe multicast bandwidth request The multicast subnetworkin Figure 4 has 119891 destination edge switches 119909 availablecore switches and 119899 times 119891 servers where 1 le 119909 le 119898In the process of multicast connection the link weight ofmulticast subnetwork is denoted as the blocking probability

4 Complexity

Core (1 2)

Edge (1 middot middot middot 4)

Server (1 middot middot middot 12)

Multicast flow

Figure 4 The multicast subnetwork

at next time-slot Thus our goal is to obtain the link blockingprobability for any type of multicast bandwidth request atnext time-slot

It is known that the fat-tree DCN is a typical large-scalenetwork where there are many available links that can meetthe multicast connection request When a link is available fora multicast bandwidth request 120596 the blocking probability ofthe link at the current time-slot is given by 119901 = 120596120583 where 120583is the remaining bandwidth

A multicast connection can be represented by the desti-nation edge switches Given a multicast bandwidth request120596 with fanout 119891 (1 le 119891 lt 119903) 119875(119891) indicates the blockingprobability for this multicast connection We denote theblocking of available uplink 119894 as the events 1199061 1199062 119906119909 andthe blocking of available downlinks between available coreswitches and the 119896th (1 le 119896 le 119891) destination edge switchesas the events 1198891198961 1198891198962 119889119896119909 All available links form amulticast tree rooted at the core switches that can satisfythe multicast connection in the multicast network Othernotations used in the paper are summarized in Notations

32 Multicast Blocking Model In the multicast subnetworkwe employ 120598 to express the event that the request of mul-ticast connection with fanout 119891 cannot be satisfied in thenetwork shown in Figure 4 We do not consider the linkswhose remaining bandwidth is less thanmulticast bandwidthrequest 120596 since the link is not available when the multicastdata flow 120596 goes through the link We let 119875(120598 | 120601) bethe conditional blocking probability of state 120601 and 119875(120601) bethe probability of state 120601 Then the blocking probability ofsubnetwork for a multicast connection is given by

119875 (119891) = 119875 (120598) = sum120601

119875 (120601) 119875 (120598 | 120601) (1)

For the event 120601 the data traffic of the uplinks does notinterfere with each other that is the uplinks are independentTherefore we have 119875(120601) = 119902119896119901119898minus119896

From the multicast blocking subnetwork in Figure 4 wecan obtain the blocking property of the fat-tree DCNs that is

the multicast bandwidth request 120596 from a source edge switchto distinct destination edge switches cannot be achieved ifand only if there is no any available downlink connecting alldestination edge switches

In that way we take 1205981015840 to denote the event that themulticast bandwidth request 120596 with fanout 119891 cannot beachieved in the available uplinks Thus we can get

119875 (1205981015840) = 119875 (120598 | 1199061 1199062 119906119909) (2)

An available downlink 119889119894119895 where 1 le 119894 lt 119891 and 1 le 119895 le 119909represents a link from a core switch to the 119894th destination edgeswitchThe event 1205981015840 can be expressed by events119889119894119895rsquos as follows

1205981015840 = (11988911 cap 11988912 cap sdot sdot sdot cap 1198891119909) cup sdot sdot sdotcup (1198891198911 cap 1198891198912 cap sdot sdot sdot cap 119889119891119909) (3)

Afterwards we define that the blocking of downlinksconnecting to each destination edge switch is event 119860 =1198601 1198601 119860119891 moreover we have 1198601 = (11988911 cap 11988912 cap sdot sdot sdot cap1198891119909) Thus we get

1205981015840 = 119891⋃119894=1

119860 119894 (4)

Based on the theory of combinatorics the inclusion-exclusion principle (also known as the sieve principle) is anequation related to the size of two sets and their intersectionFor the general case of principle in [14] let 1198601 1198602 119860119891be finite set Then we have10038161003816100381610038161003816100381610038161003816100381610038161003816

119891⋃119894=1

11986011989410038161003816100381610038161003816100381610038161003816100381610038161003816 = 119891sum119894=1

1003816100381610038161003816119860 1198941003816100381610038161003816 minus sum1le119894lt119895le119891

10038161003816100381610038161003816119860 119894 cap 11986011989510038161003816100381610038161003816+ sum1le119894lt119895ltℎle119891

10038161003816100381610038161003816119860 119894 cap 119860119895 cap 119860ℎ10038161003816100381610038161003816 minus sdot sdot sdot+ (minus1)119891minus1 100381610038161003816100381610038161198601 cap 1198602 cap sdot sdot sdot cap 11986011989110038161003816100381610038161003816

(5)

Complexity 5

For the events 1198601 1198601 119860119891 in a probability space(Ω 119865 119875) we can obtain the probability of the event 1205981015840119875 (1205981015840) = 119891sum

119894=1

119875 (119860 119894) minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895)+ sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) minus sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)

(6)

where 119875(119860 119894) denotes the probability of the event 119860 119894Combining (1) and (2) with (6) the multicast blocking

model for a multicast connection with fanout 119891 is given by

119875 (119891) = 119898sum119896=1

(119898119896 ) 119901119896119902119898minus119896( 119891sum

119894=1

119875 (119860 119894)minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) + sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ)

minus sdot sdot sdot + (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)) (7)

From (6) sum1le119894lt119895le119891 119875(119860 119894 cap 119860119895) ge sum1le119894lt119895ltℎle119891 119875(119860 119894 cap 119860119895 cap119860ℎ) the following inequality can be derived

sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) minus sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) + sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891) ge 0

(8)

Therefore theminimumblocking probability of the event1205981015840 is119875min (1205981015840) = 119891sum

119894=1

119875 (119860 119894) (9)

where 119875(1198601) = prod119909119896=11199011119896Afterwards we define 119875min(119891) as the minimum blocking

probability of multicast subnetwork and the number ofavailable core switches is 119909 Thus we get

119875min (119891) = 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119875 (119860 119894))

= 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119909prod119895=1

119901119894119895) (10)

where 119909 le 119898It is not difficult to find from (10) that the minimum

blocking probability 119875min(119891) is an increasing sequence withfanout 119891 In other words it is more difficult to realize a mul-ticast bandwidth request with larger fanout since the numberof core switches is less Therefore the minimum blockingprobability with fanout 119891 reflects the state of available linkat next time-slot

33 Link Blocking Probability at Next Time-Slot In thissubsection we calculate the blocking probability of availablelink at next time-slot based on Markov chains theory Werandomly select a link denoted by the 119894th link to analyze

In the multicast blocking model we denote the currenttime-slot as 119905 and the next time-slot as 119905 + 1 119887119894 is the 119894th linkoccupied bandwidth at time-slot 119905 that is 119910119894(119905) = 119887119894 119886(119905) isthe sum of occupied bandwidth of all available downlinks attime-slot 119905 namely 119886(119905) = sum119909119895=1 119910119895(119905) and 119910119894(119905 + 1) refers topredicted occupied bandwidth of the 119894th link at time-slot 119905+1In [15] the preference or uniform selectionmechanism basedonMarkov chains is adopted for calculating the link blockingprobability at next time-slot Based on the mechanism theprobability 119875119894 of the link incoming new flow at time-slot 119905 + 1can be given by

119875119894 = 119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891 (11)

where 1 le 119891 le 119903In addition we do not consider the case that the band-

width of available link is decreasing namely the bandwidthof available link is enough for multicast bandwidth request Ifa multicast bandwidth request selects the 119894th link at time-slot119905 + 1 it means 119910119894(119905 + 1) will add 120587119894 where 1 le 120587119894 le 119872119894119872119894 is defined as increasing the maximum number of dataflows Then we let 119875119887119894 denote the probability of the 119894th linkflow remaining unchanged or increasing at time-slot 119905 + 1thus we can get

119875119887119894 =119872119894sum120587119894=0

[119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891]120587119894

= 119872119894sum120587119894=0

119875120587119894119894 = 1 minus 119875119872119894+11198941 minus 119875119894 (12)

where 119894 = 1 2 119909 and 120587119894 = 0 1 119872119894According to (12) we will calculate one-step transition

probability of amulticast flow denoted as119875(119910119894(119905+1) = 119887119894+120587119894 |119910119894(119905) = 119887119894) which is a Markov process

119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)= 1119875119887119894 sdot (119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891)120587119894

= (1 minus 119875119894) sdot 1198751205871198941198941 minus 119875119872119894+1119894 (13)

where 119894 = 1 2 119909In fact 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894)

indicates the link blocking probability at time-slot 119905 +1 which is determined by 119875119894 and 120587119894 The link blockingprobability will be small when 120587119894 is small at time-slot119905 + 1 otherwise the link may be blocked at time-slot119905 + 1 Therefore the range of 120587119894 is very important to ourproposed multicast scheduling algorithm In this paper weassume that the multicast bandwidth request 120596 is one dataflow unit and 120587119894 is an integral multiple of multicast band-width request 120596

6 Complexity

Input Incoming flow (119894 119863 120596) link remaining bandwidth 120583 the number of destination edge switches |119863| 120587119894 = 3120596Output Multicast links with the minimum blocking probability(1) Step 1 identify available core switches(2) for 119894 = 1 to 119898 do(3) Select an uplink 119906119894(4) if 119906120583119894 ge 3120596 and |119879| le |119863| then(5) Select the core switch 119894 and add it into the set 119879(6) end if(7) end for(8) Step 2 select appropriate core switches(9) Calculate the blocking probability of available downlinks at time-slot 119905 + 1 119875119894(119905 + 1) by equation (13)(10) for 119895 = 1 to |119863| do(11) Find the core switch(es) in 119879 that are connected to a destination edge switch in 119863(12) if There are multiple core switches to be found then(13) Select the core switch with the minimum blocking probability and deliver it to the appropriate set of core switches 1198791015840(14) else(15) Deliver the core switch to the set 1198791015840(16) end if(17) Remove destination edge switches that the selected core switch from 119863 can reach(18) Update the set of remaining core switches in 119879(19) end for(20) Step 3 establish the optimal pathes(21) Connect the links between source edge switch and destination edge switches through appropriate core switches in the set 1198791015840(22) Send configuration signals to corresponding devices in multicast subnetwork

Algorithm 1 Multicast scheduling algorithm with Markov chains (MSaMC)

4 Multicast Scheduling Algorithm withMarkov Chains

In the section we will propose a multicast schedulingalgorithm with Markov chains (MSaMC) in fat-tree DCNswhich aims to minimize the blocking probability of availablelinks and improve the traffic efficiency of data flows in themulticast network Then we give a simple example to explainthe implementation process of MSaMC

41 Description of the MSaMC The core of MSaMC is toselect the downlinks with minimum blocking probability attime-slot 119905+1 Accordingly the first step of the algorithm is tofind the available core switches denoted as the set119879 |119879| le 119891We take the remaining bandwidth of the 119894th uplink as 119906120583119894 Based on our theoretical analysis in Section 5 the multicastsubnetwork may be blocked if it is less than 3120596 that is 119906120583119894 ge3120596

The second step is to choose the appropriate core switchwhich is connected to the downlink with minimum blockingprobability at time-slot 119905 + 1 in each iteration At the endof the iteration we can transfer the core switches from theset 119879 to the set 1198791015840 The iteration will terminate when the setof destination edge switches 119863 is empty Obviously the coreswitches in the set 1198791015840 are connected to the downlinks withminimum blocking probability And the set 1198791015840 can satisfyarbitrary multicast flow request in fat-tree DCNs [5]

Based on the above steps we will obtain a set of appro-priate core switches 1198791015840 Moreover each destination edgeswitch in 119863 can find one downlink from the set 1198791015840 tobe connected with the minimal blocking probability at

Table 1 Link remaining bandwidth (M)

C1 C2 C3 C4E1 90 300 600 800E2 600 700 800 200E3 750 400 350 700E4 500 200 150 500

time-slot 119905 + 1 The third step is to establish the optimalpath from source edge switch to destination edge switchesthrough the appropriate core switches The state of multicastsubnetwork will be updated after the source server sends theconfiguration signals to corresponding forwarding devicesThe main process of the MSaMC is described in Algorithm 1

42 An Example of the MSaMC For the purpose of illus-tration in the following we give a scheduling example in asimple fat-tree DCN as shown in Figure 5 Assume that wehave obtained the network state at time-slot 119905 and made amulticast flow request (1 (2 3 4) 50119872) The link remainingbandwidth 120583 and link blocking probability 119875 at next time-slot are shown in Tables 1 and 2 respectively The symbolradic denotes available uplink and times indicates unavailable linkFor clarity we select only two layers of the network and giverelevant links in each step

As described in Section 41 the MSaMC is implementedby three steps Firstly we take the remaining bandwidth ofthe uplink as 119906120583 (119906120583119894 ge 3 times 50119872) and find the set of availablecore switches that is 119879 = 2 3 4 Secondly we evaluate theblocking probability of relevant downlinks at time-slot 119905+1 In

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

2 Complexity

0 4 8 12 16 200

10

20

30

40

NLR ()

SAFR

()

Fat-treeDCellBCube

Figure 1 The relationship between network blocking rate (NBR)and scheduling algorithm fault rate (SAFR) in different DCNs

process is finished This may lead to the network loadimbalance because the bandwidth of multicast connectionhas not been allocated dynamically [7]Therefore we developan efficient multicast scheduling algorithm to achieve thescheduling of network flows at the network state of next time-slot in fat-tree DCNs

However since the network state at next time-slot isprobabilistic and not deterministic it is difficult to predictthe network state of next time-slot from the present statewith certainty and find a deterministic strategy The Markovchains can be employed to predict network state even thoughstate transition is probabilistic [8] Thus the next networkstates can be assessed by the set of probabilities in a Markovprocess [9] The evolution of the set of probability essentiallydescribes the underlying dynamical nature of a network [10]In [11] the authors proposed a scheme by using Markovapproximation which aims at minimizing themaximum linkutilization (ie the link utilization of the most blocked link)in data center networks Moreover the scheme provides twostrategies that construct Markov chains with different con-nection relationships The first strategy just applies Markovapproximation to data center traffic engineering The secondstrategy is a local search algorithm that modifies Markovapproximation

In this paper we adopt Markov chains to deduce the linkblocking probability at next time-slot and take them as linkweight in the multicast blocking model in fat-tree DCNsTherefore available links are selected based on the networkstate at next time-slot and the optimal downlink are selectedby the link weight In the downlink selection we comparethe blocking probability and choose the downlinks withlowest blocking probability at next time-slot which avoidsMSaMC failure due to delay error In particular we findthat the remaining bandwidth of the selected uplinks is closeto and greater than the three times of multicast bandwidthrequests which can reduce the algorithm execution time and

save bandwidth consumptionTheoretical analysis shows thecorrectness of the strategy while simulation results show thatMSaMC can achieve higher network throughput and loweraverage delay

The contributions of the paper can be summarized asfollows

(i) We analyzewhymulticast blocking occurs in practicalapplication Afterwards we present a novel way ofmulticast transmission forecasting and the multicastblocking model in fat-tree DCNs

(ii) We put forward a multicast scheduling algorithm(MSaMC) to select the optimal uplinks and down-links MSaMC not only ensures lower network block-ing but also maximizes the utility of network band-width resources

(iii) Theoretical analysis shows that the link blockingprobability is less than 13 by our proposed MSaMCalgorithm and the multicast network can be non-blocking if the link blocking probability is less than01

The rest of the paper is organized as follows Section 2describes the detrimental effects of multicast blocking infat-tree DCNs Section 3 establishes the multicast blockingprobability model in fat-tree DCNs and deduces the linkblocking probability at next time-slot based Markov chainsIn Section 4 we propose multicast scheduling algorithmwith Markov chains (MSaMC) and analyze the complexityof MSaMC algorithm in Section 5 In Section 6 we evaluatethe performance of MSaMC by simulation results FinallySection 7 concludes this paper

2 Cause of Multicast Blocking

A fat-tree DCN as shown in Figure 2 is represented as atriple 119891(119898 119899 119903) where 119898 and 119903 denote the number of coreswitches and edge switches respectively and 119899 indicates thenumber of servers connecting to an edge switch In fat-treeDCNs all links are bidirectional and have the same capacityWe define the uplink as the link from edge switch to coreswitch and the downlink as the link from core switch toedge switch A multicast flow request 120596 can be abstracted asa triple (119894 119863 120596) where 119894 isin 1 2 119903 is the source edgeswitch and 119863 denotes the set of destination edge switchesby the multicast flow request 120596 The number of destinationedge switches with multicast flow request 120596 is represented as|119863| |119863| le 119903 minus 1 which is denoted as fanout 119891 Note thatthe servers connecting to the same edge switch can freelycommunicate with each other and the intraedge switch trafficcan be ignored Hence both aggregation and edge layer canbe seen as edge layer

To illustrate the disadvantages of multicast blocking infat-treeDCNs a simple traffic pattern in a small fat-treeDCNis depicted in Figure 3 Suppose that there are two multicastflow requests 1205961 and 1205962 and every flow request looks foravailable links by identical scheduling algorithm Both flow1205961 and flow 1205962 have a source server and two destinationservers located at different edge switches and the sum of both

Complexity 3

Core (1 middot middot middot m)

Aggregation (1 middot middot middot r)

Edge (1 middot middot middot r)

Figure 2 The topology of fat-tree DCNs

Core 1 Core 2

Edge 1 Edge 2 Edge 3

1

2

Figure 3 The cause of multicast blocking

is greater than the available link bandwidth In particularflow 1205961 and flow 1205962 forward through core switch 1 at thesame time and are routed from core switch 1 to edge switch2 through the same link by the scheduling algorithm whichwill cause heavy blocking at the links connected to coreswitch 1 Therefore the available bandwidth to each flow willsuffer further reduction if the scheduler cannot identify heavymulticast blocking in the fat-tree DCNs

Figure 3 also explains the main reason of multicastblocking We can see that multicast blocking has occurredat the link between core switch 1 and edge switch 2 Clearlybefore the blocking at the link is alleviated other links cannotrelease the occupied bandwidth This means that the linksfrom edge switch 1 to core switch 1 from edge switch 1 to coreswitch 2 from core switch 2 to edge switch 3 and fromedge switch 3 to core switch 1 are released until the multicastblocking is alleviated However the fat-tree DCNs cannotaccept the long time to address the blocking due to therequirement for low latency

In the fat-tree DCNs different source servers may exe-cute scheduling algorithm in the same time so that they

may occupy the same link and the multicast blocking willinevitably occur Hence the multicast blocking is a commonphenomenon in the applications of DCN so that networkperformance will be reduced In addition there are alsomanyservers as hotspots of user access which may cause data flowtransfer by many to one In fact the key reason of multicastblocking is that the network link state at next time-slot is notconsidered Several works have been proposed to solve thenetwork blocking in the transmission of multicast packetsin DCNs [12 13] As data centers usually adopt commercialswitches that cannot guarantee network nonblocking an effi-cient packet repairing schemewas proposed [12] which relieson unicast to retransmit dropped multicast packets causedby switch buffer overload or switching failure Furthermorethe bloom filter [13] was proposed to compress the multicastforwarding table in switches which avoids the multicastblocking in the data center network

To the best of our knowledge the exiting multicastscheduling algorithms only considered the network state atthe current time-slot in DCNs thus the delay error betweenthe algorithm execution time and the beginning transferringtime of data flow will make the scheduling algorithm invalidBased on the consideration we focus on the study of themulticast scheduling in the network state at next time-slotbased on Markov chains

3 Model and Probability of Multicast Blocking

In the section we first establish the multicast blockingmodel based on the topology of fat-tree DCNs by using asimilar approach Then we deduce the blocking probabilityof available downlinks at next time-slot

31 Multicast Subnetwork A multicast bandwidth requestcorresponds to a multicast subnetwork in fat-tree DCNswhich consists of available core switches and edge switches forthe multicast bandwidth request The multicast subnetworkin Figure 4 has 119891 destination edge switches 119909 availablecore switches and 119899 times 119891 servers where 1 le 119909 le 119898In the process of multicast connection the link weight ofmulticast subnetwork is denoted as the blocking probability

4 Complexity

Core (1 2)

Edge (1 middot middot middot 4)

Server (1 middot middot middot 12)

Multicast flow

Figure 4 The multicast subnetwork

at next time-slot Thus our goal is to obtain the link blockingprobability for any type of multicast bandwidth request atnext time-slot

It is known that the fat-tree DCN is a typical large-scalenetwork where there are many available links that can meetthe multicast connection request When a link is available fora multicast bandwidth request 120596 the blocking probability ofthe link at the current time-slot is given by 119901 = 120596120583 where 120583is the remaining bandwidth

A multicast connection can be represented by the desti-nation edge switches Given a multicast bandwidth request120596 with fanout 119891 (1 le 119891 lt 119903) 119875(119891) indicates the blockingprobability for this multicast connection We denote theblocking of available uplink 119894 as the events 1199061 1199062 119906119909 andthe blocking of available downlinks between available coreswitches and the 119896th (1 le 119896 le 119891) destination edge switchesas the events 1198891198961 1198891198962 119889119896119909 All available links form amulticast tree rooted at the core switches that can satisfythe multicast connection in the multicast network Othernotations used in the paper are summarized in Notations

32 Multicast Blocking Model In the multicast subnetworkwe employ 120598 to express the event that the request of mul-ticast connection with fanout 119891 cannot be satisfied in thenetwork shown in Figure 4 We do not consider the linkswhose remaining bandwidth is less thanmulticast bandwidthrequest 120596 since the link is not available when the multicastdata flow 120596 goes through the link We let 119875(120598 | 120601) bethe conditional blocking probability of state 120601 and 119875(120601) bethe probability of state 120601 Then the blocking probability ofsubnetwork for a multicast connection is given by

119875 (119891) = 119875 (120598) = sum120601

119875 (120601) 119875 (120598 | 120601) (1)

For the event 120601 the data traffic of the uplinks does notinterfere with each other that is the uplinks are independentTherefore we have 119875(120601) = 119902119896119901119898minus119896

From the multicast blocking subnetwork in Figure 4 wecan obtain the blocking property of the fat-tree DCNs that is

the multicast bandwidth request 120596 from a source edge switchto distinct destination edge switches cannot be achieved ifand only if there is no any available downlink connecting alldestination edge switches

In that way we take 1205981015840 to denote the event that themulticast bandwidth request 120596 with fanout 119891 cannot beachieved in the available uplinks Thus we can get

119875 (1205981015840) = 119875 (120598 | 1199061 1199062 119906119909) (2)

An available downlink 119889119894119895 where 1 le 119894 lt 119891 and 1 le 119895 le 119909represents a link from a core switch to the 119894th destination edgeswitchThe event 1205981015840 can be expressed by events119889119894119895rsquos as follows

1205981015840 = (11988911 cap 11988912 cap sdot sdot sdot cap 1198891119909) cup sdot sdot sdotcup (1198891198911 cap 1198891198912 cap sdot sdot sdot cap 119889119891119909) (3)

Afterwards we define that the blocking of downlinksconnecting to each destination edge switch is event 119860 =1198601 1198601 119860119891 moreover we have 1198601 = (11988911 cap 11988912 cap sdot sdot sdot cap1198891119909) Thus we get

1205981015840 = 119891⋃119894=1

119860 119894 (4)

Based on the theory of combinatorics the inclusion-exclusion principle (also known as the sieve principle) is anequation related to the size of two sets and their intersectionFor the general case of principle in [14] let 1198601 1198602 119860119891be finite set Then we have10038161003816100381610038161003816100381610038161003816100381610038161003816

119891⋃119894=1

11986011989410038161003816100381610038161003816100381610038161003816100381610038161003816 = 119891sum119894=1

1003816100381610038161003816119860 1198941003816100381610038161003816 minus sum1le119894lt119895le119891

10038161003816100381610038161003816119860 119894 cap 11986011989510038161003816100381610038161003816+ sum1le119894lt119895ltℎle119891

10038161003816100381610038161003816119860 119894 cap 119860119895 cap 119860ℎ10038161003816100381610038161003816 minus sdot sdot sdot+ (minus1)119891minus1 100381610038161003816100381610038161198601 cap 1198602 cap sdot sdot sdot cap 11986011989110038161003816100381610038161003816

(5)

Complexity 5

For the events 1198601 1198601 119860119891 in a probability space(Ω 119865 119875) we can obtain the probability of the event 1205981015840119875 (1205981015840) = 119891sum

119894=1

119875 (119860 119894) minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895)+ sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) minus sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)

(6)

where 119875(119860 119894) denotes the probability of the event 119860 119894Combining (1) and (2) with (6) the multicast blocking

model for a multicast connection with fanout 119891 is given by

119875 (119891) = 119898sum119896=1

(119898119896 ) 119901119896119902119898minus119896( 119891sum

119894=1

119875 (119860 119894)minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) + sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ)

minus sdot sdot sdot + (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)) (7)

From (6) sum1le119894lt119895le119891 119875(119860 119894 cap 119860119895) ge sum1le119894lt119895ltℎle119891 119875(119860 119894 cap 119860119895 cap119860ℎ) the following inequality can be derived

sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) minus sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) + sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891) ge 0

(8)

Therefore theminimumblocking probability of the event1205981015840 is119875min (1205981015840) = 119891sum

119894=1

119875 (119860 119894) (9)

where 119875(1198601) = prod119909119896=11199011119896Afterwards we define 119875min(119891) as the minimum blocking

probability of multicast subnetwork and the number ofavailable core switches is 119909 Thus we get

119875min (119891) = 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119875 (119860 119894))

= 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119909prod119895=1

119901119894119895) (10)

where 119909 le 119898It is not difficult to find from (10) that the minimum

blocking probability 119875min(119891) is an increasing sequence withfanout 119891 In other words it is more difficult to realize a mul-ticast bandwidth request with larger fanout since the numberof core switches is less Therefore the minimum blockingprobability with fanout 119891 reflects the state of available linkat next time-slot

33 Link Blocking Probability at Next Time-Slot In thissubsection we calculate the blocking probability of availablelink at next time-slot based on Markov chains theory Werandomly select a link denoted by the 119894th link to analyze

In the multicast blocking model we denote the currenttime-slot as 119905 and the next time-slot as 119905 + 1 119887119894 is the 119894th linkoccupied bandwidth at time-slot 119905 that is 119910119894(119905) = 119887119894 119886(119905) isthe sum of occupied bandwidth of all available downlinks attime-slot 119905 namely 119886(119905) = sum119909119895=1 119910119895(119905) and 119910119894(119905 + 1) refers topredicted occupied bandwidth of the 119894th link at time-slot 119905+1In [15] the preference or uniform selectionmechanism basedonMarkov chains is adopted for calculating the link blockingprobability at next time-slot Based on the mechanism theprobability 119875119894 of the link incoming new flow at time-slot 119905 + 1can be given by

119875119894 = 119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891 (11)

where 1 le 119891 le 119903In addition we do not consider the case that the band-

width of available link is decreasing namely the bandwidthof available link is enough for multicast bandwidth request Ifa multicast bandwidth request selects the 119894th link at time-slot119905 + 1 it means 119910119894(119905 + 1) will add 120587119894 where 1 le 120587119894 le 119872119894119872119894 is defined as increasing the maximum number of dataflows Then we let 119875119887119894 denote the probability of the 119894th linkflow remaining unchanged or increasing at time-slot 119905 + 1thus we can get

119875119887119894 =119872119894sum120587119894=0

[119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891]120587119894

= 119872119894sum120587119894=0

119875120587119894119894 = 1 minus 119875119872119894+11198941 minus 119875119894 (12)

where 119894 = 1 2 119909 and 120587119894 = 0 1 119872119894According to (12) we will calculate one-step transition

probability of amulticast flow denoted as119875(119910119894(119905+1) = 119887119894+120587119894 |119910119894(119905) = 119887119894) which is a Markov process

119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)= 1119875119887119894 sdot (119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891)120587119894

= (1 minus 119875119894) sdot 1198751205871198941198941 minus 119875119872119894+1119894 (13)

where 119894 = 1 2 119909In fact 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894)

indicates the link blocking probability at time-slot 119905 +1 which is determined by 119875119894 and 120587119894 The link blockingprobability will be small when 120587119894 is small at time-slot119905 + 1 otherwise the link may be blocked at time-slot119905 + 1 Therefore the range of 120587119894 is very important to ourproposed multicast scheduling algorithm In this paper weassume that the multicast bandwidth request 120596 is one dataflow unit and 120587119894 is an integral multiple of multicast band-width request 120596

6 Complexity

Input Incoming flow (119894 119863 120596) link remaining bandwidth 120583 the number of destination edge switches |119863| 120587119894 = 3120596Output Multicast links with the minimum blocking probability(1) Step 1 identify available core switches(2) for 119894 = 1 to 119898 do(3) Select an uplink 119906119894(4) if 119906120583119894 ge 3120596 and |119879| le |119863| then(5) Select the core switch 119894 and add it into the set 119879(6) end if(7) end for(8) Step 2 select appropriate core switches(9) Calculate the blocking probability of available downlinks at time-slot 119905 + 1 119875119894(119905 + 1) by equation (13)(10) for 119895 = 1 to |119863| do(11) Find the core switch(es) in 119879 that are connected to a destination edge switch in 119863(12) if There are multiple core switches to be found then(13) Select the core switch with the minimum blocking probability and deliver it to the appropriate set of core switches 1198791015840(14) else(15) Deliver the core switch to the set 1198791015840(16) end if(17) Remove destination edge switches that the selected core switch from 119863 can reach(18) Update the set of remaining core switches in 119879(19) end for(20) Step 3 establish the optimal pathes(21) Connect the links between source edge switch and destination edge switches through appropriate core switches in the set 1198791015840(22) Send configuration signals to corresponding devices in multicast subnetwork

Algorithm 1 Multicast scheduling algorithm with Markov chains (MSaMC)

4 Multicast Scheduling Algorithm withMarkov Chains

In the section we will propose a multicast schedulingalgorithm with Markov chains (MSaMC) in fat-tree DCNswhich aims to minimize the blocking probability of availablelinks and improve the traffic efficiency of data flows in themulticast network Then we give a simple example to explainthe implementation process of MSaMC

41 Description of the MSaMC The core of MSaMC is toselect the downlinks with minimum blocking probability attime-slot 119905+1 Accordingly the first step of the algorithm is tofind the available core switches denoted as the set119879 |119879| le 119891We take the remaining bandwidth of the 119894th uplink as 119906120583119894 Based on our theoretical analysis in Section 5 the multicastsubnetwork may be blocked if it is less than 3120596 that is 119906120583119894 ge3120596

The second step is to choose the appropriate core switchwhich is connected to the downlink with minimum blockingprobability at time-slot 119905 + 1 in each iteration At the endof the iteration we can transfer the core switches from theset 119879 to the set 1198791015840 The iteration will terminate when the setof destination edge switches 119863 is empty Obviously the coreswitches in the set 1198791015840 are connected to the downlinks withminimum blocking probability And the set 1198791015840 can satisfyarbitrary multicast flow request in fat-tree DCNs [5]

Based on the above steps we will obtain a set of appro-priate core switches 1198791015840 Moreover each destination edgeswitch in 119863 can find one downlink from the set 1198791015840 tobe connected with the minimal blocking probability at

Table 1 Link remaining bandwidth (M)

C1 C2 C3 C4E1 90 300 600 800E2 600 700 800 200E3 750 400 350 700E4 500 200 150 500

time-slot 119905 + 1 The third step is to establish the optimalpath from source edge switch to destination edge switchesthrough the appropriate core switches The state of multicastsubnetwork will be updated after the source server sends theconfiguration signals to corresponding forwarding devicesThe main process of the MSaMC is described in Algorithm 1

42 An Example of the MSaMC For the purpose of illus-tration in the following we give a scheduling example in asimple fat-tree DCN as shown in Figure 5 Assume that wehave obtained the network state at time-slot 119905 and made amulticast flow request (1 (2 3 4) 50119872) The link remainingbandwidth 120583 and link blocking probability 119875 at next time-slot are shown in Tables 1 and 2 respectively The symbolradic denotes available uplink and times indicates unavailable linkFor clarity we select only two layers of the network and giverelevant links in each step

As described in Section 41 the MSaMC is implementedby three steps Firstly we take the remaining bandwidth ofthe uplink as 119906120583 (119906120583119894 ge 3 times 50119872) and find the set of availablecore switches that is 119879 = 2 3 4 Secondly we evaluate theblocking probability of relevant downlinks at time-slot 119905+1 In

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Complexity 3

Core (1 middot middot middot m)

Aggregation (1 middot middot middot r)

Edge (1 middot middot middot r)

Figure 2 The topology of fat-tree DCNs

Core 1 Core 2

Edge 1 Edge 2 Edge 3

1

2

Figure 3 The cause of multicast blocking

is greater than the available link bandwidth In particularflow 1205961 and flow 1205962 forward through core switch 1 at thesame time and are routed from core switch 1 to edge switch2 through the same link by the scheduling algorithm whichwill cause heavy blocking at the links connected to coreswitch 1 Therefore the available bandwidth to each flow willsuffer further reduction if the scheduler cannot identify heavymulticast blocking in the fat-tree DCNs

Figure 3 also explains the main reason of multicastblocking We can see that multicast blocking has occurredat the link between core switch 1 and edge switch 2 Clearlybefore the blocking at the link is alleviated other links cannotrelease the occupied bandwidth This means that the linksfrom edge switch 1 to core switch 1 from edge switch 1 to coreswitch 2 from core switch 2 to edge switch 3 and fromedge switch 3 to core switch 1 are released until the multicastblocking is alleviated However the fat-tree DCNs cannotaccept the long time to address the blocking due to therequirement for low latency

In the fat-tree DCNs different source servers may exe-cute scheduling algorithm in the same time so that they

may occupy the same link and the multicast blocking willinevitably occur Hence the multicast blocking is a commonphenomenon in the applications of DCN so that networkperformance will be reduced In addition there are alsomanyservers as hotspots of user access which may cause data flowtransfer by many to one In fact the key reason of multicastblocking is that the network link state at next time-slot is notconsidered Several works have been proposed to solve thenetwork blocking in the transmission of multicast packetsin DCNs [12 13] As data centers usually adopt commercialswitches that cannot guarantee network nonblocking an effi-cient packet repairing schemewas proposed [12] which relieson unicast to retransmit dropped multicast packets causedby switch buffer overload or switching failure Furthermorethe bloom filter [13] was proposed to compress the multicastforwarding table in switches which avoids the multicastblocking in the data center network

To the best of our knowledge the exiting multicastscheduling algorithms only considered the network state atthe current time-slot in DCNs thus the delay error betweenthe algorithm execution time and the beginning transferringtime of data flow will make the scheduling algorithm invalidBased on the consideration we focus on the study of themulticast scheduling in the network state at next time-slotbased on Markov chains

3 Model and Probability of Multicast Blocking

In the section we first establish the multicast blockingmodel based on the topology of fat-tree DCNs by using asimilar approach Then we deduce the blocking probabilityof available downlinks at next time-slot

31 Multicast Subnetwork A multicast bandwidth requestcorresponds to a multicast subnetwork in fat-tree DCNswhich consists of available core switches and edge switches forthe multicast bandwidth request The multicast subnetworkin Figure 4 has 119891 destination edge switches 119909 availablecore switches and 119899 times 119891 servers where 1 le 119909 le 119898In the process of multicast connection the link weight ofmulticast subnetwork is denoted as the blocking probability

4 Complexity

Core (1 2)

Edge (1 middot middot middot 4)

Server (1 middot middot middot 12)

Multicast flow

Figure 4 The multicast subnetwork

at next time-slot Thus our goal is to obtain the link blockingprobability for any type of multicast bandwidth request atnext time-slot

It is known that the fat-tree DCN is a typical large-scalenetwork where there are many available links that can meetthe multicast connection request When a link is available fora multicast bandwidth request 120596 the blocking probability ofthe link at the current time-slot is given by 119901 = 120596120583 where 120583is the remaining bandwidth

A multicast connection can be represented by the desti-nation edge switches Given a multicast bandwidth request120596 with fanout 119891 (1 le 119891 lt 119903) 119875(119891) indicates the blockingprobability for this multicast connection We denote theblocking of available uplink 119894 as the events 1199061 1199062 119906119909 andthe blocking of available downlinks between available coreswitches and the 119896th (1 le 119896 le 119891) destination edge switchesas the events 1198891198961 1198891198962 119889119896119909 All available links form amulticast tree rooted at the core switches that can satisfythe multicast connection in the multicast network Othernotations used in the paper are summarized in Notations

32 Multicast Blocking Model In the multicast subnetworkwe employ 120598 to express the event that the request of mul-ticast connection with fanout 119891 cannot be satisfied in thenetwork shown in Figure 4 We do not consider the linkswhose remaining bandwidth is less thanmulticast bandwidthrequest 120596 since the link is not available when the multicastdata flow 120596 goes through the link We let 119875(120598 | 120601) bethe conditional blocking probability of state 120601 and 119875(120601) bethe probability of state 120601 Then the blocking probability ofsubnetwork for a multicast connection is given by

119875 (119891) = 119875 (120598) = sum120601

119875 (120601) 119875 (120598 | 120601) (1)

For the event 120601 the data traffic of the uplinks does notinterfere with each other that is the uplinks are independentTherefore we have 119875(120601) = 119902119896119901119898minus119896

From the multicast blocking subnetwork in Figure 4 wecan obtain the blocking property of the fat-tree DCNs that is

the multicast bandwidth request 120596 from a source edge switchto distinct destination edge switches cannot be achieved ifand only if there is no any available downlink connecting alldestination edge switches

In that way we take 1205981015840 to denote the event that themulticast bandwidth request 120596 with fanout 119891 cannot beachieved in the available uplinks Thus we can get

119875 (1205981015840) = 119875 (120598 | 1199061 1199062 119906119909) (2)

An available downlink 119889119894119895 where 1 le 119894 lt 119891 and 1 le 119895 le 119909represents a link from a core switch to the 119894th destination edgeswitchThe event 1205981015840 can be expressed by events119889119894119895rsquos as follows

1205981015840 = (11988911 cap 11988912 cap sdot sdot sdot cap 1198891119909) cup sdot sdot sdotcup (1198891198911 cap 1198891198912 cap sdot sdot sdot cap 119889119891119909) (3)

Afterwards we define that the blocking of downlinksconnecting to each destination edge switch is event 119860 =1198601 1198601 119860119891 moreover we have 1198601 = (11988911 cap 11988912 cap sdot sdot sdot cap1198891119909) Thus we get

1205981015840 = 119891⋃119894=1

119860 119894 (4)

Based on the theory of combinatorics the inclusion-exclusion principle (also known as the sieve principle) is anequation related to the size of two sets and their intersectionFor the general case of principle in [14] let 1198601 1198602 119860119891be finite set Then we have10038161003816100381610038161003816100381610038161003816100381610038161003816

119891⋃119894=1

11986011989410038161003816100381610038161003816100381610038161003816100381610038161003816 = 119891sum119894=1

1003816100381610038161003816119860 1198941003816100381610038161003816 minus sum1le119894lt119895le119891

10038161003816100381610038161003816119860 119894 cap 11986011989510038161003816100381610038161003816+ sum1le119894lt119895ltℎle119891

10038161003816100381610038161003816119860 119894 cap 119860119895 cap 119860ℎ10038161003816100381610038161003816 minus sdot sdot sdot+ (minus1)119891minus1 100381610038161003816100381610038161198601 cap 1198602 cap sdot sdot sdot cap 11986011989110038161003816100381610038161003816

(5)

Complexity 5

For the events 1198601 1198601 119860119891 in a probability space(Ω 119865 119875) we can obtain the probability of the event 1205981015840119875 (1205981015840) = 119891sum

119894=1

119875 (119860 119894) minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895)+ sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) minus sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)

(6)

where 119875(119860 119894) denotes the probability of the event 119860 119894Combining (1) and (2) with (6) the multicast blocking

model for a multicast connection with fanout 119891 is given by

119875 (119891) = 119898sum119896=1

(119898119896 ) 119901119896119902119898minus119896( 119891sum

119894=1

119875 (119860 119894)minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) + sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ)

minus sdot sdot sdot + (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)) (7)

From (6) sum1le119894lt119895le119891 119875(119860 119894 cap 119860119895) ge sum1le119894lt119895ltℎle119891 119875(119860 119894 cap 119860119895 cap119860ℎ) the following inequality can be derived

sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) minus sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) + sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891) ge 0

(8)

Therefore theminimumblocking probability of the event1205981015840 is119875min (1205981015840) = 119891sum

119894=1

119875 (119860 119894) (9)

where 119875(1198601) = prod119909119896=11199011119896Afterwards we define 119875min(119891) as the minimum blocking

probability of multicast subnetwork and the number ofavailable core switches is 119909 Thus we get

119875min (119891) = 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119875 (119860 119894))

= 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119909prod119895=1

119901119894119895) (10)

where 119909 le 119898It is not difficult to find from (10) that the minimum

blocking probability 119875min(119891) is an increasing sequence withfanout 119891 In other words it is more difficult to realize a mul-ticast bandwidth request with larger fanout since the numberof core switches is less Therefore the minimum blockingprobability with fanout 119891 reflects the state of available linkat next time-slot

33 Link Blocking Probability at Next Time-Slot In thissubsection we calculate the blocking probability of availablelink at next time-slot based on Markov chains theory Werandomly select a link denoted by the 119894th link to analyze

In the multicast blocking model we denote the currenttime-slot as 119905 and the next time-slot as 119905 + 1 119887119894 is the 119894th linkoccupied bandwidth at time-slot 119905 that is 119910119894(119905) = 119887119894 119886(119905) isthe sum of occupied bandwidth of all available downlinks attime-slot 119905 namely 119886(119905) = sum119909119895=1 119910119895(119905) and 119910119894(119905 + 1) refers topredicted occupied bandwidth of the 119894th link at time-slot 119905+1In [15] the preference or uniform selectionmechanism basedonMarkov chains is adopted for calculating the link blockingprobability at next time-slot Based on the mechanism theprobability 119875119894 of the link incoming new flow at time-slot 119905 + 1can be given by

119875119894 = 119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891 (11)

where 1 le 119891 le 119903In addition we do not consider the case that the band-

width of available link is decreasing namely the bandwidthof available link is enough for multicast bandwidth request Ifa multicast bandwidth request selects the 119894th link at time-slot119905 + 1 it means 119910119894(119905 + 1) will add 120587119894 where 1 le 120587119894 le 119872119894119872119894 is defined as increasing the maximum number of dataflows Then we let 119875119887119894 denote the probability of the 119894th linkflow remaining unchanged or increasing at time-slot 119905 + 1thus we can get

119875119887119894 =119872119894sum120587119894=0

[119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891]120587119894

= 119872119894sum120587119894=0

119875120587119894119894 = 1 minus 119875119872119894+11198941 minus 119875119894 (12)

where 119894 = 1 2 119909 and 120587119894 = 0 1 119872119894According to (12) we will calculate one-step transition

probability of amulticast flow denoted as119875(119910119894(119905+1) = 119887119894+120587119894 |119910119894(119905) = 119887119894) which is a Markov process

119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)= 1119875119887119894 sdot (119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891)120587119894

= (1 minus 119875119894) sdot 1198751205871198941198941 minus 119875119872119894+1119894 (13)

where 119894 = 1 2 119909In fact 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894)

indicates the link blocking probability at time-slot 119905 +1 which is determined by 119875119894 and 120587119894 The link blockingprobability will be small when 120587119894 is small at time-slot119905 + 1 otherwise the link may be blocked at time-slot119905 + 1 Therefore the range of 120587119894 is very important to ourproposed multicast scheduling algorithm In this paper weassume that the multicast bandwidth request 120596 is one dataflow unit and 120587119894 is an integral multiple of multicast band-width request 120596

6 Complexity

Input Incoming flow (119894 119863 120596) link remaining bandwidth 120583 the number of destination edge switches |119863| 120587119894 = 3120596Output Multicast links with the minimum blocking probability(1) Step 1 identify available core switches(2) for 119894 = 1 to 119898 do(3) Select an uplink 119906119894(4) if 119906120583119894 ge 3120596 and |119879| le |119863| then(5) Select the core switch 119894 and add it into the set 119879(6) end if(7) end for(8) Step 2 select appropriate core switches(9) Calculate the blocking probability of available downlinks at time-slot 119905 + 1 119875119894(119905 + 1) by equation (13)(10) for 119895 = 1 to |119863| do(11) Find the core switch(es) in 119879 that are connected to a destination edge switch in 119863(12) if There are multiple core switches to be found then(13) Select the core switch with the minimum blocking probability and deliver it to the appropriate set of core switches 1198791015840(14) else(15) Deliver the core switch to the set 1198791015840(16) end if(17) Remove destination edge switches that the selected core switch from 119863 can reach(18) Update the set of remaining core switches in 119879(19) end for(20) Step 3 establish the optimal pathes(21) Connect the links between source edge switch and destination edge switches through appropriate core switches in the set 1198791015840(22) Send configuration signals to corresponding devices in multicast subnetwork

Algorithm 1 Multicast scheduling algorithm with Markov chains (MSaMC)

4 Multicast Scheduling Algorithm withMarkov Chains

In the section we will propose a multicast schedulingalgorithm with Markov chains (MSaMC) in fat-tree DCNswhich aims to minimize the blocking probability of availablelinks and improve the traffic efficiency of data flows in themulticast network Then we give a simple example to explainthe implementation process of MSaMC

41 Description of the MSaMC The core of MSaMC is toselect the downlinks with minimum blocking probability attime-slot 119905+1 Accordingly the first step of the algorithm is tofind the available core switches denoted as the set119879 |119879| le 119891We take the remaining bandwidth of the 119894th uplink as 119906120583119894 Based on our theoretical analysis in Section 5 the multicastsubnetwork may be blocked if it is less than 3120596 that is 119906120583119894 ge3120596

The second step is to choose the appropriate core switchwhich is connected to the downlink with minimum blockingprobability at time-slot 119905 + 1 in each iteration At the endof the iteration we can transfer the core switches from theset 119879 to the set 1198791015840 The iteration will terminate when the setof destination edge switches 119863 is empty Obviously the coreswitches in the set 1198791015840 are connected to the downlinks withminimum blocking probability And the set 1198791015840 can satisfyarbitrary multicast flow request in fat-tree DCNs [5]

Based on the above steps we will obtain a set of appro-priate core switches 1198791015840 Moreover each destination edgeswitch in 119863 can find one downlink from the set 1198791015840 tobe connected with the minimal blocking probability at

Table 1 Link remaining bandwidth (M)

C1 C2 C3 C4E1 90 300 600 800E2 600 700 800 200E3 750 400 350 700E4 500 200 150 500

time-slot 119905 + 1 The third step is to establish the optimalpath from source edge switch to destination edge switchesthrough the appropriate core switches The state of multicastsubnetwork will be updated after the source server sends theconfiguration signals to corresponding forwarding devicesThe main process of the MSaMC is described in Algorithm 1

42 An Example of the MSaMC For the purpose of illus-tration in the following we give a scheduling example in asimple fat-tree DCN as shown in Figure 5 Assume that wehave obtained the network state at time-slot 119905 and made amulticast flow request (1 (2 3 4) 50119872) The link remainingbandwidth 120583 and link blocking probability 119875 at next time-slot are shown in Tables 1 and 2 respectively The symbolradic denotes available uplink and times indicates unavailable linkFor clarity we select only two layers of the network and giverelevant links in each step

As described in Section 41 the MSaMC is implementedby three steps Firstly we take the remaining bandwidth ofthe uplink as 119906120583 (119906120583119894 ge 3 times 50119872) and find the set of availablecore switches that is 119879 = 2 3 4 Secondly we evaluate theblocking probability of relevant downlinks at time-slot 119905+1 In

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

4 Complexity

Core (1 2)

Edge (1 middot middot middot 4)

Server (1 middot middot middot 12)

Multicast flow

Figure 4 The multicast subnetwork

at next time-slot Thus our goal is to obtain the link blockingprobability for any type of multicast bandwidth request atnext time-slot

It is known that the fat-tree DCN is a typical large-scalenetwork where there are many available links that can meetthe multicast connection request When a link is available fora multicast bandwidth request 120596 the blocking probability ofthe link at the current time-slot is given by 119901 = 120596120583 where 120583is the remaining bandwidth

A multicast connection can be represented by the desti-nation edge switches Given a multicast bandwidth request120596 with fanout 119891 (1 le 119891 lt 119903) 119875(119891) indicates the blockingprobability for this multicast connection We denote theblocking of available uplink 119894 as the events 1199061 1199062 119906119909 andthe blocking of available downlinks between available coreswitches and the 119896th (1 le 119896 le 119891) destination edge switchesas the events 1198891198961 1198891198962 119889119896119909 All available links form amulticast tree rooted at the core switches that can satisfythe multicast connection in the multicast network Othernotations used in the paper are summarized in Notations

32 Multicast Blocking Model In the multicast subnetworkwe employ 120598 to express the event that the request of mul-ticast connection with fanout 119891 cannot be satisfied in thenetwork shown in Figure 4 We do not consider the linkswhose remaining bandwidth is less thanmulticast bandwidthrequest 120596 since the link is not available when the multicastdata flow 120596 goes through the link We let 119875(120598 | 120601) bethe conditional blocking probability of state 120601 and 119875(120601) bethe probability of state 120601 Then the blocking probability ofsubnetwork for a multicast connection is given by

119875 (119891) = 119875 (120598) = sum120601

119875 (120601) 119875 (120598 | 120601) (1)

For the event 120601 the data traffic of the uplinks does notinterfere with each other that is the uplinks are independentTherefore we have 119875(120601) = 119902119896119901119898minus119896

From the multicast blocking subnetwork in Figure 4 wecan obtain the blocking property of the fat-tree DCNs that is

the multicast bandwidth request 120596 from a source edge switchto distinct destination edge switches cannot be achieved ifand only if there is no any available downlink connecting alldestination edge switches

In that way we take 1205981015840 to denote the event that themulticast bandwidth request 120596 with fanout 119891 cannot beachieved in the available uplinks Thus we can get

119875 (1205981015840) = 119875 (120598 | 1199061 1199062 119906119909) (2)

An available downlink 119889119894119895 where 1 le 119894 lt 119891 and 1 le 119895 le 119909represents a link from a core switch to the 119894th destination edgeswitchThe event 1205981015840 can be expressed by events119889119894119895rsquos as follows

1205981015840 = (11988911 cap 11988912 cap sdot sdot sdot cap 1198891119909) cup sdot sdot sdotcup (1198891198911 cap 1198891198912 cap sdot sdot sdot cap 119889119891119909) (3)

Afterwards we define that the blocking of downlinksconnecting to each destination edge switch is event 119860 =1198601 1198601 119860119891 moreover we have 1198601 = (11988911 cap 11988912 cap sdot sdot sdot cap1198891119909) Thus we get

1205981015840 = 119891⋃119894=1

119860 119894 (4)

Based on the theory of combinatorics the inclusion-exclusion principle (also known as the sieve principle) is anequation related to the size of two sets and their intersectionFor the general case of principle in [14] let 1198601 1198602 119860119891be finite set Then we have10038161003816100381610038161003816100381610038161003816100381610038161003816

119891⋃119894=1

11986011989410038161003816100381610038161003816100381610038161003816100381610038161003816 = 119891sum119894=1

1003816100381610038161003816119860 1198941003816100381610038161003816 minus sum1le119894lt119895le119891

10038161003816100381610038161003816119860 119894 cap 11986011989510038161003816100381610038161003816+ sum1le119894lt119895ltℎle119891

10038161003816100381610038161003816119860 119894 cap 119860119895 cap 119860ℎ10038161003816100381610038161003816 minus sdot sdot sdot+ (minus1)119891minus1 100381610038161003816100381610038161198601 cap 1198602 cap sdot sdot sdot cap 11986011989110038161003816100381610038161003816

(5)

Complexity 5

For the events 1198601 1198601 119860119891 in a probability space(Ω 119865 119875) we can obtain the probability of the event 1205981015840119875 (1205981015840) = 119891sum

119894=1

119875 (119860 119894) minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895)+ sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) minus sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)

(6)

where 119875(119860 119894) denotes the probability of the event 119860 119894Combining (1) and (2) with (6) the multicast blocking

model for a multicast connection with fanout 119891 is given by

119875 (119891) = 119898sum119896=1

(119898119896 ) 119901119896119902119898minus119896( 119891sum

119894=1

119875 (119860 119894)minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) + sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ)

minus sdot sdot sdot + (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)) (7)

From (6) sum1le119894lt119895le119891 119875(119860 119894 cap 119860119895) ge sum1le119894lt119895ltℎle119891 119875(119860 119894 cap 119860119895 cap119860ℎ) the following inequality can be derived

sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) minus sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) + sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891) ge 0

(8)

Therefore theminimumblocking probability of the event1205981015840 is119875min (1205981015840) = 119891sum

119894=1

119875 (119860 119894) (9)

where 119875(1198601) = prod119909119896=11199011119896Afterwards we define 119875min(119891) as the minimum blocking

probability of multicast subnetwork and the number ofavailable core switches is 119909 Thus we get

119875min (119891) = 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119875 (119860 119894))

= 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119909prod119895=1

119901119894119895) (10)

where 119909 le 119898It is not difficult to find from (10) that the minimum

blocking probability 119875min(119891) is an increasing sequence withfanout 119891 In other words it is more difficult to realize a mul-ticast bandwidth request with larger fanout since the numberof core switches is less Therefore the minimum blockingprobability with fanout 119891 reflects the state of available linkat next time-slot

33 Link Blocking Probability at Next Time-Slot In thissubsection we calculate the blocking probability of availablelink at next time-slot based on Markov chains theory Werandomly select a link denoted by the 119894th link to analyze

In the multicast blocking model we denote the currenttime-slot as 119905 and the next time-slot as 119905 + 1 119887119894 is the 119894th linkoccupied bandwidth at time-slot 119905 that is 119910119894(119905) = 119887119894 119886(119905) isthe sum of occupied bandwidth of all available downlinks attime-slot 119905 namely 119886(119905) = sum119909119895=1 119910119895(119905) and 119910119894(119905 + 1) refers topredicted occupied bandwidth of the 119894th link at time-slot 119905+1In [15] the preference or uniform selectionmechanism basedonMarkov chains is adopted for calculating the link blockingprobability at next time-slot Based on the mechanism theprobability 119875119894 of the link incoming new flow at time-slot 119905 + 1can be given by

119875119894 = 119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891 (11)

where 1 le 119891 le 119903In addition we do not consider the case that the band-

width of available link is decreasing namely the bandwidthof available link is enough for multicast bandwidth request Ifa multicast bandwidth request selects the 119894th link at time-slot119905 + 1 it means 119910119894(119905 + 1) will add 120587119894 where 1 le 120587119894 le 119872119894119872119894 is defined as increasing the maximum number of dataflows Then we let 119875119887119894 denote the probability of the 119894th linkflow remaining unchanged or increasing at time-slot 119905 + 1thus we can get

119875119887119894 =119872119894sum120587119894=0

[119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891]120587119894

= 119872119894sum120587119894=0

119875120587119894119894 = 1 minus 119875119872119894+11198941 minus 119875119894 (12)

where 119894 = 1 2 119909 and 120587119894 = 0 1 119872119894According to (12) we will calculate one-step transition

probability of amulticast flow denoted as119875(119910119894(119905+1) = 119887119894+120587119894 |119910119894(119905) = 119887119894) which is a Markov process

119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)= 1119875119887119894 sdot (119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891)120587119894

= (1 minus 119875119894) sdot 1198751205871198941198941 minus 119875119872119894+1119894 (13)

where 119894 = 1 2 119909In fact 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894)

indicates the link blocking probability at time-slot 119905 +1 which is determined by 119875119894 and 120587119894 The link blockingprobability will be small when 120587119894 is small at time-slot119905 + 1 otherwise the link may be blocked at time-slot119905 + 1 Therefore the range of 120587119894 is very important to ourproposed multicast scheduling algorithm In this paper weassume that the multicast bandwidth request 120596 is one dataflow unit and 120587119894 is an integral multiple of multicast band-width request 120596

6 Complexity

Input Incoming flow (119894 119863 120596) link remaining bandwidth 120583 the number of destination edge switches |119863| 120587119894 = 3120596Output Multicast links with the minimum blocking probability(1) Step 1 identify available core switches(2) for 119894 = 1 to 119898 do(3) Select an uplink 119906119894(4) if 119906120583119894 ge 3120596 and |119879| le |119863| then(5) Select the core switch 119894 and add it into the set 119879(6) end if(7) end for(8) Step 2 select appropriate core switches(9) Calculate the blocking probability of available downlinks at time-slot 119905 + 1 119875119894(119905 + 1) by equation (13)(10) for 119895 = 1 to |119863| do(11) Find the core switch(es) in 119879 that are connected to a destination edge switch in 119863(12) if There are multiple core switches to be found then(13) Select the core switch with the minimum blocking probability and deliver it to the appropriate set of core switches 1198791015840(14) else(15) Deliver the core switch to the set 1198791015840(16) end if(17) Remove destination edge switches that the selected core switch from 119863 can reach(18) Update the set of remaining core switches in 119879(19) end for(20) Step 3 establish the optimal pathes(21) Connect the links between source edge switch and destination edge switches through appropriate core switches in the set 1198791015840(22) Send configuration signals to corresponding devices in multicast subnetwork

Algorithm 1 Multicast scheduling algorithm with Markov chains (MSaMC)

4 Multicast Scheduling Algorithm withMarkov Chains

In the section we will propose a multicast schedulingalgorithm with Markov chains (MSaMC) in fat-tree DCNswhich aims to minimize the blocking probability of availablelinks and improve the traffic efficiency of data flows in themulticast network Then we give a simple example to explainthe implementation process of MSaMC

41 Description of the MSaMC The core of MSaMC is toselect the downlinks with minimum blocking probability attime-slot 119905+1 Accordingly the first step of the algorithm is tofind the available core switches denoted as the set119879 |119879| le 119891We take the remaining bandwidth of the 119894th uplink as 119906120583119894 Based on our theoretical analysis in Section 5 the multicastsubnetwork may be blocked if it is less than 3120596 that is 119906120583119894 ge3120596

The second step is to choose the appropriate core switchwhich is connected to the downlink with minimum blockingprobability at time-slot 119905 + 1 in each iteration At the endof the iteration we can transfer the core switches from theset 119879 to the set 1198791015840 The iteration will terminate when the setof destination edge switches 119863 is empty Obviously the coreswitches in the set 1198791015840 are connected to the downlinks withminimum blocking probability And the set 1198791015840 can satisfyarbitrary multicast flow request in fat-tree DCNs [5]

Based on the above steps we will obtain a set of appro-priate core switches 1198791015840 Moreover each destination edgeswitch in 119863 can find one downlink from the set 1198791015840 tobe connected with the minimal blocking probability at

Table 1 Link remaining bandwidth (M)

C1 C2 C3 C4E1 90 300 600 800E2 600 700 800 200E3 750 400 350 700E4 500 200 150 500

time-slot 119905 + 1 The third step is to establish the optimalpath from source edge switch to destination edge switchesthrough the appropriate core switches The state of multicastsubnetwork will be updated after the source server sends theconfiguration signals to corresponding forwarding devicesThe main process of the MSaMC is described in Algorithm 1

42 An Example of the MSaMC For the purpose of illus-tration in the following we give a scheduling example in asimple fat-tree DCN as shown in Figure 5 Assume that wehave obtained the network state at time-slot 119905 and made amulticast flow request (1 (2 3 4) 50119872) The link remainingbandwidth 120583 and link blocking probability 119875 at next time-slot are shown in Tables 1 and 2 respectively The symbolradic denotes available uplink and times indicates unavailable linkFor clarity we select only two layers of the network and giverelevant links in each step

As described in Section 41 the MSaMC is implementedby three steps Firstly we take the remaining bandwidth ofthe uplink as 119906120583 (119906120583119894 ge 3 times 50119872) and find the set of availablecore switches that is 119879 = 2 3 4 Secondly we evaluate theblocking probability of relevant downlinks at time-slot 119905+1 In

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Complexity 5

For the events 1198601 1198601 119860119891 in a probability space(Ω 119865 119875) we can obtain the probability of the event 1205981015840119875 (1205981015840) = 119891sum

119894=1

119875 (119860 119894) minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895)+ sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) minus sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)

(6)

where 119875(119860 119894) denotes the probability of the event 119860 119894Combining (1) and (2) with (6) the multicast blocking

model for a multicast connection with fanout 119891 is given by

119875 (119891) = 119898sum119896=1

(119898119896 ) 119901119896119902119898minus119896( 119891sum

119894=1

119875 (119860 119894)minus sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) + sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ)

minus sdot sdot sdot + (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891)) (7)

From (6) sum1le119894lt119895le119891 119875(119860 119894 cap 119860119895) ge sum1le119894lt119895ltℎle119891 119875(119860 119894 cap 119860119895 cap119860ℎ) the following inequality can be derived

sum1le119894lt119895le119891

119875 (119860 119894 cap 119860119895) minus sum1le119894lt119895ltℎle119891

119875 (119860 119894 cap 119860119895 cap 119860ℎ) + sdot sdot sdot+ (minus1)119891minus1 119875 (1198601 cap 1198602 cap sdot sdot sdot cap 119860119891) ge 0

(8)

Therefore theminimumblocking probability of the event1205981015840 is119875min (1205981015840) = 119891sum

119894=1

119875 (119860 119894) (9)

where 119875(1198601) = prod119909119896=11199011119896Afterwards we define 119875min(119891) as the minimum blocking

probability of multicast subnetwork and the number ofavailable core switches is 119909 Thus we get

119875min (119891) = 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119875 (119860 119894))

= 119909sum119896=1

(119909119896) 119901119896119902119909minus119896( 119891sum

119894=1

119909prod119895=1

119901119894119895) (10)

where 119909 le 119898It is not difficult to find from (10) that the minimum

blocking probability 119875min(119891) is an increasing sequence withfanout 119891 In other words it is more difficult to realize a mul-ticast bandwidth request with larger fanout since the numberof core switches is less Therefore the minimum blockingprobability with fanout 119891 reflects the state of available linkat next time-slot

33 Link Blocking Probability at Next Time-Slot In thissubsection we calculate the blocking probability of availablelink at next time-slot based on Markov chains theory Werandomly select a link denoted by the 119894th link to analyze

In the multicast blocking model we denote the currenttime-slot as 119905 and the next time-slot as 119905 + 1 119887119894 is the 119894th linkoccupied bandwidth at time-slot 119905 that is 119910119894(119905) = 119887119894 119886(119905) isthe sum of occupied bandwidth of all available downlinks attime-slot 119905 namely 119886(119905) = sum119909119895=1 119910119895(119905) and 119910119894(119905 + 1) refers topredicted occupied bandwidth of the 119894th link at time-slot 119905+1In [15] the preference or uniform selectionmechanism basedonMarkov chains is adopted for calculating the link blockingprobability at next time-slot Based on the mechanism theprobability 119875119894 of the link incoming new flow at time-slot 119905 + 1can be given by

119875119894 = 119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891 (11)

where 1 le 119891 le 119903In addition we do not consider the case that the band-

width of available link is decreasing namely the bandwidthof available link is enough for multicast bandwidth request Ifa multicast bandwidth request selects the 119894th link at time-slot119905 + 1 it means 119910119894(119905 + 1) will add 120587119894 where 1 le 120587119894 le 119872119894119872119894 is defined as increasing the maximum number of dataflows Then we let 119875119887119894 denote the probability of the 119894th linkflow remaining unchanged or increasing at time-slot 119905 + 1thus we can get

119875119887119894 =119872119894sum120587119894=0

[119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891]120587119894

= 119872119894sum120587119894=0

119875120587119894119894 = 1 minus 119875119872119894+11198941 minus 119875119894 (12)

where 119894 = 1 2 119909 and 120587119894 = 0 1 119872119894According to (12) we will calculate one-step transition

probability of amulticast flow denoted as119875(119910119894(119905+1) = 119887119894+120587119894 |119910119894(119905) = 119887119894) which is a Markov process

119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)= 1119875119887119894 sdot (119875min (119891) sdot 119887119894119886 (119905) + (1 minus 119875min (119891)) sdot 1119891)120587119894

= (1 minus 119875119894) sdot 1198751205871198941198941 minus 119875119872119894+1119894 (13)

where 119894 = 1 2 119909In fact 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894)

indicates the link blocking probability at time-slot 119905 +1 which is determined by 119875119894 and 120587119894 The link blockingprobability will be small when 120587119894 is small at time-slot119905 + 1 otherwise the link may be blocked at time-slot119905 + 1 Therefore the range of 120587119894 is very important to ourproposed multicast scheduling algorithm In this paper weassume that the multicast bandwidth request 120596 is one dataflow unit and 120587119894 is an integral multiple of multicast band-width request 120596

6 Complexity

Input Incoming flow (119894 119863 120596) link remaining bandwidth 120583 the number of destination edge switches |119863| 120587119894 = 3120596Output Multicast links with the minimum blocking probability(1) Step 1 identify available core switches(2) for 119894 = 1 to 119898 do(3) Select an uplink 119906119894(4) if 119906120583119894 ge 3120596 and |119879| le |119863| then(5) Select the core switch 119894 and add it into the set 119879(6) end if(7) end for(8) Step 2 select appropriate core switches(9) Calculate the blocking probability of available downlinks at time-slot 119905 + 1 119875119894(119905 + 1) by equation (13)(10) for 119895 = 1 to |119863| do(11) Find the core switch(es) in 119879 that are connected to a destination edge switch in 119863(12) if There are multiple core switches to be found then(13) Select the core switch with the minimum blocking probability and deliver it to the appropriate set of core switches 1198791015840(14) else(15) Deliver the core switch to the set 1198791015840(16) end if(17) Remove destination edge switches that the selected core switch from 119863 can reach(18) Update the set of remaining core switches in 119879(19) end for(20) Step 3 establish the optimal pathes(21) Connect the links between source edge switch and destination edge switches through appropriate core switches in the set 1198791015840(22) Send configuration signals to corresponding devices in multicast subnetwork

Algorithm 1 Multicast scheduling algorithm with Markov chains (MSaMC)

4 Multicast Scheduling Algorithm withMarkov Chains

In the section we will propose a multicast schedulingalgorithm with Markov chains (MSaMC) in fat-tree DCNswhich aims to minimize the blocking probability of availablelinks and improve the traffic efficiency of data flows in themulticast network Then we give a simple example to explainthe implementation process of MSaMC

41 Description of the MSaMC The core of MSaMC is toselect the downlinks with minimum blocking probability attime-slot 119905+1 Accordingly the first step of the algorithm is tofind the available core switches denoted as the set119879 |119879| le 119891We take the remaining bandwidth of the 119894th uplink as 119906120583119894 Based on our theoretical analysis in Section 5 the multicastsubnetwork may be blocked if it is less than 3120596 that is 119906120583119894 ge3120596

The second step is to choose the appropriate core switchwhich is connected to the downlink with minimum blockingprobability at time-slot 119905 + 1 in each iteration At the endof the iteration we can transfer the core switches from theset 119879 to the set 1198791015840 The iteration will terminate when the setof destination edge switches 119863 is empty Obviously the coreswitches in the set 1198791015840 are connected to the downlinks withminimum blocking probability And the set 1198791015840 can satisfyarbitrary multicast flow request in fat-tree DCNs [5]

Based on the above steps we will obtain a set of appro-priate core switches 1198791015840 Moreover each destination edgeswitch in 119863 can find one downlink from the set 1198791015840 tobe connected with the minimal blocking probability at

Table 1 Link remaining bandwidth (M)

C1 C2 C3 C4E1 90 300 600 800E2 600 700 800 200E3 750 400 350 700E4 500 200 150 500

time-slot 119905 + 1 The third step is to establish the optimalpath from source edge switch to destination edge switchesthrough the appropriate core switches The state of multicastsubnetwork will be updated after the source server sends theconfiguration signals to corresponding forwarding devicesThe main process of the MSaMC is described in Algorithm 1

42 An Example of the MSaMC For the purpose of illus-tration in the following we give a scheduling example in asimple fat-tree DCN as shown in Figure 5 Assume that wehave obtained the network state at time-slot 119905 and made amulticast flow request (1 (2 3 4) 50119872) The link remainingbandwidth 120583 and link blocking probability 119875 at next time-slot are shown in Tables 1 and 2 respectively The symbolradic denotes available uplink and times indicates unavailable linkFor clarity we select only two layers of the network and giverelevant links in each step

As described in Section 41 the MSaMC is implementedby three steps Firstly we take the remaining bandwidth ofthe uplink as 119906120583 (119906120583119894 ge 3 times 50119872) and find the set of availablecore switches that is 119879 = 2 3 4 Secondly we evaluate theblocking probability of relevant downlinks at time-slot 119905+1 In

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

6 Complexity

Input Incoming flow (119894 119863 120596) link remaining bandwidth 120583 the number of destination edge switches |119863| 120587119894 = 3120596Output Multicast links with the minimum blocking probability(1) Step 1 identify available core switches(2) for 119894 = 1 to 119898 do(3) Select an uplink 119906119894(4) if 119906120583119894 ge 3120596 and |119879| le |119863| then(5) Select the core switch 119894 and add it into the set 119879(6) end if(7) end for(8) Step 2 select appropriate core switches(9) Calculate the blocking probability of available downlinks at time-slot 119905 + 1 119875119894(119905 + 1) by equation (13)(10) for 119895 = 1 to |119863| do(11) Find the core switch(es) in 119879 that are connected to a destination edge switch in 119863(12) if There are multiple core switches to be found then(13) Select the core switch with the minimum blocking probability and deliver it to the appropriate set of core switches 1198791015840(14) else(15) Deliver the core switch to the set 1198791015840(16) end if(17) Remove destination edge switches that the selected core switch from 119863 can reach(18) Update the set of remaining core switches in 119879(19) end for(20) Step 3 establish the optimal pathes(21) Connect the links between source edge switch and destination edge switches through appropriate core switches in the set 1198791015840(22) Send configuration signals to corresponding devices in multicast subnetwork

Algorithm 1 Multicast scheduling algorithm with Markov chains (MSaMC)

4 Multicast Scheduling Algorithm withMarkov Chains

In the section we will propose a multicast schedulingalgorithm with Markov chains (MSaMC) in fat-tree DCNswhich aims to minimize the blocking probability of availablelinks and improve the traffic efficiency of data flows in themulticast network Then we give a simple example to explainthe implementation process of MSaMC

41 Description of the MSaMC The core of MSaMC is toselect the downlinks with minimum blocking probability attime-slot 119905+1 Accordingly the first step of the algorithm is tofind the available core switches denoted as the set119879 |119879| le 119891We take the remaining bandwidth of the 119894th uplink as 119906120583119894 Based on our theoretical analysis in Section 5 the multicastsubnetwork may be blocked if it is less than 3120596 that is 119906120583119894 ge3120596

The second step is to choose the appropriate core switchwhich is connected to the downlink with minimum blockingprobability at time-slot 119905 + 1 in each iteration At the endof the iteration we can transfer the core switches from theset 119879 to the set 1198791015840 The iteration will terminate when the setof destination edge switches 119863 is empty Obviously the coreswitches in the set 1198791015840 are connected to the downlinks withminimum blocking probability And the set 1198791015840 can satisfyarbitrary multicast flow request in fat-tree DCNs [5]

Based on the above steps we will obtain a set of appro-priate core switches 1198791015840 Moreover each destination edgeswitch in 119863 can find one downlink from the set 1198791015840 tobe connected with the minimal blocking probability at

Table 1 Link remaining bandwidth (M)

C1 C2 C3 C4E1 90 300 600 800E2 600 700 800 200E3 750 400 350 700E4 500 200 150 500

time-slot 119905 + 1 The third step is to establish the optimalpath from source edge switch to destination edge switchesthrough the appropriate core switches The state of multicastsubnetwork will be updated after the source server sends theconfiguration signals to corresponding forwarding devicesThe main process of the MSaMC is described in Algorithm 1

42 An Example of the MSaMC For the purpose of illus-tration in the following we give a scheduling example in asimple fat-tree DCN as shown in Figure 5 Assume that wehave obtained the network state at time-slot 119905 and made amulticast flow request (1 (2 3 4) 50119872) The link remainingbandwidth 120583 and link blocking probability 119875 at next time-slot are shown in Tables 1 and 2 respectively The symbolradic denotes available uplink and times indicates unavailable linkFor clarity we select only two layers of the network and giverelevant links in each step

As described in Section 41 the MSaMC is implementedby three steps Firstly we take the remaining bandwidth ofthe uplink as 119906120583 (119906120583119894 ge 3 times 50119872) and find the set of availablecore switches that is 119879 = 2 3 4 Secondly we evaluate theblocking probability of relevant downlinks at time-slot 119905+1 In

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Complexity 7

Table 2 The link blocking probability at next time-slot ()

C1 C2 C3 C4E1 times 9 5 4E2 times 4 3 7E3 times 6 7 4E4 times 9 10 5

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(a) The links with satisfying the multicast flow request (1 (2 3 4) 120596)

Core 1 Core 2 Core 3 Core 4

Edge 1 Edge 2 Edge 3 Edge 4

(b) The selected optimal paths by the MSaMC

Figure 5 An example of the MSaMC

effect the blocking probability of downlink at time-slot 119905 + 1from core switch 2 to destination switch 2 is higher than thatfrom core switch 3 to destination switch 2 therefore we selectthe latter downlink as the optimal path Subsequently thecore switch 3 is put into the set 1198791015840 Similarly we get the coreswitch 4 for the set1198791015840 Finally the optimal path is constructedand the routing information is sent to the source edge switch1 and core switches (3 4)

In Figure 5(a) the link remaining bandwidth from edgeswitch 1 to core switch 1 is no less than 150119872 By the aboveway we find that the optimal path for a pair of source edgeswitch and destination edge switch is source edge switch 1 rarrcore switch 3rarr destination edge switch 2 source edge switch1 rarr core switch 4 rarr destination edge switch 3 and sourceedge switch 1 rarr core switch 4 rarr destination edge switch 4as shown in Figure 5(b)

5 Theoretical Analysis

In the section we analyze the performance of MSaMC By(9) we derived the blocking probability bound of multicastsubnetwork as shown in Lemma 1

Lemma 1 In a multicast subnetwork the maximum subnet-work blocking probability is less than 13

Proof We take the remaining bandwidth of uplink to be noless than 3120596 by the first step of Algorithm 1 and thus themaximum value of link blocking probability 119901 is 13 in otherwords the available link remaining bandwidth just satisfiesthe above condition that is 119906120583 = 3120596

From (9) and De Morganrsquos laws [16] we can obtain theprobability of event 1205981015840

119875min (1205981015840) = 1 minus 119891prod119894=1

119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909)

= 1 minus 119891prod119894=1

(1 minus 119875 (1198891198941 cap 1198891198942 cap sdot sdot sdot cap 119889119894119909))

= 1 minus 119891prod119894=1

(1 minus 119909prod119896=1

119901119889119894119896) = 1 minus (1 minus 119901119909)119891

(14)

Therefore based on (10) the subnetwork blocking prob-ability is maximumwhen the number of uplinks is 1Thus wecan obtain

max119875min (119891) = 119901 sdot (1 minus (1 minus 119901119909min)119891)= 13 (1 minus (1 minus 13)119891) (15)

Then we have max119875min(119891) = 13 as 119891 rarr infin This completesthe proof

The result of Lemma 1 is not related to the number ofports of switchesThis is because the deduction of Lemma 1 isbased on the link blocking probability 119901 119901 = 120596120583 Howeverthemulticast bandwidth120596 and the link remaining bandwidth120583 will not be affected by the number of ports of switchesTherefore Lemma 1 still holds when the edge switches havemore ports Moreover the size of switch radix has no effecton the performance of MSaMC

At time-slot 119905 + 1 the data flow of available link willincrease under the preference or uniform selection mecha-nism In addition the blocking probability of available linkshould have upper bound (maximum value) for guaranteeingthe efficient transmission of multicast flow Based on (7) andLemma 1 we can get max119875119894 = 13 when the number ofuplinks and downlinks are equal to 2 respectively Clearlythis condition is a simplest multicast transmission modelIn real multicast network satisfying 119875119894 ≪ 13 is a generalcondition

In addition 119875119894 is proportional to 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 |119910119894(119905) = 119887119894) namely the link blocking probability will increaseas the multicast flow gets larger Therefore 119875(119910119894(119905 + 1) = 119887119894 +120587119894 | 119910119894(119905) = 119887119894) is monotonously increasing for 119901119894Theorem 2 As the remaining bandwidth of available link 120583is no less than 3120596 the multicast flow can be transferred to 119891destination edge switches

Proof For each incoming flow by adopting the preferredselection mechanism in selecting the 119894th link when 120587119894 ge 1

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

8 Complexity

we compute the first-order derivative of (13) about 119901119894 where119894 = 1 2 119909120597120597119901119894119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894)

= minus 1198751205871198941198941 minus 119875119872119894+1119894 + 120587119894 sdot (1 minus 119875119894) sdot 119875120587119894119894119901119894 sdot (1 minus 119875119872119894+1119894 )+ (119872119894 + 1) sdot (1 minus 119875119894) sdot 119875120587119894119894 sdot 119875119872119894+1119894119901119894 sdot (1 minus 119875119872119894+1119894 )2

(16)

In (16) the third term is more than zero and the secondterm is greater than the absolute value of the first term when120587119894 ge 3 hence we can obtain119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) gt0Therefore119875(119910119894(119905+1) = 119887119894+120587119894 | 119910119894(119905) = 119887119894) is monotonouslyincreasing function for 119901119894 when 120587119894 ge 3 The multicast flowrequest 120596 is defined as one data unit evidently 120587119894 ge 3120596 Inother words the remaining bandwidth of available link cansatisfy the multicast bandwidth request 120596 at time-slot 119905 + 1 if120583 ge 3120596 This completes the proof

On the basis of Theorem 2 the first step of Algorithm 1 isreasonable and efficient The condition with 120583 ge 3120596 not onlyensures the sufficient remaining bandwidth for satisfying themulticast flow request but also avoids the complex calculationof uplink blocking probability However the downlink hasdata flow coming from other uplinks at any time-slot whichresults in the uncertainty of downlink state at time-slot 119905 + 1Therefore we take theminimumblocking probability at time-slot 119905 + 1 as the selection target of optimal downlinks

Due to the randomness and uncertainty of the downlinkstate it is difficult to estimate the network blocking state attime-slot 119905 + 1 Afterwards we deduce the expectation thatthe 119894th downlink connects to the 119895th destination edge switchat time-slot 119905 + 1 denoted by 119890119894(119905 119887119894) 119895 = 1 2 119891 Giventhat the data flow in the 119894th downlink is 119887119894 we can obtain

119890119894 (119905 119887119894)= 119872119894sum120587119894=0

((119887119894 + 120587119894) sdot 119875 (119910119894 (119905 + 1) = 119887119894 + 120587119894 | 119910119894 (119905) = 119887119894))

= 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 (17)

where 119875119887119894 = (1 minus 119875119872119894+1119894 )(1 minus 119875119894) 119894 = 1 2 119909By (17) we conclude the following theorem which

explains the average increase rate of data flow at eachdownlink

Theorem 3 In a fat-tree DCN the increased bandwidth ofdownlink is no more than two units on the average at time-slot119905 + 1Proof We consider sum119872119894120587119894=0 119875(119910119894(119905 + 1) = 119887119894 + 120587119894 | 119910119894(119905) = 119887119894) =1 which means the flow increment of each link must be oneelement in set 0 1 119872119894

Setting 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894119894 = 119875119894 + sum119872119894120587119894=2 120587119894 sdot 119875120587119894119894 we can get119875119894 sdot 119860 = sum119872119894120587119894=1 120587119894 sdot 119875120587119894+1119894 = sum119872119894120587119894=2(120587119894 minus 1) sdot 119875120587119894119894 + 119872119894 sdot 119875119872119894+1119894 Through the subtraction of the above two equations we

can obtain (1 minus 119875119894) sdot 119860 = 119875119894 + sum119872119894119899119894=2 119875120587119894119894 minus 119872119894 sdot 119875119872119894+1119894 Then wehave119860 = (119875119894minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)+(1198752119894 minus119872119894 sdot119875119872119894+1119894 )(1minus119875119894)2Substituting it into (17) we can obtain

119890119894 (119905 119887119894) = 119887119894 + 1119875119887119894119872119894sum120587119894=1

120587119894 sdot 119875120587119894119894 = 119887119894 + 119860119875119887119894= 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894

+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) (18)

where119875119894 lt 13 By relaxing the latter two terms of (18) 119890119894(119905 119887119894)can be rewritten as

119890119894 (119905 119887119894) = 119887119894 + 119875119894 minus 119872119894 sdot 119875119872119894+11198941 minus 119875119872119894+1119894+ 1198752119894 minus 119875119872119894+1119894(1 minus 119875119894) (1 minus 119875119872119894+1119894 ) lt 119887119894 + 2

(19)

where 119894 = 1 2 119909By merging (17) and (19) we have 119887119894 lt 119890119894(119905 119887119894) lt 119887119894 + 2

then 1 lt 119890119894(119905 119887119894) minus 119887119894 + 1 lt 3 Hence the downlink bandwidthwill increase at least one unit data flow when the downlink isblocked

When 119872119894 lt 119890119894(119905 119887119894) minus 119887119894 + 1 the number of increaseddata flows is larger than 119872119894 however it is not allowed by thedefinition of 119872119894 thus we can obtain

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894) = 0 (20)

When 119872119894 ge 119890119894(119905 119887119894) minus 119887119894 + 1 we can get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

119875 (119910119894 (119905 + 1) = 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)

= 119872119894sum120587119894=119890119894(119905119887119894)minus119887119894+1

1119875119887119894 sdot 119875120587119894119894 = 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 (21)

Equation (21) represents the downlink traffic capabilityat time-slot 119905 + 1 When the value of (21) is very large theblocking probability of downlink is higher vice versa Toclarify the fact that the downlink has lower blocking prob-ability at next time-slot we have the following theorem

Theorem 4 In the multicast blocking model of fat-tree DCNsthe downlink blocking probability at time-slot 119905 + 1 is less than0125

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Complexity 9Bl

ocki

ng p

roba

bilit

y (

)

0 10 20 30 40 500

4

8

12

16

20

Zero point Mi = 2

Mi = 4

Mi = 8

Mi = 16

Pi ()

Figure 6 Downlink blocking probability comparison in different119872119894sProof Based on (21) we take the minimum value of 119872119894 as 2Thus we get

119875 (119910119894 (119905 + 1) gt 119890119894 (119905 119887119894) | 119910119894 (119905) = 119887119894)= 119875119890119894(119905119887119894)minus119887119894+1119894 minus 119875119872119894+11198941 minus 119875119872119894+1119894 lt (13)3 minus (13)(3+1)1 minus (13)(3+1)= 0125

(22)

This completes the proof

In order to show that the MSaMC manifests the lowerblocking probability of downlink at time-slot 119905 + 1 under thedifferent values of 119872119894 we provide the following comparisonas shown in Figure 6

In Figure 6 119875(119910119894(119905 + 1) gt 119890119894(119905 119887119894) | 119910119894(119905) = 119887119894) indicatesthe downlink blocking probability and their values are notmore than 0125 for different 119872119894 and 119875119894 At the zero point theblocking probability is close to zero unless 119875119894 gt 01 In realnetwork the condition of 119875119894 gt 01 is rarely Therefore theMSaMC has very lower blocking probability

In the following we analyze the time complexity ofMSaMC The first step of MSaMC takes the time complexityof119874(119898) to identify available core switches In the second stepthe MSaMC needs to find the appropriate core switches Weneed 119874(119891 sdot 119891) time to calculate the blocking probability ofavailable downlinks at time-slot 119905 + 1 and select the appro-priate core switches to the set 1198791015840 where 119891 le 119903 minus 1 In theend we take 119874(119891 + 119891) time to construct the optimal pathsfrom source edge switch to destination edge switches Thusthe computational complexity of MSaMC is given by

119874 (119898 + 119891 sdot 119891 + 119891 + 119891) le 119874 (119898 + (119903 minus 1)2 + 2 (119903 minus 1))= 119874 (1199032 + 119898 minus 1) (23)

Note that the complexity of the algorithm is polynomialwith the number of core switches 119898 and the number of edge

Table 3 Parameter setting

Parameter DescriptionPlatform NS2Link bandwidth 1GbpsRTT delay 01msSwitch buffer size 64KBTCP receiver buffer size 100 segmentsSimulation time 10 s

switches 119903 which means that the computational complexityis rather lower if the fanout 119891 is very small Therefore thealgorithm is time-efficient in multicast scheduling

6 Simulation Results

In this section we utilize network simulator NS2 to evaluatethe effectiveness of MSaMC in fat-tree DCNs in terms ofthe average delay variance (ADV) of links with differenttime-slots Afterwards we compare the performance betweenMSaMC and SLMR algorithm with the unicast traffic [4]and present the comparison between MSaMC and BCMSalgorithm with the multicast traffic [5]

61 Simulation Settings The simulation network topologyadopts 1024 servers 128 edge switches 128 aggregationswitches and 64 core switches The related network param-eters are set in Table 3 Each flow has a bandwidth demandwith the bandwidth of 10Mbps [4] For the fat-tree topologywe consider mixed traffic distribution of both unicast andmulticast traffic For unicast traffic the flow destinations ofa source server are uniformly distributed in all other serversThe packet length is uniformly distributed between 800 and1400 bytes and the size of eachmulticast flow is equal [17 18]

62 Comparison of Average Delay Variance In this subsec-tion we first define the average delay variance (ADV) andthen compare the ADV of the uplink and downlink by thedifferent number of packets

Definition 5 (average delay variance) Average delay variance(ADV) 119881 is defined as the average of the sum of thetransmission delay differences of the two adjacent packets ina multicast subnetwork that is

119881 = sum119894isin119909 sum119895isin119897 (119879 (119905)119894119895 minus 119879 (119905 minus 1)119894119895)119909 (24)

where 119909 is the number of available links 119897 is the number ofpackets in an available link and 119879(119905) indicates the transmis-sion delay of packet at time-slot 119905

WE take ADV as a metric for the network state ofmulticast subnetwork The smaller the ADV is the morestable the network state is vice versa

Figure 7 shows the average delay variance (ADV) oflinks as the number of packets grows As the link remainingbandwidth 120583 is taken as 120596 or 2120596 the average delay variance

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

10 Complexity

0 1000 2000 3000

0

2

4

6

Number of packets

AD

V (

)

minus2

minus4

minus6

=

= 2

= 3

Figure 7 Average delay variance (ADV) comparison among thelink of different remaining bandwidth

0 1000 2000 3000minus4

minus2

0

2

4

Number of packets

AD

V (

)

UplinkDownlink

Figure 8 Average delay variance (ADV) comparison betweenuplink and downlink

has bigger jitterThis is because the link remaining bandwidthcannot satisfy the multicast flow request 120596 at time-slot 119905 + 1The average delay variance is close to a straight line when thelink remaining bandwidth is 3120596 which implies that thenetwork state is very stable Therefore the simulation resultmanifests that the optimal value of the link remainingbandwidth 120583 is 3120596

From Figure 8 we observe that the jitter of uplink ADVis smaller than that of the downlink ADVThis is because thefat-tree DCN is a bipartition network that is the bandwidthof the uplink and downlink is equal However the downlinkload is higher than the uplink load in the multicast traffictherefore the uplink state is more stable

63 Total NetworkThroughput In the subsection we set thelength of time-slot 119905 as 120596119878 and 2(120596119878) We can observe fromthe Figure 9(a) that MSaMC achieves better performancethan the SLMR algorithm when the length of time-slot 119905 is2(120596119878) This is because MSaMC can quickly recover thenetwork blocking and thus it can achieve higher networkthroughput In addition the MSaMC cannot calculate theoptimal path in real time when the length of time-slot 119905is 120596119878 therefore the SLMR algorithm provides the higherthroughput

Figure 9(b) shows throughput comparison of MSaMCand BCMS algorithm under mixed scheduling pattern Thethroughput of BCMS algorithm is lower as the simula-tion time increases gradually The multicast transmission ofBCMS algorithm needs longer time to address the problemof network blocking therefore the throughout will decreasesharply if the network blocking cannot be predicted Incontrast the MSaMC can predict the probability of networkblocking at next time-slot and address the delay problem ofdynamic bandwidth allocation Therefore the MSaMC canobtain higher total network throughput

64 Average Delay In this subsection we compare theaverage end-to-end delay of our MSaMC SLMR algorithmwith the unicast traffic and BCMS algorithm with mixedtraffic over different traffic loads Figure 10 shows the averageend-to-end delay for the unicast and mixed traffic patternsrespectively

We can observe from Figure 10 that as the simulationtime increases gradually the MSaMC with 119905 = 2(120596119878) hasthe lowest average delay than SLMR and BCMS algorithmsfor the two kinds of traffic This is because SLMR and BCMSalgorithms utilize more backtracks to eliminate the multicastblocking therefore they takemore time to forward data flowsto destination edge switches In addition we can also find thatwhen the length of the time-slot is 2(120596119878) our MSaMC hastheminimumaverage delayThis is because the time-slot withlength 2(120596119878) can just ensure that data can be transmittedaccurately to destination switches The shorter time-slot withless than 2(120596119878)will lead to the incomplete data transmissionwhile the longer time-slot with more than 2(120596119878) will causethe incorrect prediction for traffic blocking status

7 Conclusions

In this paper we propose a novel multicast schedulingalgorithmwithMarkov chains calledMSaMC in fat-tree datacenter networks (DCNs) which can accurately predict thelink traffic state at next time-slot and achieve effective flowscheduling to improve efficiently network performance Weshow that MSaMC can guarantee the lower link blocking atnext time-slot in a fat-tree DCN for satisfying an arbitrarysequence of multicast flow requests under our traffic modelIn addition the time complexity analysis also shows that theperformance of MSaMC is determined by the number ofcore switches 119898 and the destination edge switches 119891 Finallywe compare the performance of MSaMC with an existingunicast scheduling algorithm called SLMR algorithm and awell-known adaptive multicast scheduling algorithm called

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Complexity 11

0 1 2 3 4 50

1000

2000

3000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

(a)

0 1 2 3 4 50

1000

2000

3000

4000

5000

Simulation time (s)

Net

wor

k th

roug

hput

(Gb

s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 9 Network throughput comparison

0 1 2 3 4 50

40

80

120

160

200

Simulation time (s)

SLMRMSaMC (t = 2(S))

MSaMC (t = S)

Aver

age d

elay

(s)

(a)

0 1 2 3 4 50

20

40

60

80

100

Simulation time (s)

Aver

age d

elay

(s)

BCMSMSaMC (t = 2(S))

MSaMC (t = S)

(b)

Figure 10 Average delay comparison

BCMS algorithm Experimental results show that MSaMCcan achieve higher network throughput and lower averagedelay

Notations

120596 Multicast bandwidth request about data flow119887119894 The occupied bandwidth of 119894th link120583 The remaining bandwidth of link119886 The sum of occupied bandwidth

119910 The value of link weight119878 Link bandwidth119872 Increasing the maximum number of dataflows120587 Increasing the number of data flows119879 The set of available core switches

Conflicts of Interest

The authors declare that they have no conflicts of interest

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

12 Complexity

Acknowledgments

Thisworkwas supported by the Fundamental Research Fundsfor the Central Universities (XDJK2016A011 XDJK2015C010XDJK2015D023 and XDJK2016D047) the National Natu-ral Science Foundation of China (nos 61402381 6150330961772432 and 61772433) Natural Science Key Foundation ofChongqing (cstc2015jcyjBX0094) andNatural Science Foun-dation of Chongqing (CSTC2016JCYJA0449) China Post-doctoral Science Foundation (2016M592619) andChongqingPostdoctoral Science Foundation (XM2016002)

References

[1] J Duan and Y Yang ldquoPlacement and Performance Analysis ofVirtual Multicast Networks in Fat-Tree Data Center NetworksrdquoIEEE Transactions on Parallel and Distributed Systems vol 27no 10 pp 3013ndash3028 2016

[2] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[3] SGhemawatHGobioff and S Leung ldquoThegoogle file systemrdquoAcm Sigops Operating Systems Review vol 37 no 5 pp 29ndash432003

[4] O Fatmi and D Pan ldquoDistributed multipath routing for datacenter networks based on stochastic traffic modelingrdquo in Pro-ceedings of the 11th IEEE International Conference on Network-ing Sensing and Control ICNSC 2014 pp 536ndash541 USA April2014

[5] Z Guo On The Design of High Performance Data CenterNetworks Dissertations andTheses - Gradworks 2014

[6] H Yu S Ruepp and M S Berger ldquoOut-of-sequence preven-tion for multicast input-queuing space-memory-memory clos-networkrdquo IEEE Communications Letters vol 15 no 7 pp 761ndash765 2011

[7] G Li S Guo G Liu and Y Yang ldquoMulticast Scheduling withMarkov Chains in Fat-Tree Data Center Networksrdquo in Pro-ceedings of the 2017 International Conference on NetworkingArchitecture and Storage (NAS) pp 1ndash7 Shenzhen ChinaAugust 2017

[8] X Geng A Luo Z Sun and Y Cheng ldquoMarkov chainsbased dynamic bandwidth allocation in diffserv networkrdquo IEEECommunications Letters vol 16 no 10 pp 1711ndash1714 2012

[9] J Sun S Boyd L Xiao and P Diaconis ldquoThe fastest mixingMarkov process on a graph and a connection to a maximumvariance unfolding problemrdquo SIAM Review vol 48 no 4 pp681ndash699 2006

[10] T G Hallam ldquoDavid G Luenberger Introduction to DynamicSystems Theory Models and Applications New York JohnWiley amp Sons 1979 446 pprdquo Behavioural Science vol 26 no4 pp 397-398 1981

[11] K Hirata and M Yamamoto ldquoData center traffic engineeringusing Markov approximationrdquo in Proceedings of the 2017 Inter-national Conference on Information Networking (ICOIN) pp173ndash178 Da Nang Vietnam January 2017

[12] D Li M Xu M-C Zhao C Guo Y Zhang and M-Y WuldquoRDCM Reliable data center multicastrdquo in Proceedings of theIEEE INFOCOM 2011 pp 56ndash60 China April 2011

[13] D Li H Cui Y Hu Y Xia and X Wang ldquoScalable data centermulticast using multi-class bloom filterrdquo in Proceedings of the

2011 19th IEEE International Conference on Network ProtocolsICNP 2011 pp 266ndash275 Canada October 2011

[14] P J Cameron ldquoNotes on counting An introduction to enumer-ative combinatoricsrdquo Urology vol 65 no 5 pp 898ndash904 2012

[15] R Pastor-Satorras M Rubi and A Diaz-Guilera ldquoStatisticalmechanics of complex networksrdquoReview ofModern Physics vol26 no 1 2002

[16] A P Pynko ldquoCharacterizing Belnaprsquos logic via De MorganrsquoslawsrdquoMathematical Logic Quarterly vol 41 no 4 pp 442ndash4541995

[17] T Benson A Anand A Akella andM Zhang ldquoUnderstandingdata center traffic characteristicsrdquo in Proceedings of the 1stWorkshop Research on Enterprise NetworkingWREN 2009 Co-located with the 2009 SIGCOMM Conference SIGCOMMrsquo09pp 65ndash72 Spain August 2009

[18] C Fraleigh S Moon B Lyles et al ldquoPacket-level trafficmeasurements from the Sprint IP backbonerdquo IEEENetwork vol17 no 6 pp 6ndash16 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom


Recommended