+ All Categories
Home > Documents > Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce,...

Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce,...

Date post: 18-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
168
Overview of Networking Challenges for the Placement of Cloud Services Frédéric Giroire *Combinatorics, Optimisation et Algorithms For Telecommunications Journées Cloud 2019 Université Côte d’Azur/CNRS/Inria COATI*, France
Transcript
Page 1: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Overview of Networking Challenges for the Placement of Cloud Services

Frédéric Giroire

*Combinatorics, Optimisation et Algorithms For Telecommunications

Journées Cloud 2019

Université Côte d’Azur/CNRS/Inria COATI*, France

Page 2: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

One slide on my research

2

PeeringLink

Network of an Internet Service Provider (ISP)

To other ISP

/17

Optimization of network infrastructures.

Réseaux accès sans-fil et filaires

Page 3: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

One slide on my research

2

PeeringLink

Network of an Internet Service Provider (ISP)

To other ISP

/17

Optimization of network infrastructures.

Réseaux accès sans-fil et filaires

Generic and recurring question: find the best tradeoff between

• where to store data,

• where to carry out computations or execute services,

• how much trafic to send in the network and by which route,

with diverse objectives: minimize the costs, the energy consumption, the failure probability or to maximize users’ satisfaction.

Using tools from algorithmics, optimization, combinatorics (graph theory), simulations and experimentations.

Page 4: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

In the cloud

• Application or Services are run in Virtual Machines (VMs) or containers or Kata-containers

• An orchestrator assigns VMs to servers

3

Orchestrator

Classical optimization problem: VM placement satisfying CPU, memory, storage constraints while minimizing some cost

Page 5: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Big Data*

• The volume of data businesses want to make sense of is increasing

• Increasing variety of sources • Web, mobile, wearables,

vehicles, scientific, ...

• Cheaper disks, SSDs, and memory

• Stalling processor speeds

4*Thanks: Some slides were borrowed from M. Chowdhury (University of Michigan)

Page 6: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Solution: Big Data Centers for Massive Parallelism

5

Page 7: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Introduction

6

• More and more data-oriented parallel computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark)

• Traditional scheduling consider properties of

• server (e.g., CPU and memory usage)

• job (e.g., execution time, deadline)

Network resources usually not taken into consideration

Page 8: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Communication is Crucial

• Performance

• Facebook jobs spend ~25% of runtime on average in intermediate communications*

[Chowdhury. Presentation in Dimacs. 2017]

• For some workload, communications may account for up to 50% of job completion time [Chowdhury, et al. Orchestra SIGCOMM 2011]

7

As fast storage (e.g. SSD-based) systems proliferate, the network is likely to become an more and more important bottleneck

*Based on a month-long trace with 320,000 jobs and 150 Million tasks, collected from a 3000-machine Facebook production MapReduce cluster.

Page 9: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Legacy Networks

• However, network resources are usually not optimized.

• Why? ‣ Network control is *very* difficult.

8

Page 10: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Legacy networks

9

Control plane

Data plane

• Router=closed systems. Any change has to be done manually.

• Networks are managed by complex configurations.

/90

Page 11: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Legacy networks

9

Control plane

Data plane

• Router=closed systems. Any change has to be done manually.

• Networks are managed by complex configurations.

—> Important difficulties to deploy new protocols /90

Page 12: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Legacy networks

9

Control plane

Data plane

• Router=closed systems. Any change has to be done manually.

• Networks are managed by complex configurations.

—> Important difficulties to deploy new protocols

-> Dynamic routing decision not yet successfully implemented in networks.

/90

Page 13: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

10

What can be done to improve network usage?

Question:

Page 14: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

11

Page 15: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

• Trick 1: Layered graph • Trick 2: Placement = set cover • Fact 1: Efficient algorithms exist for SFC • Trick 3: Modeling concurrent flows with co-flows • Fact 2: Efficient algorithm exist for co-flows • Trick 4: The big switch abstraction (and more

generally finding the bottleneck)

12

Modeling Trick

Some modeling tricks or algorithmic facts useful to know

Page 16: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

13

Page 17: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A new context

However, arrival of two new network paradigms:

1. Software Defined Networking (SDN)

2. Network Function Virtualization (NFV)

14

Page 18: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A new context

However, arrival of two new network paradigms:

1. Software Defined Networking (SDN)

2. Network Function Virtualization (NFV)

14

Page 19: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Software Defined Networks

15

• Router=closed systems. Any change has to be done manually.

• Networks are managed by complex configurations.

—> Important difficulties to deploy new protocols

• Intelligence implemented by a centralized controller managing elementary switches

• SDN conceives the network as a program.

Control plane

Data plane

Data plane

Control plane

Network Applications

/90

Page 20: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Software Defined Networks

15

• Router=closed systems. Any change has to be done manually.

• Networks are managed by complex configurations.

—> Important difficulties to deploy new protocols

• Intelligence implemented by a centralized controller managing elementary switches

• SDN conceives the network as a program.

—>Allows the deployment of advanced (dynamic) protocols

Control plane

Data plane

Data plane

Control plane

Network Applications

/90

Page 21: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Example: Energy Efficiency

16

• Core of solutions for energy efficiency: dynamic adaptation of resource usage to traffic changes.

HIGH Traffic

LOW Traffic

Other applications: energy efficient data centers (virtual machine assignment), wireless networks (base-station assignment)…

/90

Page 22: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Software Defined Networks• Pushed by open source communities + large software and

telecommunication companies.

• Large eco-system: Open Flow / Open Day Light / Open Stack / Open vSwitch

• Software companies: Google B4 large scale experiment on its inter-data center networks [Jain 2013].

• Telcos: e.g. AT&T targets 75% of network functions as a software by 2020.

17

B4 worldwide deployment (2011)

/90

Page 23: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SDN Challenges

• Defining the architecture. • e.g. northbound APIs to enable

real network programmability

• Security • e.g. single point of failure

• Scalability of the SDN environment • e.g. avoiding Control – Data

Plane communications overhead

18

Data plane

Network Applications

Control plane

/90

Page 24: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SDN in summary

19

Decoupling of network control and forwarding functions

Advantages:• centralized management• programmatically configured• dynamic routing• ...

Page 25: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A new context

However, arrival of two new network paradigms:

1. Software Defined Networking (SDN)

2. Network Function Virtualization (NFV)

20

Page 26: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network Function Virtualization

• Network flows have to be processed by a large number of network functions… …offering different services: security, traffic engineering, …

• Legacy networks implements network functions using expensive specific hardware called middleboxes.

21

Page 27: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network Function Virtualization

• The NFV initiative decouples the network elements from underlying hardware by allowing functions to be run on general hardware using Virtual Machines.

22

• Advantages: - flexibility, - cost, - scalability, - …

Network Appliances General PurposeServers

Page 28: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SDN+NFV = full Network Programmability

23

• NFV and SDN independent of each other but complementary

GOAL: exploit the benefits and potentials of both approaches

• A symbiosis between them can improve resource management and service orchestration:

- Increased Efficiency and Lower Costs - Faster Innovation and Time to market - Agility - Automation & change faster - No Vendor Lock-in

Page 29: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Research Challenges

24

‣ Algorithmic Aspects of Resource Allocation‣ Evaluation of SDN/NFV solutions‣ New Protocols & Standardization‣ Performance‣ Resiliency‣ Scalability‣ Security‣ …

Page 30: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Research Challenges

25

‣Algorithmic Aspects of Resource Allocation‣ Evaluation of SDN/NFV solutions‣ New Protocols & Standardization‣ Performance‣ Resiliency‣ Scalability‣ Security‣ …

Page 31: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

26

Page 32: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Service Function Chaining

• Network flows are often required to be processed by an ordered sequence of network functions defining a service

• Different customers can have different requirements in terms of the sequence of network functions

27

Videooptimization

Deeppacketinspection

Firewall

SFCA

SFCB

Page 33: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Service Function Chaining

28

• Legacy Networks: new service —> new hardware

- impractical to change the locations of physical middleboxes

• SDN/NFV-enabled Networks: easier and cheaper SFCs deployment and provisioning:

- simplified middlebox traffic steering (SDN)

- flexible and dynamic deployment of network functions (NFV)

Flows can be managed dynamically from end-to-end and the network functions can be installed only along the paths for which and when they are necessary.

Page 34: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

NFV Placement

• NFV: more efficient and flexible network management.

• Hence, placing network functions in a cost effective manner is an essential step toward the full adoption of the NFV paradigm.

• Problem: place VNFs to satisfy the ordering constraints of the flows with the goal of minimizing the total setup cost (such as license fees, network efficiency, or energy consumption)

29

Page 35: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Example of Service Function Chains

30

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

Page 36: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Example of Service Function Chains

31

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

Page 37: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Example of Service Function Chains

32

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

Page 38: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Example of Service Function Chains

33

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

Page 39: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SFC Placement• Challenges:

- Optimizing routing AND NVF provisioning - Modeling order between functions

• Outline: 1. Trick 1: The layered graph

[Dwaraki and Wolf, in HotMIddlebox, 2016]

2. Approximation algorithms for SFC[Tomassilli, Giroire, Huin, Perennes, in INFOCOM 2018]

‣ Trick 2: NVF placement = Set Cover[Sang et al. in Infocom 2017]

34

Page 40: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SFC Placement• Challenges:

- Optimizing routing AND NVF provisioning - Modeling order between functions

• Outline: 1. Trick 1: The layered graph

[Dwaraki and Wolf, in HotMIddlebox, 2016]

2. Approximation algorithms for SFC[Tomassilli, Giroire, Huin, Perennes, in INFOCOM 2018]

‣ Trick 2: NVF placement = Set Cover[Sang et al. in Infocom 2017]

35

Page 41: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SFC Placement

• Classic way to model the problem of routing & provisioning SFC is using Integer Linear Programming (ILP) with • Introduction of large number of binary variables to

model the function placement. • Introduction of large number of binary variables to

model the order (“function f2 cannot appear on the path before function f1”).

• Leads to not efficient optimization solutions and algorithms

36

Modeling Trick 1

Page 42: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph[1]

• Proposes an alternate way to find Service Path (path & placement of function)

• Transforms a problem of routing and placement into a problem of routing,

• While taking into account the order between functions.

37

2

6 5

3

41

Example: Request between 1 and 4 for SFC

Modeling Trick 1

[1] Dwaraki and Wolf. Adaptive service-chain routing for virtual network functions in software-defined net- works,” in Workshop on Hot topics in HotMIddlebox, 2016]

Page 43: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41 • # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 44: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 45: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 46: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 47: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 48: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 49: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 50: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

38

Example: Request between 1 and 4 for SFC

Modeling Trick 1

2

6 5

3

41

2

6 5

3

41

2

6 5

3

41

• # layers = # functions + 1

• Link between layers gives the placement

• Link inside layers gives the routing

• Path from first to last layer

Page 51: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Layered Graph

• Finding a Service Path boils down now to find

• a constrained shortest path (because of shared capacity) in the layered graph, using fast pseudo-polynomial algorithms e.g. [1]

• or even a simple shortest path (often sufficient in practice), using a very fast algorithm like Dijkstra.

39

Algorithmic Fact 1

[1] Irnich and Desaulniers. Shortest path problems with resource constraints. Column generation. 2005.]

Page 52: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SFC Placement• Challenges:

- Optimizing routing AND NVF provisioning - Modeling order between functions

• Outline: 1. Trick 1: The layered graph

[Dwaraki and Wolf, in HotMIddlebox, 2016]

2. Approximation algorithms for SFC [Tomassilli, Giroire, Huin, Perennes, in INFOCOM 2018]

‣ Trick 2: NVF placement = Set Cover [Sang et al. in Infocom 2017]

40

Page 53: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

• Input: A digraph G = (V,E), a set of functions F, and a collection D of demands.

• A demand d ∈ D is modeled by a couple :

• a path path(d) of length l(d) and • a service function chain sfc(d) of length s(d).

• A setup cost c(v,f) of function f in node v ∈ V .

• Output: A function placement Π ⊂ V × F

• Objective: minimize total setup cost

Problem

41

X

(v,f)2⇧

c(v, f)

Page 54: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

• Input: A digraph G = (V,E), a set of functions F, and a collection D of demands.

• A demand d ∈ D is modeled by a couple :

• a path path(d) of length l(d) and • a service function chain sfc(d) of length s(d).

• A setup cost c(v,f) of function f in node v ∈ V .

• Output: A function placement Π ⊂ V × F

• Objective: minimize total setup cost

• Similarly to [Sang et al. Infocom 2017], we consider the case of an operator which has already routed its demands and which now wants to optimize the placement of network functions.

Problem

42

X

(v,f)2⇧

c(v, f)

Page 55: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Related Work• Roughly two categories Heuristic-Based and ILP based

• [Kuo et al. Infocom 2016] Maximizing the total number of admitted demands

• [Mehraghdam et al. Cloudnet 2014] Minimizing the number of used nodes or the latency of the paths.

• Works closest to us, Approximation Algorithms • [Cohen et al. Infocom 2015] Minimizing setup cost near-optimal

approximation algorithms with theoretically proven performance. However, no execution order of the network functions

• [Sang et al. Infocom 2017] Minimizing the total number of network functions. But one single network function and leave the placement of virtual functions with chaining constraint as an open problem for future research.

43

Page 56: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Contributions

“First approximation algorithms taking into account ordering constraints.”

+ optimal on trees + validation

[Tomassilli, Giroire, Huin, Perennes INFOCOM 2018]

44

Page 57: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

• Direct equivalence with the Minimum Weight Hitting Set Problem

45

Modeling Trick 2

[Sang et al. Infocom 2017]

Page 58: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

• Direct equivalence with the Minimum Weight Hitting Set Problem

• Input: Collection C of subsets of a finite set S.Output: A hitting set for C, i.e., a subset S′ ⊆ S such that S′ contains at least one element from each subset in C.

• Objective: Minimize the cost of the hitting set, i.e.,

46

Modeling Trick 2

Page 59: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

• Direct equivalence with the Minimum Weight Hitting Set Problem

• Input: Collection C of subsets of a finite set S.Output: A hitting set for C, i.e., a subset S′ ⊆ S such that S′ contains at least one element from each subset in C.

• Objective: Minimize the cost of the hitting set, i.e.,

46

Modeling Trick 2

Page 60: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

• Direct equivalence with the Minimum Weight Hitting Set Problem

• Input: Collection C of subsets of a finite set S.Output: A hitting set for C, i.e., a subset S′ ⊆ S such that S′ contains at least one element from each subset in C.

• Objective: Minimize the cost of the hitting set, i.e.,

46

Modeling Trick 2

Page 61: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

• Elements of S: possible function locations, i.e., the vertices in V . Each element has cost c(v).

• Sets in C: paths of the demands in D. Set = all path nodes {u1, ..., ul(d)}.

-> Placement of minimum cost covering all demands corresponds to a minimum cost hitting set.

47

Modeling Trick 2

Page 62: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

48

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

Modeling Trick 2

Page 63: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

49

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

Modeling Trick 2

Page 64: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

49

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

Modeling Trick 2

Page 65: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

49

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

Modeling Trick 2

Page 66: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

49

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

Modeling Trick 2

Page 67: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

50

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

c(A,f1)

c(B,f1)

c(C,f1)

c(D,f1)

c(E,f1)

c(F,f1)

Modeling Trick 2

Page 68: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

51

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

c(A,f1)

c(B,f1)

c(C,f1)

c(D,f1)

c(E,f1)

c(F,f1)

Modeling Trick 2

Page 69: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

52

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

c(A,f1)

c(B,f1)

c(C,f1)

c(D,f1)

c(E,f1)

c(F,f1)

Modeling Trick 2

Page 70: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

53

A

B

C

D

F

E

3 flows: A to F A to E

F to C} SFC

ABDF

ACE

FEC

A

C

B

E

D

F

c(A,f1)

c(B,f1)

c(C,f1)

c(D,f1)

c(E,f1)

c(F,f1)

Cost= c(A,f1)+c(E,f1)

Modeling Trick 2

Page 71: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Preliminaries: Chains of Length 1

• The equivalence directly gives:

• On the positive side, an H(|D|)-approximation using the greedy-algorithm for Set Cover [Chvatal 1979].

• On the negative side, SFC Placement Problem is hard to approximate within ln(|D|) [Alon et al. 2006].

54

Modeling Trick 2

Page 72: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

General Case

• When length of the chain >= 2, Extension is not direct even for a single chain.

55

How to deal with the general case?

Page 73: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network

• A key concept: an associated network for each demand

56

Page 74: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network

• Definition: Associated Networks H(d) for demand d with path(d) = u1, u2, ..., ul(d) and chain sfc(d) = r1, r2, ..., rs(d)

57

Page 75: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network

• Definition: Capacited Associated Network H(d,Π) of demand d and function placement Π: - All arcs have infinite capacity. - Capacity of node u of layer i is 1 if (u,ri) ∈ Π and 0 otherwise.

58

Page 76: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network

• Key property: A demand d∈D is satisfied by Π if and only if there exists a feasible st − path in the capacitated associated network H(d,Π).

59

Page 77: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network: An Example

60

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

D

B B

A A

t

C C

D

s

Page 78: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network: An Example

61

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

D

B B

A A

t

C C

D

s

Page 79: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network: An Example

62

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

D

B B

A A

t

C C

D

s

Page 80: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network: An example

63

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

B B

A A

t

D D

F

s

F

Page 81: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network: An Example

64

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

B B

A A

t

D D

F

s

F

Page 82: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Associated Network: An Example

65

3 flows: A to F A to E

F to C

A

B

C

D

F

E

SFCA

SFCB}

B B

A A

t

D D

F

s

F

Order not respected = No st-paths

Page 83: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

New Formulation of the Problem

• Goal: Link our problem with the Hitting Set Problem.

• Tool: Menger’s theorem for digraphs (max flow-min cut) “number of st − paths in a digraph is equal to the minimum st-vertex cut” -> Existence of st-paths <=> cost >= 1 of minimum st-vertex cut -> All cuts of the associated networks have to be hit.

66

Page 84: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Approximation Algorithms

67

-> leads to two approximation algorithms with logarithmic factor • a greedy one (naive and fast versions) • one using LP-rounding (naive and fast versions)

Page 85: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Contributions

• Investigated the problem of placing VNFs to satisfy the ordering constraints of the flows with the goal of minimizing the total setup cost.

• We proposed two algorithms that achieve a logarithmic approximation factor.

• For the special case of tree network topologies with only upstream and downstream flows, we devised an optimal algorithm.

68

Algorithmic Fact 1

Page 86: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

An optimal algorithm for tree topologies

• Finding efficient algorithms for some class of graphs (such as trees) -> often important in practice e.g. for Mobile Edge Computing or FOG computing (specific topology of access networks)

69

Modeling Trick 3

Page 87: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

An optimal algorithm for tree topologies

• Tree topology. - Physical network of any shape, - But clients communicating through a logical tree (e.g. CDNs, sensor networks, …)

70

Modeling Trick 3

Page 88: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

An optimal algorithm for tree topologies

• Theorem: SFC Placement Problem NP-hard even on a tree and with a single network function. (Proof: Reduction from Vertex Cover)

• Polynomial exact algorithm for upstream or downstream flows based on dynamic programming.

71

Modeling Trick 3

Algorithmic Fact 1

Page 89: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SFC - Conclusions

• Efficient algorithms proposed for SFC provisioning

• Theoretical framework for studying the placement problem with ordering constraints.

• Unaddressed issues:

- accounting of practical constraints such as soft capacities on network functions or hard capacities on network nodes.

- Affinity/anti-affinity rules

- Partial order

- Latency

72

Future research direction: possible to efficiently approximate these problems?

Page 90: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

SFC - Conclusions

• SDN and NFV bring several benefits:

- simplify management

- enhance flexibility of the network

- reduce the network cost

• But also several challenges that need to be addressed to fully attain their benefits

73

SDN-NFV enabled network has the potential to boost NFV deployment and support new efficient and cost-effective services

Page 91: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Future Directions

74

• Several major revolutions: - 5G- IoT- Mobile Edge Computing- …

• Assign slices to capacity slots of physical links -> slicing• Dynamic SFC Placement • Network Reconfiguration

New algorithmic problems to be solved

} New challenges

Page 92: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network Slicing

75

• Assign slices to capacity slots of physical links- each slice is independent from each other- each slice may have different QoS requirements

• 2 different network slicing strategies:- SOFT: traffic is multiplexed in queuing systems: high load

may affect other slices- HARD: each slice has dedicated resources at physical and

MAC layers

(Parallel with isolation problems VM vs Containers)

Page 93: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline: Summary

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

76

Page 94: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Convergence Data Centers/Networks

• Convergence • of infrastructures, • of their control with the next

generation SDN/NFV networks

• Allows a joint optimization of applications and network trafic.

• Revisit the fundamental problems of scheduling in data centers.

Topic of a joint lab between

Orange and Inria “Big OS”

77/17

Page 95: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

78

Page 96: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Reminder

79

• More and more data-oriented parallel computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark)

• Traditional scheduling consider properties of

• server (e.g., CPU and memory usage)

• job (e.g., execution time, deadline)

• Communications account for up to 50% of job completion time [Chowdhury, et al. Orchestra SIGCOMM 2011]

Network resources usually not taken into consideration

Page 97: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Related Work

• Optimizing data center communications. • [Chowdhury et al. Sigcomm 2011]

Orchestra. Load balancing mechanisms to improve the shuffle phase.

• [Jalaparti et al. Sigcomm Rev. 2015] Corral. Using job recurrence to place data and large computation locality .

80

Page 98: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Related Work

• Optimizing data center communications. • [Chowdhury et al. Sigcomm 2011]

Orchestra. Load balancing mechanisms to improve the shuffle phase.

• [Jalaparti et al. Sigcomm Rev. 2015] Corral. Using job recurrence to place data and large computation locality .

80

Few theoretical frameworks and provably efficient algorithms

Page 99: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Related Work• Theoretical frameworks for Scheduling of complex workflows

• [Graham Bell System Tech. Journal 1966] Scheduling with precedence constraints or list scheduling. Main result: 2-1/m-approx.

• In the 90s, scheduling with communication delays. Minimizing makespan still an open problem. However, 2-approx if uniform delays and task replication [Papadimitriou Yannakakis SIAM J. of Computing 1990] or if unitary costs [Rayward-Smith DAM 1987]

81

Page 100: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Related Work• Theoretical frameworks for Scheduling of complex workflows

• [Graham Bell System Tech. Journal 1966] Scheduling with precedence constraints or list scheduling. Main result: 2-1/m-approx.

• In the 90s, scheduling with communication delays. Minimizing makespan still an open problem. However, 2-approx if uniform delays and task replication [Papadimitriou Yannakakis SIAM J. of Computing 1990] or if unitary costs [Rayward-Smith DAM 1987]

81

No Network Capacity is assumed: all communications can be done at the same time without changing the delay!

Page 101: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Two Theoretical Frameworks1. Coflows

or scheduling group of dependant flows

2. Network tasks or scheduling while optimizing network resources

82

[Chowdhury,Stoica Hotnets 2012]

[Giroire,Huin,Tomassilli,Pérennes, INFOCOM 2019]

Page 102: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Two Theoretical Frameworks1. Coflows

or scheduling group of dependant flows

2. Network tasks or scheduling while optimizing network resources

83

[Chowdhury,Stoica Hotnets 2012]

[Giroire,Huin,Tomassilli,Pérennes, INFOCOM 2019]

Page 103: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Distributed Data-Parallel Applications

• Multi-stage dataflow

- Computation interleaved with

communication

• Computation Stage (e.g., Map, Reduce)

- Distributed across many machines

- Tasks run in parallel

• Communication Stage (e.g., Shuffle) Between successive computation stages

84

Page 104: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Distributed Data-Parallel Applications

• Multi-stage dataflow

- Computation interleaved with

communication

• Computation Stage (e.g., Map, Reduce)

- Distributed across many machines

- Tasks run in parallel

• Communication Stage (e.g., Shuffle) Between successive computation stages

84

A communication stage cannot complete until all the data has been transferred

Page 105: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Question

How to design the network for data parallel applications?

‣ What are good communication abstractions?

85

Page 106: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Traditional solution: The flow abstraction

86

Flow: Transfer of data from a source to a destination

E.g., Lots of work to ensure Per-Flow Fairness and/or minimize Flow Completion Time

Page 107: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Is Flow Still the Right Abstraction?

87

Independent flows cannot capture the collective communication behavior common in data-parallel applications

Page 108: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

88

• Coflow = Collection of semantically related flows [1]

The Coflow abstraction

• Communication abstraction for data-parallel applications to express their performance goals

[1] Chowdhury,Stoica Hotnets 2012

Page 109: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

88

Aggregation

Broadcast

Shuffle

• Coflow = Collection of semantically related flows [1]

The Coflow abstraction

• Communication abstraction for data-parallel applications to express their performance goals

[1] Chowdhury,Stoica Hotnets 2012

Page 110: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

88

Aggregation

Broadcast

ShuffleParallel Flows

All-to-All

• Coflow = Collection of semantically related flows [1]

The Coflow abstraction

• Communication abstraction for data-parallel applications to express their performance goals

[1] Chowdhury,Stoica Hotnets 2012

Page 111: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

88

Aggregation

Broadcast

ShuffleParallel Flows

All-to-All

Single Flow

• Coflow = Collection of semantically related flows [1]

The Coflow abstraction

• Communication abstraction for data-parallel applications to express their performance goals

[1] Chowdhury,Stoica Hotnets 2012

Page 112: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

89

The Coflow abstraction

Page 113: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

89

How to schedule coflows online …

… for faster #1 completion of coflows?

… to meet #2 more deadlines?

1

2

N

1

2

N

.

.

.

.

.

.Datacenter

The Coflow abstraction

Page 114: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network and Coflow Model

• “Big switch” conceptual model = abstract out the datacenter network fabric as one big switch interconnecting servers.

• Assumption: the fabric core can sustain 100% throughput and only the ingress (NICs) and egress (TOR switches) queues are potential congestion points.

• Indeed: most data center network architecture (e.g. Fat Tree) have full bissection bandwidth and are permutation networks.

90

Modeling Trick 4

Page 115: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network and Coflow Model

• Big-switch model

• Clairvoyant scheduler = Coflow details known at arrival time:

- Source-destination for each flow - Size of each flow - Coflow weight

• Considered Metric: Coflow Completion Time (CCT) = Time when all flows of a coflow have completed

91

Goal: Minimize Average Weighted CCT

Page 116: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 117: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6

Fair Sharing

L1

L2

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 118: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Fair Sharing

L1

L2

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 119: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6 time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Fair Sharing Smallest-Flow First[1],[2]

L1

L2

L1

L2

[1] Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. [2] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 120: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6 time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Fair Sharing Smallest-Flow First[1],[2]

L1

L2

L1

L2

[1] Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. [2] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 121: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6 time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Coflow1 comp. time = 5Coflow2 comp. time = 6

Fair Sharing Smallest-Flow First[1],[2]

L1

L2

L1

L2

[1] Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. [2] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 122: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6 time2 4 6 time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Coflow1 comp. time = 5Coflow2 comp. time = 6

Fair Sharing Smallest-Flow First[1],[2] The Optimal

L1

L2

L1

L2

L1

L2

[1] Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. [2] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 123: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6 time2 4 6 time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Coflow1 comp. time = 5Coflow2 comp. time = 6

Fair Sharing Smallest-Flow First[1],[2] The Optimal

Coflow1 comp. time = 3

L1

L2

L1

L2

L1

L2

[1] Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. [2] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 124: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Benefit of inter-coflow scheduling

92

time2 4 6 time2 4 6 time2 4 6

Coflow1 comp. time = 5 Coflow2 comp. time = 6

Coflow1 comp. time = 5Coflow2 comp. time = 6

Fair Sharing Smallest-Flow First[1],[2] The Optimal

Coflow1 comp. time = 3Coflow2 comp. time = 6

L1

L2

L1

L2

L1

L2

[1] Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. [2] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

Link 1 Link 2

3 Units

Coflow 16 Units

Coflow 2

2 Units

Page 125: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Coflows: Main Known Results

Problem of min. avg CCT - Negative Algorithmic Results:

• NP-Hardness (reduction from concurrent open-shop scheduling). [Chowdhury, Zhong, and Stoica. Varys. In ACM SIGCOMM 2014] Thus, best hope for = approximation algorithms.

• Lower Bounds: Inapproximibility within a factor of 2 − ε. [Bansal and Khot. Inapproximability of hypergraph vertex cover and applications to scheduling problems. In EATCS ICALP 2010.]

• Necessity for Coordination: Without Ω(√n) of the optimal. [Chowdhury and Stoica. Efficient coflow scheduling without prior knowledge. In ACM SIGCOMM 2015]

93

Page 126: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Coflows: Main Known ResultsProblem of min. avg CCT - Positive Algorithmic Results:

Lots of coflow schedulers proposed:

• Baraat [Dogar et al. in ACM SIGCOMM 2014 ]

• Varys [Chowdhury, Zhong, and Stoica. Efficient coflow scheduling with varys. In ACM SIGCOMM 2014]

• Sincronia [Agarwal et al. Sincronia: near-optimal network design for coflows. In ACM SIGCOMM 2018]

• Best known approximation algorithm: 4-approximation [Agarwal, Rajakrishnan, Narayan, Agarwal, Shmoys, Vahdat, Sincronia: near-optimal network design for coflows. In ACM SIGCOMM 2018]

94

Algorithmic Fact 2

Page 127: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Key open challenges • Better theoretical understanding

• Efficient solutions to deal with

• decentralization,

• more complex topologies,

• estimations over DAG,

• etc.

• Extensions to

• non-clairvoyant scheduler, • other performance metrics, e.g. tail completion time, fairness.

• co-designing routing along with scheduling of coflows.

95[Coflow Recent Advances and What’s Next? M. Chowdhury Dimarcs 2017]

Page 128: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Key open challenges

• Better theoretical understanding

• Gap between lower and upper bounds: 2 − ε vs 4-approx.

• Improved competitive ratio for online coflow scheduling (best known 12).

96

Page 129: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Key open challenges• Coordination is necessary to determine realtime

• Coflow size (sum);

• Coflow rates (max);

• Partial order of coflows (ordering);

• Can be a large source of overhead

• Does not impact too much for large coflows in slow networks, but ...

• How to perform decentralized coflow scheduling?

• Some centralization necessary with strong lower bound of Ω(√n)

• But, which “amount of coordination” is unclear

• e.g. Sincronia does not need per flow rate adaptation 97

Page 130: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Key open challenges

• Schedule a DAG of coflows

• Consider both network and server resources (cores)

98

Page 131: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Key open challenges

• Schedule a DAG of coflows

• Consider both network and server resources (cores)

98

-> Introduction of a new theoretical framework

Page 132: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Two Theoretical Frameworks1. Coflows

or scheduling group of dependant flows

2. Network tasks or scheduling while optimizing network resources

99

[Chowdhury,Stoica Hotnets 2012]

[Giroire,Huin,Tomassilli,Pérennes, INFOCOM 2019]

Page 133: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A new scheduling framework

100

• Goal: schedule workflows while taking into account the limited communication bandwidth

• 2 kinds of tasks:

- CPU tasks: to be executed by servers

- Network tasks: to be executed by network machines

• Network tasks may or may not be executed depending on the placement of the CPU tasks

Page 134: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A simple example with 2 Servers and 1 Network Machine)

101

2 Servers (P1 and P2)

Network

Dependency Digraph of a Job with 9 CPU Tasks

Page 135: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A simple example with 2 Servers and 1 Network Machine)

101

2 Servers (P1 and P2)

Network

Dependency Digraph of a Job with 9 CPU Tasks

Workflow with network tasksA possible schedule

Page 136: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

102

• Simple Networks (machines connected via a bus or via an antenna)

- one network machine per channel

Network

Page 137: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

102

• Simple Networks (machines connected via a bus or via an antenna)

- one network machine per channel

Network

• Data Center Networks- key property: large bisection

bandwidth (full for VL2 and Fat Trees) [Chen et al. JPDC 2016]

- only border links (i.e., links between the servers and the ToR switches) have to be taken into account

Page 138: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

102

• Simple Networks (machines connected via a bus or via an antenna)

- one network machine per channel

Network

• Data Center Networks- key property: large bisection

bandwidth (full for VL2 and Fat Trees) [Chen et al. JPDC 2016]

- only border links (i.e., links between the servers and the ToR switches) have to be taken into account

Page 139: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

103

• Only inter-rack bandwidth to be modeled

Page 140: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

103

• Only inter-rack bandwidth to be modeled

• 2 network machines per link:- one for upload- one for download

Nu1 Nd

1

Page 141: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

103

• Only inter-rack bandwidth to be modeled

• 2 network machines per link:- one for upload- one for download

• Network transfer between M1 and M13

• job in download machine of M1

• job in upload machine of M13

Nu1 Nd

1 Nu13 Nd

13

Page 142: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

103

• Only inter-rack bandwidth to be modeled

• 2 network machines per link:- one for upload- one for download

• Network transfer between M1 and M13

• job in download machine of M1

• job in upload machine of M13

Nu1 Nd

1 Nu13 Nd

13

Page 143: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Modeling Data Center Networks

103

• Only inter-rack bandwidth to be modeled

• 2 network machines per link:- one for upload- one for download

• Network transfer between M1 and M13

• job in download machine of M1

• job in upload machine of M13

Nu1 Nd

1 Nu13 Nd

13

Page 144: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

• More general networks (without full bissection bandwidth) leads to -approximation with C minimum network multicut [Garg et al STOC 1993]

Modeling Data Center Networks

104

• Simple Networks (machines connected via a bus or via an antenna)

Network

• Data Center Networks- modeling border links

C

O(m logm)

Page 145: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Contributions

1. Introduction of new scheduling framework to model communication delays when tasks are competing for a limited network bandwidth.

2. Show how to schedule data center jobs while routing their communications

3. Hardness results of SCHEDULING WITH NETWORK TASKS problem

4. Two efficient scheduling algorithms, G-LIST and PARTITION

5. Extensive evaluation using workflows based on Google trace [Reiss et al. White paper]

105

Page 146: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Two Efficient Algorithms

106

• G-LIST: greedy algorithm- Generalization of the List Scheduling algorithm- Idea: place a task where there is most needed data and

only if needed network tasks can all be done- Theorem: G-List is optimal on simple MapReduce

workflows.

• PARTITION: a 2-phase algorithm1. assign the tasks to machines while minimizing the CPU and

the networking work 2. compute a schedule for the tasks

Page 147: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

PARTITION: 2-phase approach

• Phase 1: Distribute tasks into machines minimizing communications

• Phase 2: Schedule the tasks when placed minimizing makespan

107

M2

M1

Page 148: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

PARTITION: Phase 1

108

• Based on the k-balanced graph partitioning problem:

Goal: Partition vertices of input graph G into k equally sized components, while minimizing the total weight of the border edges

Known results: -approximation algorithm [Krauthgamer SODA 2009]

Beware! Best solution is not necessarily with the largest number of machines

Page 149: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

PARTITION: Phase 1

109

Principle of PARTITION-ASSIGN Algo:

1. Choose a number of machines, k.

2. Solve a k-balanced partitioning problem

3. Do it for all possible 1 ≤ k ≤ m

Beware! Best solution is not necessarily with the largest number of machines

Page 150: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

PARTITION: Phase 2

110

- SchedulingWhen Placed problem.

- Results:

- Hardness: NP-complete and inapproximability 5/4 (reduction from 3SAT)

- Approximation algorithm, PARTITION-SCHEDULE.

M2

M1

Page 151: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network tasks: Conclusion

111

• Proposition of a new framework to model the orchestration of tasks in a datacenter for scenarios in which the network bandwidth is a limiting resource.

• Two algorithms to solve the problem, for which we derive some theoretical guarantees.

• Demonstration of the effectiveness of our algorithms using datasets built using statistics from Google data center traces.

Page 152: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Network tasks: Conclusion

112

A lot of open questions:

- Main one: inapproximability of the general problem?

Reminder: Without network, scheduling with a dependency digraph not approximable within a factor 4/3 [Lenstra Rinnooy Kan 78] and 2-approximation.

Goal: With network, approximation algorithm or inapproximability (with a constant>4/3 or log factor)

-> Study of variants of k-balanced partition.

Page 153: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A lot of open questions:

- On the practical side: study of behaviors of the algorithms on a testbed, comparing them with practical solutions proposed for data centers.

Conclusion

113

Page 154: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

114

Page 155: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Validating solutions• Theoretical results (explain main parameter

dependencies, but often too simplistic hypothesis) • Simulations (represent more complex phenomena,

but bad fidelity to real networks, implementation different from actual application)

• Emulations (fast and good scalability, can run actual application, can interact width a live environment)

• Experimentations (Wide-area implementation not always possible, too few nodes may be available, not reproducible)

• Most used tool for SDN/NFV networks: Mininet.

115

Page 156: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Mininet

116

Page 157: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Mininet Limitations•Mininet provides a flexible and cost-efficient experimental platform to evaluate SDN applications.

• But it has several limitations: - resources limits (CPU, bandwidth) if experiments are run on a single host.

-no strong notion of virtual time (timing measurements based on system clock)

• When the physical host is overloaded, Mininet -may return wrong results or -not be able to run the experiments

117

Page 158: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Mininet Limitations•Mininet provides a flexible and cost-efficient experimental platform to evaluate SDN applications.

• But it has several limitations: - resources limits (CPU, bandwidth) if experiments are run on a single host.

-no strong notion of virtual time (timing measurements based on system clock)

• When the physical host is overloaded, Mininet -may return wrong results or -not be able to run the experiments

117

Need to overcome Mininet Limitations and increase the performance fidelity of network experiments

Page 159: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Distributed Network Emulation

Existing tools: • Mininet Cluster Edition: [1] • Maxinet: [Wette et al. IFIP Networking 2014]

118

Solution: distribute the load for resource intensive experiments.

[1] https://github.com/mininet/mininet/wiki/Cluster-Edition-Prototype

Page 160: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

A new tool: Distrinet• Limitations of existing tools:

- No performance guarantees

- New API

• Distrinet Work in progress [2] + Fully compatible with Mininet API. + Automatic deployment in private

infrastructures (linux machines and Grid5000) or public cloud (AWS).

+ Some guarantees that resources requirements (e.g. cores, memory, network) are satisfied.

+ Minimization of resource utilization for private infrastructures and costs for public cloud.

119

[2] Di Lena, Tomassilli, Saucez, Giroire, Turletti and Lac. Mininet on steroids: exploiting the cloud for Mininet performance IEEE CloudNet 2019

Page 161: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Outline

1. Motivation 2. A new situation: SDN and NFV 3. Placement of virtual network functions ‣ Use case: Service Function Chaining

4. Coflows for datacenters 5. Scheduling with network tasks 6. Tools to evaluate solutions 7. What next?

120

Page 162: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Challenge 1

• Lots of open algorithmic problems • For SFCs • For coflows • For variants of scheduling

121

Page 163: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Challenge 2

• Scheduling beyond the cloud • Fog Computing and Mobile Edge

Computing

122

Page 164: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Fog/Edge Computing

123

• PROBLEM: Interactive applications require ultra-low network latencies (< 20 ms) … but latencies to the closest data center are 20-30 ms using wired networks, up to 50-150 ms on 4G mobile networks

• SOLUTION: Exploit distributed and near-edge computation:- Reduce latency and network traffic- improve power consumption

- increase scalability and availability

Analyze most IoT data near the devices that produce and act on that data FOG COMPUTING

Page 165: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Fog/Edge Computing - Challenges

124

• Computing and networking resources are:

• heterogeneous

• not always available

• Service cannot be processed everywhere

• Demands and resources are dynamic

How to assign the IoT applications to computing nodes (fog nodes)

which are distributed in a Fog environment?

Page 166: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Mobile Edge Computing

125

• IDEA: Offloading to improve latency and alleviate congestion in the core -> Push the content (application servers) close to the users using MEC servers (small datacenter collocated with the base station) in the infrastructure close to the edge of the network

• PROBLEM: assign users, application, and share of traffic to the MEC servers

• Constraints:- mobile traffic depends on time and locality- geographical constraints- mobility of the users- budget

Page 167: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Challenge 3

• Getting realistic scenarios with • data (application and networks) • architecture

126

Page 168: Overview of Networking Challenges for the Placement of ... · computing solutions (e.g., MapReduce, Dryad, CIEL, and Spark) • Traditional scheduling consider properties of ... can

Challenge 3

• Getting realistic scenarios with • data (application and networks) • architecture

126

THANKS FOR YOUR ATTENTION!


Recommended