Download - DIRECT: Distributed Cross-Domain Resource Orchestration in … · allocation (physical resource blocks (PRBs) in LTE network). Input: Slices and their users, resource orchestration

DIRECT: Distributed Cross-Domain Resource Orchestration in Cellular Edge

Computing

Presenter: Tao HanAssistant Professor

Electrical and Computing Engineering DepartmentThe University of North Carolina at Charlotte

[email protected]

1

Qiang Liu and Tao Han

Outline

❖ Introduction

❖Motivation & Challenges

❖System Model & Algorithm

❖System Design & Implementation

❖Performance Evaluation

* https://www.bloorresearch.com/technology/5g-iot-and-edge-computing/

* http://www.nxtview.com/?p=1230 3

Currently in U.S. there are 8 networked devices per person, expected to 13.6 per person by 2022*

Rapid Increasing of Connected Devices

* https://www.ericsson.com/en/mobility-report/reports/november-2018/mobile-

data-traffic-growth-outlook 4

6x Mobile Traffic Increase in North America Until 2024

Substantial growth of Mobile Traffic

* https://www.independent.ie/business/technology/news/the-need-for-speed-is-

ireland-ready-for-5g-the-next-big-thing-in-cellular-technology-36629260.html 5

5G aims to connect “everything”, e.g., Phone, Vehicles, IoT, Machine, Health*

The 5G Era

* https://networks.nokia.com/5g/get-ready6

Accommodating heterogeneous services is challenging since they have extremely diverse

performance requirements, e.g., low-latency, high-reliability, low-energy*

Heterogeneous Services

* Guan, W., Wen, X., Wang, L., Lu, Z. and Shen, Y., 2018. A service-oriented

deployment policy of end-to-end network slicing based on complex network

theory. IEEE Access, 6, pp.19691-19701. 7

Network Slicing enables operator creates logical networks (slices) over physical infrastructure,

slices are tailored to support various services, improves revenue and reduces operation costs

Network Slicing

Outline

❖ Introduction





9

The main objective of network slicing is to efficiently utilize the physical infrastructures

to serve the slices which are managed individually by slice tenants (functional isolation).

Motivation

FlexRAN: A flexible and programmable platform for software-defined radio access networks

Orion: RAN slicing for a flexible and cost-effective multi-service mobile network architecture

How Should I Slice My Network?: A Multi-Service Empirical Evaluation of Resource Sharing Efficiency

Overbooking network slices through yield-driven end-to-end orchestration

Efficient Radio Access Network (RAN) Sharing Platform

Efficient Network Slicing in Radio Access Network (MAC layer)

Introduce “Slicing Efficiency” to Evaluate the Resource Sharing Efficiency

End-to-End Slicing Platform, Improve Revenue by Slices Overbooking

The resource demands of slices are assumed to be known, that is not always TURE!

10

Challenges

Unknown performance model of slices due to functional isolation

❖ Tenants can customize own operation strategies on

their slices (user scheduling, prioritization, etc.)Resource Orchestration

❖ Slice performance is related to multiple domain

resources (radio, transport, computing, etc.)

❖ Dynamic slice traffic in the network (temporal,

spatial, etc. )

Fast response on network traffic dynamics

Determine multiple domain resource correlations

Adapt to different network scenarios

11

Our Target: Cross-Domain Resource Orchestration for Network Slicing

Our Solutions

❖ Utilize Gaussian Process (GP) to learn the resource

demands of individual slice

❖ Derive “predictive gradient” from GP model and use

gradient descent method to orchestrate resources to slices

❖ Adopt ADMM method to coordinate the resource

usage of slices across the network to meet SLAs

Resource Orchestration

GP is data-efficient and fast

“gradient” descent is effective

Derivation-based coordination in network

Outline

❖ Introduction





13

The set of resources orchestrated to the ith slice on jth edge node*

System Model

*Edge node: a logic network unit, edge node, which is composed of

a cellular base station and a certain amount of computing resources

The utility (unknown) of the ith slice on jth edge node

Assumption: Each slice reports its utility to orchestrator periodically that indicates the performance

of last resource orchestration

Radio Access Network Edge Servers

The objective function is:

14

Problem Statement

Maximize the system utility under the total payment of slices:

The constraints of resources in edge nodes

The constraints of slice total payments

The problem falls into the realm of black-box optimization, e.g., pattern search, Bayesian optimization, etc.

➔ Time consuming, Lack of scalability

15

Algorithm Design

Let’s think about this situation: what will we do if we know the utility function of slices?

Step 1: decompose the problem w.r.t. edge nodes with ADMM method ( constraints 1 is the

only coupling constraints )

Step 2: use gradient-based descent methods to optimize resource orchestration of slices at each

edge node individually ( utility function is known )

σ𝑗∈𝐽 𝑓𝑗(𝑥𝑗) −𝑔(𝑥, 𝑧)σ𝑗∈𝐽 𝑓𝑗(𝑥𝑗)

𝑓1 𝑥1 − 𝑔(𝑥, 𝑧) 𝑓1 𝑥1 − 𝑔(𝑥, 𝑧) 𝑓1 𝑥1 − 𝑔(𝑥, 𝑧)

equivalent

gradientStep size

∇𝑔(𝑥, 𝑧)

16

Algorithm Design

Then it becomes: how can we obtain the exact gradients of utility function?

Answer: WE CANNOT! But we can estimate. With Gaussian Process (fast, data-efficient).

Gaussian Process (GP): constructs a probabilistic model, regresses the target unknown function.

GP produces a distribution of predicted output for any individual input.

Input (X1,X2,…) Output (Y)

<3,2> <13>

<1,3> <10>

<1,1> <2>

<3,0> <9>

… …

17

Algorithm Design

The accuracy of the prediction of GP increases with more input-output

data pairs in the database.

1-dim function example: Gaussian Process Regression after 3 steps

18

Algorithm Design




19

Algorithm Design




20

Algorithm Design

Predictive Gradient:resource 1

resource 2

resource K



predicted curve

real curve

observed point

predict point

An example of predictive gradient w.r.t. single resource:

predictive gradient

predicted curve

real curve

observed point

predict point

predictive gradient

Point 1 Point 2

21

Algorithm Overview

Controller side: updating the dual variables and optimize the auxiliary variable Z (convex problem)

Edge node side: optimize the resource allocation X with Gaussian Process based proximal

gradient descent method (predictive gradient).

Outline

❖ Introduction





24

DIRECT: System Overview

❖ DIRECT controller: coordinate the resource usage of slices across edge nodes (control-side algorithm)

❖ DIRECT agents in edge nodes: orchestrate resources to slices with predictive gradients (edge-side algorithm)

Slice Orchestrator

effectively and dynamically orchestrate virtual network resources to serve slices across the whole network

❖ controller coordinate with agents by exchanging certain variables, e.g., X, Z, U, etc.

25


Resource Hypervisor

efficiently and dynamically virtualize physical infrastructures and map virtual resource allocation of slices to physical

Radio Resource Hypervisor

Computing Resource Hypervisor

26


Network Slices

use virtual resources to serve its users individually

27

Resource Hypervisor: Radio

Methodology: virtualize the radio resource by managing the MAC layer user scheduling and resource

allocation (physical resource blocks (PRBs) in LTE network).

Input: Slices and their users, resource orchestration of slices, channel conditions

Output: all users to PRBs mapping for both uplink and downlink

Difficulties:

1) tradeoff between isolation and efficiency;

2) unknown association between slice and users in MAC layer (only RNTI available );

28

Resource Hypervisor: Radio

Minimum number of PRBs

Solution:

1) define virtual radio resource (RR) as wireless bandwidth, e.g., 540kHz;

2) convert the RR of slice users into number of PRBs

3) find the association between slice and users by capture IMSI info in S1AP message

4) formulate a users-to-PRBs mapping problem;

5) solve the problem with heuristic algorithm

Idea of heuristic algorithm: select the user with the best channel condition for each PRB. Comply to

uplink frequency-contiguous PRB allocation.

29

Resource Hypervisor: Computing

Difficulties: No open-source CUDA kernel management platform; no operations allowed once kernels are

dispatched to GPU side except interruption

Methodology: virtualize the GPU computing resource by managing the dispatch of kernel functions

(Token-based mechanism)

name Required threads parameters

Kernel function in

CUDA programing:

Kernel 1 (10k threads)

Kernel 2(1k threads)


Kernel N

Callin

gkern

els

asy

nch

ron

ou

sly




Executing kernels serially in GPU

CPU side GPU side

*Multiple process services (MPS) enables multiple applications/processes to share

the GPU resources. Here, we consider no Concurrent Kernel Execution/Streams.

30

Resource Hypervisor: Computing

The kernel of user is dispatched only if its token is meet. Token is update as:

Methodology: virtualize the GPU computing resource by managing the dispatch of kernel functions

(Token-based mechanism)

name Required threads parameters

Kernel function in

CUDA programing:

Kernel 1 (10k threads)



Kernel N

Callin

gkern

els

asy

nch

ron

ou

sly




Executing kernels serially in GPU

CPU side GPU side

*Multiple process services (MPS) enables multiple applications/processes to share

the GPU resources. Here, we consider no Concurrent Kernel Execution/Streams.

Token generated

Token consumed by running

kernels (tracked by cudaEvent)




Command Queue

Token

Track execution status of

kernels using cudaEventsync

31

➢ DIRECT controller exchanges the control variables with all

edge nodes;

The DIRECT Protocol

➢ Each DIRECT agent in edge node orchestrates the virtual

resources to network slices;

➢ Each network slice orchestrates the its virtual resources to

users based its own policy;

➢ The virtual resource allocations of all the admitted users are

informed to hypervisor;

➢ The hypervisor maps the virtual resources of users to physical

resources;

➢ Slices report their utilities to DIRECT agent which report

resource usage to DIRECT controller;

➢ The loop continues until convergence.

DIRECT protocol enables effectively cross-domain resource orchestration in network slicing

32

System Implementation

Hardware Details:

❖ OpenAirInterface LTE Platform: 2x USRP B210 SDR boards, 2x eNodeB computers, 1x Core network

❖ CUDA GPU computing platform: 2x NVIDIA GTX 1080Ti, CUDA 8.0

❖ Mobile users: 4x Huawei dongle E2273, 4x Linux computers

Outline

❖ Introduction





34

Evaluation Setting

Utility: Utility of slice (unknown) is defined as the slice latency (summation of user latency).

Applications: based on YOLO object detection framework to emulate various resource demands of slices

❖ Mobile Augmented Reality (MAR):

❖ Video Analytics and Streaming (VAS):

“request”

1280x720

416x416

medium model

1280x720

“dog, bike”

608x608

large model

Algorithms Comparison:

Static: evenly share to slices all the resources across all edge nodes;

Pswarm: a global optimization solver, replace the GP-based Alg. 1 in each edge node;

TOMLAB-glcSolve: a global optimization solver, replace the GP-based Alg. 1 in each edge node;

application MAR MAR VAS

35

Experimental Results

❖ DIRECT converges in several iterations.

❖ DIRECT reduces about 21% system latency as compared to Static.

❖ DIRECT agents learn to orchestrate resources to slices.

[*] Pswarm and TOMLAB need too much iterations, impractical in experiments

21%

61%

36

Experimental Results

❖ DIRECT coordinates the resource utilization among edge nodes to meet the total payment.

❖ DIRECT agents learn the resource demands of slices and orchestrate resources accordingly.

Learn the slice traffic on edges

Learn the resource demand of application

application MAR MAR VAS

37

Simulation Results

❖ DIRECT has a smooth convergence performance in simulation, corresponds to experiments.

❖ DIRECT agents learn the resources orchestration in less than 20 interactions with slice.

38

Simulation Results

❖ DIRECT has a great scalability performance as compared to the others.

❖ Optimization solvers could obtain similar performance in terms of system utility, but

impractical in real system since too many interactions needed.

Better scalability

39

Conclusion

❖ We presented DIRECT, cross-domain resource virtualization and orchestration system, which

orchestrates resources to slices under unknown performance model of slices.

❖ DIRECT integrates optimization method (ADMM) and learning assisted algorithm (GP-

based).

❖ We designed cross-domain resource virtualization, i.e., Radio and Computing Resource

Hypervisor to virtualize physical infrastructures.

❖ We implemented DIRECT prototype which is composed of OpenAirInterface LTE system and

CUDA GPU computing platform.

❖ We evaluate the performance of DIRECT with both experiments and simulations. Results

validate that DIRECT significantly outperforms the other baseline algorithms.

THANKS!Any questions?

Tao [email protected]

40