A Game-Theoretic Resource Manager for RT Applications

transcript

Martina Maggio, Enrico Bini, Georgios Chasparis, Karl-Erik Årzén

Lund University, Department of Automatic Control

Motivation

• Multiple applications sharing the same computing platform, especially in embedded systems

• Applications support multiple service levels with different execution requirements and quality of service

Motivation

• Multiple applications sharing the same computing platform, especially in embedded systems

• Applications support multiple service levels with different execution requirements and quality of service

SL1: 640x480 SL2: 800x600 SL3: 1024x768CPU: 30% CPU: 60% CPU: 90%

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Outline

• Implementation

Problem formulation

• Select:

the resource allocation

the service levels of the applications

to maximize the quality of the overall computation

Problem formulation

• Select:

The resource allocation naturally belongs to the resource management domain

Problem formulation

• Select:

The service level naturally belongs to the application domain

Existing approaches

• Centralized solution: the resource manager chooses both the resource allocation and the application service levels

Existing approaches

• Centralized solution: the resource manager chooses both the resource allocation and the application service levels

sensorvirtualplatform

sensorvirtualplatformOS

PSfrag

app1 app2

v1 v2 vn

f1 f2 fn

app1 app2

v1 v2 vn

f1 f2 fn

Existing approachesD

app1 app2

v1 v2 vn

f1 f2 fn

Existing approaches

The resource manager solves an ILP problem, high time complexityDra

app1 app2

v1 v2 vn

f1 f2 fn

Existing approaches

The resource manager compares service levels from different applications

app1 app2

v1 v2 vn

f1 f2 fn

Existing approaches

The resource manager compares service levels from different applications

The resource manager requires a lot of informations from the applications

Our contribution

• Decoupling service level assignment and resource allocation

• Formal guarantees that the operating point is a good match between the resource assigned and the service level

• Implemented in Linux with SCHED_DEADLINE

Our contribution

• Implemented in Linux with SCHED_DEADLINEThe application selects the service level

The resource manager assignes resources

Our contribution

Low information exchange between the two entities and no unwanted comparisons

The application selects the service levelThe resource manager assignes resources

Our contribution

Assumptions

• Every application is made of jobs

every job has a deadline - expected execution time - and a real execution time

Outline

• Implementation

The architecture

settingservice

awareservice

settingservice

awareservice

unawareservice

app1 app2

!nv1 v2 vn

f1 f2 fn

The architecture

settingservice

awareservice

settingservice

awareservice

unawareservice

app1 app2

!nv1 v2 vn

f1 f2 fn

The application has a service level si - known only by the application itself

The architecture

settingservice

awareservice

settingservice

awareservice

unawareservice

app1 app2

!nv1 v2 vn

f1 f2 fn

The resource manager selects the virtual platform vi assigned within a CBS scheduler

The architecture

settingservice

awareservice

settingservice

awareservice

unawareservice

app1 app2

!nv1 v2 vn

f1 f2 fnA weight λi determines how the adaptation

should be done and who should adjust more

The architecture

settingservice

awareservice

settingservice

awareservice

unawareservice

app1 app2

!nv1 v2 vn

f1 f2 fnA weight λi determines how the adaptation

should be done and who should adjust more

λi = 0: the applications does all the adaptation by changing its service level

λi = 1: the resource manager adjusts by setting the virtual platform

The architecture

settingservice

awareservice

settingservice

awareservice

unawareservice

app1 app2

!nv1 v2 vn

f1 f2 fn

Both the application and the resource manager sense a matching function fi

The matching function

-5-3.75-2.5

-1.250

1.252.5

-5-3.75-2.5

-1.250

1.252.5

The matching is abundant:increase si or decrease vi

-5-3.75-2.5

-1.250

1.252.5

The matching is scarce:decrease si or increase vi

-5-3.75-2.5

-1.250

1.252.5

The matching is perfect:do nothing

-5-3.75-2.5

-1.250

1.252.5

• The purpose of the resource manager is find the allocation where the matching functions of all the applications are as close as possible to zero

-5-3.75-2.5

-1.250

1.252.5

Both the application and the resource manager should be able to measure the

matching function independently

• Defines how good is the match between the resource assigned to an application and its current service level:

increases with vi

decreases with si

increases with vi

decreases with sifi = �ivisi

increases with vi

application dependent constant

increases with vi

Not meas

urable!

• In order to get a function that both the resource manager and the application can measure, we chose to use:

fi =deadline

response time

= �lateness

response time

Outline

• Implementation

Virtual platform

• The resource manager is designed with a game-theoretic approach and changes the virtual platform assignments according to

vi = vi � "[�ifi � viX

(�ifi)]

Formal guarantees

• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore

Formal guarantees

If a stationary point where all the matching functions are zero exists, it is reached

Formal guarantees

If a stationary point where all the matching functions are zero exists, it is reached

If multiple of these points exist, the assignment will depend on the weights λi

Outline

• Implementation

Implementation

• In TrueTime (RT kernel simulator)

• In Linux with SCHED_DEADLINE*Linux Scheduling Class implementing global EDF and CBS

* Juri Lelli, Giuseppe Lipari, Dario Faggioli, Tommaso Cucinotta An efficient and scalable implementation of global EDF in Linux International Workshop on Operating Systems Platforms for Embedded Real-Time Applications (OSPERT), Porto (Portugal), July 2011.

https://github.com/martinamaggio/gtrm

Communication

• The application communicates via shared memory the start and stop time of each job: the shared memory contains the necessary values to compute the matching function

Communication

• The application communicates via shared memory the start and stop time of each job: the shared memory contains the necessary values to compute the matching function

Shared memory communication is provided transparently with a library

https://github.com/martinamaggio/jobsignal

one job consumes acpu si + bcpu cpu and amem si + bmem memory

while (!finished) { /* jobs */

signal_job_start();

if (needed)

change_service_level();

do_work();

signal_job_end();

write on shared memory

si: based on performance function

wi = [acpu si + bcpu, amem si + bmem]

e.g.: every 10 jobs

Application

for i in applications {

read_matching_functions();

compute_virtual_platforms();

set_virtual_platforms();

send_app_indications();

read from shared memory

vi = vi - Ɛ [λi fi - Ʃi(λi fi) * vi]

set SCHED_DEADLINE budget

Resource manager

for i in applications {

read_matching_functions();

compute_virtual_platforms();

set_virtual_platforms();

send_app_indications();

read from shared memory

vi = vi - Ɛ [λi fi - Ʃi(λi fi) * vi]

set SCHED_DEADLINE budget

Resource manager

Outline

• Implementation

Convergence test

• The first experiment is a single core experiment to test the convergence of virtual platforms to their predicted values

Convergence test

λ1 = 0.1, λ2 = 0.3, λ3 = 0.2, λ4 = 0.5

0 1 2 3 4 5

0.80.60.40.20

1 app1

time (sec)

Multicore

• The second one is a multicore experiment, where applications are demanding CPU and memory and have different weights

Multicore

0 2 4 6 8 10

−1.0

−0.50.0

0.40.30.20.10

0.50.6

0.80.60.40.2

time (sec)

app1...8

Overhead

• The third experiment is still multicore and tests the overhead of the resource manager and its linear time complexity

Overhead

0 5 10 15 20 25

number of applications (n)

run-tim

e(µsec)

noshared

Outline

• Implementation

Conclusion

• Decoupling the responsibility of choosing the applications service levels and the CPU allocation

• Resource allocation has linear time complexity in the number of applications

• Guarantees in terms of zero-matching functions whenever a feasible allocation exists

Future work

• Managing asynchronous application updates [✔]

• Apply to web server jobs for the cloud [✔]

• Address multithreaded applicationsSCHED_DEADLINE → cgroups

• Deal with cheating applications

• Apply to different resources, like memory and network bandwidth

Thanks for the attentionQuestions?

email: martina@control.lth.se

code: https://github.com/martinamaggio/jobsignal https://github.com/martinamaggio/gtrm

A Game-Theoretic Resource Manager for RT Applications

Technology