A Game-Theoretic Resource Manager for RT Applications

Post on 18-Nov-2014

743 views 1 download

description

Presentation given at ECRTS 2013. Code: https://github.com/martinamaggio/gtrm Paper abstract: The management of resources among competing QoS-aware applications is often solved by a resource manager (RM) that assigns both the resources and the application service levels. However, this approach requires all applications to inform the RM of the available service levels. Then, the RM has to maximize the "overall quality" by comparing service levels of different applications which are not necessarily comparable. In this paper we describe a Linux implementation of a game-theoretic framework that decouples the two distinct problems of resource assignment and quality setting, solving them in the domain where they naturally belong to. By this approach the RM has linear time complexity in the number of the applications. Our RM is built over the SCHED_DEADLINE Linux scheduling class.

transcript

A Game-Theoretic Resource Manager for RT Applications

Martina Maggio, Enrico Bini, Georgios Chasparis, Karl-Erik Årzén

Lund University, Department of Automatic Control

Motivation

• Multiple applications sharing the same computing platform, especially in embedded systems

• Applications support multiple service levels with different execution requirements and quality of service

Motivation

• Multiple applications sharing the same computing platform, especially in embedded systems

• Applications support multiple service levels with different execution requirements and quality of service

SL1: 640x480 SL2: 800x600 SL3: 1024x768CPU: 30% CPU: 60% CPU: 90%

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Problem formulation

• Select:

the resource allocation

the service levels of the applications

to maximize the quality of the overall computation

Problem formulation

• Select:

the resource allocation

the service levels of the applications

to maximize the quality of the overall computation

The resource allocation naturally belongs to the resource management domain

Problem formulation

• Select:

the resource allocation

the service levels of the applications

to maximize the quality of the overall computation

The service level naturally belongs to the application domain

Existing approaches

• Centralized solution: the resource manager chooses both the resource allocation and the application service levels

Existing approaches

• Centralized solution: the resource manager chooses both the resource allocation and the application service levels

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatformOS

PSfrag

s1 s2

app1 app2

sn

appn

RM

v1 v2 vn

f1 f2 fn

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatformOS

s1 s2

app1 app2

sn

appn

RM

v1 v2 vn

f1 f2 fn

Existing approachesD

raw

back

s

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatformOS

s1 s2

app1 app2

sn

appn

RM

v1 v2 vn

f1 f2 fn

Existing approaches

The resource manager solves an ILP problem, high time complexityDra

wba

cks

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatformOS

s1 s2

app1 app2

sn

appn

RM

v1 v2 vn

f1 f2 fn

Existing approaches

The resource manager solves an ILP problem, high time complexityDra

wba

cks

The resource manager compares service levels from different applications

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatformOS

s1 s2

app1 app2

sn

appn

RM

v1 v2 vn

f1 f2 fn

Existing approaches

The resource manager solves an ILP problem, high time complexityDra

wba

cks

The resource manager compares service levels from different applications

The resource manager requires a lot of informations from the applications

Our contribution

• Decoupling service level assignment and resource allocation

• Formal guarantees that the operating point is a good match between the resource assigned and the service level

• Implemented in Linux with SCHED_DEADLINE

Our contribution

• Decoupling service level assignment and resource allocation

• Formal guarantees that the operating point is a good match between the resource assigned and the service level

• Implemented in Linux with SCHED_DEADLINEThe application selects the service level

The resource manager assignes resources

Our contribution

• Decoupling service level assignment and resource allocation

• Formal guarantees that the operating point is a good match between the resource assigned and the service level

• Implemented in Linux with SCHED_DEADLINE

Low information exchange between the two entities and no unwanted comparisons

The application selects the service levelThe resource manager assignes resources

Our contribution

• Decoupling service level assignment and resource allocation

• Formal guarantees that the operating point is a good match between the resource assigned and the service level

• Implemented in Linux with SCHED_DEADLINE

Assumptions

• Every application is made of jobs

every job has a deadline - expected execution time - and a real execution time

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

The architecture

settingservice

awareservice

settingservice

awareservice

01

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatform

01

01

unawareservice

OS

s1

app1 app2

sn

appn

RM!1

!2

!nv1 v2 vn

f1 f2 fn

The architecture

settingservice

awareservice

settingservice

awareservice

01

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatform

01

01

unawareservice

OS

s1

app1 app2

sn

appn

RM!1

!2

!nv1 v2 vn

f1 f2 fn

The application has a service level si - known only by the application itself

The architecture

settingservice

awareservice

settingservice

awareservice

01

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatform

01

01

unawareservice

OS

s1

app1 app2

sn

appn

RM!1

!2

!nv1 v2 vn

f1 f2 fn

The resource manager selects the virtual platform vi assigned within a CBS scheduler

The architecture

settingservice

awareservice

settingservice

awareservice

01

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatform

01

01

unawareservice

OS

s1

app1 app2

sn

appn

RM!1

!2

!nv1 v2 vn

f1 f2 fnA weight λi determines how the adaptation

should be done and who should adjust more

The architecture

settingservice

awareservice

settingservice

awareservice

01

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatform

01

01

unawareservice

OS

s1

app1 app2

sn

appn

RM!1

!2

!nv1 v2 vn

f1 f2 fnA weight λi determines how the adaptation

should be done and who should adjust more

λi = 0: the applications does all the adaptation by changing its service level

λi = 1: the resource manager adjusts by setting the virtual platform

The architecture

settingservice

awareservice

settingservice

awareservice

01

sensorvirtualplatform

sensorvirtualplatform

sensorvirtualplatform

01

01

unawareservice

OS

s1

app1 app2

sn

appn

RM!1

!2

!nv1 v2 vn

f1 f2 fn

Both the application and the resource manager sense a matching function fi

The matching function

-5-3.75-2.5

-1.250

1.252.5

3.75

time

-5-3.75-2.5

-1.250

1.252.5

3.75

The matching function

The matching is abundant:increase si or decrease vi

The matching function

-5-3.75-2.5

-1.250

1.252.5

3.75

The matching is scarce:decrease si or increase vi

The matching function

-5-3.75-2.5

-1.250

1.252.5

3.75

The matching is perfect:do nothing

The matching function

-5-3.75-2.5

-1.250

1.252.5

3.75

• The purpose of the resource manager is find the allocation where the matching functions of all the applications are as close as possible to zero

The matching function

-5-3.75-2.5

-1.250

1.252.5

3.75

Both the application and the resource manager should be able to measure the

matching function independently

The matching function

• Defines how good is the match between the resource assigned to an application and its current service level:

increases with vi

decreases with si

The matching function

• Defines how good is the match between the resource assigned to an application and its current service level:

increases with vi

decreases with sifi = �ivisi

� 1

The matching function

• Defines how good is the match between the resource assigned to an application and its current service level:

increases with vi

decreases with sifi = �ivisi

� 1

application dependent constant

The matching function

• Defines how good is the match between the resource assigned to an application and its current service level:

increases with vi

decreases with sifi = �ivisi

� 1

Not meas

urable!

The matching function

• In order to get a function that both the resource manager and the application can measure, we chose to use:

fi =deadline

response time

� 1

= �lateness

response time

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Virtual platform

• The resource manager is designed with a game-theoretic approach and changes the virtual platform assignments according to

vi = vi � "[�ifi � viX

i

(�ifi)]

Formal guarantees

• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore

Formal guarantees

• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore

If a stationary point where all the matching functions are zero exists, it is reached

Formal guarantees

• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore

If a stationary point where all the matching functions are zero exists, it is reached

If multiple of these points exist, the assignment will depend on the weights λi

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Implementation

• In TrueTime (RT kernel simulator)

• In Linux with SCHED_DEADLINE*Linux Scheduling Class implementing global EDF and CBS

* Juri Lelli, Giuseppe Lipari, Dario Faggioli, Tommaso Cucinotta An efficient and scalable implementation of global EDF in Linux International Workshop on Operating Systems Platforms for Embedded Real-Time Applications (OSPERT), Porto (Portugal), July 2011.

https://github.com/martinamaggio/gtrm

Communication

• The application communicates via shared memory the start and stop time of each job: the shared memory contains the necessary values to compute the matching function

Communication

• The application communicates via shared memory the start and stop time of each job: the shared memory contains the necessary values to compute the matching function

Shared memory communication is provided transparently with a library

https://github.com/martinamaggio/jobsignal

one job consumes acpu si + bcpu cpu and amem si + bmem memory

while (!finished) { /* jobs */

signal_job_start();

if (needed)

change_service_level();

do_work();

signal_job_end();

}

write on shared memory

si: based on performance function

wi = [acpu si + bcpu, amem si + bmem]

write on shared memory

e.g.: every 10 jobs

Application

for i in applications {

read_matching_functions();

compute_virtual_platforms();

set_virtual_platforms();

send_app_indications();

}

read from shared memory

vi = vi - Ɛ [λi fi - Ʃi(λi fi) * vi]

set SCHED_DEADLINE budget

write on shared memory

Resource manager

for i in applications {

read_matching_functions();

compute_virtual_platforms();

set_virtual_platforms();

send_app_indications();

}

read from shared memory

vi = vi - Ɛ [λi fi - Ʃi(λi fi) * vi]

set SCHED_DEADLINE budget

write on shared memory

Resource manager

O(n)

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Convergence test

• The first experiment is a single core experiment to test the convergence of virtual platforms to their predicted values

Convergence test

λ1 = 0.1, λ2 = 0.3, λ3 = 0.2, λ4 = 0.5

0 1 2 3 4 5

0.80.60.40.20

1 app1

app2

app3

app4

time (sec)

VPs(v

i)

Multicore

• The second one is a multicore experiment, where applications are demanding CPU and memory and have different weights

Multicore

0 2 4 6 8 10

−1.0

0.5

−0.50.0

0.40.30.20.10

0.50.6

1.0

0.0

0.80.60.40.2

time (sec)

VPs(v

i)

fi

SLs(s

i)

app1...8

app9

app10

app11

app12

Overhead

• The third experiment is still multicore and tests the overhead of the resource manager and its linear time complexity

Overhead

200

400

600

800

3.5

2.5

1.5

0.5

3

2

1

0 5 10 15 20 25

0

number of applications (n)

run-tim

e(µsec)

noshared

mem

Outline

• Problem formulation

• The architecture

• Formal guarantees

• Implementation

• Experimental evaluation

• Conclusion and future work

Conclusion

• Decoupling the responsibility of choosing the applications service levels and the CPU allocation

• Resource allocation has linear time complexity in the number of applications

• Guarantees in terms of zero-matching functions whenever a feasible allocation exists

Future work

• Managing asynchronous application updates [✔]

• Apply to web server jobs for the cloud [✔]

• Address multithreaded applicationsSCHED_DEADLINE → cgroups

• Deal with cheating applications

• Apply to different resources, like memory and network bandwidth

Thanks for the attentionQuestions?

email: martina@control.lth.se

code: https://github.com/martinamaggio/jobsignal https://github.com/martinamaggio/gtrm