+ All Categories
Home > Documents > Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting...

Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting...

Date post: 25-Jun-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
23
Trustable VM Scheduling in a Cloud Fabien Hermenier, Ludovic Henrio
Transcript
Page 1: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Trustable VM Scheduling in a CloudFabien Hermenier, Ludovic Henrio

Page 2: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

monitoring data

VM queue

actu

ator

sVM scheduler

cloud config.

decisions

Page 3: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

anti-affinity(VM[2..3]); allocate({VM1},’ucpu’, 3); offline(@N4);

0’00 to 0’02: relocate(VM2,N2) 0’00 to 0’04: relocate(VM6,N2) 0’02 to 0’05: relocate(VM4,N1) 0’04 to 0’08: shutdown(N4) 0’05 to 0’06: allocate(VM1,‘cpu’,3)

reconfiguration plan

current configuration

constraints

.

.

.

Page 4: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Computing solutions is filtering out non-viable decisions

running(VM1) running(VM2) isolate(VM2) offline(N1)

N1N2 VM2

VM1

+

Explicit in OpenStack, CloudStack Implicit in BtrPlace

=?N2N1

N3

N3

N1N2 VM2

N3 VM1

N1N2N3

VM1

VM2

N1N2N3

VM1

VM2

Page 5: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

high consolidation, performance, trustworthy placements, valid schedules

VM scheduler brings

Page 6: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Filter-out viable decisions Reduce the hosting capabilities

over-filtering

under-filtering Let non-viable decisions Break SLA & user confidence

crash …

Behind the scene

+

Page 7: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

/* CAmongTest.java */ @Test public void testContinuousWithNotAlreadySatisfied() {…} @Test public void testWithOnGroup() { …} @Test public void testWithGroupChange() {…} @Test public void testWithNoSolution() {…} @Test public void testContinuousWithAlreadySatisfied() {…}

A limited vision of the significant use cases (specific state/transitions)

Page 8: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

A limited expertise in the theoretical foundationsdiscrete maxOnline(N[1..10], 7)::=

continuous maxOnline(N[1..10], 7)::=

10X

i=1

nqi 7

8i 2 [1, 10],n

on

i

=

⇢0 if n

q

i

= 1

a

start

i

otherwise

n

off

i

=

⇢max (T ) if n

q

i

= 0

a

end

i

otherwise

8t 2 T, card({i|non

i

� t ^ n

off

i

}) 7

ZER0

to HER continuous constraints [hotdep’13]

Page 9: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

,,

unit tests, smoke testing, peer review cannot address reasoning issues

Page 10: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

fuzz testing + simulatorto exhibit reasoning issues

a specification langageto state the awaited VM scheduler behaviour

applied to BtrPlace

Page 11: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

The specification language

toRunning ::= !(v : vms) vmState(v) = running —> ^vmState(v) : {ready, running, sleeping}

noVMsOnOfflineNodes ::= !(n : nodes) nodeState(n) /= online —> card(hosted(n)) = 0

First order logic

temporal call, Refers to the initial state Core constraints reflect element lifecycle

Must always be satisfied

MaxOnline(ns <: nodes, nb : int)::= card({i. i:ns , nodeState(i)=online}) <= nb // set builder notation

Side constraints are enabled on demand

Business functions in native code (Java) Added dynamically through reflections

Provide extensibility

Page 12: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

The testingframework@CstrTest(groups = {"lonely", "affinity"})

public void testLonely(TestCampaign c) { c.fuzz().constraint(“lonely”) .vms(2).nodes(2).srcVMs(1, 9, 0); c.limits().tests(100).failures(1); }

Test campaign

test case fuzzer

N1N2+continuous lonely(VM2)

VM1

VM2 VM1

TestCase valid plan + sample constraint

+continu lonely({VM1, VM2})

VM2

N1N2

+discrete lonely(VM1)

VM2VM1

N1N2+continuous lonely(VM1)

Page 13: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

oracle lonelyimplementation

The testing phase exhibits the inconsistencies

simulator + specification evaluator

+continu lonely({VM1, VM2})

VM2N1N2

+discrete lonely(VM1)

VM2VM1

N1N2+continuous lonely(VM1)

N1N2+continuous lonely(VM2)

VM1

VM2 VM1

Page 14: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Testing with an oracle

lonely(vs <: vms) ::= !(i : vs) vmState(i) = running --> (colocated(i) - {i}) <: vs

N1N2+continuous lonely(VM2)

VM1

VM2 VM1

N1N2+continuous lonely(VM2)

VM1

VM2 VM1

OK KO OK OK

a simulator executes the plan

the invariant is checked at every timestamp of interest

OK

Page 15: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Testing BtrPlace

online(N[1..2]) running(VM1) shutdown(VM2) fence(VM1, N2) schedule(VM1, 0, 11) schedule(VM2, 20, 26) continuous lonely(VM2)

N2N1 The test case is turned to a

heavily constrained instance to solve.

btrplace < 1.9

btrplace >= 1.9

N1N2

VM1

VM2 VM1

N1N2+continuous lonely(VM2)

VM1

VM2 VM1

Page 16: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

implementationOK KO CRASH

OK overfiltering crash

KO underfiltering crash

comparing the results exhibit the defects

oracle lonelyimplementationsimulator +

spec. evaluator

N1N2+continuous lonely(VM2)

VM1

VM2 VM1

orac

le

Page 17: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Evaluation

?usefulto find reasoning

defects

usablefor developers

Page 18: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Specificationcapabilities

All the constraints (27)

Formal documentationOutside the business code

state transition, action schedule, resource sharing, affinities, counting

Theoretical suitability

All the constraints

Page 19: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

short invariants

short functions

First order logic is effective Easy to read

Reduce risks of bugs

95th

89 chars

95th

95th

14 sloc

44 sloc8 sloc

Inputs provided by the fuzzerExpectation provided by the specification

small test campaigns

Page 20: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Fast enough for live testing

Fuzzer tuning to speed up the validation phase

200tests/sec

200tests/sec

fuzzing: test case generationvalidation: checking test case consistency wrt. core constraintstesting: checking the constraint under test

Page 21: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Exhibit known and unknown bugsLead to under-filtering (57%), over-filtering(28%), crashes(15%)

Cause Constraints Tests

Initial violation in continuous mode 7 704Unexpected arguments 4 642

Discrete filtering in continuous mode 3 45Unsupported action synchronisation 4 20Bad action semantic comprehension 1 16Unconsidered initial element state 1 4

Testing BtrPlace22 constraints 1,000 non-unique tests per campaign

Page 22: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

Programmatic approach is error prone

Specification vs. btrplace assertions

Developers forgot about action interleaving

Assertion systemWritten by the developer Event based Verbose

Page 23: Trustable VM Scheduling in a Cloud - Fabien …Filter-out viable decisions Reduce the hosting capabilities over-filtering under-filtering Let non-viable decisions Break SLA & user

A concise DSL to specify the constraint invariants Fuzz testing to detect inconsistencies Non disruptive Exhibit representative reasoning issues Read the paper for more details and evaluation results

Reasoning bugs cannot be exhibited through regular testing methods

http://www.btrplace.org


Recommended