Property Assurance in Middleware for Distributed Real-Time Systems* Christopher Gill...

Property Assurance in Middlewarefor Distributed Real-Time Systems*

Christopher [email protected]

Department of Computer Science and EngineeringWashington University, St. Louis, MO

Seminar at Stanford UniversityThursday, March 15, 2007

*Research supported in part by NSF CAREER award CCF-0448562Joint work with Venkita Subramonian, César Sánchez, Henny Sipma, and Zohar Manna

2

A Motivating Example: Real-Time Image Transmission

low (often variable) bandwidth radio link

virtual folder,images

adaptationmiddleware

transmissionmiddleware

displays

serverside

clientside

imagesource

Camera

Console

Chains of end-to-end tasks» E.g., compress, transmit,

decompress, analyze, and then display images

Property assurance is crucial» Soft real-time constraints» Deadlock freedom

Many applications have similar needs» Is correct reuse of middleware

possible?[Gill et al., “Integrated Adaptive QoS

Management in Middleware: An Empirical Case Study” (RTAS ‘04)]

[Wang et al., “CAMRIT: Control-based Adaptive Middleware for Real-time Image Transmission” (RTAS ‘04)]

3

Middleware for Distributed Real-Time Systems

Layered stacks of mechanisms» thread, port,

socket, timer» reactor, monitor» client, server,

gateway, ORB

Task chains span multiple hosts» may be initiated

asynchronously

Limited host resources » used by multiple

task chains

A “Distributed System Software Stack”

TAO+CIAO

ACE

4

One Widely Used Mechanism: Reactor

select()dataarrival

Application

Event Handlers

read()

read()

Reactor

handle_input()

Read Handle

Set

handle_input()

Socket

Reactor abstraction has many variations:select () vs. WaitForMultipleObjects()single thread vs. thread poolunsynchronized vs. mutex vs. readers-writer …

5

An Illustration of Inherent Complexity

Client ServerC

Reactor

3 wait

Re

acto

r

Servant

Deadlock here

Callback

1 2 4

5

Wait-on-Connection Handler waits on socket

connection for the reply » Blocking call to recv()» One less thread listening on the

Reactor for new requests Exclusive handling of the reply However, may cause deadlocks

if reactor upcalls are nested

Wait-on-Reactor Handler waits in reactor for reply

» E.g., set read_mask, call select() again

Other requests can be processed while replies are still pending» For efficiency, call stack

remembers handler continuation» Intervening requests may delay

reply processing (LIFO semantics)

Client ServerC

Reactorwait

Re

acto

r

Servant

Callback6

Deadlock avoided bywaiting on reactor

1

342

5

Two essential research questions:

How can we represent and analyze such diverse behavior?

How can we enforce properties that span hosts, efficiently?

6

A principled basis for middleware verification» Model each mechanism’s inherent complexity accurately» Remove unnecessary complexity through abstraction» Compose models tractably and with high fidelity to system

itself

New protocols and mechanisms for property enforcement» Exploit call graph structure and other domain-specific

information» Develop efficient local mechanisms for end-to-end

enforcement» Design frameworks to support entire families of related

protocols

Two Essential Technical Objectives

7

Model Architecture in IF for ACE

Network/OS layer: inter-process communication abstractions Middleware layer: ACE pattern-oriented abstractions Application layer: application-specific semantics within ACE event handlers

8

Modeling Threads Challenge

» No native constructs for threads in model checkers that currently support timed automata

Option 1: model all thread actions as a single automaton» Suitable for high level modeling of application semantics

Option 2: model a thread as multiple interacting automata » Interactions model the flow of control » This option better abstracts the nuances of ACE-level mechanisms

Foo

output method_request

Bar

input method_request

input method_result output method_result

9

Modeling Thread Scheduling Semantics (1/4)

Easy to achieve with one automaton per thread» Specify to model checker directly

E.g., using IF priority rules More difficult with more than one

automaton per thread » Thread of control spans interactions

among automata» Need to express thread scheduling in

terms of execution control primitives provided by the model checker

prio_rule: pid1 < pid2 if pid1 instanceof Activity1 and pid2 instanceof Activity2

Update Display

Control Flow Rate

1 automaton per thread

Activity1 Activity2

Foo

output m_req

Bar

input m_req

input m_result output m_result

2 automata per thread

10

Modeling Thread Scheduling Semantics (2/4) Solution

» Introduce a thread id that is propagated along automata interactions

» Thread id acts as an index to a storage area which holds each thread’s scheduling parameters

1

1

2

2

Foo1 Bar1

Foo2 Bar2

Prio=5 Prio=8

thread_schedule: pid1 < pid2ifpid1 instanceof Foo1 and pid2 instanceof Bar1 and({Foo1}pid1).threadid <> ({Bar1}pid2).threadid and({Thread}(({Foo1}pid1).threadid)).prio <({Thread}(({Bar1}pid2).threadid)).prio )

Give higher preference to the automaton whose “thread” (pointed to by thread id) has higher priority

Foo1 Bar1

Bar2Foo2 Bar1

Foo1

Hint to the model checker

Resulting Behavior

11

Modeling Thread Scheduling Semantics (3/4) What if two threads have the same priority? In an actual implementation, run-to-completion

(SCHED_FIFO) may control the possible interleavings How can we model run-to-completion?

Foo1 Foo2 Foo3

Bar1 Bar2 Bar3

Bar1

Foo1

Foo2

Foo3

Bar1

Bar2

Bar3

Bar1

Bar1

Bar2

Bar3

Foo1

Foo2

Foo3

Foo1

How do we pruneout this space?

12

Modeling Thread Scheduling Semantics (4/4)

Current=nil

Bar1

Foo1

Foo2

Give higher preference to the automaton whose thread is the currently running thread. Non-deterministic choice if Current is nil

Current=1 Current=2

Current=1

Bar1

Bar1

Solution» Record id of

currently executing thread

» Update when executing actions in each automaton

Current=1

Bar1

Foo3

Current=2

Bar2

Current=2

Bar3

Current=2


13

Problem: Over-constraining Concurrency

Current=nil

Bar1

Foo1

Foo2

Give higher preference to the automaton whose thread is the currently running thread. Non-deterministic choice if Current is nil

Current=1 Current=2

Current=1

Bar1

Bar1

Current=1

Bar1

Foo3

Current=2

Bar2

Current=2

Bar3

Current=2


Current=2

Time progresses

Bar3

Foo3

Bar3 always chosen to run

14

Solution: Idle Catcher Automaton

Key idea: lowest priority “catcher” runs when all others are blocked» E.g., catcher thread in

middleware group scheduling (RTAS ‘05)

Here, idle catcher automaton» runs when all other automata

are idle (not enabled), but before time progresses

» Resets value of current id to nil

Current=2

Current=nil

Time progress

Bar3Foo3

Foo3, Bar3 blockedat this point

Current=nil

Idle catcher runs

Foo3 or Bar3 could be chosen to run. Over-constraining eliminated

15

Problem: Tractability

Model checking can suffer from state space explosion» State space reduction, live variable analysis can help» But even good model checkers don’t fully solve this

Need to think of modeling as a design issue, too» Does the model represent what it needs to represent?» Can the model be re-factored to help the checker?» Can domain specific information help avoid unnecessary checking?

“right away”“in a minute”“get coffee”“go for an espresso”“maybe tomorrow?”

16

Optimization 1: Leader Election Leader/Followers concurrency

» Threads in a reactor thread pool take turns waiting on the reactor

» One thread gets the “token” to access the reactor - leader

» All other threads wait for the token – followers It does not matter which thread gets selected

as leader in a threadpool Model checker not aware of this domain

specific semantics For BASIC-P protocol example, saved factor of

~50 in state space, and factor of ~20 in time

Token to access thereactor is available

Prune this out

T1T2

T3

T2 T3

T3

17

Optimization 2: System Initialization Similar idea, but different technique Iff ok to establish initial object relations

in any order, can optimize away» E.g., 2 server automata, each of which

creates a reactor automaton Useful when modeling object systems

in model checkers with dynamic automaton creation capability (e.g., IF)

State space reduction depends on application» Factor of ~250 for a deadlock scenario

with 2 reactors and 3 threads in each reactor

S1 R1 S2 R1

S1 R1

S2 R2

S2 R1

S1 R2

Prune this out

S1 creates R S2 creates R

S2 creates RS1 creates R

18

Verification of a Real-Time Gateway

An exemplar of many realistic ACE-based applications We modified the Gateway example to add new

capabilities» E.g., Real time, Reliability, Control-Push-Data-Pull» Value added service in Gateway before forwarding a to

consumer E.g. Using consumer specific information to customize data stream

Different design, configuration choices become important » E.g., number of threads, dispatch lanes, reply wait strategies

Gateway

Supplier1

Supplier2

Consumer1

Consumer3

Consumer2

Consumer4

19

Model Checking/Experiment Configuration

Gateway is theoretically schedulable under RMA» Utilization = 80%» Schedulable utilization = 100% for harmonic

periods » Assumption – Messages from 50ms supplier is

given higher preference than 100ms supplier ACE models let us verify scheduling enforcement

» IN THE ACTUAL SYSTEM IMPLEMENTATION

Gateway

S1

C1

S2

C2

C3

C4

20

20

10

10

100ms

50ms

100ms

100ms

50ms

50ms

Period

Relative Deadline

Deadline Exec time

C1 100ms 20ms

C2 100ms 20ms

C3 50ms 10ms

C4 50ms 10ms

Value-addedexecution

(and its cost)

20

Real-time Gateway – Single Thread

Single reactor thread dispatches incoming events I/O (reactor) thread same as dispatch thread I/O thread responsible for value added service

Gateway

SupplierHandler

SupplierHandler

Reactor

ConsumerHandler

ConsumerHandler

ConsumerHandler

ConsumerHandler

Supplier Consumer

21

Real-time Gateway – Dispatch Lanes

Gateway

SupplierHandler

SupplierHandler

Reactor

ConsumerHandler

ConsumerHandler

ConsumerHandler

ConsumerHandler

Supplier Consumer

Single reactor thread again dispatches events to gateway handlers I/O (reactor) thread puts message into dispatch lanes Lane threads perform value added service, dispatch to consumers DO QUEUES HELP OR HURT TIMING PREDICTABILITY?

22

Model/Actual Traces for Real-time Gateway

C1

C1

C2

C2

C3 C4

C3 C4

20 40 6010 30 50

S1,S2 S2

Model

Actual

Single threaded GatewayExecution in the context of reactor thread

Deadline miss for Consumer4because of blocking delay at reactor

Gateway with dispatch lanes

Execution in the context of lane threads

20 40 6010 30 50

C3 C4 C1 C2 C3 C4 C2

C3 C4 C1 C2 C3 C4 C2

70 80 90100Time

C1, C2, C3, C4

C3 C4 C1 C2 C3 C4 C2

20 40 6010 30 50 70 80 90100

Period

Exec time

Deadline

C1

100ms

20ms

100ms

C2

100ms

20ms

100ms

C3

50ms 10ms

50ms

C4

50ms 10ms

50ms

Expected execution timeline with RMS

Time

C3, C4

S1,S2 S2

23

A principled basis for middleware verification» Model each mechanism’s inherent complexity accurately» Remove unnecessary complexity through abstraction» Compose models tractably and with high fidelity to system

itself

New protocols and mechanisms for property enforcement» Exploit call graph structure and other domain-specific

information» Develop efficient local mechanisms for end-to-end

enforcement» Design frameworks to support entire families of related

protocols

Two Essential Technical Objectives

24

Properties, Protocols, and Call Graphs Many real-time systems

have static call graphs» even distributed ones» helps feasibility analysis» intuitive to program

Exploit this to design efficient protocols» pre-parse graph and assign

static attributes to its nodes Resource dependence, prioritization

» maintain local state about use» enforce properties according to

(static) attributes and local state Guard: α(fi) < tRj

Decrement, increment tRj

f4

f1

f3

f2

[Subramonian et al., HICSS04][Sanchez et al., FORTE05, IPDPS06, EMSOFT06, OPODIS06]

Reactor 1

Reactor 2

α(f4) = 0

α(f3) = 0α(f1) = 1

α(f2) = 0

tR1 = 2

tR2 = 1

25

Property Enforcement Mechanisms

Protocol enforcement has a common structure » pre-invocation method» invocation up-call» post-invocation method

Specialized strategies implement each protocol» BASIC-P

annotation + variable» k-EFFCIENT-P

annotation + array» LIVE-P

annotation + balanced binary tree

All of these protocols work by delaying upcalls» Constitutes a side effect

that model checker should evaluate

TP_Reactor

Protocol_TP_Reactor

PTPR_Strategy Reactor_Strategy_Base

Efficient_PBasic_P Live_P

k

26

Model checking & actual timing tracesshow BASIC-P protocol’s regulation ofthreads’ use of resources (no deadlock)

EH33

EH23EH13

Timing Traces for BASIC-P Protocol

EH22EH12

R1 R2

EH32

Flow2

R1 R2

Flow3

EH31

EH21EH11

R1 R2

Flow1

27

BASIC-P Blocking Delay ComparisonActual ExecutionModel Execution

Blocking delayfor Client2

Blocking delayfor Client3

28

Overhead of ACE TP/DA reactor with BASIC-P

Negligible overhead with no DA protocol

Overhead increases linearly with # of event handlers due suspend/resume actions on handlers at BASIC-P entry/exit

29

Vanderbilt University and UC Irvine» GME, CoSMIC, PICML, Semantic Mapping

UC Irvine» DREAM

UC Santa Cruz» Code Aware Resource Management

UC Berkeley» Ptolemy, E-machine, Giotto

Kansas State University and University of Nebraska» Bogor, Cadena, Kiasan

A Brief Survey of Closely Related Work

30

Concluding Remarks Timed automata models of middleware building blocks

» Are useful to verify middleware concurrency and timing semantics

Domain specific model checking refinements» Help improve fidelity of the models (run to completion, priorities)» Can achieve significant reductions in state space

Property protocols » Reduce what must be checked by provable enforcement» Also benefit from model checking (due to side effects)

Current and Future work» Complete implementation of reactor protocol framework in ACE

Integrate priority inheritance mechanisms with RTCORBA Evaluate alternatives for mechanism level thread synchronization Dynamic call graph adaptation, priority protocol enforcement

» Extend modeling approach beyond real-time concerns Model mechanism faults and failure modes Hybrid automata + domain-aware techniques for constraining

complexity

Date post:	02-Jan-2016
Category:	Documents
Upload:	osborn-randall
View:	215 times
Download:	0 times

Property Assurance in Middleware for Distributed Real-Time Systems* Christopher Gill...

Documents