Distributed Systems (part 1) Chris Gill [email protected] Department of Computer Science and...

Distributed Systems (part 1)

Chris [email protected]

Department of Computer Science and EngineeringWashington University, St. Louis, MO, USA

CSE 591 Area 5 Talk

Monday, November 10, 2008

2 - Gill: Distributed Systems – 04/20/23

What is a Distributed System?

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.”

- Leslie Lamport

(BTW, this is entirely “ha, ha, only serious” ;-)


Key Characteristics of a Distributed System

Programs on different computers must interact» A distributed system spans multiple computers» Programs must send information to each other» Programs must receive information from each other» Programs also need to do some work ;-)

Programs play different roles in those interactions» Send a request (client), process the request (server),

send a reply (server), receive and process reply (client)

» Remember where to find things (directory, etc. services)

» Mediate interactions among distributed programs (coordination, orchestration, etc. services)

Programs can interact in many other ways as well » Coordination “tuple spaces” (JavaSpaces, Linda,

LIME) » Publish-subscribe and message passing middleware» Externally driven (e.g., a workflow management

system)


Distribution Semantics Matters a Lot

How are the different computers inter-connected?» Does all traffic move on a common data bus? » Or, does traffic move across (hierarchical) networks?» Or, does traffic move point-to-point between hosts?

Are there spatial and/or temporal factors?» Does hosts physical location/movement matter?» Is delay noticeable, are bandwidth limits relevant? » Are connections “always on” or can they be

intermittent?» Does the inter-connection topology change?» Is the inter-connection topology entirely dynamic?

Programs can interact in many other ways as well » Coordination “tuple spaces” (JavaSpaces, Linda,


system)


Distribution Semantics Examples (1/3)

Wired (hierarchical) internet» Can reach any host from any other host» Hosts are “always” on and available (% failure,

downtime)» Much of the WWW depends on this notion

(example?) J

I

H

G

F

A

B

C

D

E



Nomadic (hierarchical) internet» Some hosts are mobile, connect to nearest access

point» Hosts may be unavailable, but reconnect eventually

» Host-to-host path topology may change due to this

» Cell phones, wireless laptops exhibit this behaviorJ

I

H

G

F

A

B

C

D

EC



Mobile ad hoc networks (MANETS)» Mobile hosts connect to each other (w/out access

point)» Hosts may detect dynamic connection, disconnection» Hosts must exploit communication windows of

opportunity» Enables ad-hoc routing, message “mule”

behaviorsJ I

HG

F

AB

C

DE

H


Distributed System Example (Wired)

Real-time avionics middleware » Layer(s) between the application and the operating

system» Ensures non-critical activities don’t interfere with

timing of critical ones» Based on other open-source middleware projects

» ACE C++ library and TAO object request broker» Standards-based (CORBA), written in C++/Ada

Flight demonstrations : BBN, WUSTL, Boeing,

Honeywell

laxity

static

static

timers

laxity

static

static

timers


Distributed System Example (Nomadic/MANET)

Sliver» A compact (small footprint) workflow engine for

personal computing devices (e.g., cell phones, PDAs)» Allows mobile collaboration to assemble and

complete automated work-flows (task graphs)» Standards-based (BPEL, SOAP), written in Java

Developed by Greg Hackmann at WUSTL


How do Distributed Systems Interact?

Remote method invocations are one popular style» Allows method calls to be made between programs» Middleware uses threads, sockets, etc. to make it so» CORBA, Java RMI, SOAP, etc. standardize the details

Other styles (better for nomadic/mobile settings)» Coordination “tuple spaces” (JavaSpaces, Linda,


system)


Challenges for (Wired) Distributed Systems Distributed systems are inherently complex

» Remote concurrent programs must inter-operate» Interactions must be assured of liveness and safety

Also must avoid accidental complexity» Design for ease of configuration, avoidance of

mistakes» System architectures and design patterns can help

map low level abstractions into appropriate higher level ones

Clients Servants

nORB

DispatcherDispatcherDispatcher

SOASOASkeletonSkeleton SkeletonSkeleton

StubStubStubStub


How to Abstract Concurrent Event Handling?

Server

CONNECT

Client1

Port:27098

Client2

Port:26545

CONNECT

Goal: process multiple service

requests concurrently

using OS level threads

Port:30000 listen

Port:24467 accept

Port:25667 accept


Basis: Synchronous vs. Reactive Read

read()

Clients Server

select()

Clients Server

read()

data data

HandleSetHandleSet


Approach: Reactive Serial Event Dispatching

select()Clients

Application

Event Handlers

read()

read()

Reactor

handle_*()

HandleSet


Interactions among Participants

Main ProgramConcrete Event

HandlerReactor

SynchronousEvent

Demultiplexer

register_handler(handler, event_types)

get_handle()

handle_events()select()

event

handle_event()


Distributed Interactions with Reactive Hosts Application components implemented as handlers

»Use reactor threads to run input and output methods»Send requests to other handlers via sockets, upcalls

Example of a multi-host request/result chain»h1 to h2, h2 to h3, h3 to h4

reactor r1

handler h1

reactor r2 reactor r3

socket socket

handler h2 handler h4 handler h3


WaitOnConnection Strategy

Client ServerC

Reactor

3 wait

Re

acto

r

Servant

Deadlock here

Callback

1 2 4

5

• Handler waits on socket connection for the reply– Makes a blocking call to

socket’s recv() method

• Benefits– No interference from other

requests that arrive while the reply is pending

• Drawbacks– One less thread in the

Reactor for new requests– Could allow deadlocks when

upcalls are nested


WaitOnReactor Strategy

• Handler returns control to reactor until reply comes back – Reactor can keep processing

other requests while replies are pending

• Benefits– Thread available, no

deadlock– Thread stays fully occupied

• Drawbacks– Interleaving of request reply

processing– Interference from other

requests issued while reply is pending

Client ServerC

Reactorwait

Re

acto

r

Servant

Callback6

Deadlock avoided bywaiting on reactor

1

342

5


Blocking with WaitOnReactor• Wait-on-Reactor

strategy could cause interleaved request/reply processing

• Blocking factor could be large or even unbounded – Based on the upcall

duration– And sequence of other

intervening upcalls

• Blocking factors may affect real-time properties of other end-systems– Call-chains can have

a cascading blocking effect

f2f5

f3

f5 replyqueued

f3 completes

f5 replyprocessed

f2 completes

Blocking factor for f2


Why not a “Stackless” WaitOnReactor Variant?

• What if we didn’t “stack” processing of results?– But instead allowed them to handled

asynchronously as they are ready– “Stackless Python” takes this approach

– Thanks to Caleb Hines who pointed this out in CSE 532

• Benefits– No interference from other requests that arrive when

reply is pending– No risk of deadlock as thread still returns to reactor

• Drawbacks– Significant increase in implementation complexity– Time and space overhead to match requests to

results (other patterns we cover in CSE 532 could help, though)


Could WaitOnConnection Be Used?

Main limitation is its potential for deadlock»And, it offers low overhead, ease of implementation/use

Could we make a system deadlock-free …» if we knew its call-graph … and were careful about how

threads were allowed to proceed?

Notice that a lot of distributed systems research has this kind of flavor…»Given one approach (of probably several alternatives)

Can we solve problem X that limits its applicability and/or utility? Can we apply that solution efficiently in practice? Does the solution raise other problems that need to be solved?


Call graph often can be obtained

Each reactor is assigned a color

Deadlock can exist » If there exists > Kc

segments of color C»Where Kc is the

number of threads in node with color C

»E.g., f3-f2-f4-f5-f2 needs at least 2 & 1

Deadlock Problem in Terms of a Call Graph

f1f2

f3

f4

f5

From V. Subramonian and C. Gill, “A Generative Programming Framework for Adaptive Middleware”, 2004


Simulation Showing Thread Exhaustion

Reactor1

Client1

Client2

Client3

Reactor2

Formally, increasing number of reactor threads may not prevent deadlock

Server1 Server2

Flow1

Flow2

Flow3

EH11

EH31

EH21

EH12

EH13

EH22

EH23

EH32

EH33

Clients send requests 3: Client3 : TRACE_SAP_Buffer_Write(13,10) 4: Unidir_IPC_13_14 : TRACE_SAP_Buffer_Transfer(13,14,10) 5: Client2 : TRACE_SAP_Buffer_Write(7,10) 6: Unidir_IPC_7_8 : TRACE_SAP_Buffer_Transfer(7,8,10) 7: Client1 : TRACE_SAP_Buffer_Write(1,10) 8: Unidir_IPC_1_2 : TRACE_SAP_Buffer_Transfer(1,2,10)Reactor1 makes upcalls to event handlers 10: Reactor1_TPRHE1 ---handle_input(2,1)---> Flow1_EH1 12: Reactor1_TPRHE2 ---handle_input(8,2)---> Flow2_EH1 14: Reactor1_TPRHE3 ---handle_input(14,3)---> Flow3_EH1Flow1 proceeds 15: Time advanced by 25 units. Global time is 28 16: Flow1_EH1 : TRACE_SAP_Buffer_Write(3,10) 17: Unidir_IPC_3_4 : TRACE_SAP_Buffer_Transfer(3,4,10) 19: Reactor2_TPRHE4 ---handle_input(4,4)---> Flow1_EH2 20: Time advanced by 25 units. Global time is 53 21: Flow1_EH2 : TRACE_SAP_Buffer_Write(5,10) 22: Unidir_IPC_5_6 : TRACE_SAP_Buffer_Transfer(5,6,10)Flow2 proceeds 23: Time advanced by 25 units. Global time is 78 24: Flow2_EH1 : TRACE_SAP_Buffer_Write(9,10) 25: Unidir_IPC_9_10 : TRACE_SAP_Buffer_Transfer(9,10,10) 27: Reactor2_TPRHE5 ---handle_input(10,5)---> Flow2_EH2 28: Time advanced by 25 units. Global time is 103 29: Flow2_EH2 : TRACE_SAP_Buffer_Write(11,10) 30: Unidir_IPC_11_12 : TRACE_SAP_Buffer_Transfer(11,12,10)Flow3 proceeds 31: Time advanced by 25 units. Global time is 128 32: Flow3_EH1 : TRACE_SAP_Buffer_Write(15,10) 33: Unidir_IPC_15_16 : TRACE_SAP_Buffer_Transfer(15,16,10) 35: Reactor2_TPRHE6 ---handle_input(16,6)---> Flow3_EH2 36: Time advanced by 25 units. Global time is 153 37: Flow3_EH2 : TRACE_SAP_Buffer_Write(17,10) 38: Unidir_IPC_17_18 : TRACE_SAP_Buffer_Transfer(17,18,10) 39: Time advanced by 851 units. Global time is 1004


Solution: New Deadlock Avoidance Protocols

Papers at FORTE 2005 through EMSOFT 2006http://www.cse.wustl.edu/~cdgill/PDF/forte05.pdfhttp://www.cse.wustl.edu/~cdgill/PDF/

emsoft06_liveness.pdf

César Sánchez PhD dissertation at Stanford»Collaboration with Henny Sipma and Zohar Manna

Paul Oberlin: MS project here at WUSTL Avoid interactions leading to deadlock

»a liveness property Like synchronization, achived via scheduling

»Upcalls are delayed until enough threads are ready But, introduces small blocking delays

»a timing property» In real-time systems, also a safety property


Deadlock Avoidance Protocol Overview

• Regulates upcalls based on # of available reactor threads and call graph’s “thread height”– Does not allow exhaustion

• BASIC-P protocol implemented in the ACE Thread Pool Reactor– Using handle suspension

and resumption– Backward compatible,

minimal overhead

EH11

EH21

EH12

EH13

EH22

EH23

EH33

Client3

Client2

Client1

Server1 Server2

Reactor1 Reactor2

EH31

EH32

Flow1

Flow2

Flow3


Timing traces from model/execution show DA protocol regulating the flows to use available resources without deadlock

EH33

EH23EH13

Timing Traces: DA Protocol at Work

EH22EH12

R1 R2

EH32

Flow2

R1 R2

Flow3

EH31

EH21EH11

R1 R2

Flow1


DA Blocking Delay (Simulated vs. Actual)Actual ExecutionModel Execution

Blocking delayfor Client2

Blocking delayfor Client3


Overhead of ACE TP reactor with DA

Negligible overhead with no DA protocol

Overhead increases with number of event handlers because of their suspension and resumption on protocol entry and exit


Where Can We Go From Here? Distributed computing is

ubiquitous» …in planes, trains, and automobiles

…» …in medical devices and

equipment…» …in more and more places each day

Distributed systems offer many research opportunities» Discover them from specific problems» May allow advances even in well

worked areas (e.g., deadlock avoidance)

What new systems can we build by spanning different platforms?» I’ll leave that as an open question for

you to consider (and ultimately, to answer)

A fire extinguisher that runs UNIX?

Date post:	04-Jan-2016
Category:	Documents
Upload:	edwin-goodman
View:	217 times
Download:	0 times

Distributed Systems (part 1) Chris Gill [email protected] Department of Computer Science and...

Documents