+ All Categories
Home > Documents > Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Date post: 20-Dec-2015
Category:
View: 217 times
Download: 2 times
Share this document with a friend
Popular Tags:
44
Design Patterns from Biology for Distributed Computing Andres J. Ramirez
Transcript
Page 1: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Design Patterns from Biology for Distributed

Computing

Design Patterns from Biology for Distributed

Computing

Andres J. RamirezAndres J. Ramirez

Page 2: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Paper InformationPaper Information

AuthorsBabaoglu, Canright, Di Caro, et al.

(about 11 different dudes.)Published

ACM Transactions on Autonomous and Adaptive Systems, Vol 1, No 1, September 2006.

AuthorsBabaoglu, Canright, Di Caro, et al.

(about 11 different dudes.)Published

ACM Transactions on Autonomous and Adaptive Systems, Vol 1, No 1, September 2006.

Page 3: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Presentation OutlinePresentation Outline

What is a design pattern?Current challenges in designing

software systems.Parallelism to biological systems.Presentation of design patterns

extracted from biological systems.Experimentation and validation.Conclusion.

What is a design pattern?Current challenges in designing

software systems.Parallelism to biological systems.Presentation of design patterns

extracted from biological systems.Experimentation and validation.Conclusion.

Page 4: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

What is a design pattern?What is a design pattern?

Various definitions proposed: “Each pattern describes a problem which occurs

over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice.” [Chris Alexander]

“Each design pattern systematically names, explains and evaluates an important and recurrent design in object-oriented system.” [Gamma et al.]

“A recurring solution to a standard problem.” [Schmidt]

Overall, most are rather similar.

Various definitions proposed: “Each pattern describes a problem which occurs

over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice.” [Chris Alexander]

“Each design pattern systematically names, explains and evaluates an important and recurrent design in object-oriented system.” [Gamma et al.]

“A recurring solution to a standard problem.” [Schmidt]

Overall, most are rather similar.

Page 5: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Design Pattern Presentation

Design Pattern Presentation

Bare minimum format:Pattern name.Problem description.Solution to the problem.Consequences of applying the design

pattern.

Bare minimum format:Pattern name.Problem description.Solution to the problem.Consequences of applying the design

pattern.

Page 6: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Current ChallengesCurrent Challenges

Distributed environments are common place nowExtremely dynamic.Unreliable.Large scale.

Traditional approaches for designing distributed systems are not applicable.

Distributed environments are common place nowExtremely dynamic.Unreliable.Large scale.

Traditional approaches for designing distributed systems are not applicable.

Page 7: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Biological SystemsBiological Systems

Effectively organize large numbers of unreliable and dynamically changing components (cells, molecules, individuals) into structures that implement a wide range of functions.

These structures exhibit: Robustness to failure. Adaptability to changing conditions. Lack of reliance on an explicit central

coordinator.

Effectively organize large numbers of unreliable and dynamically changing components (cells, molecules, individuals) into structures that implement a wide range of functions.

These structures exhibit: Robustness to failure. Adaptability to changing conditions. Lack of reliance on an explicit central

coordinator.

Page 8: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Why patterns from biology?

Why patterns from biology?

Biological entities evolve to solve a particular problem, usually related to survival issues. This solution, by the notion of evolution,

must be well tested and reliable to be in existence today.

Similarities exist between distributed computing systems and biological systems.

Solutions from one domain can transfer onto the other.

Biological entities evolve to solve a particular problem, usually related to survival issues. This solution, by the notion of evolution,

must be well tested and reliable to be in existence today.

Similarities exist between distributed computing systems and biological systems.

Solutions from one domain can transfer onto the other.

Page 9: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Key IdeaKey Idea

Abstract design patterns from biological systems and apply them in distributed systems.Serve as a bridge between biological

systems and computer systems.How do they accomplish this?

Formulate the patterns as local communication strategies over arbitrary communication topologies.

Abstract design patterns from biological systems and apply them in distributed systems.Serve as a bridge between biological

systems and computer systems.How do they accomplish this?

Formulate the patterns as local communication strategies over arbitrary communication topologies.

Page 10: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Design Pattern Presentation in Paper

Design Pattern Presentation in Paper

NameHandle for the pattern. Key.

ContextDefined by the system model (more

in a bit.)Problem

Possible functionality we are trying to achieve.

NameHandle for the pattern. Key.

ContextDefined by the system model (more

in a bit.)Problem

Possible functionality we are trying to achieve.

Page 11: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Design Pattern Presentation in Paper

Design Pattern Presentation in Paper

SolutionAn algorithm which produces the

desired output based on the problem.Example

Sort of a case study.Design Rationale

The inspiration from biology.

SolutionAn algorithm which produces the

desired output based on the problem.Example

Sort of a case study.Design Rationale

The inspiration from biology.

Page 12: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

System ModelSystem Model

Basic Abstraction Network where nodes communicate via

message passing. Additional Assumptions

Basic components are nodes. Computing devices which maintain states and

perform computations. Neighbors

Only “visible” neighbors can send messages to each other.

Asynchronous message passing No message delivery time bound.

Basic Abstraction Network where nodes communicate via

message passing. Additional Assumptions

Basic components are nodes. Computing devices which maintain states and

perform computations. Neighbors

Only “visible” neighbors can send messages to each other.

Asynchronous message passing No message delivery time bound.

Page 13: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

System ModelSystem Model

Nodes are unreliable Nodes may fail. Can leave and join at any time.

Communication mediums are unreliable Messages can be lost. Side note: No mention of corrupted

message passing? Three Generals Problem does not seem to be

addressed. Maybe animals are more trustworthy than humans?

Nodes are unreliable Nodes may fail. Can leave and join at any time.

Communication mediums are unreliable Messages can be lost. Side note: No mention of corrupted

message passing? Three Generals Problem does not seem to be

addressed. Maybe animals are more trustworthy than humans?

Page 14: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Topology IssuesTopology Issues

The topology here is given by the graph defined by the neighbor relation. … typical topology definition from graph theory.

Two particular networks seen in this work:Overlay NetworksMobile Ad Hoc Networks (MANETs)

The topology here is given by the graph defined by the neighbor relation. … typical topology definition from graph theory.

Two particular networks seen in this work:Overlay NetworksMobile Ad Hoc Networks (MANETs)

Page 15: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Overlay NetworksOverlay Networks

Promising paradigm for building applications over large-scale wide-area networks.Service Clouds is an example.

Logical structures built on top of a physical network with a routing service.Any node can send to any other node

granted it knows the target nodes network address.

Promising paradigm for building applications over large-scale wide-area networks.Service Clouds is an example.

Logical structures built on top of a physical network with a routing service.Any node can send to any other node

granted it knows the target nodes network address.

Page 16: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Mobile Ad Hoc NetworksMobile Ad Hoc Networks

Set of wireless mobile devices which self-organize into a network without relying on a fixed infrastructure.

All nodes are treated equal. Neighbor relations are dependent on

the wireless connections between nodes. Defined by transmission power and physical

proximity.s

Set of wireless mobile devices which self-organize into a network without relying on a fixed infrastructure.

All nodes are treated equal. Neighbor relations are dependent on

the wireless connections between nodes. Defined by transmission power and physical

proximity.s

Page 17: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

The actual Design PatternsThe actual Design Patterns

Plain DiffusionReplicationStigmergyChemotaxis (composite)Reaction Diffusion (composite)

Plain DiffusionReplicationStigmergyChemotaxis (composite)Reaction Diffusion (composite)

Page 18: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Plain DiffusionPlain Diffusion

Problem: Bring the system to a state where each node

contains the average value of all the values in the system.

Assign a gradient to each link that is proportional to the change in values when following the link.

Solution: Rely on message passing.

For each link, each node periodically subtracts a fixed proportion from its current value and sends it along the given link. On the receiving side, add the message to current value.

Problem: Bring the system to a state where each node

contains the average value of all the values in the system.

Assign a gradient to each link that is proportional to the change in values when following the link.

Solution: Rely on message passing.

For each link, each node periodically subtracts a fixed proportion from its current value and sends it along the given link. On the receiving side, add the message to current value.

Page 19: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Plain DiffusionPlain Diffusion

Solution presented maintains the sum of all the values in the system constant.

All the node values will quickly approach the average value.Gradients are generated in this

process.

Solution presented maintains the sum of all the values in the system constant.

All the node values will quickly approach the average value.Gradients are generated in this

process.

Page 20: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Plain DiffusionPlain Diffusion

Design RationaleA form of diffusion.

Equalizing the concentration of some substance or some abstract quantity like heat.

Present in many biological and physical systems.

Known to be efficient at convergence. This will be important when testing in a distributed environment.

Design RationaleA form of diffusion.

Equalizing the concentration of some substance or some abstract quantity like heat.

Present in many biological and physical systems.

Known to be efficient at convergence. This will be important when testing in a distributed environment.

Page 21: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ReplicationReplication

Problem:Propagate novel information to all

other nodes.Assign the maximal value present in

the network to all the nodes.Find a node which contains a

document matching a given query.

Problem:Propagate novel information to all

other nodes.Assign the maximal value present in

the network to all the nodes.Find a node which contains a

document matching a given query.

Page 22: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ReplicationReplication

Solution:Nodes receive messages from neighbors

and forward them according to application specific rules.

Flooding is an easy but expensive example.Messages can stand for the maximum

value (thus solving problem 2)Messages can stand for the query until a

match is found (thus solving problem 3)

Solution:Nodes receive messages from neighbors

and forward them according to application specific rules.

Flooding is an easy but expensive example.Messages can stand for the maximum

value (thus solving problem 2)Messages can stand for the query until a

match is found (thus solving problem 3)

Page 23: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ReplicationReplication

Design Rationale:Replication is common place in nature

Growth processes, signal propagation in certain neural networks, epidemic spreading.

Messages can be seen as infective agents which propagate through the system invading hosts (nodes.)

Design Rationale:Replication is common place in nature

Growth processes, signal propagation in certain neural networks, epidemic spreading.

Messages can be seen as infective agents which propagate through the system invading hosts (nodes.)

Page 24: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

StigmergyStigmergy

Problem: Assuming that the links between nodes

have weights attached, find the shortest path between two given nodes.

Nodes need not be directly connected.

Redistribute items found in one node over a small number of nodes where similar items are held at the same node.

Does not really address when all the items are the same? Does it even matter?

Problem: Assuming that the links between nodes

have weights attached, find the shortest path between two given nodes.

Nodes need not be directly connected.

Redistribute items found in one node over a small number of nodes where similar items are held at the same node.

Does not really address when all the items are the same? Does it even matter?

Page 25: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

StigmergyStigmergy

Solution: Let every node contain a set of variables called

stigmergic variables. Nodes generate messages and send and

received these based on application dependent policies.

Reception of a message will trigger an action. Defined by the message itself and the stigmergic

variables of the node. Stigmergic variables are updated and then the

message (also updated) is forwarded. Essentially, distributed reinforcement learning.

Solution: Let every node contain a set of variables called

stigmergic variables. Nodes generate messages and send and

received these based on application dependent policies.

Reception of a message will trigger an action. Defined by the message itself and the stigmergic

variables of the node. Stigmergic variables are updated and then the

message (also updated) is forwarded. Essentially, distributed reinforcement learning.

Page 26: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

StigmergyStigmergy

In the first problem, the estimated cost for a particular path is represented by the stigmergic variables. As it progresses, the variables are updated

with more exact costs. In the second problem, clusters form by

assigning items to the messages and determining whether the message is forwarded or not based on the stigmergic variables.

In the first problem, the estimated cost for a particular path is represented by the stigmergic variables. As it progresses, the variables are updated

with more exact costs. In the second problem, clusters form by

assigning items to the messages and determining whether the message is forwarded or not based on the stigmergic variables.

Page 27: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

StigmergyStigmergy

Design Rationale:Typically seen in distributed self-

organizing behaviors in diverse social systems.

Nest building, labor division, path finding.Classic example, ants.

Design Rationale:Typically seen in distributed self-

organizing behaviors in diverse social systems.

Nest building, labor division, path finding.Classic example, ants.

Page 28: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ChemotaxisChemotaxis

Note: Composite pattern based on plain diffusion.

Problem:Finding a short path from a given

node to regions of the network where the concentration of a diffusive substance Is maximal

Does not seem to incorporate finding the shortest path?

Note: Composite pattern based on plain diffusion.

Problem:Finding a short path from a given

node to regions of the network where the concentration of a diffusive substance Is maximal

Does not seem to incorporate finding the shortest path?

Page 29: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ChemotaxisChemotaxis

Solution: Just follow the maximal gradient.

Start at any given nodeSelect link with highest gradientRepeat until local maximum

concentration is found.Greedy Algorithm! Not necessarily

the shortest path, and not necessarily where the highest diffusive substance is found.

Solution: Just follow the maximal gradient.

Start at any given nodeSelect link with highest gradientRepeat until local maximum

concentration is found.Greedy Algorithm! Not necessarily

the shortest path, and not necessarily where the highest diffusive substance is found.

Page 30: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ChemotaxisChemotaxis

Design Rationale:Cells or organisms might direct their

movements according to the concentration gradients of one or more chemicals in the environment.

Responsible for the development of certain multicellular organisms and pattern formations.

Design Rationale:Cells or organisms might direct their

movements according to the concentration gradients of one or more chemicals in the environment.

Responsible for the development of certain multicellular organisms and pattern formations.

Page 31: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Reaction-DiffusionReaction-Diffusion

Not a pattern, a framework covering a large set of patterns.

A strong generalization of the plain diffusion patternSimultaneous diffusion of one or more

materials. Also removal.Nothing else on this framework,

pattern, etc.

Not a pattern, a framework covering a large set of patterns.

A strong generalization of the plain diffusion patternSimultaneous diffusion of one or more

materials. Also removal.Nothing else on this framework,

pattern, etc.

Page 32: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating Design Patterns

Evaluating Design Patterns

Insensitivity: Self-repairing Self-organizing Adaptive Intelligent

Quantifying the notion of good and bad in a sense of merit. Dependent on too many things, domain

specific, not perfectly defined. Insensitive systems show little variation in the

figure of merits as the environment varies.

Insensitivity: Self-repairing Self-organizing Adaptive Intelligent

Quantifying the notion of good and bad in a sense of merit. Dependent on too many things, domain

specific, not perfectly defined. Insensitive systems show little variation in the

figure of merits as the environment varies.

Page 33: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating Plain DiffusionEvaluating Plain Diffusion

Distributed Aggregation ProblemCalculating global functions over the

set of locally known quantities.We saw these problems earlier.Simplify the task of controlling,

monitoring and optimizing distributed applications, among other things.

Building block for other patterns.

In the paper, the average is found.

Distributed Aggregation ProblemCalculating global functions over the

set of locally known quantities.We saw these problems earlier.Simplify the task of controlling,

monitoring and optimizing distributed applications, among other things.

Building block for other patterns.

In the paper, the average is found.

Page 34: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating Plain DiffusionEvaluating Plain Diffusion

Algorithm: Each node p has two threads, active and

passive. Active thread: periodically initiates an information

exchange with peer node q selected at random. Message contains state of p.

Passive thread: waits for a message and replies with the local state.

Symmetric information exchange, constant update of values sent and received.

The update is defined by what the problem is trying to solve. In this example, take the average of the two messages.

Could also do a maximum, etc.

Algorithm: Each node p has two threads, active and

passive. Active thread: periodically initiates an information

exchange with peer node q selected at random. Message contains state of p.

Passive thread: waits for a message and replies with the local state.

Symmetric information exchange, constant update of values sent and received.

The update is defined by what the problem is trying to solve. In this example, take the average of the two messages.

Could also do a maximum, etc.

Page 35: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating Plain DiffusionEvaluating Plain Diffusion

How good is this solution?Value at each node will converge to

the true global average.IF the underlying overlay network

remains connected.Just how fast does it converge?

Exponential. Very high precision estimates are achieved

in a few cycles regardless of network size.

It is scalable!

How good is this solution?Value at each node will converge to

the true global average.IF the underlying overlay network

remains connected.Just how fast does it converge?

Exponential. Very high precision estimates are achieved

in a few cycles regardless of network size.

It is scalable!

Page 36: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating Plain DiffusionEvaluating Plain Diffusion

Simulation done on PeerSim. Count protocol -> number of nodes in

the network. Average calculation over a starting set of

numbers. One node has value 1, rest 0. Obtain?

1/N.

Why do this? Very sensitive to failures.

Tests scalability and robustness.

Simulation done on PeerSim. Count protocol -> number of nodes in

the network. Average calculation over a starting set of

numbers. One node has value 1, rest 0. Obtain?

1/N.

Why do this? Very sensitive to failures.

Tests scalability and robustness.

Page 37: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating Plain DiffusionEvaluating Plain Diffusion

Converged to a specific value exponentially, as predicted.

What about failures? If crashed node has a smaller value than the actual

global average, estimated average will increase. N will decrease.

Opposite case? Opposite results. Crashes have the most impact in the first few

iterations. Churn? Adding and removing nodes (N remains

constant though.) Estimates still reliable.

Converged to a specific value exponentially, as predicted.

What about failures? If crashed node has a smaller value than the actual

global average, estimated average will increase. N will decrease.

Opposite case? Opposite results. Crashes have the most impact in the first few

iterations. Churn? Adding and removing nodes (N remains

constant though.) Estimates still reliable.

Page 38: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating ReplicationEvaluating Replication

Distributed Search. Idea is to spread queries throughout nodes. Typical, simple, stupid solution?

Flood the network. Clone the queries received at a node and

propagate to all neighbors. Huge overhead. Opposing objectives. Higher efficiency vs lower

overhead. Can we do better?

Distributed Search. Idea is to spread queries throughout nodes. Typical, simple, stupid solution?

Flood the network. Clone the queries received at a node and

propagate to all neighbors. Huge overhead. Opposing objectives. Higher efficiency vs lower

overhead. Can we do better?

Page 39: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating ReplicationEvaluating Replication

Design the algorithm for an unstructured overlay network. No relation between the information stored at a

node and its position in the overlay network. Learn from proliferation

Replication strategy inspired by the immune system.

Basically acts as a rate limit on propagated messages.

B cells, after being stimulated by an antigen, proliferate generating antibodies.

After this, basically a gang of antibodies do several drive-bys on the antigens and you are no longer sick!

Design the algorithm for an unstructured overlay network. No relation between the information stored at a

node and its position in the overlay network. Learn from proliferation

Replication strategy inspired by the immune system.

Basically acts as a rate limit on propagated messages.

B cells, after being stimulated by an antigen, proliferate generating antibodies.

After this, basically a gang of antibodies do several drive-bys on the antigens and you are no longer sick!

Page 40: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating ReplicationEvaluating Replication

Treat the query as the antibody and the searched items as the antigens.

Search can be started at any node. Send query messages to k neighbors.

Receive a message? Calculate the similarity between query and

local contents. Higher the similarity, more messages sent out.

Only new neighbors.

Treat the query as the antibody and the searched items as the antigens.

Search can be started at any node. Send query messages to k neighbors.

Receive a message? Calculate the similarity between query and

local contents. Higher the similarity, more messages sent out.

Only new neighbors.

Page 41: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

Evaluating ReplicationEvaluating Replication

Restricted proliferation shown to be more effective than random walks.

Even though some fluctuations were present in the results, restricted proliferation performed roughly 50% better than restricted random walk.

Key notion? Guiding message replication to areas of

more promise yields better results.

Restricted proliferation shown to be more effective than random walks.

Even though some fluctuations were present in the results, restricted proliferation performed roughly 50% better than restricted random walk.

Key notion? Guiding message replication to areas of

more promise yields better results.

Page 42: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

No more!No more!

I am sure I have bored you by now. General experiment results of the

remaining patterns exhibit better performance and insensitivity to traditional approaches seen in distributed computing. Want some more specifics, look at the

paper. You did do that already, right?

Good.

I am sure I have bored you by now. General experiment results of the

remaining patterns exhibit better performance and insensitivity to traditional approaches seen in distributed computing. Want some more specifics, look at the

paper. You did do that already, right?

Good.

Page 43: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ConclusionsConclusions

Biological systems have evolved through millions of years to reach their current point.Evolution happens for a reason, it is a

search for a solution to survival.We can extract some of this behavior

and apply it with success to distributed computing systems.

Great amounts of parallelism between the two.

Biological systems have evolved through millions of years to reach their current point.Evolution happens for a reason, it is a

search for a solution to survival.We can extract some of this behavior

and apply it with success to distributed computing systems.

Great amounts of parallelism between the two.

Page 44: Design Patterns from Biology for Distributed Computing Andres J. Ramirez.

ConclusionsConclusions

Solutions are not perfect, but they are good.

Few patterns extracted, certainly more are possible.

Translate ideas from large, varied and seemingly unrelated systems into one languageApplicable to our domain.

Solutions are not perfect, but they are good.

Few patterns extracted, certainly more are possible.

Translate ideas from large, varied and seemingly unrelated systems into one languageApplicable to our domain.


Recommended