Improving Error Containment and Reliability of Controller Area Network...

Improving Error Containment and Reliability of Controller Area Network

(CAN) by means of Adequate Star Topologies

Manuel Barranco

Julián Proenza

Luis Almeida

� Field-bus communication protocol mainly used in distributed control systems.

� Extensively used in practice for over 15 years in:

� In-vehicle and intra-building communication.

� Factory automation.

� Some space applications.

� Main characteristics

� Low cost.

� Interesting real-time features.

� Good dependability.

IntroductionCAN (Controller Area Network) protocol

IntroductionCAN protocol - Basic properties

Node1

Node2

Node3

Node4

� Simplex bus topology.


Node1

Node2

Node3

Node4

“0”“1”“1”“1”Recessivebit

Dominantbit

Dominant bits overwrite recessive bits

� Dominant / recessive transmission: the medium implements a wired-AND function.


Node1

Node2

Node3

Node4

0 1 0 1 1 0 0 0 1

“1” “1” “1” “1”

� In-bit response: nodes have a quasi-simultaneous view of every bit in the channel.


Node1

Node2

Node3

Node4

�

“010101”

� Fault-treatment mechanisms.


Node3

�

Node shuts down when it diagnoses itself as being permanently faulty

Node1

Node2

Node4

� Fault-treatment mechanisms.

Node1

Node2

Node3

Node4

�

“010101”

If the node does not shut down when faulty, it cannot prevent the

propagation of errors

� A bus has scarce error-containment mechanisms.

IntroductionCAN protocol – Scarce error containment

IntroductionFormalization of the problem

� K-severe failure of communication.

�Less than N-K nodes of an ensemble of N nodes can communicate with each other.

� Point of k-severe failure of communication.

�Point whose failure provokes a k-severe failure of communication.

� It includes the concept of single point of failure.

�A bus has multiple points of k-severe failure.

Node1

“0”“0” “0” “0”

�

“1”

Node2

Node3

Node4

� Stuck-at-dominant fault (node ormedium).

Node1

Node2

Node3

Node4

� Medium partition fault.

�

� Bit-flipping fault (node ormedium).

Node1

“010”“111”

Node2

Node3

Node4

�

“010” “010” “010”

IntroductionFormalization of the problem – fault model

Node1

“1” “1” “1”

Node2

Node3

Node4

� Stuck-at-recessive fault(medium).

“1”�

� To provide communication infrastructures that improve error containment and reliability of CAN.

� To keep compatibility with CAN: to inheritits good properties and to use CAN-COTShardware and software.

IntroductionThe objective

Hub

Node2

Node3

Node1

Node4

Hub

Node2

Node3

Node1

Node4

“1” “010”

“1”

“1”

Error-containmentregion

Port

no common-mode failuresno medium partitions

no spatial proximity failures

IntroductionThe solution: adequate star topologies

� An adequate star topology must provide.

�Error containment of stuck-at and bit-flipping faults.

�Tolerance of stuck-at and bit-flipping faults.

�Full compatibility with CAN.


� An adequate star topology must provide.

�Error containment of stuck-at and bit-flipping faults.

�Tolerance of stuck-at and bit-flipping faults.

�Full compatibility with CAN.


This is what we have done

� CANcentrate.

� ReCANcentrate.

� Conclusions.

� Future work.

Error containment

Error containmentand reliability

Outline

CANcentrateMain objective: error containment

� To prevent that a single fault in a network component causes a severe failure of communication in a CAN network.

�One fault just prevents a maximum of one node from communicating.

Hub

UplinkDownlink

Link

Uplink/downlink to allow separating the contribution

of each hub port.

CANcentrateArchitecture overview

Node1

Node3

Node2

Node4

Fault-treatmentModule

Input / outputModule

CouplerModule

Uplinkfrom a node

Downlinkto a

node

B1 BnB2

EDn

B0

ED1

ED0

. . .

.

.

.

B0

CANcentrateHub basic architecture

CANcentrateCoupling schema

...

CouplerModuleB0

Bn


. . .

EDs

B1 B2B0


CANcentrateFault treatment basics

...

CouplerModuleB0

Bn


. . .

EDs

B1 B2B0



...

CouplerModuleB0

Bn


. . .

EDs

B1 B2B0


...

CouplerModuleB0

Bn


. . .

EDs

B1 B2B0


“1”


...

CouplerModuleB0

Bn


. . .

EDs

B1 B2B0


“1”


CANcentratePrototype implementation

Hub coreCANivete

board

StarLinkboardInput/Output

Module

CANcentratePrototype implementation - Tests

� Functional tests.

�Short fault isolation delays [25, 300]us at 690 kbs.

� Performance tests.

� Inverse relationship in CAN between the bit rate and the network length: at 690 kbs the achieved a star diameter was 41 meters (68 meter in CAN).

�Extra delay introduced by the hub transceivers. It does not visibly depend on the number of ports.

CANcentrateDependability evaluation

� A star includes more hardware than a bus: the probability of suffering from a fault is higher in a star.

�CANcentrate reduces reliability.

�But CANcentrate can improve error containment.

�Suitable for system that can assume that up to K of N nodes cannot communicate.

CANcentrateDependability evaluation – Modelling framework

� Dependability comparison in the presence of permanent hardware faults.

� CAN and CANcentrate modelled by means of: Stochastic Activity Networks (SANs): a generalization of Stochastic Petri Nets.

� Realistic values for dependability parameterssuch as failure rates and error-detection coverages.

CANcentrateDependability evaluation – Assumptions

� Results are lower bounds to the dependability of CANcentrate.

�Modeling assumptions that favor CAN, e.g. we did not consider spatial proximity failures.

CANcentrateReliability comparison vs number of nodes

CANcentratePNS comparison vs number of nodes

CANcentrateMain disadvantages

HubLink

Node2

Node3

Node1

Node4

�

� CANcentrate slightly reduces the reliability.

� It still has one severe point of failure: the hub.

Outline

� CANcentrate.

� ReCANcentrate.

� Conclusions.

� Future work.

Error containmentand reliability

ReCANcentrateMain objectives: error containment and reliability

� To detinitively eliminate all points of severe failure in a CAN network: tolerate one hub failure.

� To tolerate link failures.

Hub

Link

Hub2�

...Node

1Node

3

Node2

Node4

HubN

ReCANcentrateThe solution: a replicated star

� In particular: we replicated CANcentrate.

�We take advantage of the error-containment properties already achieved by CANcentrate.

�We still keep the fully compatibility with CAN.

ReCANcentrateA replication of CANcentrate

Node3

Node2

Node1

Uplink & Downlink

InterLink

SubLinks

Hub2

Hub1 Link

..

ReCANcentrateArchitecture overview

� Two coupled hubs.

ReCANcentrateBasic functionality

Node3

Node2

Node1

Link

Hub2

Hub1

..

Hub1

� Hubs behave like one: they send the same bit stream bit by bit to the nodes.

Uplink Downlink

Links to hub 1 Links to hub 2


Node can easily manage

replicated media

CANController

Txrx Txrx

Tx RxCAN

Controller

Txrx Txrx

MicroController

Tx Rx

CANController

Txrx Txrx

Tx RxCAN

Controller

Txrx Txrx

MicroController

Tx Rx

Transmission

Reception of the same traffic bit by bit simultaneously


replicated media


CANController

Txrx Txrx

Tx RxCAN

Controller

Txrx Txrx

MicroController

Tx Rx

Transmission Reception


replicated media



Node3

Node2

Node1 InterLink

Hub2

Hub1 Link

..

Node 1can communicate

with Node 3

� Flexible configuration to reduce cabling costs.


Node3

Node2

Node1 InterLink

Hub2

Hub1 Link

..

�

� Error containment of link and node faults.


Node2

Node1 InterLink

Hub2

Hub1 Link

..

� Node3

� Tolerance to link faults.


Node3

Node2

Node1 InterLink

Hub2

Hub1 Link

.. �

� Tolerance to interlink faults.


Node3

Node2

Node1 InterLink

Hub2

Hub1 Link

..

�

� Tolerance to hub faults.

ReCANcentratePrototype implementation

Hub core

StarLinkboardInput/Output

Module

PIC board

Interlink

� Functional tests.

�Similar results as in CANcentrate.

� Performance tests.

�At 625 kbs, the maximum achievable star diameter was 25 meters (79 meters in CAN).

ReCANcentratePrototype implementation - Tests

� ReCANcentrate modeled using the same formalisms and tools as for CANcentrate.

� Results are lower bounds to the dependability of ReCANcentrate.

ReCANcentrateDependability evaluation

ReCANcentrateReliability comparison vs number of nodes

ReCANcentratePNS comparison vs number of nodes

�CANcentrate demonstrates that it is possible to improve error containment of CAN by means of a CAN-compliant simplex star topology.

�ReCANcentrate demonstrates that it is possible to improve both reliability and error containmet of CAN by means of a replicated star topology.

Conclusions

Future work� Design and implementation of further fault treatment

mechanisms at hubs: babbling idiot, masquerading faults, etc.

� Design and implementation of stars that use only one CAN cable per link.

� Performability evaluation of (Re)CANcentrate in the presence of transient faults.

� Implementation and formal verification of a driver for managing the replicated media in ReCANcentrate.

Improving Error Containment and Reliability of Controller Area Network

(CAN) by means of Adequate Star Topologies

Manuel Barranco

Julián Proenza

Luis Almeida

Date post:	29-Mar-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Improving Error Containment and Reliability of Controller Area Network...

Documents