2017©Excelfore
Using Redundant Data Paths and Clock Domains in Ethernet TSN
for Mission-Critical Network Reliability
Presentedby: ShrikantAcharyaChiefTechnologyOfficer,ExcelforeCorp.
ContributingAuthors: AnoopBalakrishnan,ExcelforeCorp.ShiroNinomiya,ExcelforeCorp.
2017©Excelfore
ADAS Controller
Actuators
Driver Reaction
Actuators
Sensors
Mission-CriticalAutomotiveNetworking
EverythingWorking
Together
NetworkFailure
EnhancedSafety
Problems!
2017©Excelfore
ADASInfotainment Body/ChassisPowertrain
VehicleGateway
PowertrainController/Gateway
ADASController/Gateway
BodyController/Gateway
eSync
CAN
LIN
CAN
Ethe
rnet
TSN
Ethe
rnet
TSN
CAN
Ethe
rnet
ASILD ASILBASILD ASILB
Ethe
rnet
AVB
LVD
S
………….………….
EthernetorOBDDiagnosticPort
High-Speed Ethernet
High-Speed Ethernet
High-Speed Ethernet
RepresentativeApproachtoNext-GenVehicleNetwork(PhysicalDomains/NoRedundancy)
CloudServer
IVIHeadUnit/Gateway
2017©Excelfore
eSync
CAN
LIN
CAN
Ethe
rnet
TSN
Ethe
rnet
TSN
CAN
Ethe
rnet
Ethe
rnet
TSN
Ethe
rnet
High-Speed Ethernet TSN
High-Speed Ethernet TSN
High-Speed Ethernet TSN
Ethernet-CentricNext-GenVehicleNetwork(LogicalDomains/NoRedundancy)
CloudServer
Gateway/Switch Gateway/Switch Gateway/Switch
Powertrain
Controller
VehicleGateway………….………….………….………….
EthernetorOBDDiagnosticPort
IVI
HeadUnit
Body
Controller
ADAS
Co
ntroller
FailureofaDeviceontheNetworkFailureofaNetworkLink
RedundancytoAddress:
X
X Gateway/Switch
High-Speed Ethernet TSN
VLANsCreatetheDomains
2017©Excelfore
Mission-CriticalNetworkRedundancy
ThreeLevelsofHardwareRedundancy:1. RedundantLinksBetweenNetworkGateway/Switches2. DaisyChainingEndDevicestoaNetworkGateway/Switch3. DaisyChainingEndDevicestoRedundantNetworkGateway/Switches
KeySoftwareConceptsforRedundancyinTSNNetworking:A. RedundantDataPaths– IEEE802.1CBB. TimingandSynchronization– IEEE802.1AS/802.1BAC. RedundantClockDomains– IEEE802.1ASrev
2017©Excelfore
RedundantLinksBetweenSwitches
Switch/Gateway
EndPoints
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndPointsSwitch/Gateway
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
PositiveAttributes:
• ProtectionfromFailureofNetworkLinkonHighspeedBackbone
• Maximumof2SwitchHopsRetainsTSNGuaranteedLatency(<2mson100MbpsEthernet)
Shortcomings:
• NoProtectionfromFailureofNetworkLinktoEndDevices
• NoProtectionfromGateway/SwitchDeviceFailure
X1 2
2017©Excelfore
DualEthernetNodes:KeyHardwareFeature
• LimitationofSingleNodeEndPoints• RedundantPathsonlyatSwitchNodes,notatEndPoints
• FrameReplicationattheSwitch• NoFrameReplicationatEndPoint
• EnhancedRedundancywithDualNodeEndPoints• EndPointscanReplicateFramesfromaTalker• DaisyChainingofEndPointsImprovesRedundancy• DaisyChainingofEndPointImprovesUtilizationofSwitchPorts• AutomotiveProcessorsSupportDualEthernetNodes:
• NXPi.MX6Family• TIJacintoJ6Family
2017©Excelfore
DaisyChainingDualNodeEndDevices
OneLinkFailureDoesnotDisconnectDevicesOneDeviceFailureDoesNot DisconnectOtherDevicesCarefulAnalysisofSwitchHopsRequiredtoEnsureGuaranteedLatency
EndPointA
Switch(Relay)A
EndDeviceA
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
XXEndDevicewith2PortsMayhavea3PortSwitch:• 2ExternalPorts• 1InternalPort
2017©Excelfore
RedundantLinksbetweenSwitches/DualNodeEndPoints
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
X
X
PositiveAttributes:• ProtectionfromFailureofAnyOneNetworkLink• NetworkisStillProtectedfromEdgeDeviceFailure• BetterNodeUtilizationattheSwitch• Maximumof6SwitchHops(3+2+1-or- 1+2+3)
RetainsTSNGuaranteedLatencywithAnyOneFailure
Shortcomings:
• NoProtectionfromGateway/SwitchDeviceFailure
Switch/Gateway
EndPointsEndPointsSwitch/Gateway
1
2
3
4 5
6
3HopsfromEndPointtoBackbone 2HopsintheBackbone 1HopfromBackbonetoEndPoint
2017©Excelfore
RedundancyImpact
• HardwareCosts• EndPointsNeedTwoExternalEthernetNodes
• SoftwarePerformance(higherimpactwithhigherpayloads,utilizationdoubles)_• OverheadofReplicationontheEndPoint
• Allpackets=processingdoubled• Ifoverheadforpackettransmission=10%,withreplication=20%
• OverheadofReplicationontheSwitch(UtilizationisDoubled)• Dependshowmanypacketsneedtobereplicatedtothevariousports• Alsoimpactedishowmanydeletionsarehappening
• NetworkBandwidth• AggregatesBandwidthLoadofDaisy-ChainedEndPoints• OverallNetworkTrafficonSomeLinksMayIncreasebyMultiple(discussedlater)
• DaisyChainingMitigatesthePortUtilizationattheSwitch• EndPointswithSingleNodesCanNotDaisyChain
• MaybeAppropriateforNonMission-CriticalTasks
2017©Excelfore
FullRedundancyin End-to-EndNetworkConnections
Switch(Relay)E
Switch(Relay)F
Switch(Relay)G
Switch(Relay)H
LinkA-E
LinkE-F
LinkF-W
LinkE-G LinkF-H
LinkD-G
LinkG-H
LinkH-Z
X
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
EndPointsEndPoints
X
LossofAnySingleNetworkLink,orAnyNetworkSwitch,isRecoverableLossofAnyEndPointDoesNotAffectConnectivityofOtherEndPoints
2017©Excelfore
Switch(Relay)E
Switch(Relay)F
Switch(Relay)G
Switch(Relay)H
LinkA-E
LinkE-F
LinkF-W
LinkE-G LinkF-H
LinkD-G
LinkG-H
LinkH-Z
X
EndDeviceA
EndDeviceB
EndDeviceC
EndDeviceD
EndDeviceW
EndDeviceX
EndDeviceY
EndDeviceZ
EndPointsEndPoints
X
2msEnd-to-EndLatencyGuaranteedon100MbitNetwork-ForAnyOneFailureNoMorethan7SwitchHops
ControlLatency:AnalyzetheNumberofHops
1
2
3
4
5 6
3HopsfromEndPointtoBackbone 3HopsintheBackbone 1HopfromBackbonetoEndPoint
7
2017©Excelfore
eSync
CAN
LIN
CAN
Ethe
rnet
TSN
Ethe
rnet
TSN
CAN
Ethe
rnet
Ethe
rnet
TSN
Ethe
rnet
High-Speed Ethernet TSN
High-Speed Ethernet TSN
High-Speed Ethernet TSN
Reminder:Ethernet-CentricNext-GenVehicleNetwork(LogicalDomains/NoRedundancy)
CloudServer
Gateway/Switch Gateway/Switch Gateway/Switch
Powertrain
Controller
VehicleGateway
………….………….………….………….
EthernetorOBDDiagnosticPort
IVI
HeadUnit
Body
Controller
ADAS
Co
ntroller
FailureofaDeviceontheNetworkFailureofaNetworkLink
RedundancytoAddress:
Gateway/Switch
High-Speed Ethernet TSN
VLANsCreatetheDomains
2017©Excelfore
FullRedundancyin End-to-EndNetworkConnections
Gateway/Switch
Gateway/Switch
Gateway/Switch
Gateway/Switch
LossofAnySingleNetworkLink,orAnyNetworkSwitch,isRecoverableLossofAnySingleNetworkLinkorSwitchPreservesGuaranteedLatencyLossofAnyEndPointDoesNotAffectConnectivityorLatencyofOtherEndPoints
ASILD ASILD
ASILD ASILD
VLANsCreatetheDomains
2017©Excelfore
SoftwareImplicationsofRedundant NetworkPaths
FrameReplicationandEliminationforReliabilityIEEE802.1CB
2017©Excelfore
Simple End-to-EndNetworkConnections(NoRedundancy)
EndPointA
EndPointZ
SwitchE
SwitchF
LinkA-E LinkE-G LinkF-Z
● LinkA-E● LinkE-F● LinkF-Z● SwitchE● SwitchF
FailureinAnyOneMakestheConnectionFail
X X
2017©Excelfore
FRER(FrameReplicationandEliminationforReliability)
Switch(Relay)
PacketA PacketA
PacketA’
Replication1xIncoming“PacketA”“PacketA”isReplicated2x“PacketA”SentOut
Switch(Relay)
PacketA PacketA
PacketA’
Elimination2xIncoming“PacketA”1x“PacketA” isEliminated1x“PacketA” SentOut
2017©Excelfore
Identifying “PacketA”
DestinationAddress SourceAddress R-TAG VLAN-TAG
EthernetHeader
F1C1 Reserved SeqNum
(optionally‘HSR-TAG’or‘PRP-TAG’)
• DestinationAddress+SourceAddress+VlanID+Seq.NumbercanIdentifythePacket
• This PacketIdentificationisSufficientforReplicationandEliminationbyRelaySystem(Switch)
2017©Excelfore
FrameEliminationandReplicationExplained
EndPointA
EndPointZ
Switch(Relay)E
Switch(Relay)F
Switch(Relay)H
LinkE-F
LinkF-H
1.ManyRedundantPaths2.Bandwidth (BW)UtilizationisDoubled3.SwitchEisSimpleReplication(~5%overhead).4.ComputationComplexityisIncreasedonSwitchF
andH(~30%overhead).
PacketA
PacketA”’
BWin =1BWout=2Onlyreplication
BWout=1
BWin =2BWout=2OneReplicationOneElimination
BWin =2BWout=2OneReplicationOneElimination
BWin=2OneElimination
PacketA”
2017©Excelfore
SoftwareImplementation(Replication)
PHY1
MAC1
PHY2
MAC2
PHY3
MAC3
PHY4
MAC4
PHY5
MAC5
• CheckR-TAGintheIncomingPacketsfromMAC1IfnotExit,thenInsertR-TAG
• KeepTrackintheInternalTableforPACKETID• ReplicateandSendtoRequestedPorts(MAC4,MAC5)
2017©Excelfore
SoftwareImplementation(Elimination)
PHY1
MAC1
PHY2
MAC2
PHY3
MAC3
PHY4
MAC4
PHY5
MAC5
• CheckR-TAGintheIncomingPacketsfromMAC1andMAC2• KeepTrackintheInternalTableforPACKETID• EliminateReplicatedPacketsandSendtoRequestedPorts(MAC4)
IfMAC4doesnot RequestR-TAG,RemoveIt
2017©Excelfore
DesignImplicationforReplication/Elimination
• Software SolutionLayer2SoftwarecanImplementthisLogic– RequiresIDCheckonEachPacket
ThisImpactsLatencyfromAdditionalProcessingProcessorUtilizationMayExceedCapacityUnderHeavyTraffic(~40Mbits/SecondofVideoData)
• SuggestedHardware AccelerationR-TagInsertionorEliminationPACKETIDLook-UpTable(e.g.MACAddr,VLANID,SequenceNo.)
2017©Excelfore
RedundancyofGrandMaster Clock
NoDisruptionofNetworkDevicesbyGMFailureIEEE802.1AS-Rev
2017©Excelfore
EndPointA
ClockA
EndPointB
ClockB
EndPointC
ClockC
EndPointY
ClockY
EndPointZ
ClockZ
Switch (Relay)F Switch(Relay) G
PrimaryGM
SecondaryGM
CurrentDiagramforClockSync
X
2017©Excelfore
• EndPointAFailsGMClock(ClockA)isLostontheNetwork
• NetworkStartsBMCA(IEEE1588BestMasterClockAlgorithm)Chooses OneofamongClockBtoClockZ asNewGMClock
• ClockY BecomesNewGMClock
• SwitchingGMfromClockAtoClockYProcedureRequiresMultipleSecondsAllDevicesLoseSynchronizationDuringProcedure
CurrentProcedureforClockSyncImplementation
2017©Excelfore
EndpointA
ClockA
EndpointB
ClockB
EndpointC
ClockC
EndpointY
ClockY
EndpointZ
ClockZ
BridgeF BridgeG
PrimaryGM
Hot-StandbyGM
DiagramforRedundantGMClockSyncImplementation
X
2017©Excelfore
• PrimaryGMisClockASecondaryGMisClockY
• TwoDomainsof802.1ASClockareRunningSeparately
• NormalCircumstance:GMintheSecondaryDomainisNotOperational
• UponFailureofPrimaryGM:NetworkSeamlesslySwitches toSecondaryGM
• NoDevicesLosetheirSystemSynchronization
Note:ManagementofMultipleDomainsofPTPMessagesisCurrentlyBeingDefinedin802.1AS-rev
ProcedureforRedundantGMClockSyncImplementation
2017©Excelfore
Implementation ofRedundantGM(Updating thegPTPKernel)
FollowingFunctionsMustBeinUpdatedgPTP:
• Handling of MultipleDomainsofSYNCMessagesOurExample isTwoDomains– CouldbeMore
• ManageClocks ofMultipleDomainsKeepTrackofPrimaryGMandSecondaryStand-byGMSecondaryGMMustBeSynchronizedtothePrimaryGM
(RequiredforSeamlessSwitching)
• If PrimaryGMFailsEachgPTPEndDeviceSwitchestoSecondaryGMNo ImpactfromClockDiscontinuityonAny gPTPEndDeviceSwitchingfromPrimaryGMtoSecondaryGMisSeamless
2017©Excelfore
Replacement ofMalfunctioningGM– aProposal(Updating thegPTPKernel)
CaseofaMalfunctioningGM(Clockisdegraded,butnotlost)
• TwoGMsInadequateforRedundantClockDomainswithHotStandby• WhichGMisCorrectinaDispute?
• RequiresThirdGMtoAuditClockBehavior
• ImplementationoftheAuditorGM• OneGMContestsThatOtherGMisMalfunctioning• AuditorChecksStatusofBothGMs• AuditorRendersDecisionandNotifiesAllGMs• AuditorSendsMalfunctionNotificationtoGM
ItsurrendersandceasestobeGM
2017©Excelfore
EndpointA
ClockA
EndpointB
ClockB
EndpointC
ClockC
EndpointY
ClockY
EndpointZ
ClockZ
BridgeF BridgeG
PrimaryGM
Hot-StandbyGM
DiagramforRedundantGMClockSyncImplementation
Auditor
2017©Excelfore
PerformanceImpactofGMClockRedundancy
• NetworkTrafficAdditional~1%OverheadinRedundantSyncMessagesat40Mbits/second
• SoftwareSolutiononEachgPTP NodeGMACSoftwareComplexitywillIncrease
- EachPHY/GMACReceives2xtheNumberofSyncMessages- ValidateandProcesstheSecondarySyncMessages- InputProcessingRequiresMorePerformanceinPHY/GMAC
• SuggestedHardwareAccelerationDetectionofClockDomainIDKeepingTrackofSeparateSyncMessagesandTimeStamps
2017©Excelfore
802.1ASRevSpecvs.Implementation
• StandardOnlyWarrantsHowHot-PlugGMSetupEnvisagedHowtomanagemultipledifferentdomainsofPTPmessagesstillunderdefinition
• DetectionofMalfunctioningGMisNotPartoftheStandard• LefttoIndividualImplementation• Minimum:ThirdGMforMonitoring
- MonitoringandRegularReview- ImplicationsforStartupTime- AddedCosttoImplement- InputProcessingRequiresMorePerformanceinPHY/GMAC
• CostImplicationforThirdGM• ComplexityLefttoSystem/NetworkImplementer
2017©Excelfore
SummaryofOpportunitiesforHardwareAcceleration
ForFrameReplicationandEliminationforReliability:• R-TagInsertionorElimination• PACKETIDLook-UpTable(e.g.MACAddr,VLANID,SequenceNo.)
ForRedundancyofGrandMaster Clock:• DetectionofClockDomainID• KeepingTrackofSeparateSyncMessagesandTimeStamps
2017©Excelfore
SummaryandConclusion
• AutomotiveNetworkingMustAddressMissionCriticalRequirements
• EthernetTSNHasStructuresforRedundantLinkstoMissionCriticalEndDevice
• RedundantDataPathsEnsureMissionCriticalNetworkLinks
• CarefulAnalysisofNetworkHopsEnsuresGuaranteedLatency
• RedundantClockDomainsCouldEnsureSeamlessContinuityofMissionCriticalNetworkOperation