TM
Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Comparison of High-Speed Interconnects: Ethernet, PCI Express® and RapidIO®Technology
Greg ShippenSystem Architect, Network Systems DivisionNetworking & Multimedia Group
July 2009
TM
2Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Agenda
►Interconnect Trends
►Technical Overview
►Comparison
►Summary and Conclusion
TM
3Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Frame Manager
Parse, Classify,Distribute
Buffer
QorIQ™ P4 Series P4080 Block Diagram
RapidIO®MessageUnit (RMU)
2x DMA
PCIe
18-Lane 5 GHz SerDes
PCIe sRIOPCIe
CoreNet™
1024 KBFrontsideL3 Cache
64-bitDDR-2 / 3
Memory Controller
QorIQ™ P4080 MULTICOREPROCESSOR
SRIO
WatchpointCross
Trigger
PerfMonitor
CoreNetTrace
Aurora
Security4.0
PatternMatchEngine
2.0
Queue Mgr.
BufferMgr.
eLBC
TestPort/SAP
1GE 1GE
1GE 1GE10GE
1024 KBFrontsideL3 Cache
64-bitDDR-2 / 3
Memory Controller
PAMU
Coherency FabricPAMUPAMUPAMU PAMU Peripheral
Access Mgmt Unit
eOpenPIC
Power Mgmt
2x USB 2.0/ULPI
SD/MMC
Clocks/Reset
2x DUART
4x I 2C
SPI
GPIO
PreBoot Loader
Security Monitor
Internal BootROM
CCSR
Power Architecture™e500-mc Core
D-Cache I-Cache
128 KBBacksideL2 Cache 32 KB 32 KB
Frame Manager
Parse, Classify,Distribute
Buffer
1GE 1GE
1GE 1GE10GE
Real Time Debug
TM
4Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Our Customer Feedback on Interconnects
I want more CPU cycles Quit spending them moving data
I don’t want to change next time
I want to meet my technical requirements
I don’t want to rewrite my software
I want it cheap
Support living standards with a living ecosystem
Implement QoS, scalable BW, multicore ready, high availability…
Use common usage models and software APIs
Use multi-vendor standards
TM
5Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Market Trends
Bandwidth
GB/s
Modularity,Reuse
Connected Devices
ATM
TDM
TCP/IP
SPI4.2CSIX
PCI Express ®
Protocols
CPU
I/O
CPU
I/O
CPU
Accel
CPU
Multicore Devices
Cost$$$ NRE
$$$ CAPEX
$$$ OPEX
TM
6Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Interconnect Trends
► Hierarchical Bus• Bridged Hierarchy• Broadcast• PHY: Single-ended
Perform
ance
Device
BridgeDevice
Device
Example: PCI/PCI-X/SCSI
Device Device
≤ 133MHz
Device Device Device Device Device
Device Device Device Device Device≤ 66MHzExample: VME
► Shared Bus• Single segment• Broadcast• PHY: Single-ended• Highest pin count
► 1st Generation Point-to-Point• Packet switched• PHY: Source-sync differential• Lower pin count
Device
Device Device
Device≤ 3 GHzExample: HT/P-RapidIO®
► 2nd Generation Point-to-Point• Packet switched• PHY: SERDES differential• Lowest pin count Switch
Fabric
DeviceDevice
DeviceDevice
≥ 10 GHzEx: PCIe,
S-RapidIO,SATA,SAS
TM
7Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Interconnect Roles
►Chip-to-chip►Board-to-Device►Board-to-board►Chassis-to-chassis
Board-to-Board
Chassis-to-chassis
Chip-to-chip
Device
TM
8Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Technical Overview
TM
9Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Ethernet Overview► WAN scale interconnect
• Box-to-box, board-to-board, backplane• Connect thousands to millions of endpoints• Physical layer defined for LAN-scale interconnection
Closet to computerBackplane
• Optical, twisted pair and backplane copper media► Target market
• GigE WAN to workstations, PCs and laptops• 10GE now used in aggregation settings
High performance switches, routers and LAN backbones► Specification history
• First spec (10Mbps) ~1975 by Xerox• 100Mbps spec in 1995• 1Gbps spec in 1998• 10Gbps spec in 2002
10G Copper (10GBase-T) in 2006• Recent relevant additions
Backplane Ethernet (802.3ap-2007)Data Center Bridging (DCB)
► Gigabit Ethernet ubiquitous now• 10G Copper PHYs shipping
► Extensible layered specification► Point-to-point packetized architecture
• High header overhead• Variable packet size• 46-1500 byte packet L2 PDU• Up to 9000 byte jumbo frames
Port3
CPU
Port0
Port2
Port1
Switch/Router
DRAMMAC/PHY
Endpoint
Endpoint
Endpoint
Ethernet Endpoint
TM
10Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Ethernet Layer 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
SFD
Packet PDU
Preamble
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Preamble
Destination Address
Destination Address
Source Address
Source Address
Type/Length Packet PDU
12
16
20
24
278
FCS 282
8
4
Layer 2 Packet Type: 1500 Byte Max Packet PDUTotal = 294 Bytes(256 Byte PDU)
Inter-Frame Gap 294 Bytes
L2 header/trailer PayloadInterframe overhead
TM
11Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Ethernet + TCP/IP
14 Bytes
Preamble/SFD
Total = 334 Bytes(256 Byte User PDU)
TCP/IP Packet Type: 1460 Byte Max User PDU
L2 Header IP Header20 Bytes
TCP Header256 Bytes20 Bytes
FCS4 Bytes
User PDU
8 Bytes
IFG12 Bytes
334 Bytes
TM
12Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
PCI Express® Overview
► Chassis-scale interconnect• Chip-to-chip, Board-to-board• Required legacy PCI compatibility• Physical layer defined for board + connector• Copper-on-board and cable media
► Successor to PCI 2.3/PCI-X 2.0• Fully SW/firmware backward compatible to PCI
► Target market• PC and Servers space• Embedded where suitable
► Specification history• Rev 1.0 (Gen1) completed in 2002
External cable spec released Feb 2007• Rev 2.0 (Gen2) completed in 2006• Rev 3.0 (Gen3) expected “late 2009”
8 GTransfers/s• Recent relevant additions
Multiroot/single-root IO VirtualizationCable Spec
► PCIe Gen2 now widely deployed• First Gen2 Intel Silicon (X38 chipset) Sep 2007
► Extensible layered specification► Point-to-point packetized architecture
• Relatively low overhead• Variable size packets• 128-4096 byte PDU
CPU
Endpoint
UpstreamSwitch Port
Host/Root
Complex
Downstream
Port
Endpoint
Downstream
Port
Endpoint
Downstream
Port
Endpoint
Port0
Port1
Endpoint
Port2
Switch
P2P P2P P2P
P2P
TM
13Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
PCI Express® Protocol
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Last DWBE
Packet PDU
R
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Requester ID
Address[31:16]
RAddress[15:2] Packet PDU
12
16
20
272
276
Optional TLP Digest (ECRC)
8
4
Memory Write: 4096 Byte Max Packet PDUTotal = 278 Bytes(256 Byte PDU)
FMT Type R TC Rsvd
TD
EP Attr R Length
Tag First DWBE
Rsvd TLP Sequence Number
LCRC
Packet PDU
LCRC Cont
Optional TLP Digest (ECRC) Cont
Next Packet/DLLP 280 Bytes
Transaction Layer PayloadLink Layer
TM
14Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
RapidIO® Overview
► Chassis scale interconnect• Chip-to-chip, Board-to-board, backplane• Initially a processor interconnect as Motorola/Mercury
collaboration• Physical layer defined for board + connectors• Copper-on-board and cable media
► Target market• Embedded systems
Wireless infrastructure, media, networking, compute & defense• CPU I/O, Line-card aggregation, backplane• Extensive dataplane features
QoS, VCs, datagrams, encapsulation► Specification History
• Rev 1.0 completed in 1999• Rev 1.2 completed in 2002• Rev 1.3 completed in 2005• Rev 2.0 completed in 2007
5-6G PHY, 2, 8 and 16x lanes + Virtual Channels• Recent relevant additions
Data streaming, encapsulation, traffic management► Extensible layered specification► Point-to-point packetized architecture
• Low overhead• Variable packet size• Maximum 256 byte PDU• SAR support for 4 Kbyte messages
Endpoint
Endpoint
Port3
Host
Endpoint
Port0
Port2
Port1 Switch
TM
15Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
RapidIO® Packet Format: SWRITE
SWRITE Packet Type: 256 Byte Max Packet PDU
FTYPE(0 1 1 0)
Address 0 XAdd
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Source IDTarget IDAckID0 0
Prio tt0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Packet PDU
Early CRC
80
CRC
Packet PDU
Packet PDU
84
268 Bytes
4
8
CR
F
Total = 268 Bytes(256 Byte PDU)Transport Layer Logical LayerPhysical Layer Payload
TM
16Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
RapidIO® Packet Format: Message
Type 11 Packet Type: 256 Byte Max Packet PDU, 4KB w/SAR
FTYPE(1 0 1 1)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Packet PDUMsglength
Sourcesize Letter Msg
segMbox
Packet PDU
Early CRC
Source IDTarget IDPrio tt0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
4
80
8
CRC
Packet PDU
2 Bytes Padding
84
268 Bytes
AckID0 0
CR
F
Total = 268 Bytes(256 Byte PDU)Transport Layer Logical LayerPhysical Layer Payload
TM
17Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
RapidIO® Packet Format: Data Streaming
Type 9 Packet Type: 256 Byte Max Packet PDU, 64KB w/SAR
FTYPE(1 0 0 1)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
StreamIDClass-of-Service S Rsvd
Packet PDU
Early CRC
Source IDTarget IDPrio tt0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
4
80
8
CRC
Packet PDU
2 Bytes Padding
84
268 Bytes
AckID0 0
CR
F
Transport Layer Logical LayerPhysical Layer PayloadTotal = 268 Bytes(256 Byte PDU)
E
xh O P
Rsvd
TM
18Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Comparison
TM
19Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Logical Layer ComparisonEthernet PCI Express® RapidIO®
Memory-mapped R/W
Write w/Response? N/A None NWRITE_R
Messaging/Datagram 1500-9000B Payloads
Msg: Cntl/Int MessagesMsgD: User Defined
Type 9: 64KB PayloadsType 10: DoorbellsType 11: 4KB Payloads
Channelization10-100sL2 Type,
VLAN Tags, UDP/TCP Ports,
8Traffic Class,
Virtual Channels
4-16MType 9: StreamID, CoSType 11: mbox/xmbox
Virtualization Not Defined SR-IOV, MR-IOV Specifications Not Defined
Supported address sizes N/A 32, 64-bits 34, 50, 66-bits
Global Shared Memory Not Defined Not Defined Yes
No Read/WriteConfiguration
Read/WriteAtomics
Configuration
TM
20Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Transport Layer Comparison
Ethernet PCI Express® RapidIO®
Topologies
Peer-to-peer? Yes Data only Yes
Max number of endpoints
248 (L2)232 (IPv4)2128 (IPv6)
Large(Address-dependent)
28 (Small)216 (Large)
Multicast Yes Msg only(Data defined in new ECN) Yes
What fields must switches modify?
Delivery L2: Best EffortTCP/IP: Guaranteed Guaranteed Rev 1.x: Guaranteed
Rev 2.0: +Best Effort
Any Tree Any
L2: NoneIP: TTL, MAC, FCS TLP, Seq Num, LCRC AckID
TM
21Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Physical Layer Comparison: Ethernet
Backplane Ethernet1000Base-
CX XAUI10GBase-KX4
Future 40G(40GBase-KR4)
10G
3.125G
4x
50 cm board
8b10b
NRZAC Coupled
Status Shipping Shipping Emerging Emerging Spec in 2009? Spec in 2009?
10G
Intended for MAC-PHY
3.125G
40G
10.3125G
4x
100 cm backplane + 2
connectors
64b66b
TBD
4x
100 cm backplane + 2
connectors
8b10b
NRZAC Coupled
Optional FEC
1G
Per lane baud rate 1.25G 10.3125G TBD
1x
25 m coax
8b10b
NRZ PECLAC Coupled
Also SGMII, 1000Base-T Proprietary
10GBase-KRFuture 100G
(802.3ba)
Per port data rate 10G 100G
Signal pairs 1x 10x @ 10G or4x @ 25G
100 cm backplane + 2
connectors
Short range copper
(50cm board?)
TBD
TBD
64b66b
NRZAC Coupled
Pre-emphasis, DFE, optional FEC
Channel
Encoding
Signaling
Notes
TM
22Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Physical Layer Comparison: PCI Express®
Gen1 Gen2 Gen3
Per lane data rate 2G
Per lane baud rate 2.5G 5.0G ???
Status Shipping Emerging Final spec late 2009?
1x, 2x, 4x, 8x, 12x, 16x, 32x
~40-50 cm + 2 connectors
8b10b
CustomAC Coupled
8G4G
8b10b
CustomAC Coupled
Signal Pairs
Channel
Encoding ???
Signaling CustomAC Coupled
Notes Products 2010?
TM
23Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Physical Layer Comparison: RapidIO®
Rev 1.3 Rev 2.0 Future
Data rate 1.0, 2.0, 2.5
1.25, 2.5, 3.125
1x, 4x
~80-100 cm +2 connectors
8b10b
XAUIAC Coupled
Status Shipping 2010 2011?
10G
Baud Rate
4.0, 5.0
5.0, 6.25
1x, 2x, 4x, 8x, 16x
~80-100 cm +2 connectors
8b10b
OIFAC Coupled
TBD
Signal Pairs TBD
Channel 100 cm +2 connectors
Encoding TBD
Signaling TBD
Notes
TM
24Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
1%
33%
93%
17%
8%4%
93%
80%
67%
86%
50%
95%
91%
7%
13%
24%
38%
55%
71%
83%
3%
98%
93%
2%
10%
38%
63%
77%
87%
96%
1%
97%99%
89%
66%
80%
5%
10%
19%
33%
49%
98%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 10 100 1000 10000PDU Size (Bytes)
Effic
ienc
y
RapidIO NWRITEPCI Express MWrEthernet L2Ethernet UDP
Protocol Efficiency
NOTE: Includes header & ACK overhead
TM
25Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Effective Bandwidth
Includes 8B/10B, header and ACK overhead when present
1.7
3.3
5.0
6.7
6.7
7.37.6
7.8 7.9 8.0
0.2 0.3 0.5 0.7 0.8 1.0 1.0 1.0 1.0
9.39.39.3
0.8
0.4
8.6
8.0
9.3 9.3
1.10.3
0.5
1.9
3.1
4.4
5.7
4.9
8.9
8.0
6.6
0.5
9.4
0.1
0.2
3.3
1.9
9.69.89.7
9.9
1.00.9 0.9
0.0
0.0
0.0 0.10
2
4
6
8
10
12
1 10 100 1000 10000PDU Size (Bytes)
Ban
dwid
th (G
bps)
SRIO 4x 3.125G
PCI Express x4
10G Ethernet: UDP
1G Ethernet: UDP
TM
26Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Quality-of-Service (QoS) Dependencies
►QoS depends on proper hooks across the interconnect fabric• Hierarchical Flow Control
Addresses short, medium and long-term congestion eventsLink and end-to-end
• Ability to define many streams of trafficOften defined as a logical sequence of transactions between two endpoints
• Ability to differentiate classes of traffic among streams• Ability to reserve and allocate bandwidth to streams and classes
Overall Interconnect Traffic Streams Classes
TM
27Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QoS Comparison: Ethernet
►No universal QoS standard►Many Layer 2+ switches support VLAN Priority Tagging (802.1d/q)
• Eight classes►Increasing number of routers support MPLS at L3
UDP Packet Type: 1472 byte User PDU
14 Bytes
324 Bytes(256 Byte User
PDU)
Preamble/SFD L2 Header IP Header20 Bytes
UDP Header8 Bytes
VID2 Bytes
8 Bytes
256 Bytes
FCS4 Bytes
User PDU IFG12 Bytes
PRIO CFI TCI3 1 12
TM
28Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QoS Comparison: PCI Express®
►8 Traffic Classes (TC)• No ordering between TCs
►8 Virtual Channel (VC)• Separate buffer resources per VC• TCs are mapped onto VCs
TC to VC mapping per port– No VC field in TLP
►Flexible arbitration• Arbitrary, RR, WRR
►Most implementations support a single TC/VC
TM
29Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QoS Comparison: RapidIO®
►All implementations must support 3 prioritized flows
• No ordering between flows• Allows shared buffer pool across flows
►Switches required to provide some improved service
• Extent of improvement is implementation dependant
►Dataplane Extensions adds carrier-grade QoS• Support for thousands of flows, hundreds of
traffic classes• End-to-end traffic management
EndPoint
W
EndPoint
X
EndPoint
Y
EndPoint
Z
Switch
Flow 2
Flow 0
Flow 1
TM
30Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Flow Control Comparison►Ethernet► Link-to-link flow control
• PAUSE frames• 802.1Qbb priority-based flow
control (new for DCB)► L2 Bridge-to-endpoint
• Leverages VLAN tags• Rate limit• 802.1Qau congestion notification
(new for DCB)► L3+ end-to-end flow control
• ECN, TCP windowing, others
PCI Express®
►Link-to-link flow controlRapidIO®
► Link-to-link flow control► Switch/Endpoint-to-endpoint
• XON, XOFF► Fine-grained end-to-end flow
control• Data Streaming Logical Layer
Endpoint
Line Card
Switch Switch Switch Endpoint
Endpoint
Endpoint
Endpoint
Endpoint
Line CardBack Plane
End-to-End Traffic Mgmt
Link-level Flow Control
XOn-XOffCongestion
Control
TM
31Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Software Use Models
► Several data use models supported by high speed interconnects
• Address-based memory-mapped Read/Write• Address-less messaging and datagrams
► Memory-mapped read/write• Very efficient but scales poorly beyond a few
devices• Software moves data using low-level memory-
mapped read/writesAddress range of target device is located
– Often using a previously constructed structure produced by an initialization and system discovery routine
Target buffer is allocated within the producer’s space
– e.g. mmap in LinuxData is moved using a bcopy or DMA operationWhen data transfer is complete, producer notifies the consumer
– Interrupt, memory semaphore etc– How SW knows last data committed at consumer
can be an issue► Write w/Response very helpful
End point Switch End point
DDR DDR
Data writes
Transfer complete notification
TM
32Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Software Use Models
► Several data use models are supported by high speed interconnects
• Address-based memory-mapped Read/Write• Address-less messaging and datagrams
► Messaging and datagrams• Less efficient but scales well• Some abstract service types
Unreliable connectionless messages– Comparable to Ethernet UDP
Reliable connectionless messagesReliable connection-oriented messagesReliable connection-oriented byte streams
– Comparable to Ethernet TCP• Software typically calls various underlying APIs
supplied by drivers to move dataCalls abstract underlying interconnect protocols and controllers
– Allocate(buffer 0)– Open(AZ) connection to consumer Z on device A – Send(0)– Close(AZ) connection to consumer Z on device A
Notification of arrival at device A handled locally by consumer
End point Switch End point
Data writes and notification
TM
33Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Software APIs
►Many interconnect services ►APIs attempt to abstract underlying interconnect protocols►Many such APIs have been defined
IP
TCP UDP
Sockets
Low-level Hardware Device Driver
RDMA
Proprietary
Applications
Shared Memory
Discovery &
Initialization
…TIPC
Interconnect Hardware
IP
TCP UDP
Sockets
TM
34Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Software API Comparison
► Ethernet• Many services and APIs are supported for many OS environments
SocketsRDMATIPCOthers…
► PCI Express®• Memory-mapped read/write only
Linux– /dev to locate device– mmap() to open buffer at consumer
Proprietary services► RapidIO®
• Many proprietary services on Read/Write and messaging• Power Architecture™ processors
rionet– Ethernet network stack using RapidIO messaging as packet transport– Work on optimizations using R/W transport
• DSPFSL SmartDSP
– RapidIO R/W with DMA API– Ethernet over Messaging
TI DSP/BIOS– RapidIO Message Queue Transport (MQT)
TM
35Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Software/Hardware Interface
►High-bandwidth interconnects require low CPU overhead usage model• Hardware support for logical, transport and link layer• Low overhead DMA with QoS support
StackSW
HW
ClientSW
MAC
DMA
TCP/IP
Application
TCP/IP
MAC
DMA
UDP/IP
SAR/Err
Application
UDP
MAC
SAR/Err
Application
Layer 2
Link/PHY
DMA
Client
Trans
LogicalHW
AppSW
PCI Express® : Rd, WrRapidIO® : SWRITE, MSG, Streaming
Ethernet
DriverStackSW
Sockets API
DMA
TM
36Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Ethernet Performance
►Microsecond+ fall through latencies (~100us?)• Not just the hardware, data has to traverse the SW stack
►High CPU overhead• Rule of thumb appears to be borne out in data for TCP/IP termination SW
overhead1 Hz of CPU per bit of throughput (per direction)
• Wire speed achievable with GHz class processorsSome CPU will be left but how much depends on
– Protocol being terminated– Offload features of GigE interfaces
• Too often advanced off-load features cannot be leveragedOS & SW stack support issues
►UDP or MAC/Layer 2 solutions sometimes use proprietary higher layer protocols
Can defeat the value of off-the-shelf “standards-based” solution►Error correction at endpoint stacks introduce latency jitter and determinism
issues►Works well when application requires < 30% fabric utilization
• Lack of flow control problematic for systems that can’t significantly overprovision
TM
37Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
PCI Express® & RapidIO® Performance
►Latency• Sub-microsecond switch latencies• PCI Express switches must manage address mapping
►End-to-end latency• Lower latency than Ethernet since latency does not include a SW stack
►Architecture• PCI Express switches allow limited peer-to-peer communication
Multiple hosting for redundancy problematicMaintenance responses as well as wake-up beacons must move upstreamSome switches support non-transparent bridging
– Create two separate spaces for each host– Non-standard and implementation specific
Must collect INTx messages and some power management transactions►RapidIO switches straightforward and orthogonal in architecture
• Strict peer-to-peer• Packet headers architected to reduce logic• No need to recalculate CRC
TM
38Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Some Economics
►RapidIO®, PCI Express® and Ethernet with modest TCP/IP offload have similar underlying silicon costs
• PCI Express controller is slightly larger than RapidIO• Aggressive TCP/IP Offload engine larger than PCI Express and RapidIO
endpoints
►Interesting fact about switches• Available established vendors for all three interconnects similar: 2-3
►Leveraging Ethernet volume economics not always a reality• L2+ Ethernet switches suitable for aggregation and backplanes are not
high volume16-24 ports, QoS features and SERDES PHYs for backplane12-16 ports, QoS features for aggregation
• Terminating TCP/IP imposes significant processor overhead Dedicate processor or reduce performance and/or application features
TM
39Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Summary and Conclusion
TM
40Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Summary by Attribute
Attribute Ethernet PCI Express® RapidIO®
Low latency SW Stack
QoS: Channelization w/Flow Control VLAN, No FC 1-2 VC Avail.
Multicast (w/o New ECN)
Virtualization Support
High availability (hot plug, multiple hosting)
Large number of endpoints Tree/Bridge
Low CPU overhead Limited TOE
QoS: Flow Control Link Only Link Only
QoS: Jitter SW Stack
Peer-to-peer for data
Peer-to-peer for management (w/o MR-IOV)
High port bandwidth (>10G)
Commodity off-the-shelf endpoints (graphics, NICs, HBAs, etc)
Good Fit Marginal Poor Fit
TM
41Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Summary Conclusion
► Ethernet ubiquitous as LAN-scale interconnect• GigE ubiquitous, 10G Ethernet will segment market for first time• Broad endpoint silicon and software support• Useful in low bandwidth embedded applications
► PCI Express® widely deployed in PC/Server space• Significant role in the embedded space
Where there is an intersection with the PC & Server spaceWhere PCI has been used
• Backplane interconnect role in the embedded space will be limitedUnwieldy when connecting large numbers of endpointsSimilar switch ecosystem to RapidIO® and Ethernet
• Broad switch, IP and endpoint ecosystem► RapidIO deployed with growing ecosystem
• Expanding from initial Military/Aero, DSP and line card aggregation role• Best positioned for multicore applications• Will gradually expand role onto the backplane
Efficient protocol supporting both control and data planeVariety of PHY speeds
• Cost competitive against 1G and 10G Ethernet• Established and diverse ecosystem
TM