1
Walking a Tight Rope withResilient Packet Rings
— A Startup Story
Nirmal Saxena7 August 2006Stanford CRC
Outline
• Beginning– Metropolitan Area Network Gap– Chip Engines
• Betting on Resilient Packet Rings• Adaptive Computing Approach• Accomplishments & Contributions• Summary
Compelling Case for MAN– Early 2000
Edge
EnterpriseLANs Metro Wide Area
NetworksCore
10/100/1000 Mbps
T1 1.54 Mbps
10/40 Gbps
Metro Bandwidth Gap
Chip Engines– The Company
Company Focus• Semiconductor Solutions
– Standards Based– Metropolitan Area Networks
Founded November 2000• Babu Chilukuri, FounderVenture Funding March 2001• Alliance Ventures
– $8.0 Million• Affiliate & Private Investments
– $2.5 Million
One Small Chip for MAN
One Giant Pipe for MANkind
- Chip Engines, Inc. 2001
c h i p e n g i n e sTM
Chip Engines, Inc.Sunnyvale, USA
Sales & MarketingR&D SpecificationCustomer Support
Employees 20
Chip Engines, Inc.Hyderabad, India
Project ImplementationCustomer Support
Employees 60
Initial Product Focus– June 2001
Multi-Transport Support• 10G Ethernet• Utopia/POS Level 2/3• SONET (up to OC-192)Wire-Speed (40 Gbps Rate)• 10G Ethernet• OC-192Standards• MPLS / DiffServ• RPR (IEEE 802.17)• IEEE QoS and VLAN
Port A
Multi-Service Packet ManagementEngine
CPUPCI interface
256 MBDDR SDRAM
Packet Memory
ZBT-RAM8/16 MBL3 Lookup/
ClassificationInterface
Switch FabricInterface
CSIX
ZBT-RAM8/16 MB
Port B Switch FabricInterface
CSIX
256 MBDDR SDRAM
Packet Memory
Positioning for RPR– August 2001Multi-Service Packet Management Market Diligence
– Muted Response from Customers (MPLS & TM)– Highest Execution Risk & Competition from NPUs
Position Chip Engines Solution as “The RPR Solution”– Active Role in 802.17– First To Market 10G RPR MAC (Silicon & Software)
IEEE 802.17 WG Presentations
0
10
20
30
40
50
60
70
Mar-00Jun-00Sep-00Dec-00Mar-01Jun-01Sep-01Dec-01Mar-02Jun-02Sep-02Dec-02Mar-03Jun-03Sep-03Dec-03Mar-04Jun-04
2
RPR Market Predictions– July 2001
$13B RPR Equipment Market Growth by 2003– Source RHK
RPR Semiconductor Market > $175M by 2005– Source Gartner Group
IEEE 802.17 Working Group Timeline RPR – A Short Tutorial
Ring Technologies
• IEEE 802.5 Token Ring– Source Based Ring Ownership Through Tokens– No Spatial Reuse
• FDDI Rings– ANSI Standard– 100 Mbps Token Ring
• SONET Rings– Redundant Rings for Protection
• Standby Redundancy• Redundancy with Mirrored Traffic
– TDM Based Data Transport– Spatial Reuse through Add-Drop-Multiplexing (ADM)
SONET Rings
ADM ADM
ADM ADM
ADM ADM
TDM Slots
Transport Ring
Protection Ring
Empty Slot
SONET Ring Features
• Guaranteed Bandwidth– Bounded Delay & Jitter
• TDM Efficiency
• Resiliency & Failure Protection– Service Restoration in Less Than 50ms
• Resource Utilization– Unused Slots Wasted
• TDM Inefficiency– Redundancy
• Underutilized Bandwidth
• Longer Time to Provision• High Entry Cost to Customer
Packet Transport in RPR
ringlet 0
ringlet 1
Data 0Packets
Data 1Packets
Control 1Packets
Control 0Packets
MAC MAC MAC MAC
MAC MAC MAC MAC
Node A Node B
Node CNode D
Node D GeneratesMulticast Packet
Node B StripsUnicast Packet
DA=B SA=A Payload
MA SA=D Payload
MA SA=D Payload
Node D StripsMulticast Packet
Node A GeneratesUnicast Packet
3
RPR MAC Structure
HP Transit Buffer
LP Transit Buffer
Receive Buffer Transmit Buffer
Ingress Scheduler
Ringlet 0In
Ringlet 0Out
RPR Client
EgressRinglet 1
Out
Ringlet 1Out
Bandwidth & Fairness Management
• Transit Traffic– Source Node to Destination Node
• Transmit Traffic– Generated By Local Station– Managed by RPR Client
• Monitoring Bandwidth Usage– Control Packet Processing & Generation
• Unused Bandwidth– Fair Allocation of Unused Bandwidth
Topology Discovery
Node D Node A
Node C Node B
rpr_discovery
Stn_A_SA
Stn_B_SA
Stn_C_SA
Stn_D_SArpr_discovery
Stn_A_SA
rpr_discovery
Stn_A_SAStn_B_SA
rpr_discovery
Stn_A_SA
Stn_B_SA
Stn_C_SA
Topology Discovery Events
• Periodic Discovery Transmission– Multicast or Unicast Control Packets– Convergence Time Related to
• Ring Span• Number of Nodes and Link Speed
• New Station Insertion– Provisioning New Nodes on the Ring
• Failure Events– Node Failures– Link Failures– Relearn Topology for Efficient Routing
Link FailureMAC MAC
MAC MAC MAC MAC
Failure Protection– Wrapping
MAC MAC MAC MAC
MAC MAC MAC MAC
Node A Node B
Node CNode D
DA=D SA=A Payload
Link FailureDA=C
DA=B
DA=A SA=D Payload
DA=D SA=B Payload
DA=C
DA=A
DA=D SA=C Payload
DA=A
DA=B SA=C Payload
DA=C SA=C Payload
DA=B
Failure Protection– Steering
4
Wrapping versus Steering
• Wrapping– Minimizes Short-Term Packet Loss– Excessive Link Bandwidth is Consumed
• Could Impact Delay and Jitter Requirements for HP Traffic
– Expected to Be Transient State– Could Cause Out-Of-Order Packets
• Steering– Packets Over Failed Link Get Dropped
• Until Source Stations are Notified of Failure
– Minimizes Long-Term Packet Loss– Helps Meet Delay & Jitter Requirements for HP Traffic
Resilient Packet Rings
Features SONET Ethernet RPRFair Access to Ring Bandwidth √Bandwidth Efficiency of Dual Ring Topology √Ethernet Like Economics √ √Controlled Latency and Jitter √ √50ms Ring Protection √ √Efficiency for data traffic √ √
RPR Status– August 2001
• Legacy Protocols– Cisco SRP Protocol– Nortel oPE Protocol– Luminous Networks RPT Protocol– Lantern Networks RPR (Acquired by C-COR)
• IEEE 802.17 Working Group Open Issues– RPR Packet Frame Formats– Fairness & Bandwidth Management Algorithms– Topology Discovery & Failure Protection Mechanisms
Adaptive Computing
Chip Engines’ Approach
Adaptive Computing Systems (ACS)
Processor EmbeddedMemory
ReconfigurableEngine
Input Output Devices
CustomLogic
Processor
DARPA Sponsored Effort– 1997 thru 2001
Systems Comprising– Stored Program Processors– Reconfigurable Logic– Embedded Memory– Custom Logic– Input and Output Devices
Application Adaptation• Partitioning Influenced By
– Performance– Cost– Dependability
Xilinx Virtex-II Pro FPGA– An ACS Example
BRAMColumn
PPC405Core
XILINXVirtex-II Pro
PPC405Core
Rapid I/O Serdes
CLBColumn
DigitalClock
Module
5
Chip Engines RPR Silicon Challenges
• Flexibility in RPR?– IEEE 802.17– An Emerging Standard– Legacy Protocol Support– Field Upgrade Support– New Applications
• Challenge Problem– What is the Right Recipe ?– Meets Cost, Performance, and Scalability
32-Bit Frame Check Sequence Bandwidth (Gbps)
Lempel-Ziv Compress Bandwidth (Mbps)
0
24
6
500MHz
1GHz
1.5GHz
2GHz
256 EntryLookUp Table65,536 EntryLookup Table
16020 MHz FPGA (Stanford CRC)121000 MHz Processor
Network Processors– Wire Rate Processing?
FPGA Reconfigurability
Benefits• Provides Maximum Flexibility• Addresses Large Application Space
– Through Semantically Complete (But Not Rich) Primitives• Faster Time To Market• Opportunities
– Functionality Upgrade & Bug FixesIssues• Non-Determinism
– Area & Speed• Cost
Adaptive Computing Heuristics
Study & Bound Application Flexibility• Identify Building Blocks (Richer Primitives)• Yields Most Salutary Results
– Lowest Silicon Cost for Given Flexibility– Highest Performance
Linear Increments in Silicon Area• Exponential Increments in Flexibility
Semantically Rich Primitives– Benefits
F0 F1 F31 ++
Data In
FiF2
CLB
CLB
CLB
CLB
CLB
CLB
CLB
CLB
CLB
454 microns
577microns
53
6739
50
FPGA Reconfigurable LogicAny FCS-32 Polynomial
Adaptive LogicAny FCS-32 Polynomial
Fixed FCS-32 Polynomial
Fairness & Bandwidth Control Engine
FPGA Reconfigurable Logic
Adaptive Control EngineVirtex-II Pro
6.9 mm
8.8 mm
1.9
2.5
PPC405Core
PPC405Core
3.3
4.0
6
Adaptive Computing– Chip Engines’ Approach
ASICs
FPGAs
Virtex II Pro
AS95L210x
Processors
NetworkProcessors
Reconfigurable Silicon
Programmable Silicon
Fixed Silicon
Hardware FlexibilityHigh ParallelismMedium Clock RateHigh Cost
Low FlexibilityHigh ParallelismHigh Clock RateLow Cost
Application Specific FlexibilityHigh ParallelismHigh Clock RateLow Cost
Software FlexibilityMedium ParallelismHigh Clock RateLow Cost
Technology Validation & DemonstrationJan 2002 - Aug 2002
Pre-Silicon Validation
OC-48C Sonet
1 Gbps MAC 1 Gbps MAC
ZBT-RAM
FPGA2FPGA4
FPGA1FPGA3
3-Node RPR Demo System– May 2002
FPGA Prototype Contributions
• Design Verification Acceleration– Software Stack & RTL
• Single Node RPR Functionality Demonstration– Gigabit Ethernet (Jan’2002)– OC48 SONET Ring (March’2002)
• Three Node RPR Functionality Demonstration– IEEE 802.17 Meeting, Ottawa, Canada, May’2002– Topology, Fairness, Protection Protocols
• Inter-Operability Testing with Cisco Boxes (Aug’2002)• Active Feedback to the IEEE 802.17 Working Group
– Drafts 2.3 through Draft 2.7 (July - Nov 2003)
Stacking Against Odds– Aug 2002
RPR Market Did Not Take-Off As Anticipated
Standard Ratification Delay• Planned Schedule March’2003• Actual Ratification June’2004
Slowdown in Venture Funding
UMC (Fabrication House) Abandons 0.13u Low-K Process– Initial Technology Choice of Chip Engines’ Silicon– Invalidates External Vendor Circuit-IP
7
Events that Followed– Sep 2002
Alliance Semiconductor Acquires Chip Engines– Product Portfolio Expansion– New Metro Market Penetration– IP Leverage for Future Roadmap Products
First RPR Silicon (Sep 2003)– Tapeout June 2003– CAD Tool License Renegotiation– Technology Transfer from UMC 0.13u to TSMC 0.13u– Redesign of Circuit-IP
Post-Silicon Validation
RPR Silicon Summary
High Level of Integration– One to Four RPR MAC Processors– OC12 to OC-192– 10G Ethernet
Flexibility & Support– Packet Formats, Fairness, Topology Algorithms– Steer or Wrap Protection– Ring or Framer Mode– Single or Dual Line Card
Performance & Cost– ASIC Like Determinism