40G Signal Tap (sniffer) – Yearly Project 40G Signal Tap
Intel: Lan Access DivisionTechnion: High Speed Digital
Systems LabBy: Leonid Yuhananov &
Asaad MalshySupervised by: Dr. David Bar-On
We want to tap onto 40G traffic and present it in a useful way.
Tap: Listen to the Link.◦ Sniff the data transmitting on the line.
Present: View data on Logic analyzer.◦ Parse the data into Ethernet II frames.
Useful: Easy to read and good for debug.◦ Only the frames we are interested in will be presented.
Versatile: highly configurable for debug purposes. ◦ We are able to configure our Rx path to suit our needs in the bit
level.
Goal“Tracing 40Gbit Ethernet on a logic analyzer”
In the Ingress direction:◦ 4 x 10G optical lines in differential operation
mode. ◦ Representing an IEEE 802.3 40GbE link.
In the egress direction:◦ 34x4 channels to logic analyzer.
Display:◦ Output will be displayed on the logic analyzer in
Ethernet II frame structure. ◦ Trigger indication.
Project Definition – Preliminary
Due to HW limitations some revisions in the requirements were made, though maintaining the project’s poise and quality.
In the Ingress direction:◦ 10G Base-R optical line in differential operation mode. ◦ Representing an IEEE 802.3 10GbE link.◦ Highly configurable PHY.◦ Dual pipelined data path one for the frames, and one for the trigger.◦ Low latency.
In the egress direction:◦ Top generation clock speed for the Altera, of 625MHz.◦ 18 bit wide bus.◦ External LA sync.
Display:◦ Output will be displayed on the logic analyzer in Ethernet II frame
structure. ◦ Trigger indication.
Project Definition – Revised
High Level Block Diagram – Initial
SFP+Optical
modulesx2
10.3125Gx2Transceiver channelsAEL2005
10.3125Gx2
ALTERA AltGx
4xXAUIx3.125G
4xXAUIx3.125G
10.3125Gx2ALTERA
10GbaseR PHY
Transceiver channelsAEL2006
10G word
alignerx4
72 lines x 156.25M x4
SFP+Optical
modulesx2
40G wordsaligner
ALTERAto DDR
frequency multiplier
s
Logic Analyzer
FPGA
Due to HW limitations and optimization requirements, some Design Change Requests (DCRs) were addressed.◦ We used only one transceiver.◦ We used only the faster more configurable AEL2006.◦ The implementation was in the more up to date 10G
Base-R protocol.◦ There was no need for a 40G aligner, but our 10G
aligner was implemented to support future expansion.◦ We expanded our trigger mechanism.◦ Out Logic analyzer is of more elegant nature,
supplying more information.
DCRs
High Level Block Diagram – Revised
10.3125Gx210.3125Gx2
ALTERA10Gbase
R PHY
Transceiver channelsAEL2006
10G word
aligner
72 lines x 156.25M
SFP+Optical
modulesx2
ALTERAto DDR
frequency multiplier
s
Logic Analyzer
FPGA
MDIO writer, PHY configuration Block.
Trigger detection
SYNC signal
SFP+ (optical Module): Converts the optical signal to an electrical one. Transceiver channel:
◦ AEL2006- converts data to 10Gbase-R 10.3125G traffic (detailed information is internal)
MDIO writer: we use it to write configurations to the AEL thus making our PHY highly configurable.
ALTERA 10GBASER-PHY – convert 10.3125Gtraffic to 72 lines of 64 data and 8 controls.
10G word aligner – Our logic to align data and generate triggers (as defined at midterm presentation)
Trigger detection Block: the block which detects our wanted word for triggering and thus capturing the data following it.
Altera to DDR frequency block – multipliers that reduce amount of lines to logic analyzer by increasing speed.
Logic Analyzer:◦ Is our device for viewing the captured data.◦ A sync signal supplied by the Altera is used to sync our device.
Description of main blocks.
Transceiver channels Puma AEL2006-10GbE Dual CDR w/EDC
◦ Transiving 10G HSRXDATA from SFP+ to 10G RXDATA for 10G baseR PHY
NetLogic Microsystems' Puma AEL2006 device is a dual physical layer retimer - compliant with IEEE802.3aq specifications.The NetLogic Microsystems Puma AEL2006 device provides the consolidation of the receiver and transmitter SerDes functions on a single chip along with on-chip clock drivers, multiple loop-back features and PRBS generation & verification for both the line side and the system side.
ALTERA 10G BaseR PHY 10G BaseR PHY – block from Altera megafunction, used to convert
10G RXDATA to 8 words of data and 8 bits of controlsSDR XGMII = single data rate XGMII, 72 bits @156.25 Mbps
10GBASE-R PCS 10.3125-Gbps physical medium
attachment (PMA), PHY management functions 10GBASE-R PHY functions:
64b/66b encoding/decoding scrambling/descrambling 66b/16b gear-boxing, and data
serialization/deserialization
Alignment FPGA blocks 10G alignment logic (detailed description at part 1)
Rearrangement of data coming from 10G-BASER-PHY Alignment data from beginning of packet Triggering matched packet (hard coded) Contains FSM, rewiring blocks and trigger capturing FSMs
72 bits 156.25Mnot aligned
10G-BASER-PHY alignment logic
x2
72 bits 156.25Maligned
40G alignment logic – Done – though not checked due to HW limitations.
Contains 4 10G alignment blocks Determining alignment pattern logic Alignment output according to 40G protocols – FSMs Redirection of Trigger’s signals Arrangement data for DDR to Logic analyzer block
Alignment FPGA blocks
72 bits 156.25Maligned
10G-BASER-PHY alignment logic
x2
72x4 bits 156.25MAligned for 40G protocolAnd DDR multipliers
The double data rate is our output to the outer world (Logic Analyzer).
Since we want to utilize less LA pins using higher speeds, a double data rate is required.
Should be considered as a serializer, from 2 or more lines of a certain data rate, to a single line of double or more data rate.
The operation is based on a high speed DeMux, with a round around counter for its select bits.
DDR – Double Data Rate
A SODIMM to 4xSoftTouch interposer.◦ Market price > 40k $.◦ Our price – 3 gray hairs on Leonid’s head.
The following interposer was designed by a member of the team.
It was a distinct effort, that would make our project unique.
The implementation was from scratch and done in many purposes in mind.
Supports maximum bandwidth of current and future LA.◦ High connectivity – low latency.
DDR Interposer
A logic analyzer is an electronic instrument which displays signals in a digital circuit. A logic analyzer may convert the captured data into timing diagrams, protocol decodes, state machine traces.
TLA7000 Series 6,528 Logic Analyzer Channels 500 ps (2 GHz) – serial data 312.5 ps (3.2 GHz) – signal integrity 625 ps (1.6 GHz) MIPI
Logic Analyzer
Our 40G IXIA should have been our hammer and chisel for our debug process.
IXIA is an industry leading supplier of networking test equipment.
We leased our 40G IXIA for 2 weeks. it had many features though lacked some key features
for our debug process. It was a good experience, allowed us to determine
functionality and compatibility of our design.
40G IXIA
Our setup was a real system setup. We had our system sniffing both a
production NIC by Intel and an industry grade IXIA.
we veryfied our system to be working under all conditions.
Our LA was connected via our interposer. The results were shown on the LA screen
with vivid and vibrant colors.
Setup
Setup – visibly smart!
Our Staratix 4 dev board proved to lack the necessary features to enable 40G link and processing.
The DDR to soft touch connector: an imperative piece of hardware. Unfortunately it was not designed as requested.
A 40G link partner, even though we had the IXIA for 2 weeks, it was lacking many debug features.
One of our dev boards was fried during testing operation. We are using MegaCore functions provided by Altera –
black box. We may run our design while connected to the computer
only – not an issue, since the device is couple with a computer not unlike any test equipment today.
Project Constraints and HW limitation
When we look at the spec, we see the reason for this behavior.
The spec states that the alignment words do not undergo encoding, while all the rest does.
This presented the hardware limitation and stoped our 40G effort.
In our debugging we proved that this was the actual issue.
Why it didn’t work – and what did!
Here is the spec snippet.
Why it didn’t work – and what did!
When we tried debugging why 40G link wasn’t working, we started configuring various loopback modes.
The 40G link worked when we tried a loopback without the various decoding blocks.
Proof of the fact that the transceivers were working properly, but the decoding mechanism wasn’t.
This wasn’t due to our limitation but due to HW.
Why it didn’t work – and what did!
We have a working 10GbE tap. Alignment mechanisms. Special IEEE 802.3 words detection. We have a wide array of configurations. A versatile MDIO writer. A triggering mechanism. A Logic Analyzer interface. A working DDR for a generic bit width. A 625MHz output on a wanted width. The ground is ready for a 40GbE tap assuming
Stratix 5 board.
Project deliverables
Here is a run of our project using the TLA.
Project deliverables
Get 40Gbit link.◦A whole new world – 40G is still young in the
industry.◦Getting a new dev board which will support
QSFP.◦On our side, the ground is laid for the 40G
effort, most of the required blocks are working and debugged under simulation in HDL.
◦We enabled our DDR to provide enough frequency multiplication to enable 40Gbit traffic to be shown on the LA.
Next – expectations for the next project
Thank you all
Don’t Stay tapped for more
Gantt – need to add