+ All Categories
Home > Documents > Implementation of DMT transceiver for 400GbE 2km and 10km...

Implementation of DMT transceiver for 400GbE 2km and 10km...

Date post: 18-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
19
Implementation of DMT transceiver for 400GbE 2km and 10km PMD using 4 wavelengths Ian Dedic Fujitsu Semiconductor Europe GmbH IEEE P802.3bs 400GbE Task Force meeting San Diego, July 2014
Transcript
Page 1: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Implementation of DMT

transceiver for 400GbE 2km and

10km PMD using 4 wavelengths

Ian Dedic

Fujitsu Semiconductor Europe GmbH

IEEE P802.3bs 400GbE Task Force meeting – San Diego, July 2014

Page 2: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Supporters

David Lewis, JDSU

Sacha Corbeil, JDSU

Beck Mason, JDSU

Moon S Park, OE Solutions

Tom McDermott, Fujitsu Network Communications

Rolf Steiner, Agilent

Bernd Nebendahl, Agilent

1

Page 3: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

400GbE with 4-Lambda

Motivation?

Reduction of optical component count needed to reduce cost

Allows WDM without making lasers too difficult/expensive

Cost-effective building block needed for next generation 400G and beyond

Straw Polls from Norfolk IEEE 802.3bs interim meeting indicated strong interest in 100Gbps/lambda solutions for 400GbE

2

Page 4: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Discrete Multi-Tone Quick Overview

DMT uses a series of uniformly separately subcarriers (e.g. 256) to transmit data, where each subcarrier uses QAM-n modulation

Level of modulation is adjusted based on the available SNR within each subcarrier to maximize the number of bits per symbol within that subcarrier

Transmitter and Receiver communicate to determine optimal settings for transmission

Flexible modulation on each of the subcarriers allows DMT to compensate for link impairments and achieve the best use of the available signal channel bandwidth and SNR

Frequency domain calculations with efficient IFFT & FFT implementations

Transmit data is assembled in the frequency domain before passing through an IFFT and high speed DAC

Receive DMT data is captured using a high-speed ADC, converted back into the frequency domain via FFT where subcarrier amplitude and phase are mapped into received data

3

Po

we

r

Modulation

levels

High n-QAM

Low n-QAM

Frequency

Frequency

Re

sp

on

se

Frequency

Po

we

r

Subcarrier

Page 5: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Can support optical PMDs for 500m, 2km, 10km

Can potentially support flexible line rates

4 x 1-ch 100G transceivers or dedicated 4-ch 400G transceiver

Host-based transceiver provides opportunity for delivering 400G in a QSFP+ form factor

Optics only inside QSFP+ module, with 4*100G DMT electrical interface

4*116Gbps electrical interface from chip to module through connector

DMT can cope with imperfections of pluggable module connector

400G DMT Transceivers

Quad

ROSA

Quad

TOSA

DMT 4 * A

DC

CD

AU

I4-T

X

CD

AU

I4-R

X

4 * D

AC

QSFP+: 3.5W 2 x QSFP+

Quad

ROSA

Quad

TOSA

DMT

8*D

AC

8*A

DC

2*C

DA

UI4

-TX

2

*CD

AU

I4-R

X

8* D

AC

8*A

DC

Quad

ROSA

Quad

TOSA

2 x QSFP+

400GbE host (LR4/ER4) 2 x 400GbE host (LR4/ER4)

4

Page 6: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

System Requirements

5

Block DMT Requirement

(100Gbps/lambda)

ADC Sample Rate 58GSa/s

ENOB > 5

Bandwidth 18GHz

DAC Sample Rate 58GSa/s

ENOB > 5

Bandwidth 16GHz

SerDes 4-lane 28Gbps CEI-28G-VSR

FEC Coding Gain ~8dB

DMT DSP Gate Count ~5M

Linear TIA < 25G class

Linear Driver < 25G class

Laser (DML/EML) < 25G class

Page 7: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Bandwidth Requirements (DMT)

6

DMT Transmitter

Se

rDe

s (

4x2

8)

FE

C E

nco

de

r

Co

ns

tell

ati

on

Map

pin

g

DA

C

iFF

T

Cyc

lie

Pre

fix

Se

rDe

s (

4x2

8)

FE

C D

eco

de

r

Cyc

lic

Pre

fix

Ch

an

nel E

Q &

De

mp

pa

ing

AD

C

FF

T

Linear

PIN/TIA EML

Linear

Driver

Cascaded Bandwidth Requirement around 16GHz

DMT Receiver

25G-class optics (maybe even lower bandwidth) can be used

Existing ADC & DAC (28nm) can already generate 170Gbps electrical back-to-back transmission (@ BER 1e-03) over “bandwidth limited components”

Increases in bandwidth in next process node will add more margin for ROSA and TOSA

Packaging & PCB losses @ 16GHz-20GHz manageable

Bandwidth is never free

Lower bandwidth requirements will reduce overall system difficulty & cost

Higher digital power in DMT engine compensated for by power savings in optics

• In reality this power saving may be bigger than added DMT power cost...

Page 8: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

100G DMT Transceiver IC Overview

Single chip integrated CMOS transceiver for 100Gbps/lambda

Low cost, low loss 10mm x 10mm SHDBU package for QSFP28 module integration

3.5W in 28nm (silicon measurement, synthesis and layout results) – OK for CFP4

<1.8W in 14nm (estimated from “100G-type” benchmarks) – OK for QSFP28

7

DMT Rx Datapath

Rx Clock

Recovery

DMT Tx Datapath

DMT

116-126Gbps

CP

Rem

ov

al

1-ch

58GSps

ADC

Inte

rpo

lati

on

512-p

t F

FT

EQ

+ P

hase

Ro

tati

on

QA

M

DeM

od

ula

tor

RX

FIF

O

SerDes

Tx

FEC

Decoder

SerDes

Rx

FEC

Encoder

Ou

tpu

t F

ram

ing

CP

In

sert

ion

512-p

t iF

FT

QA

M

Mo

du

latr

ior

Bu

ffer

&

Su

bcarr

ier

Map

pin

g

APB Bus

1-ch

58GSps

DAC

NCO Loop

Filter

Phase

Noise

Estimator

Phase

Interp

Rx LCC

I/F

TxLCC

I/F

4x Tx/Rx

25Gbps

CAUI-4

Page 9: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

FFT/IFFT and CP Implementation

FFT & IFFT can be efficiently realized in dedicated hardware

Mixed-radix FFT with hardcoded twiddle factors (directly synthesized, no actual multipliers)

Core clock frequency Fsamp/128 (450MHz at 58Gs/s)

• Optimises power efficiency of digital (parallel processing of 128 samples)

2 selectable FFT sizes, 64SC (128 point FFT) and 256SC (512 point FFT)

64SC lower power/latency/capacity (SNR margin), 256SC higher power/latency/capacity

• Pilot tones and cyclic prefix add less overhead with 256SC

• 256SC for longer reach or to allow lower performance (cheaper) optics

Basic FFT is 128 points in 1 clock cycle – used for 64SC

• Mixed-radix (radix-2 and radix-4) to minimise power and simplify twiddle factors

256SC FFT adds a final radix-4 stage every 4 clock cycles (small power increase)

Selectable cyclic prefix (CP) length

Longer CP on poorer channels (longer impulse response, more reflections)

Shorter CP on better (shorter?) channels more margin allows use of 64SC mode

FFT size and CP length could be fixed or negotiated during startup/probing

Power and area are much smaller than calculated from “number of MACs”

FFT + IFFT power is <300mW in 28nm for 256SC

DMT TX+RX (datapath+clock recovery) power is <900mW (~25% of total device power)

8

Page 10: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

DMT Link Negotiation & Communication

Known location subcarriers are used to create a Link Communication Channel (LCC using DC FFT bin) and for frame/bit alignment

DMT requires initial negotiation between the TX and RX on startup

TX sends known pilot tones (fixed constellation and bit sequence) for probing channel SNR via EVM across all 256 subcarriers

RX analyzes SNR and exchanges back bit allocations tables with TX over LCC channel

Target link negotiation time around 10ms from cold

9

Page 11: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Bit allocation scheme

Bits arriving at DMT Tx input are allocated into DMT symbols

To achieve a target bit rate each DMT symbol has to carry a specified number of bits

The number of bits per DMT symbol is divisible by 4 to maintain SERDES lane ordering

SERDES lanes are bit-interleaved at the input of DMT Tx

Bits allocated into a 256-subcarrier DMT symbol are further divided into 4 chunks

Avoids buffering of the whole DMT symbol – reduces latency and power consumption

• Performance optimization, not an inherent requirement of a DMT link

Each chunk is a continuous sequence of input bits

Chunks may have unequal sizes

• Size range 256:320 bits or 256:296 bits

To minimize inequality of chunk sizes due to channel roll-off, chunks are mapped to interleaved subcarriers

In 64-sub-carrier mode there is only 1 chunk per DMT symbol, no interleaving, no overlaps.

Bits allocated into a chunk are distributed among 64 subcarriers

Each subcarrier carries 0 to 8 bits

Each group of 0-8 bits is a continuous sequence of input bits

Hardware implementation does not impose restrictions on subcarrier ordering

• Allocation algorithm allocates chunks to subcarriers in ascending order

10

Page 12: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Bit mapping diagram, SC=64

DMT Symbol N DMT Symbol N+1 DMT Symbol N-1

BitsPerSymbol

L0 L1 L2 L3 L0 …

DMT Symbol

Boundary

SERDES

lanes

4 bits

Subcarriers

0

1

2

3

4

5

63

0:8 bits

(continuous)

Arbitrary order (HW)

11

Page 13: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Bit mapping diagram, SC=256

DMT Symbol N DMT Symbol N+1 DMT Symbol N-1

BitsPerSymbol

L0 L1 L2 L3 L0 …

DMT Symbol

Boundary

SERDES

lanes

4 bits

Chunks

Chunk 0

Chunk 1

Chunk 2

Chunk 3

256:320 bits

(continuous)

256:320 bits

(continuous)

256:320 bits

(continuous)

256:320 bits

(continuous)

Subcarriers

0

1

2

3

4

5

255

0:8 bits

(continuous) Subcarrier

interleaving

Arbitrary order (HW)

12

Page 14: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Number of bits per DMT symbol

Number of Sub-Carriers,

SC [Samples]

Cyclic Prefix Size,

CP [Samples]

Bits/DMT Symbol,

BitsPerDMT [bits]

Nominal Chunk Size,

BitsPerChunkNom [bits]

TxIn-RxOut Alignment,

BitAlign [bits]

64 2 260 260 4

64 4 264 264 8

64 8 272 272 16

64 12 280 280 24

64 16 288 288 32

256 4 1032 258 8

256 8 1040 260 16

256 16 1056 264 32

256 32 1088 272 64

256 48 1120 280 96

256 64 1152 288 128

BitsPerDMT = 2*(2*SC+CP)

BitsPerChunkNom =

BitsPerDMT, in SC=64 mode

BitsPerDMT/4, in SC=256 mode

13

Page 15: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

DMT Electrical B2B Measurements

Electrical B2B measurements made using existing ADC & DAC silicon

Channel probing shows ~6dB SNR improvement between 40nm and 28nm

Excess bitrate (above 116Gbps at 1e-3 BER) gives indication of relative margin in the system for other component losses

40nm Silicon Measurements

132Gbps throughput

14% bitrate margin

28nm Silicon Measurement

170Gbps throughput

46% bitrate margin

Initial optical experiments using 28nm converters presented, work in progress

Plots show 256 subcarriers, Cyclic Prefix of 16

Bit rate measured at FEC threshold of 1e-3

14

Page 16: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

4ch.

DAC

DML1

DML2

DML3

DML4

MU

X

VOA

VOA

VOA

VOA

SMF

0~40km

4ch.

ADC

PD+TIA1

PD+TIA2

PD+TIA3

PD+TIA4

DM

T

mo

du

lation

DeM

UX

DM

T

de

mo

dula

tio

n

4ch.

driver

DML bias

controller

Bias-T

Bias-T

Bias-T

Bias-T

Optical 400GbE DMT Experiment (40nm)

1.E-05

1.E-04

1.E-03

1.E-02

1.E-01

0 10 20 30 40

BE

R

Distance(km)

DML1 DML2 DML3 DML4

Transmission capacity

464.0625Gbps(116.0156Gbpsx4)

15

Page 17: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

100G DMT Transceiver – not a simulation...

16

Page 18: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

100G DMT Transceiver – measured results

17

Page 19: Implementation of DMT transceiver for 400GbE 2km and 10km …grouper.ieee.org/.../bs/public/14_07/dedic_3bs_01a_0714.pdf · 2014-07-12 · 64 2 260 260 4 64 4 264 264 8 64 8 272 272

Summary

400GbE with 4 lambda is readily achievable through use of DMT modulation and ADC/DSP/DAC architectures

CMOS technology can offer performance and cost advantages

Allows host-based solution with only low-power optics in pluggable module

Enables low-cost single-lambda 100G and roadmap to 4-lambda 1TbE (no standards!)

Implementation of DSP solutions is not a concern

Silicon issues already solved for coherent long-haul

Can help decrease complexity/cost/power in the optics

Link negotiation/retraining is really a non-issue (well-known and solved in DSL)

Bandwidth requirements need to be carefully considered

Bandwidth isn’t free, there are always tradeoffs

• Lower bandwidth generally equals lower cost and power

Focus should not only be on just the DSP silicon, but all components

18


Recommended