+ All Categories
Home > Documents > [IEEE 2007 IEEE Symposium on VLSI Circuits - Kyoto, Japan (2007.06.14-2007.06.16)] 2007 IEEE...

[IEEE 2007 IEEE Symposium on VLSI Circuits - Kyoto, Japan (2007.06.14-2007.06.16)] 2007 IEEE...

Date post: 13-Dec-2016
Category:
Upload: jared
View: 216 times
Download: 2 times
Share this document with a friend
2
134 978-4-900784-04-8 2007 Symposium on VLSI Circuits Digest of Technical Papers 13-2 Precursor ISI Reduction in High-Speed I/O Jihong Ren 1 , Haechang Lee 1 , Qi Lin 1 , Brian Leibowitz 1 , E-Hung Chen 2 , Dan Oh 1 , Frank Lambrecht 1 , Vladimir Stojanovi 1,3 , Chih-Kong Ken Yang 2 , Jared Zerbe 1 1 Rambus, Inc, Los Altos, CA, 2 University of California, Los Angeles, CA, 3 MIT, Cambridge, MA [email protected] Abstract To achieve multi-Gb/s data rates over backplane channels, equalization is required to compensate for the non-idealities of the channels. In this paper, we first show that with decision-feedback equalization (DFE) handling postcursor inter-symbol interference (ISI), cancelling precursor ISI with transmitter equalization degrades rather than improves performance for most channels. This is due to the interaction between equalization adaptation and clock-data recovery (CDR), coupled with transmitter peak- power constraint. To minimize the impact of precursor ISI on the bit-error-rate (BER), we propose a new method of adapting CDR phase for maximum voltage margin. Introduction At high data rates, signal integrity issues such as dispersion and reflection introduce substantial ISI and limit channel bandwidth. Equalizers such as transmit FIR (Tx- FIR) and receiver DFE are widely used in high-speed links to mitigate the impact of ISI on link performance. Due to a transmitter peak-power constraint, transmitter equalization is typically de-emphasis, attenuating low-frequency components to obtain a relatively flat frequency response. Therefore, Tx-FIR reduces ISI at the expense of signal swing. DFE typically has no peak power constraint and removes ISI energy without reducing the signal swing. However, unlike Tx-FIR, DFE cannot remove precursor ISI. To exploit the complementary strengths of Tx-FIR and DFE, it seems natural to combine them as in the architecture depicted in Fig. 1, using Tx-FIR to remove precursor ISI and DFE to remove postcursor ISI. To avoid the problems of closing a tight feedback loop at high clock rates, this design unrolls the first DFE tap by using partial response DFE (PrDFE) [1]. A 2x over-sampled clock-data recovery (CDR) circuit recovers timing information, and the filter coefficients are adapted using a sign-sign least- mean-squares (SS-LMS) algorithm [2]. This paper first shows that if the DFE cancels all postcursor ISI, then a Tx-FIR for precursor ISI cancellation does not improve voltage margin for two reasons: CDR and Tx-FIR adaptation interaction and transmitter peak-power constraint. Since precursor ISI needs to be addressed at high data rates, we propose a new method of adapting CDR locking phases that accounts for precursor ISI and maximizes the eye opening at the data sampling location. This technique is demonstrated with both simulation data and lab measurements. For simulation, we use an in-house statistical link analysis tool [3]. Measurement results are taken with the 90nm transceiver of [1]. Precursor ISI cancellation with Tx-FIR in the presence of DFE A. Interaction between Tx-FIR adaptation and CDR The Tx-FIR changes the overall channel response seen by the receiver while the CDR tracks the changes. Shifting the CDR locking position changes the ISI seen by the samplers. This in turn changes the equalization tap weights Fig. 1 (a) Transceiver architecture with transmitter equalization and receiver DFE. The first DFE tap is realized as PrDFE. (b) PrDFE eye diagram. For timing recovery, the edge samplers can be placed at two of the three levels indicated. Data filtering is used for valid edge samples. By default, edge is set to data . through the adaptation engine. This interaction causes the system shown in Fig. 1 to converge to a steady state with suboptimal performance. For simplicity, we assume edge is set to 0. In this case, PrDFE edge samplers reduce to simple NRZ edge samplers. The analysis below can be readily extended for nonzero edge . In this case, the CDR nominally locks to the phase where the mean of the edge samples is zero, 0 1 0 e e (1) where e k denotes (t s -T/2+kT); (t) is the single bit response (SBR) of the raw channel; t s is the main cursor location; and T is the bit time. When precursor only Tx- FIR is applied, the mean of the edge samples for rising transitions at the original phase (before CDR update) is: 1 1 0 1 0 ' 1 ' 0 ) ( ) ( m i i i i m i i e e w e e w e e (2) where w i is the i th Tx-FIR tap. For dispersive channels, the precursor ISI is generally positive and dominated by the first precursor (Fig. 2). Thus, the summation in (2) is dominated by w -1 (e 1 -e 2 ), which is negative for typical channels. Thus, after Tx equalization, the mean of the edge samples at the same phase becomes negative; in response to which, the CDR delays its phase position. As illustrated in Fig. 2, this delayed CDR position increases the precursor ISI seen by the data sampler. The SS-LMS algorithm tries to zero-force the bigger precursor ISI, spending more energy in precursor ISI cancellation and lowering the main cursor even more. After adaptation converges, the precursor ISI is cancelled at the cost of more than half of the main cursor for the example channel shown in Fig. 2. Fig. 3 shows measured voltage margin contours versus CDR phase and the transmitter first pre-tap value for a 16’’ backplane channel. The link converges to a suboptimal point with too much pre-tap magnitude and too late of a CDR phase. Link operating point with SS-LMS adaptation is outside the contour plot with a pre-tap value of approximately -0.2 and CDR phase of roughly 88.
Transcript
Page 1: [IEEE 2007 IEEE Symposium on VLSI Circuits - Kyoto, Japan (2007.06.14-2007.06.16)] 2007 IEEE Symposium on VLSI Circuits - Precursor ISI Reduction in High-Speed I/O

134 978-4-900784-04-8 2007 Symposium on VLSI Circuits Digest of Technical Papers

13-2Precursor ISI Reduction in High-Speed I/O

Jihong Ren1, Haechang Lee1, Qi Lin1, Brian Leibowitz1, E-Hung Chen2, Dan Oh1, Frank Lambrecht1,Vladimir Stojanovi 1,3, Chih-Kong Ken Yang2, Jared Zerbe1

1Rambus, Inc, Los Altos, CA, 2University of California, Los Angeles, CA, 3MIT, Cambridge, MA [email protected]

Abstract To achieve multi-Gb/s data rates over backplane channels, equalization is required to compensate for the non-idealities of the channels. In this paper, we first show that with decision-feedback equalization (DFE) handling postcursor inter-symbol interference (ISI), cancelling precursor ISI with transmitter equalization degrades rather than improves performance for most channels. This is due to the interaction between equalization adaptation and clock-data recovery (CDR), coupled with transmitter peak-power constraint. To minimize the impact of precursor ISI on the bit-error-rate (BER), we propose a new method of adapting CDR phase for maximum voltage margin.

Introduction At high data rates, signal integrity issues such as

dispersion and reflection introduce substantial ISI and limit channel bandwidth. Equalizers such as transmit FIR (Tx-FIR) and receiver DFE are widely used in high-speed links to mitigate the impact of ISI on link performance. Due to a transmitter peak-power constraint, transmitter equalization is typically de-emphasis, attenuating low-frequency components to obtain a relatively flat frequency response. Therefore, Tx-FIR reduces ISI at the expense of signal swing. DFE typically has no peak power constraint and removes ISI energy without reducing the signal swing. However, unlike Tx-FIR, DFE cannot remove precursor ISI. To exploit the complementary strengths of Tx-FIR and DFE, it seems natural to combine them as in the architecture depicted in Fig. 1, using Tx-FIR to remove precursor ISI and DFE to remove postcursor ISI. To avoid the problems of closing a tight feedback loop at high clock rates, this design unrolls the first DFE tap by using partial response DFE (PrDFE) [1]. A 2x over-sampled clock-data recovery (CDR) circuit recovers timing information, and the filter coefficients are adapted using a sign-sign least-mean-squares (SS-LMS) algorithm [2]. This paper first shows that if the DFE cancels all postcursor ISI, then a Tx-FIR for precursor ISI cancellation does not improve voltage margin for two reasons: CDR and Tx-FIR adaptation interaction and transmitter peak-power constraint. Since precursor ISI needs to be addressed at high data rates, we propose a new method of adapting CDR locking phases that accounts for precursor ISI and maximizes the eye opening at the data sampling location. This technique is demonstrated with both simulation data and lab measurements. For simulation, we use an in-house statistical link analysis tool [3]. Measurement results are taken with the 90nm transceiver of [1].

Precursor ISI cancellation with Tx-FIR in the presence of DFE

A. Interaction between Tx-FIR adaptation and CDR The Tx-FIR changes the overall channel response seen by the receiver while the CDR tracks the changes. Shifting the CDR locking position changes the ISI seen by the samplers. This in turn changes the equalization tap weights

Fig. 1 (a) Transceiver architecture with transmitter equalization and receiver DFE. The first DFE tap is realized as PrDFE. (b) PrDFE eye diagram. For timing recovery, the edge samplers can be placed at two of the three levels indicated. Data filtering is used for valid edge samples. By default, edge is set to data.

through the adaptation engine. This interaction causes the system shown in Fig. 1 to converge to a steady state with suboptimal performance. For simplicity, we assume edge is set to 0. In this case, PrDFE edge samplers reduce to simple NRZ edge samplers. The analysis below can be readily extended for nonzero

edge. In this case, the CDR nominally locks to the phase where the mean of the edge samples is zero,

010 ee (1) where ek denotes (ts-T/2+kT); (t) is the single bit response (SBR) of the raw channel; ts is the main cursor location; and T is the bit time. When precursor only Tx-FIR is applied, the mean of the edge samples for rising transitions at the original phase (before CDR update) is:

1

1

0

10'1

'0 )()(

miiii

mii eeweewee (2)

where wi is the ith Tx-FIR tap. For dispersive channels, the precursor ISI is generally positive and dominated by the first precursor (Fig. 2). Thus, the summation in (2) is dominated by w-1(e1-e2), which is negative for typical channels. Thus, after Tx equalization, the mean of the edge samples at the same phase becomes negative; in response to which, the CDR delays its phase position. As illustrated in Fig. 2, this delayed CDR position increases the precursor ISI seen by the data sampler. The SS-LMS algorithm tries to zero-force the bigger precursor ISI, spending more energy in precursor ISI cancellation and lowering the main cursor even more. After adaptation converges, the precursor ISI is cancelled at the cost of more than half of the main cursor for the example channel shown in Fig. 2. Fig. 3 shows measured voltage margin contours versus CDR phase and the transmitter first pre-tap value for a 16’’ backplane channel. The link converges to a suboptimal point with too much pre-tap magnitude and too late of a CDR phase. Link operating point with SS-LMS adaptation is outside the contour plot with a pre-tap value of approximately -0.2 and CDR phase of roughly 88.

Page 2: [IEEE 2007 IEEE Symposium on VLSI Circuits - Kyoto, Japan (2007.06.14-2007.06.16)] 2007 IEEE Symposium on VLSI Circuits - Precursor ISI Reduction in High-Speed I/O

1352007 Symposium on VLSI Circuits Digest of Technical Papers

Raw Sbr

Intermediate equalized Sbr during Adaptation

Final equalized Sbr after Adaptation

Pre-cursor ISI and the main tap of the raw channel

X at CDR locking phase during adaptation

at initial CDR locking phase

at final CDR locking phase

Taken care ofby DFE

Fig. 2 Simulated single bit responses (SBR) before, during and after adaptation.

Fig. 3 Measured voltage margin vs. Tx FIR pre-tap value and CDR phase for a 16’’ backplane channel at 6.25Gb/s.

B. Transmitter Power Constraint Interestingly, Fig. 3 shows that the optimal setting for the pre-tap is 0 (no Tx-FIR). When postcursor ISI is

cancelled, voltage margin is roughly ||1

0 i idd where dk denotes (ts+kT). We simulated 10 backplane channels with different channel characteristics (-15dB to -35dB attenuation at Nyquist frequency) for the transceiver shown in Fig. 1. In all cases, the optimal weights of a 3 pre-tap Tx-FIR are zero when a linear program is used to maximize eye-height under a peak-power constraint [4]. Lab measurements verify this result. If the DFE is unable to cancel all postcursor ISI, using Tx-FIR might improve voltage margin by mitigating postcursor ISI.

Precursor ISI Compensation using CDR The amount of precursor ISI seen by the receiver is

dependent on the data sampling phase. Therefore, by adjusting the CDR locking phase, we can improve the voltage margin by maximizing

10 ||

i idd . Fig. 2 suggests that earlier CDR locking position results in less precursor ISI without as much reduction in the main cursor. A similar method was proposed for source-synchronous applications [5].

Data filtering for 110 and 001 transitions and setting edge sampler threshold to ± edge (Fig. 1b), we can control the CDR locking phase by simply adjusting edge. When

edge is set equal to data, the CDR locking phase is already earlier than that of an NRZ receiver, but still does not provide optimal performance (Fig. 4a). Fig. 4b shows the single bit responses measured with on-chip scope [2] for a 16’’ backplane channel at 6.25Gb/s. After adjusting edge sampler threshold, the new locking phase gives roughly a 12mV improvement in voltage margin from 114mV.

To optimize the CDR locking phase, we adjust the edge sampler threshold using a gradient descent algorithm to maximize the received voltage margin. At each edge

edge = dataOpt edge

Precursor ISI

(a) (b)

Fig. 4 (a) CDR phase vs. measured voltage margin. (b) Measured single bit responses with on-chip scope [2].

sampler threshold (each CDR locking phase), the DFE is adapted with SS-LMS, and the ISI at the corresponding data sampling location is cancelled. After the DFE settles, voltage margin is measured. Unlike simply sampling a given eye at a better location, shifting the DFE with the CDR produces a different equalized eye at each CDR locking position. The eye is maximized when the impact of precursor ISI is minimal relative to the main cursor.

Fig. 5 plots simulated voltage margin versus edge for 3 different channels at 7.5Gb/s. Table 1 shows simulated voltage margin for various CDR locking strategies. Among them, NRZ locking gives the latest locking position and worst voltage margin. Data filtering and directly using data

as edge offers much better performance than NRZ locking. Compared with setting edge to data, optimizing edgeimproves voltage margin by an additional 12-23%.

: edge = data

Channel NRZ locking

edge =

data

Opt edge

a 31 116 130b 0 86 97c 0 63 78

Sim

ulat

ed V

olta

ge

Mar

gin

at 1

0-15

BER

edge

Channel Attn@ 3.75G

a: -17.6 dBb: -17.9 dBc: -19.6 dB

ab

c

Table 1Simulated Voltage Margin

at 10-15 BER

Fig. 5 Simulated voltage margin vs. edge.Conclusion

In this paper, we first showed that precursor cancellation with Tx-FIR only degrades link performance when all postcursor ISI is dealt with by the DFE. This is due to an interaction between CDR and Tx-FIR adaptation, as well as the transmitter peak-power constraint. To minimize the impact of precursor ISI on link performance, we proposed a new method of adapting the CDR locking position based on voltage margin. This method improves link performance compared to TX-FIR without explicit precursor ISI cancellation hardware.

Reference [1] B. Leibowitz, et al, "A 7.5Gb/s 10-Tap DFE Receiver with

First Tap Partial Response, Spectrally Gated Adaptation, and 2nd-Order Data-Filtered CDR", in press, ISSCC 2007

[2] V. Stojanovic, et al, “Autonomous Dual-Mode (PAM2/4) Serial Link Transceiver With Adaptive Equalization and Data Recovery”, IEEE JSSC, Vol. 40, No.4, April 2005

[3] D. Oh, et al, “Accurate Method for Analyzing High-Speed I/O System Performance”, in press, DesignCon 2007

[4] J. Ren and M. Greenstreet, “A Unified Optimization Framework for Equalization Filter Synthesis”, DAC 2005

[5] R. Randall et. al, “Adaptive Equalizer using Precursor Error Signal for Convergence Control”, US Patent 4789994, 1987


Recommended