+ All Categories
Home > Documents > 40-43-gb/s oc-768 16:1 MUX/CMU chipset with SFI-5 compliance

40-43-gb/s oc-768 16:1 MUX/CMU chipset with SFI-5 compliance

Date post: 25-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
12
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003 2169 40–43-Gb/s OC-768 16:1 MUX/CMU Chipset With SFI-5 Compliance Hai Tao, Member, IEEE, Derek K. Shaeffer, Member, IEEE, Min Xu, Member, IEEE, Saied Benyamin, Vincent Condito, Senior Member, IEEE, Steffen Kudszus, Member, IEEE, Qinghung Lee, Member, IEEE, Adrian Ong, Member, IEEE, Arvin Shahani, Xiaomin Si, Member, IEEE, Wayne Wong, and Maurice Tarsia Abstract—In this paper, we present two copackaged ICs that provide complete OC-768 16:1 multiplexer (MUX) and clock multiplying unit (CMU) functionality. The 17-input 2.5–2.68-Gb/s parallel interface is Serdes Framer Interface Level 5 (SFI-5) compliant while the 40–43-Gb/s output satisfies OC-768 jitter generation specifications with 7 dB of margin. The system ar- chitecture and two-chip partitioning are discussed, followed by descriptions of the design challenges including SFI-5 compliance, 40-Gb/s MUX timing, and 20-GHz clock generation. A novel technique for stabilizing timing margins in the final high-speed multiplexer stage using in-phase and quadrature clocks is also presented. This chipset accommodates 11 bits of static skew and 21 bits of dynamic wander at the SFI-5 interface, while generating 125 fs rms of random jitter and 3.1 ps peak-to-peak of determin- istic jitter at its 40–43-Gb/s outputs. The measured bit-error ratio is less than 10 for PRBS data and is measurement time limited. The two chips occupy 15.6 mm and 8.25 mm of die area. Both are implemented in a 120-GHz SiGe BiCMOS process. Index Terms—Clock multiplying unit (CMU), delay-locked loop (DLL), jitter generation, multiplexer (MUX), OC-768, optical networking, optical transmission, phase noise, phase-locked loop (PLL), SiGe, SFI-5, SONET. I. INTRODUCTION T O SATISFY the ever-increasing bandwidth need of the global digital communications traffic, today’s optical telecommunication systems are running at higher and higher speeds. Current state-of-the-art systems are running at about 10 Gb/s. The next generation equipment will operate at 40 Gb/s and above. This equipment requires 40-Gb/s transponders that interface between electronic signal processors and the fiber optics media. Critical transponder components include a multiplexer/clock multiplication unit (MUX/CMU) and a clock and data recovery/demultiplexer (CDR/DEMUX) device to interface between the parallel signal processor and the serial high-speed laser driver and photodiode. These devices need to be highly integrated to keep the power, size, and production costs down. This paper will focus on the MUX/CMU unit in the transmit path. The MUX/CMU multiplexes 16 channels of 2.5-Gb/s data into a single 40-Gb/s bit stream. The 40-Gb/s output needs to satisfy the SONET OC-768/SDH-256 specification [1]. In addition, the 2.5-Gb/s parallel interface has to comply with the SFI-5 standards [2]–[4]. Early implementations of the 40-Gb/s Manuscript received April 4, 2003; revised June 25, 2003. The authors are with Big Bear Networks, Sunnyvale, CA 94085 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/JSSC.2003.818575 MUX were realized mainly in III-V heterojunction bipolar transistor (HBT) [5]–[7] or high electron mobility transistor (HEMT) [8]–[10] technology. Recent advances in SiGe pro- cessing make possible a silicon-based solution [11]–[15], but none of these prototypes fully meet the SONET specifications or has a SFI-5 compliant parallel interface. This paper presents in detail a fully integrated two-chip one-package SiGe solu- tion [16], [17] that is, to the best of our knowledge, the first 40–43-Gb/s MUX/CMU to fully satisfy both specifications. In the following sections, we will first discuss the system architecture and rationale for partitioning the design into two parts, followed by design challenges and solutions for the SFI-5 interface, 40-Gb/s MUX, and 20-GHz CMU, respectively. Fi- nally, we will present the measurement results of the individual chips as well as those of the fully packaged transmitter. II. 16:1 MUX/CMU ARCHITECTURE Fig. 1 shows the conceptual block diagram of a 40-Gb/s transponder. In the transmit path, the MUX/CMU unit mul- tiplexes 16 channels of 2.5-Gb/s data from a framing or forward-error-correction (FEC) device into a 40-Gb/s serial data stream that will be used by the laser modulator driver. Under SFI-5 specifications [3], the 16 channels of input data can travel across up to eight inches of FR4 trace plus one con- nector before reaching the MUX/CMU. Due to mechanical and routing mismatches, these 16 lanes of data can have up to 4.8 unit intervals (UIs) of skew among them. Delay variations of these lanes over temperature and power supply changes can cause as much as 2.6 UIs of relative wander and 11.65 UIs of common wander [2]. To overcome lane skews, SFI-5 provides a deskew channel (DSC) that contains a frame header and samples of data from the 16 data lanes. The MUX/CMU first recovers data in all 17 lanes independently, then eliminates skews be- tween the 16 data lanes using information contained in the DSC. As will be shown in later sections, implementation of this SFI-5 compliant 2.5-Gb/s interface requires intensive digital CMOS circuitry that generates a lot of power supply and substrate noise. Each 2.5-Gb/s data receiver must also be able to process data satisfying the SFI-5 eye mask [2] and meet jitter tolerance spec- ification [4], as shown in Fig. 2. On the other hand, the 40-Gb/s output of the MUX/CMU has to meet the stringent OC-768 jitter generation specification [1]. More specifically, the 40-Gb/s output should have less than 1.2 UI peak-to-peak (p-p) of wide-band jitter over 20 kHz to 320 MHz jitter bandwidth and less than 0.1 UI p-p of “high- band” jitter over 16–320 MHz jitter bandwidth. This calls for 0018-9200/03$17.00 © 2003 IEEE
Transcript

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003 2169

40–43-Gb/s OC-768 16:1 MUX/CMU ChipsetWith SFI-5 Compliance

Hai Tao, Member, IEEE, Derek K. Shaeffer, Member, IEEE, Min Xu, Member, IEEE, Saied Benyamin,Vincent Condito, Senior Member, IEEE, Steffen Kudszus, Member, IEEE, Qinghung Lee, Member, IEEE,Adrian Ong, Member, IEEE, Arvin Shahani, Xiaomin Si, Member, IEEE, Wayne Wong, and Maurice Tarsia

Abstract—In this paper, we present two copackaged ICs thatprovide complete OC-768 16:1 multiplexer (MUX) and clockmultiplying unit (CMU) functionality. The 17-input 2.5–2.68-Gb/sparallel interface is Serdes Framer Interface Level 5 (SFI-5)compliant while the 40–43-Gb/s output satisfies OC-768 jittergeneration specifications with 7 dB of margin. The system ar-chitecture and two-chip partitioning are discussed, followed bydescriptions of the design challenges including SFI-5 compliance,40-Gb/s MUX timing, and 20-GHz clock generation. A noveltechnique for stabilizing timing margins in the final high-speedmultiplexer stage using in-phase and quadrature clocks is alsopresented. This chipset accommodates 11 bits of static skew and21 bits of dynamic wander at the SFI-5 interface, while generating125 fs rms of random jitter and 3.1 ps peak-to-peak of determin-istic jitter at its 40–43-Gb/s outputs. The measured bit-error ratiois less than 10 15 for 231 1 PRBS data and is measurement timelimited. The two chips occupy 15.6 mm2 and 8.25 mm2 of die area.Both are implemented in a 120-GHz SiGe BiCMOS process.

Index Terms—Clock multiplying unit (CMU), delay-locked loop(DLL), jitter generation, multiplexer (MUX), OC-768, opticalnetworking, optical transmission, phase noise, phase-locked loop(PLL), SiGe, SFI-5, SONET.

I. INTRODUCTION

T O SATISFY the ever-increasing bandwidth need of theglobal digital communications traffic, today’s optical

telecommunication systems are running at higher and higherspeeds. Current state-of-the-art systems are running at about 10Gb/s. The next generation equipment will operate at 40 Gb/sand above. This equipment requires 40-Gb/s transpondersthat interface between electronic signal processors and thefiber optics media. Critical transponder components include amultiplexer/clock multiplication unit (MUX/CMU) and a clockand data recovery/demultiplexer (CDR/DEMUX) device tointerface between the parallel signal processor and the serialhigh-speed laser driver and photodiode. These devices need tobe highly integrated to keep the power, size, and productioncosts down. This paper will focus on the MUX/CMU unit inthe transmit path.

The MUX/CMU multiplexes 16 channels of 2.5-Gb/s datainto a single 40-Gb/s bit stream. The 40-Gb/s output needsto satisfy the SONET OC-768/SDH-256 specification [1]. Inaddition, the 2.5-Gb/s parallel interface has to comply with theSFI-5 standards [2]–[4]. Early implementations of the 40-Gb/s

Manuscript received April 4, 2003; revised June 25, 2003.The authors are with Big Bear Networks, Sunnyvale, CA 94085 USA (e-mail:

[email protected]).Digital Object Identifier 10.1109/JSSC.2003.818575

MUX were realized mainly in III-V heterojunction bipolartransistor (HBT) [5]–[7] or high electron mobility transistor(HEMT) [8]–[10] technology. Recent advances in SiGe pro-cessing make possible a silicon-based solution [11]–[15], butnone of these prototypes fully meet the SONET specificationsor has a SFI-5 compliant parallel interface. This paper presentsin detail a fully integrated two-chip one-package SiGe solu-tion [16], [17] that is, to the best of our knowledge, the first40–43-Gb/s MUX/CMU to fully satisfy both specifications.

In the following sections, we will first discuss the systemarchitecture and rationale for partitioning the design into twoparts, followed by design challenges and solutions for the SFI-5interface, 40-Gb/s MUX, and 20-GHz CMU, respectively. Fi-nally, we will present the measurement results of the individualchips as well as those of the fully packaged transmitter.

II. 16:1 MUX/CMU ARCHITECTURE

Fig. 1 shows the conceptual block diagram of a 40-Gb/stransponder. In the transmit path, the MUX/CMU unit mul-tiplexes 16 channels of 2.5-Gb/s data from a framing orforward-error-correction (FEC) device into a 40-Gb/s serialdata stream that will be used by the laser modulator driver.

Under SFI-5 specifications [3], the 16 channels of input datacan travel across up to eight inches of FR4 trace plus one con-nector before reaching the MUX/CMU. Due to mechanical androuting mismatches, these 16 lanes of data can have up to 4.8unit intervals (UIs) of skew among them. Delay variations ofthese lanes over temperature and power supply changes cancause as much as 2.6 UIs of relative wander and 11.65 UIs ofcommon wander [2]. To overcome lane skews, SFI-5 provides adeskew channel (DSC) that contains a frame header and samplesof data from the 16 data lanes. The MUX/CMU first recoversdata in all 17 lanes independently, then eliminates skews be-tween the 16 data lanes using information contained in the DSC.As will be shown in later sections, implementation of this SFI-5compliant 2.5-Gb/s interface requires intensive digital CMOScircuitry that generates a lot of power supply and substrate noise.Each 2.5-Gb/s data receiver must also be able to process datasatisfying the SFI-5 eye mask [2] and meet jitter tolerance spec-ification [4], as shown in Fig. 2.

On the other hand, the 40-Gb/s output of the MUX/CMUhas to meet the stringent OC-768 jitter generation specification[1]. More specifically, the 40-Gb/s output should have less than1.2 UI peak-to-peak (p-p) of wide-band jitter over 20 kHz to320 MHz jitter bandwidth and less than 0.1 UI p-p of “high-band” jitter over 16–320 MHz jitter bandwidth. This calls for

0018-9200/03$17.00 © 2003 IEEE

2170 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003

Fig. 1. Conceptual block diagram of a 40-Gb/s transponder.

Fig. 2. Input eye mask and jitter tolerance specification of SFI-5.

careful low-noise CMU design with superior power supply andsubstrate noise rejection as well as minimization of supply andsubstrate noise generation.

To eliminate the potential of the CMOS switching noise toinduce jitter on the 40-Gb/s transmit data, we decided to parti-tion the MUX/CMU into two chips, as shown in Fig. 3. The firstchip is a 16:4 multiplexer that realizes an SFI-5 compliant in-terface and 16:4 multiplexing to generate four lanes of 10-Gb/sdata. The second chip implements the final 4:1 multiplexing andthe 20-GHz CMU, which generates precision clocks of variousspeeds to be used by both chips. A 5-GHz delay-locked loop(DLL) runs between the two chips to serve two purposes: 1) toguarantee proper timing when the 4:1 MUX/CMU receives the10-Gb/s data from the 16:4 MUX and 2) to provide a 5-GHzreference clock for the 16:4 MUX chip. This partition preventsthe noisy CMOS logic from affecting the high-speed MUX andCMU at the price of additional power consumption for the DLL.

The two chips are copackaged to minimize form factor and keepthe 10-Gb/s interface lines short and well matched.

The reference clock of this CMU is generated by an externalclean-up phase-locked loop (PLL) with 16-kHz bandwidth.This configuration not only reduces the phase noise of thenoisy 622–MHz system reference clock, but also helps meetthe OC-768 jitter transfer specification [1].

III. SFI-5 INTERFACE

The block diagram of the 16:4 chip is shown in Fig. 4. Itincludes 17 2.5-Gb/s input data lanes, a digital deskew logicblock, and a 128:4 MUX. Each input data lane consists of aninput buffer, a clock and data recovery block, and a 1:8 demulti-plexer which generates an 8-bit-wide 312–Mb/s data stream forthe deskew logic. The deskew logic runs at 312 Mb/s for a totalof 128 lines of data. Outputs from the deskew logic go into the128:4 MUX to form four lanes of 10-Gb/s data.

The 16:4 chip uses SiGe bipolar circuits to implement func-tional blocks running at 1.25 Gb/s and above while taking ad-vantage of the 0.18-m CMOS devices for lower speed blocks.The bipolar blocks run current-mode logic (CML) under a 3.3-Vsupply. The CMOS blocks use a 1.8-V supply with full-swinglogic levels. Level shifters between the CML and CMOS logicreside in the 2.5-Gb/s input data lanes and the final 128:4 MUX.

The SFI-5 compliant 2.5-Gb/s data that are corrupted by low-frequency wander and high-frequency jitter are recovered bythe DLLs. Conventional voltage-controlled delay line (VCDL)DLLs have finite delays and, therefore, suffer from limited lockrange. Noise transients or leakage in the charge pump duringlong sequences of 1’s or 0’s can exhaust the locking range of theDLL. Therefore, a dual-loop architecture [18], [19] is adopted,as shown in Fig. 5.

The first loop, which is a clock phase adjusting loop, useseight identical differential VCDLs and adjusts their total delayto be one half of the 2.5-GHz reference clock period acrossprocess, temperature, and supply voltage variation. Theseidentical delay cells generate eight clock phases spanning 0

TAO et al.: 40–43-Gb/s OC-768 16:1 MUX/CMU CHIPSET WITH SFI-5 COMPLIANCE 2171

Fig. 3. Partitioning the transmitter into a 16:4 MUX and a 4:1 MUX/CMU.

Fig. 4. Block diagram of the SFI-5 16:4 MUX.

Fig. 5. Block diagram of the CDR loop.

to 157.5 through positive taps, and the other eight phasesspanning 180 to 337.5 through the complement taps. Thiscomplementary architecture reduces the power of the delayline by a factor of two. A bang-bang rather than a linear phasedetector is chosen, whose large detection gain helps reduce thephase offset caused by charge-pump leakage and mismatchesof the up/down current. The penalty paid for this improvementis a slightly increased bang-bang jitter of 2 ps p-p. Sharing acalibration loop between every two adjacent DLLs reduces thecalibration loop power dissipation by another factor of two.

The second loop, which is the main loop, is digital. It selectsone of the above 16 clock phases to place the sampling clockat the center of the data eye. A 5-bit up/down counter countsthe phase detector outputs, followed by a finite-state machine(FSM) to control the selection of the clock phase. Wheneverthe up/down counter overflows (underflows), the FSM advances(retards) the clock phase by one step. When the clock selectionreaches the earliest (latest) of the 16 possible phases, the FSMautomatically wraps around to the latest (earliest) clock phase,therefore providing unlimited (modulo ) phase shift capa-bility like a conventional PLL. However, this dual-loop DLLdoes not have the problems of stability and phase-error accumu-lation that are commonly associated with conventional voltage-controlled oscillator (VCO)-based PLLs. Another advantage ofthis digital DLL architecture is that long sequences of contin-uous 1’s or 0’s in the data will not cause phase drift as in ananalog PLL loop with charge-pump leakage. The bandwidth ofthis digital loop is designed to cover the 1.5-MHz corner in theSFI-5 jitter tolerance specification for 0.65 UI p-p of total inputjitter, as shown in Fig. 2.

Because the following deskew logic is designed with the stan-dard cell library, we decided to run it at a low speed of 312 Mb/s.Therefore, a 1:8 demultiplexer is needed in every data lane.Fig. 6 shows the block diagram of the deskew block. It receives17 lanes (16 data plus DSC) of 2.5-Gb/s data on 8-bit-wide312.5 Mb/s buses. Each lane’s data has arbitrary bit alignmentwithin the 8-bit bus. The DSC data has a frame structure con-taining a 64-bit header followed by 64 bits of data from everyone of the 16 data lanes. The bit-aligner subcircuit uses barrelshifters to align the data at byte boundaries as dictated in theDSC. The data then enters the first-in/first-out (FIFO) buffer,which serves two purposes: 1) to transfer all 17 data lanes whichhave individual clocks to a common master clock derived fromthe 5-GHz reference clock generated by the 4:1 MUX/CMUchip and 2) to absorb any wander in the data lanes with respectto each other or to the common transmit clock. The deskew logicthen uses byte delays to line up the data and passes them to the128:4 MUX block. The circuit can compensate for 11 bits ofstatic skew and 21 bits of dynamic wander/jitter. The gate ac-count of this deskew core is approximately 12 500. The com-

2172 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003

Fig. 6. Block diagram of the deskew logic.

plete CMOS digital block has about 25 000 gates that includethe above deskew core and various built-in self-test (BIST) anddiagnostic control functions.

The 128:4 multiplexer consists of four parallel 32:1 MUXblocks. Each MUX has a tree-type 2:1 multiplexing structure,and the 2:1 MUX is based on a conventional five-latch con-figuration. The 128 lines of 312.5-Mb/s single-ended CMOSdata from the deskew logic are multiplexed into four streams ofCML data at 10 Gb/s. CMOS-to-CML converters are placed atthe 32:8 CMOS MUX outputs where the data rate is 1.25 Gb/s.The final 10-Gb/s output has 800 mV p-p of differential swing.These four lanes of 10-Gb/s data are phase-aligned to the idealsampling points of the next 4:1 MUX/CMU input sampler by ananalog DLL, as described in the next section.

IV. 40–43-Gb/s MULTIPLEXER

The block diagram of the 4:1 MUX/CMU is shown in Fig. 7.The multiplexer accepts four differential 10-Gb/s CML inputs.Input data timing alignment is achieved with a DLL with anon-chip loop filter providing a 5-GHz half-rate or 10-GHzfull-rate clock to the preceding 16:4 MUX. The 16:4 MUXuses this as a master clock. Among other things, this clockdirectly times the 10-Gb/s differential output data. The DLLadjusts this clock to align the received data to a local 10-GHzclock for sampling with conventional emitter-coupled-logic(ECL) latches. The DLL is a conventional first-order loop witha bang-bang-type phase detector and is unconditionally stable.

For data timing alignment, the DLL adjusts the 5-GHz clockto optimize the timing for a single 10-Gb/s data input. This ad-

justment also provides approximate alignment for the remainingthree data inputs. It is assumed that the skews among the in-puts are acceptably small so that individual data input skewcompensation is unnecessary. Electromagnetic simulations ofthe chip-to-chip physical interface indicate that package relatedskews should be less than a few picoseconds, and the on-chipclock distribution trees and data routes are laid out as symmet-rically as possible to minimize the contribution of data path mis-match to skew. The DLL loop has about 60-MHz bandwidth andproduces less than 2 ps of bang-bang jitter in operation.

The incoming data is retimed and multiplexed up to 40-Gb/susing local clocks fed from the CMU. A conventional treestructure MUX is used with each 2:1 MUX containing fivelatches and one selector. Clock delay-compensation bufferson the 20-GHz clocks correct for propagation delay throughthe 10-Gb/s latches and subsequent selectors to improvesetup-and-hold margins for the 20-Gb/s data latches. The finalstage of the multiplexer uses a half-rate architecture with noretiming latch after the selector.

A four-stage traveling-wave amplifier (TWA) with on-chip50- terminations drives the 40-Gb/s output with a 1-V p-p dif-ferential CML signal. A TWA was chosen out of convenienceto provide a modest edge rate enhancement while spanning thewidth of the die to reach the output pads. The gain stage wasconventional with two levels of emitter-follower buffers drivinga differential pair. Simulated gain was 8 dB, with a rise timeof under 10 ps and a bandwidth of greater than 34 GHz. Theinput and output transmission lines in the TWA are implementedwith coupled microstriplines having metal shields in a low-levelmetal layer. This use of metal ground shields eliminates cou-

TAO et al.: 40–43-Gb/s OC-768 16:1 MUX/CMU CHIPSET WITH SFI-5 COMPLIANCE 2173

Fig. 7. Block diagram of the 4:1 MUX/CMU.

pling to the lossy silicon substrate. To characterize the loss,a test line was implemented and measured, showing less than0.5-dB/mm attenuation at 40 GHz. Details of the test line char-acterization can be found in [21].

A significant design challenge is to secure stable timing mar-gins over process, temperature, and supply voltage variationswhen propagation delays are a significant fraction of the 25-psbit period. The demands on timing accuracy are particularlyacute in the 20-Gb/s latches and final selector stage, where smalltiming deviations contribute to eye closure and degradation inbit-error ratio (BER) performance.

Fig. 8 shows circuit diagrams of a 20-Gb/s latch and a40-Gb/s MUX and illustrates the clocking architecture used inthe 20-Gb/s section of the multiplexer to ensure timing stability.The latch and multiplexer use a conventional ECL designwith emitter follower buffering to improve speed. Diode levelshifters reduce the collector voltage of the current source de-vices to prevent breakdown. In the MUX clocking architecture,an in-phase clock retimes the selector input, while a quadratureclock operates the selector itself. If the propagation delays ofthe active circuitry were zero, the quadrature clock would havehigh and low levels precisely centered in the 20-Gb/s data eyes.This condition maximizes the selector timing margin.

In reality, the retimed data is delayed significantly due tothe latch clock-to- delay and clock propagation delay. Torestore and stabilize the desired timing relationship between theselector clock and retimed data, a replica delay-compensationbuffer provides a matched delay for the quadrature clock signal.

This buffer has multiple stages, each designed to replicatea particular delay in the sequence of stages traversed by thein-phase clock as it propagates through various buffers and datalatches. With such replica compensation, the relative in-phaseand quadrature clock timing will match properly as operatingconditions change. The resulting timing relationships areillustrated in the timing diagram of Fig. 8. Note that the selectorclock is positioned in the middle of the 20-Gb/s data eye, max-imizing the timing margin and stabilizing performance againstprocess, voltage and temperature variations. The benefits ofthe technique are confirmed by experimental measurementsindicating reliable operation of the 4:1 MUX/CMU over a verywide range of data rates from 36 Gb/s to over 50 Gb/s [16].

The high operating frequency calls for meticulous attention todetail in the layout, particularly in the 20-GHz clock distributionnetwork, where in-phase and quadrature clock delay matching iscritical. The layout makes extensive use of on-chip transmissionlines (microstriplines), which were carefully modeled in designsimulations using seven-segmentRLC models supplied by thefoundry. The use of microstriplines is advantageous in that thebottom shield (implemented in a lower level metal layer) elim-inates any attenuation due to coupling to the lossy silicon sub-strate.

Due to the number of transmission lines required, the chipwas carefully floor planned to reserve critical signal routes, asis evident in the die photo of Fig. 12. For clock matching, thetransmission line routes were approximately delay matched tothe same electrical length. For selected critical lines, electro-

2174 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003

Fig. 8. Schematic and timing diagram of the 40-Gb/s 2:1 MUX.

magnetic modeling tools were used, particularly where an ac-curate assessment of the line attenuation was required (e.g., themicrostriplines in the VCO resonator [20], [21]).

V. CLOCK MULTIPLYING UNIT

A fully integrated 20-GHz CMU provides in-phase andquadrature clocks to the multiplexer. The CMU uses an on-chipVCO with coarse and fine tuning control. An analog PLL withon-chip loop filter sets the fine control voltage, while the coarsecontrol is set digitally. This technique permits a smaller VCOgain slope for the analog loop and reduces the sensitivity tosupply noise injection. The PLL sets the VCO frequency totrack an external 625-MHz or 2.5-GHz reference clock overabout 40 MHz of bandwidth. Upon reset, an on-chip calibrationloop automatically sets the coarse frequency control accordingto the reference frequency supplied.

To satisfy the OC-768 jitter generation specification, aquadrature-coupledLC-type VCO is used, as shown in Fig. 9.Due to the high operating frequency, a significant designchallenge in the VCO is to mitigate the effects of active devicedelay. This delay, represented by Tin Fig. 9, contributes asignificant negative phase shift to the oscillator loop whichcauses the VCO to oscillate off-resonance. Problems caused bythis behavior include reduced operating amplitude, increasedjitter, and reduced startup margin. By coupling two oscillatorstogether in quadrature, as shown in Fig. 9, it is possible tointroduce a compensating positive phase shift that largelycorrects for the active device delay and allows the oscillatorto run at a frequency close to the resonator natural frequency.The degree of compensation is determined by the couplingcoefficient , according to the following formula:

Fig. 9. Schematic and block diagram of the quadrature coupled 20-GHz VCO.

This coupling dramatically improves the oscillator performancewithout increasing power consumption, as explained in [20] and[21]. For example, Fig. 10 shows the simulated phase noise ofa 40-GHz VCO implemented using this technique compared tothose of standalone and in-phase coupled VCOs using the sameactive circuitry and resonator. The standalone and in-phase cou-pled VCOs operate at the same frequency. The off-resonanceoscillation in either configuration has a reduced amplitude dueto the delay through the active devices, which contributes about65 of negative phase shift. Simple in-phase coupling of twooscillators provides a 3-dB phase noise improvement comparedto single oscillator, commensurate with the associated doublingin power consumption. In comparison, the quadrature-coupledVCO, with a coupling coefficient of to provide a positive

TAO et al.: 40–43-Gb/s OC-768 16:1 MUX/CMU CHIPSET WITH SFI-5 COMPLIANCE 2175

Fig. 10. Simulated phase noise for three types of oscillators.

compensating phase shift of about 65, exhibits about 11-dBbetter phase noise than an in-phase coupled pair despite con-suming the same power. This coupling coefficient was selectedto maximize the operating amplitude of the VCO and minimizeits phase noise, and the resulting value of is in excellentagreement with the simple theory described above. The 20-GHzoscillator used in this work employs the same architecture as this40-GHz example [20]. However, due to the lower frequency ofoscillation, a coupling coefficient of (corresponding toa 45 phase shift) provided optimal performance.

The active circuitry in the oscillator core is illustratedin Fig. 9. The resonator uses microstriplines with MOSaccumulation-mode varactor tuning. The oscillation is re-freshed each cycle by a pseudodifferential arrangementof pulse-forming cells that behave similarly to a Colpittsoscillator. Each pulse-forming cell comprises an emitter-fol-lower/common-emitter pair. The emitter follower helps toisolate the pulse device parasitics from the resonator, andthe emitter of the common-emitter device is tightly coupledto the shield of the microstrip resonator through a capacitor.Quadrature coupling between oscillators is established witha simple capacitive divider network at the emitter-followerinputs, rendering a fixed coupling coefficient that only dependson the ratio of the two capacitors. The simulated phase noise ofthis 20-GHz VCO is about 101 dBc/Hz at 1-MHz offset.

Another important factor in minimizing jitter generation isthe design of the clock synthesizing PLL. Because the 622-MHzreference clock in a 40-Gb/s transponder is noisy, we decidedto use two loops in cascade. An external clean-up PLL with16-kHz bandwidth rejects the jitter of the reference before it isused by the on-chip CMU. The output of the clean-up PLL iseither 622 MHz or 2.5 GHz. The on-chip 20-GHz CMU thenuses a relatively wide bandwidth of 40 MHz to take advantageof this clean reference to suppress VCO phase noise.

Further investigation reveals that because the low VCO jitteris comparable to those contributed from the reference and thePLL (reference clock buffer, phase-frequency detector, chargepump, etc.), one can optimize the loop bandwidth by comparingcontributions from those two sources. Fig. 11 illustrates that de-pending on the choice of loop bandwidth, one source can con-tribute more than the other to the CMU output. For example, thereference noise (assumed white) is attenuated by the low-pass

Fig. 11. Double PLL CMU structure and loop bandwidth optimization.

characteristic of the CMU above the CMU loop bandwidth.Thus, its contribution to total jitter power is proportional to.Conversely, it is well known that the VCO contribution to jitteris reduced as is increased. In particular, the total jitter power isinversely proportional to , provided that the VCO phase noisefollows the familiar slope [21]. This is due to the fact thatthe VCO phase noise is attenuated by a high-pass characteristicbelow the CMU loop bandwidth. Drawing on these observationsalone, the total output jitter of the CMU can be expressed as

where denotes the loop bandwidth, denotes optimalloop bandwidth, and is a scaling factor. The first term is thewhite-noise contribution from the reference and PLL circuits.The second term is the VCO phase noise contribution. Thus, ifthe CMU bandwidth is too high, the noise power is dominatedby the reference and PLL circuits, as shown on the left in Fig. 11.It is worth noting that, in this condition, the optimum bandwidthcan be determined by measuring where the free-running VCOphase noise crosses the closed-loop phase noise, which is dom-inated by the reference. Similarly, if the CMU bandwidth is toolow, the noise power is dominated by the VCO. However, in thiscondition, a measurement of the free-running VCO phase noiseprovides no information about loop bandwidth optimality. Atthe optimal bandwidth, both noise sources contribute equallyand the total random jitter generation reaches a minimum. Thisresult is a simple consequence of the mathematical forms ofthe jitter, given in the above equation. Our calculation showsthat the 40-MHz bandwidth implemented in the CMU is largerthan optimal such that the jitter is dominated by reference noise.Thus, further improvement on random jitter generation is pos-sible and the optimum bandwidth can be determined by a com-parison with open-loop VCO phase-noise measurements. Nev-ertheless, the existing loop provides 7-dB better performancethan required by the OC-768 specification, as described in thenext section.

VI. EXPERIMENTAL RESULTS

Die photos of the 16:4 MUX and the 4:1 MUX/CMU areshown in Fig. 12. The die areas are 15.6 mmand 8.25 mm,respectively. These two dies are copackaged in a single custom

2176 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003

(a) (b)

Fig. 12. (a) Micrograph of 16:4 MUX. (b) 4:1 MUX/CMU.

Fig. 13. Measurement setup of the copackaged 16:1 MUX/CMU.

ceramic package. Fig. 13 shows the measurement setup that in-cludes a complete 40–43-Gb/s optical test bed. Unless noted,the measurement data presented here were taken in this config-uration with a 2.5-GHz reference clock.

To test the 16:4 MUX, we programmed the Agilent 81250ParBERT to have various combinations of 11 bits static skew be-tween the 17 data lanes and achieved error-free operation with

PRBS input data. Jitter tolerance has been measured usinga two-step technique. For jitter frequencies from 1 to 200 kHz,the reference clock for the ParBERT is directly FM modulated tocreate 17 lanes of jittery data. Since the ParBERT filters out anyreference clock jitter beyond 200 kHz, we modulated the 5-GHzreference clock of the 16:4 MUX for jitter frequencies above200 kHz. This modulation produces jitter at the clock input tothe DLLs, and the tolerance of the DLLs to this jitter is equiva-lent to the data jitter tolerance for the frequencies of interest. Themeasured jitter tolerance at 2.488 Gb/s is shown in Fig. 14, ex-ceeding the SFI-5 specification [4]. Note that the data measuredusing the two techniques are in approximate agreement wherethey meet. This confirms the validity of the two-step measure-ment technique. Fig. 15 shows one of the single-ended 10-Gb/soutput eyes from the 16:4 MUX. The total number of wave-forms is approximately 1000, at which point the eye had com-pletely stabilized. (Note that the number of measurement hitsfor the jitter measurement is less than the number of waveformsdisplayed.) This measurement is wafer-probed, as the 10-Gb/sdata interface is inaccessible in the packaged assembly. Thereis some amount of eye distortion due to the asymmetric driveof the output when viewed single-ended. When packaged, thedata is driven and sensed differentially. Therefore, eye sym-metry is guaranteed by virtue of the differential symmetry. Skew

Fig. 14. 2.5-Gb/s interface jitter tolerance measurement.

Fig. 15. 16:4 MUX 10-Gb/s output eye.2 � 1 PRBS data, single-endedoutput.

among the four 10-Gb/s outputs is very difficult to ascertain atwafer probe due to cable and probe mismatches, but the copack-aged chips operate error-free reliably over all environmental andsupply conditions.

The input sensitivity of the 2.5-Gb/s interface was measuredto be better than 100-mV p-p differential, and the 16:4 MUXachieves error-free operation with the input data rate varyingfrom 36 to 43 Gb/s, which spans the full tuning range of themeasurement equipment. This 16:4 MUX dissipates 4.2 W froma 3.3-V supply and 0.59 W from a 1.8-V supply. Table I sum-marizes the 16:4 MUX measurement results.

The 40-Gb/s data eye of the packaged 16:1 MUX/CMU withPRBS input data is shown in Fig. 16. The total number

of waveforms displayed is about 1000, at which point the eyehad completely stabilized. The eye is measured with a digitalcommunications analyzer (DCA) having less than 200-fs rmsintrinsic jitter. The DCA was triggered at 10 GHz for this mea-surement. The eyes are clearly open, entirely free of sparkle, and

TAO et al.: 40–43-Gb/s OC-768 16:1 MUX/CMU CHIPSET WITH SFI-5 COMPLIANCE 2177

TABLE IPERFORMANCESUMMARY OF THE 16:4 MUX

Fig. 16. (Top) 40-Gb/s eye at the MUX output. (Bottom) 1010 pattern at theMUX output, indicating random jitter generation.

exhibit excellent symmetry between adjacent eyes. The mea-sured data jitter is 880 fs rms and 5.1 ps p-p. Packaged TWAbuffer risetime is typically less than 10 ps and the output returnloss is typically better than 12 dB below 40 GHz.

A 1010 input pattern was also used to determine the CMUrandom jitter generation. This jitter was about 250 fs rms onthe DCA, including timebase jitter. The phase noise of the1010 pattern was also measured on a spectrum analyzer. Theresults, shown in Fig. 17, can be used to estimate the high-band

Fig. 17. Measured open-loop and closed-loop VCO phase noise.

jitter to check compliance with OC-768 jitter generation limits.The phase noise at offsets below about 1-MHz is determinedby the reference oscillator provided by the clean-up PLL. Wealso measured the free-running VCO phase noise, as shownalso in Fig. 17, by shorting out the control voltage of the VCO.With reference to Fig. 11 and the accompanying discussion, byexamining where the free-running phase-noise data interceptthe closed-loop data, we determine that the measured 40-MHzloop bandwidth is wider than the optimal bandwidth of about15 MHz. Applying the expression in Fig. 11, we concludethat reoptimizing the CMU loop bandwidth would potentiallyreduce the total CMU jitter generation by about 1.8 dB.Nonetheless, the measured high-band jitter is 82 fs rms (about1.15 ps p-p) including the reference, which already exceeds theOC-768 specification [1] by about 7 dB. The total random jittergeneration of the CMU (excluding the low-frequency referenceoscillator noise) is estimated at 125 fs rms, which is roughlyconsistent with DCA measurements, given an estimated DCAtimebase jitter of less than 200 fs rms.

Since the 40-Gb/s output is not retimed, its eye timing sym-metry can be compromised if the 20-GHz clock has duty cycledistortion. To assess this, we measured the eye timing sym-metry of the 40-Gb/s output over various data rates and tem-perature settings. Fig. 18 shows the result of ten different unitsrunning between 37 and 46 Gb/s at low and high temperatures.The MUX/CMU shows better than 1% eye timing asymmetryover the 40–43-Gb/s operating range with temperature varia-tions. Note that 50% timing symmetry corresponds to the caseof perfect 20-GHz clock duty cycle.

The tuning range of the CMU over nine different samples isshown in Fig. 19. The three discrete curves for every samplecorrespond to three digital coarse tuning settings. This indicatesenough tuning range to reliably cover 40–43-Gb/s operation,and good die-to-die uniformity. Also notice that maximum op-erating data rate where data eyes are still open is limited by thesignal path bandwidth rather than by the CMU tuning range.

The 4:1 MUX/CMU consumes 4.8 W from a5.2-V supplyand 0.18 W from a 1.8-V supply. The measurement results aresummarized in Table II.

The packaged 16:1 MUX/CMU achieved error-free opera-tion with PRBS input data for 12 hours, at which point

2178 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003

Fig. 18. Eye timing symmetry statistics.

Fig. 19. 20-GHz VCO tuning range over different samples. The three parallellines of each sample are for three coarse tuning settings.

TABLE IIPERFORMANCESUMMARY OF THE 4:1 MUX

the test was terminated. This result implies a BER of less thanwith 50% confidence. Finally, based on measure-

ments across numerous units from multiple wafer lots, the per-formance reported here (including BER performance) is typicalof packaged devices and has been found to be repeatable withhigh yield across process, temperature, and supply variations.

VII. CONCLUSION

We have presented the design and measurement results of atwo-chip 40–43-Gb/s 16:1 MUX/CMU in SiGe process. Designchallenges and solutions of critical blocks, such as SFI-5 in-terface, 40-Gb/s MUX, VCO, and PLL are discussed. In addi-tion, a novel technique is presented for stabilizing timing mar-gins in the final high-speed multiplexer stage using in-phase andquadrature clocks. To the best of our knowledge, this is the first40–43-Gb/s MUX/CMU that satisfies both the OC-768 transmitspecification and the SFI-5 interface specification over supply,process, and temperature variations.

ACKNOWLEDGMENT

The authors would like to thank C. Bowen, J. Cancio,L. Y. Lee, R. Patterson, and R. Santoro for their layout effort,K. Yee, D. Pritzkau, and J. P. Mattia for their invaluableassistance gathering measurement data, and D. Wong for CADsupport.

REFERENCES

[1] “The control of jitter and wander in the optical transport network,”ITU-T, Recommendation G.8251, Draft, revision 04.0, Oct. 2001.

[2] T. Palkert, “System interface level 5 (SxI-5): Common electrical char-acteristics for 2.488–3.125 Gbps parallel interfaces,” Optical Internet-working Forum, OIF-SxI-5-01.0, Oct. 2002.

[3] P. Dartnellet al., “Serdes framer interface level 5 (SFI-5): Implementa-tion agreement for 40 Gb/s interface for physical layer devices,” OpticalInternetworking Forum, OIF-SFI5-01.0, Jan. 2002.

[4] A. Sanders, “Jitter methodology telecom visualization foils,” Optical In-ternetworking Forum, OIF2001.642.15, June 2002.

[5] K. Rungeet al., “40 Gbit/s AlGaAs/GaAs HBT 4:1 multiplexer IC,”Electron. Lett., vol. 31, no. 11, pp. 876–877, May 25, 1995.

[6] H. Suzukiet al., “InP/InGaAs HBT ICs for 40 Gbit/s optical transmis-sion systems,” inGaAs IC Symp. Tech. Dig., 1997, pp. 215–218.

[7] J. P. Mattiaet al., “High-speed multiplexers: A 50 Gb/s 4:1 MUX in InPHBT technology,” inGaAs IC Symp. Tech. Dig., 1999, pp. 189–192.

[8] U. Nowotny et al., “44 Gbit/s 4:1 multiplexer and 50 Gbit/s 2:1 mul-tiplexer in pseudomorphic AlGaAs/GaAs-HEMT technology,” inProc.IEEE Int. Symp. Circuits and Systems, vol. 2, 1998, pp. 201–203.

[9] K. Sanoet al., “50-Gbit/s 4-bit multiplexer/demultiplexer chip-set usingInP HEMTs,” inGaAs IC Symp. Tech. Dig., 2002, pp. 207–210.

[10] Y. Nakashaet al., “A 43-Gb/s full-rate-clock 4:1 multiplexer In InP-based HEMT technology,”IEEE J. Solid-State Circuits, vol. 37, pp.1703–1709, Sept. 2002.

[11] T. Masudaet al., “40 Gb/s 4:1 multiplexer and 1:4 demultiplexer ICmodule using SiGe HBTs,” inIEEE MTT-S Int. Microwave Symp. Dig.,vol. 3, 2001, pp. 1697–1700.

[12] K. Washio et al., “A 50-GHz static frequency divider and 40-Gb/sMUX/DEMUX using self-aligned selective-epitaxial-growth SiGeHBTs with 8-ps ECL,” IEEE Trans. Electron Devices, vol. 48, pp.1482–1487, July 2001.

[13] M. Meghelli et al., “50 Gb/s SiGe BiCMOS 4:1 multiplexer and 1:4demultiplexer for serial communication systems,” inIEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, vol. 45, 2002, pp. 260–261.

[14] G. Freemanet al., “40 Gb/s circuits built from a 120 GHz Ft SiGe tech-nology,” IEEE J. Solid-State Circuits, vol. 37, pp. 1106–1114, Sept.2002.

[15] M. Meghelli et al., “A 0.18�m SiGe BiCMOS receiver and transmitterchipset for SONET OC-768 transmission systems,” inIEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, vol. 46, 2003, pp. 230–231.

[16] D. Shaeffer et al., “A 40/43 Gb/s SONET OC-768 SiGe 4:1MUX/CMU,” in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers,vol. 46, 2003, pp. 236–237.

[17] M. Xu et al., “An SFI-5 compliant 16:4 multiplexer for OC-768 sys-tems,” in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, vol.46, 2003, pp. 238–239.

[18] J. Sonntag and R. Leonowich, “A monolithic CMOS 10 MHz DPLL forburst-mode data retiming,” inIEEE Int. Solid-State Circuits Conf. Dig.Tech. Papers, vol. 33, 1990, pp. 194–195.

TAO et al.: 40–43-Gb/s OC-768 16:1 MUX/CMU CHIPSET WITH SFI-5 COMPLIANCE 2179

[19] S. Sidiropoulos, “High performance inter-chip signalling,” Ph.D. disser-tation, Stanford Univ., Stanford, CA, 1998.

[20] D. Shaeffer, “Microstrip coupled VCOs for 40 GHz and 43 GHzOC-768 optical transmission,” inESSCIRC Dig. Tech. Papers, 2002,pp. 535–538.

[21] D. Shaeffer and S. Kudszus, “Performance-optimized microstrip cou-pled VCOs for 40 GHz and 43 GHz OC-768 optical transmission,”IEEEJ. Solid-State Circuits, vol. 38, pp. 1130–1138, July 2003.

[22] A. Josephet al., “A 0.18 �m BiCMOS technology featuring120=100GHz (f =f ) and ASIC-compatible CMOS using copper intercon-nect,” inProc. IEEE BCTM, 2001, pp. 143–146.

Hai Tao (S’95–M’99) received the B.S. degreein physics from the University of Science andTechnology of China, Hefei, China, in 1993, andthe M.A. degree in physics, M.Phil., and Ph.D.degrees in electrical engineering from ColumbiaUniversity, New York, NY, in 1995, 1997, and 1999,respectively. His Ph.D. research focused on theoryand design of high-speed bandpass sigma-deltamodulators.

In the summer of 1996, he was an intern at PhilipsResearch, Briar Cliff, NY, designing frequency syn-

thesizers for customized wireless network. In 1997, he interned at Bell Lab-oratories, Lucent Technologies, Murray Hill, NJ, designing gigabit bandpasssigma-delta modulators. From 1999 to 2000, he was a Member of TechnicalStaff with the Microelectronics Division, Lucent Technologies, Murray Hill, de-signing high-speed analog and mixed signal ICs for Gigabit Ethernet, SONET,and storage applications. Since 2000, he has been with Big Bear Networks, Sun-nyvale, CA, working on high-speed analog ICs for 10-Gb/s and 40-Gb/s opticalcommunications. His technical interests include circuit and system design forwired, wireless, and optical communications.

Derek K. Shaeffer (S’94–M’98) received theB.S.E.E. degree from the University of SouthernCalifornia, Los Angeles, in 1993 and the M.S.E.E.and Ph.D. degrees from Stanford University, Stan-ford, CA, in 1995 and 1999, respectively. His Ph.D.work was in the field of CMOS RF, demonstratingthe world’s first CMOS GPS receiver.

From 1992 to 1997, he worked for Tektronix,Inc., where he did design work in spectrum analysis,precision A/D conversion, and RF communications.In 1998, he joined Matrix Semiconductor, where

he did work on next-generation memory technologies. In 1999, he cofoundedFreeSpace Communications, where he worked on last-mile broad-bandwireless access networks. Since 2001, he has been with Big Bear Networks,Sunnyvale, CA, where his current work is in 10-Gb/s and 40-Gb/s opticalcommunications circuits and systems. He is the author or coauthor of sixpatents, 18 papers, and a book on CMOS RF design. His research has includedthe fields of test instrumentation, semiconductor memory, and wireless andoptical communications.

Min Xu (S’97–M’01) received the B.S. degreein physics from the University of Science andTechnology of China, Hefei, China, in 1994 and theM.S. and Ph.D. degrees in electrical engineeringfrom Stanford University, Stanford, CA, in 1997 and2001, respectively.

She is currently a Senior Design Engineer withBig Bear Networks, Sunnyvale, CA, working onhigh-speed electronics for optical communicationswith data rate up to 40 Gb/s. Her research interestsinclude substrate noise in mixed-signal circuits and

system and integrated circuit design.

Saied Benyamin received the Bachelor’s degree(magna cum laude) in electrical engineering fromthe University of California at San Diego in 1984.

He started his career in image processing andworked on pioneer projects of the mpeg standard atGeneral Instruments. He joined Tekelec, a networktest equipment company, in 1991 and has sincebeen with two successful networking startups. Hecurrently manages the Digital Design Team of BigBear Networks, Sunnyvale, CA, where he works on10-G and 40-G transponder ICs.

Vincent Condito (M’79–SM’99) received theB.S.E.E. degree from the Massachusetts Institute ofTechnology, Cambridge, in 1976.

He was a consultant doing analog IC design beforejoining Big Bear Networks, Sunnyvale, CA.

Steffen Kudszus (S’99–M’02) received theDip.-Ing. degree in electrical engineering from theUniversity of Stuttgart, Stuttgart, Germany, in 1996and the Dr.-Ing. (Ph.D.) degree from the Universityof Karlsruhe, Karlsruhe, Germany, in 2001.

In 1996, he joined the High Frequency Devicesand Circuits Department, Fraunhofer Institutefor Applied Solid State Physics (IAF), Freiburg,Germany. His main research areas were the design ofmonolithic microwave integrated circuits (MMICs)using HEMTs on GaAs and InP, simulation, and

linear and nonlinear device modeling up to 140 GHz. He concentrated onmicrowave and millimeter-wave oscillators with novel stabilization techniques,covering all aspects of simulation, design, and high-frequency1=f -noise andphase-noise measurements. Since 2001, he has been with Big Bear Networks,Sunnyvale, CA, in the research and development of integrated circuits in SiGeBipolar technology for high-speed communication at 10, 40, and 43 Gb/s,including design, packaging, and characterization.

Qinghung Lee (M’99) was born in Beijing, China.She received the B.S. degree in physics from SichuanUniversity, Chengdu, China, in 1986, the M.S. degreein physics from Lehigh University, Bethlehem, PA, in1995, and the Ph.D. degree in electrical engineeringfrom the University of California at Santa Barbara in1999.

From 1999 to 2000, she was a Member of Tech-nical Staff with Bell Laboratories, Lucent Technolo-gies, Murray Hill, NJ, designing high-speed circuitsfor fiber-optic communications. She is currently a Se-

nior Design Engineer with Big Bear Networks, Sunnyvale, CA.

2180 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 12, DECEMBER 2003

Adrian Ong (S’94–M’99) received the B.S. degreein electrical engineering from the University of Cal-ifornia at Berkeley in 1992 and the M.S. and Ph.D.degrees in electrical engineering from Stanford Uni-versity, Stanford, CA, in 1993 and 1999, respectively.His doctoral research focused on the design of band-pass sigma-delta modulators for narrow-band IF dig-itization.

During the summers of 1990 through 1993, he waswith Lawrence Livermore National Laboratory, Liv-ermore, CA. In 1994, he participated in the National

Science Foundation Summer Institute in Japan and interned at Toshiba Corpo-ration, Kawasaki, Japan. From 1998 to 2000, he was a Member of TechnicalStaff with Bell Laboratories, Lucent Technologies, Murray Hill, NJ, and a De-sign Engineer with Atheros Communications, Sunnyvale, CA. In August 2000,he joined Big Bear Networks, Sunnyvale, CA. His technical interests includethe design of analog and mixed-signal circuits for wired and wireless commu-nications and high-speed clock and data recovery circuits in SiGe and CMOStechnologies.

Dr. Ong received the Beatrice Winner Editorial Award at the 1997 IEEE In-ternational Solid-State Circuits Conference. He is a member of Phi Beta Kappaand Eta Kappa Nu.

Arvin Shahani was born in Columbus, OH, in 1971. He received the B.S. degreein computer systems engineering and the M.S. and Ph.D. degrees in electricalengineering from Stanford University, Stanford, CA, in 1993, 1995, and 1999,respectively. His doctorate research was on the implementation of different radioreceiver functional blocks using CMOS technology, which culminated in thedesign and implementation of a CMOS GPS receiver chip.

He was with FreeSpace Communications from 1999 to 2000, where his workconcentrated on wireless system design. In 2000, he joined Big Bear Networks,Sunnyvale, CA, where he is currently a Manager of IC design. At Big BearNetworks, he is engaged in the specification and implementation of 10-Gb/s and40-Gb/s ICs. His current research interest is in high-speed equalization of fiber-optic links. He has published 11 papers in international journals and conferencesand holds three patents.

Xiaomin Si (M’95) received the B.S. and M.S. de-grees from Zhejiang University, Hangzhou, China,in 1987 and 1990, respectively, and the M.S. degreefrom the University of Hawaii at Manoa in 1994, allin electrical engineering. His graduate study initiallyfocused on semiconductor devices and then on analogIC design.

From 1994 to 2000, he was with Datapath Systems,Inc., where he designed analog ICs for read channelsand DSL transceivers. He continued to work on DSLanalog front-end circuitry at Excess Bandwidth, Inc.

Since November 2000, he has been with Big Bear Networks, Sunnyvale, CA,designing analog/mixed signal circuits for 10-Gb/s and 40-Gb/s optical com-munication systems. He has coauthored six papers for the IEEE InternationalSolid-State Circuits Conference and is also an author or coauthor of five paperson semiconductor device process and characterization. He holds two patents inanalog IC design.

Wayne Wongreceived the B.S.E.E. degree from theCooper Union, New York, NY, and the M.S.E.E. de-gree from Santa Clara University, Santa Clara, CA,in 1996 and 1999, respectively.

From 1996 to 1999, he was a Digital Designerwith National Semiconductor, LAN Division. From2000 to 2001, he was with Lucent Technologies,Murray Hill, NJ, where he worked on the GigabitEthernet transceiver project. Since 2001, he has beenwith Big Bear Networks, Sunnyvale, CA, where hedesigns digital circuits for high-speed transceivers.

Maurice Tarsia received the B.S.E.E degree fromthe New Jersey Institute of Technology, Newark, in1993 and the M.S.E.E. degree from Columbia Uni-versity, New York, NY, in 1996.

From 1989 to 1995, he was with AT&T BellLaboratories as a Member of Technical Staff inthe Advanced Silicon Technologies and CircuitsDepartment, working on broad-band and wirelesscircuits and systems. From 1996 to 1997, he waswith Texas Instruments Incorporated as a Memberof Technical Staff in the Digital Signal Processing

Research and Development organization working on analog and mixed-signalfront-end electronics for read-channel disk drive applications. From 1998 to2000, he was with Lucent Technologies, Murray Hill, NJ, as a Member ofTechnical Staff in the Silicon Circuits Research Department. At Lucent, hefocused on developing new low-voltage stressed power amplifier circuits inCMOS technologies. He also worked on SERDES products, as well as CMOSBluetooth transceivers. Since 2000, he has been with Big Bear Networks,Sunnyvale, CA, as Vice-President of Integrated Circuits and Optics developing40-Gb/s and 10-Gb/s broad-band communication circuits and subsystems.


Recommended