Abstract - WordPress.nmsu.eduwordpress.nmsu.edu/.../Optimal_Filter_Banks_Furth_1996.pdf ·...

transcript

On the Design of Optimal Continuous-Time

Filter Banks in Subthreshold CMOS

Paul M. Furth

A dissertation submitted to the Johns Hopkins University

in conformity with the requirements for the degree of

Doctor of Philosophy

Baltimore, Maryland

c 1996 by Paul M. Furth

Just to fool the page enumerator.

Abstract

This dissertation attempts to provide a comprehensive method of designing

analog hardware, whenever power dissipation is one of the engineering constraints.

It begins by looking at analog �lters and, in particular, multiresolution �lter banks

that model the spectral characteristics of the cochlea. In this sense, these structures

are also referred to as cochlear �lter banks. We distinguish cochlear �lter banks

from silicon cochleas, which incorporate adaptation, or gain control, within the

�ltering mechanism. While we understand that such adaption is important for

handling very large dynamic range input signals, in this work we are concerned

only with the design of continuous-time linear �lter banks. The design employs

MOS transistors operating in subthreshold in order to achieve a wide tuning range

(two decades or more) and low power consumption.

A mathematical framework for analyzing linear continuous-time circuits is de-

veloped, where prior knowledge of the input signal, such as speech, is taken into

account. That prior knowledge takes the form of probability density functions of

the input amplitude as well as power spectral densities. Our goal is to optimize the

�lter design with respect to the achievable dynamic range given tight constraints

on area and power consumption. Models of MOS transistors operating in the sub-

threshold region are adapted into the framework, although other technologies or

regions of operation are possible. By convention, dynamic range is de�ned as the

ratio between the maximum output signal level divided by the noise oor, such

that the distortion is acceptably low. Admittedly, the most di�cult portion of

this framework is the computation of distortion, which in this work is de�ned as

the mean-square-error. Other distortion measures, which are based on the magni-

tude frequency spectrum, may be more appropriate for speech, but are as yet not

incorporated into the design. The mathematical framework presented here is not

limited to the design of �lter banks, but extends to the more general area of linear

continuous-time �lter design whenever power consumption is a major constraint.

By way of example, the mathematical framework is used to derive the dy-

namic range of a transconductance-C lowpass �lter, in which the transconductor

is perhaps the most simple of all, the CMOS inverter.

As a starting point, we decided to study the architecture and circuit imple-

mentation of the Liu basilar membrane model. At the circuit level, the original

cochlear �lter bank implementation of Liu (Liu, 1992) was based on a transcon-

ductor using a simple di�erential pair. The dynamic range of this transconductor

con�gured as a lowpass �lter is found to be only slightly higher than that of the

CMOS inverter. However, the dynamic range of the di�erential pair can be greatly

extended by using one of three linearizing techniques. To our knowledge, only one

of these techniques had been applied to previously to subthreshold CMOS, source

degeneration with diode-connected transistors. The other two methods, which

yield greater improvements in dynamic range, are new to subthreshold circuit de-

sign. They are, source degeneration with single and double di�usors, and multiple

asymmetric di�erential pairs. Deriving analytic expressions for the input/output

functions of these transconductors is done in laborious detail, mainly because it

is not a trivial task. These have been fabricated, and tested in order to verify

the derivations. These new transconductors o�er between 6.8 dB and 13.9 dB

improvement in dynamic range over that of the simple di�erential pair with only

modest increases in circuit complexity.

At the architectural level, Liu's basilar membrane model can be described as

a lowpass cascade with taps between each stage, each tap feeding two identical

series bandpass sections. This architecture is studied in order to derive its noise

characteristics, which, for a constant-Q implementation, are approximately at

across all output channels. In addition, assuming that the input signal power

spectrum is 1=f , we show that the signal power is distributed evenly across all

channels using this architecture. In this case, the maximum signal-to-noise ratio,

or dynamic range, is constant accross all channels, a desirable characteristic for

parallel-distributed processing systems.

Again by way of example, the transconductor based on the CMOS inverter is

substituted into the original disign in order to compute the dynamic range and

power dissipation of the entire �lter bank. As a measure of goodness, we propose

the following information measure: bits/sec/Watt, or bits/Joule. Based on exact

equations and a 1.5-Volt implementation, the �lter bank is estimated to process

0.19 bits/pJ.

The last part of this work attempts to address a more fundamental question.

Why process signals in analog? Some researchers believe that cheap, fast, reli-

able, digital processors make analog processing, i.e. analog computation, obsolete.

Perhaps a more carefully worded question is as follows: when is it advantageous

to process signals in an analog format, and when is it advantageous to process

signals digitally? Two constraints that are applicable to portable systems are

low power consumption and small size. In this work we consider four types of

signal processing: analog (continuous-time continuous-value), switched-capacitor

(discrete-time, continuous-value), time-domain (continuous-time, discrete-value),

and synchronous digital (discrete-time, discrete-value). Each of these systems are

evaluated in terms of maximum possible information rate in bits per second per

watt of power for the case of a simple delay function.

In the summary chapter, future research directions are discussed. The appen-

dices contain a symmetricmodel of the CMOS transistor operating in subthreshold,

as well as details of a computer-controlled experimental setup for testing hardware

cochlear �lter banks in the context of automatic speech recognition.

Acknowledgments

As a mentor, Dr. Andreas Andreou has a gift for understanding the importance

of what we do and identifying future trends in engineering. By contrast, I am

more excited by the details of what we do, often spending many days on long

mathematical derivations. As a result, we work as a team { more and more so,

as the day of graduation approaches. Over the six years that I've known and

worked with him, he has shared his ideas, time, and money generously with me. I

gratefully acknowledge his support.

My work builds upon the research of Dr. Weimin Liu of Hughes Network

Systems, formerly a student in the Sensory Communications Laboratory at Johns

Hopkins University. Over the years, Weimin modeled careful analysis, simulation,

experimental technique, and document preparation for the more junior members

of the lab, such as myself.

Dr. Moise Goldstein held in uence on my early years at Hopkins. He intro-

duced me to the lab and has helped me to appreciate the wonders of the human

nervous system. I share his desire to use my engineering knowledge to help people.

Mr. Robert Jenkins of the Applied Physics Laboratory is a wonderful (part-

time) supervisor. He has encouraged and supported my research, providing the

�rst testing of the Hopkins Electronic EAR in a classi�cation test. He has every

con�dence in those who work under him, including me.

Dr. Gert Cauwenberghs, a relatively new faculty member at Hopkins, has been

generous towards me by being available for discussions and by reviewing all of my

papers. I also thank former department chair Dr. Charles R. Westgate for his

support and personal interest in me. Through him I obtained as much experience

as a teacher as I could while at Hopkins.

I express my gratitude to two other professors, who each instructed me in

three, seemingly peripheral, subjects. Drs. Brian Hughes and Wilson Jack Rugh

are among the �nest educators I know.

Over the years, I have interacted with many colleagues and fellow students who

had either at one time worked in the Sensory Communications Laboratory (for-

merly, the Speech Processing Lab) or are still burning the midnight oil in Barton

Hall. In particular, I thank Ben Yuhas, Nina Kowalski, Kwabena \Buster" Boa-

hen, Philippe Pouliquen, Marc Cohen, Richard Meitzler, Kewei Yang, Fernando

Pineda, Kim Strohbein, Zaven Kalaygian, Nagendra \Goel" Kumar, Mark Martin,

Hitoshi Miwa, and Stane Gruden. Together they provided a stimulating intellec-

tual environment for learning and conducting research and made the work much

more fun. Special thanks to Mark, Goel, Stane, Tim, Philippe, and Rich, who

provided substantial technical support, and equally valuable, criticism, to my own

research.

I thank good friends, Tom, Dave, Roger, and Je�, who, although they do

not understand the technical intricacies of analog VLSI, nonetheless know how to

listen, understand, and encourage a edgling graduate student.

For the most beautiful wife, Carol, our three wonderful children, Aria, Cadence,

and David Canon, our families, and our parents, I express my deep love and ap-

preciation. It was their love, comfort, patience, and support that helped carry me

through graduate school. Six years is six years, no matter how you slice it.

I conclude with words from a song that I wrote several years ago: Thank you

God for life, for owers to smell and mountains to climb, for air to sweetly breathe.

Thank you God for work, the chance to use the talents I have, in making something

new.This thesis I dedicate to Him.

Abbreviations

BiCMOS bipolar/CMOS { fabrication process in which bipolarjunction transistors and CMOS devices can be realizedon the same substrate

C capacitor { typically 1{10 pF in our designs

CMOS complementary MOS { fabrication process in whichboth p-channel and n-channel MOSFETs can be real-ized on the same substrate

DFDR distortion-free dynamic range

DLDR distortion-limited dynamic range

DR dynamic range { ratio of maximal to minimal signallevel

FET �eld-e�ect transistor

G large-signal transconductance

IC integrated circuit

MOS metal-oxide-semiconductor { the structure of a typeof �eld-e�ect transistor. Despite the name, moderndevices use polysilicon, instead of metal, as the gatematerial

MOSFET MOS �eld-e�ect transistor

MOSFET-C MOSFET-capacitor { a type of continuous-time �lterrealization

MOSIS an IC fabrication foundry

NMOS n-channel MOSFETPMOS p-channel MOSFET

RC resistor-capacitor

RLC resistor-inductor-capacitor

RMS root-mean-square value

Trans.-C transconductance-capacitor { a type of continuous-time �lter realization

VLSI very large scale integration of circuit elements on asingle substrate or chip

Contents

Abstract ii

Acknowledgments v

Abbreviations vii

1 Introduction 1

1.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11.2 Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 51.3 Dissertation Outline : : : : : : : : : : : : : : : : : : : : : : : : : : 7

2 Dynamic Range of Integrators for Continuous-Time Audio Signal

Processing in Analog VLSI 9

2.1 CMOS Integrators : : : : : : : : : : : : : : : : : : : : : : : : : : : 112.2 Acoustic Input Signals : : : : : : : : : : : : : : : : : : : : : : : : : 14

2.2.1 Random Variables and Processes : : : : : : : : : : : : : : : 142.2.2 Speech : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 16

2.3 Noise : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 202.3.1 Input-Referred Noise Power Spectrum : : : : : : : : : : : : 222.3.2 Output-Referred Noise Level : : : : : : : : : : : : : : : : : : 22

2.4 Distortion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 252.4.1 Input-Referred Distortion : : : : : : : : : : : : : : : : : : : 272.4.2 Output-Referred Distortion : : : : : : : : : : : : : : : : : : 34

2.5 Dynamic Range : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 342.6 Example: Self-biased Transconductance-C Integrator : : : : : : : : 36

2.6.1 Output Current and Transconductance : : : : : : : : : : : : 372.6.2 Noise : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 392.6.3 Distortion : : : : : : : : : : : : : : : : : : : : : : : : : : : : 412.6.4 Dynamic Range : : : : : : : : : : : : : : : : : : : : : : : : : 47

3 Linearized Transconductors in Subthreshold CMOS 54

3.1 The Transconductance-C Integrator : : : : : : : : : : : : : : : : : : 553.2 The Di�erential Pair and De�nitions : : : : : : : : : : : : : : : : : 573.3 Source Degeneration : : : : : : : : : : : : : : : : : : : : : : : : : : 65

3.3.1 Diode-Connected Transistors : : : : : : : : : : : : : : : : : : 65

3.3.2 Single Di�usor : : : : : : : : : : : : : : : : : : : : : : : : : 713.3.3 Double Di�usors : : : : : : : : : : : : : : : : : : : : : : : : 79

3.4 Multiple Di�erential Pairs : : : : : : : : : : : : : : : : : : : : : : : 833.4.1 Two Di�erential Pairs : : : : : : : : : : : : : : : : : : : : : 833.4.2 Three Di�erential Pairs : : : : : : : : : : : : : : : : : : : : : 883.4.3 Substrate Biasing Technique : : : : : : : : : : : : : : : : : : 93

3.5 Hints on Improved Transconductor Design : : : : : : : : : : : : : : 943.5.1 Use of the Gate Capacitance : : : : : : : : : : : : : : : : : : 943.5.2 Voltage-Splitting : : : : : : : : : : : : : : : : : : : : : : : : 963.5.3 Class-AB Operation : : : : : : : : : : : : : : : : : : : : : : 98

3.6 Experimental Results : : : : : : : : : : : : : : : : : : : : : : : : : : 983.6.1 Static Measurements : : : : : : : : : : : : : : : : : : : : : : 983.6.2 Dynamic Measurements : : : : : : : : : : : : : : : : : : : : 1033.6.3 Summary of Results : : : : : : : : : : : : : : : : : : : : : : 103

4 The Multi-Resolution Filter Bank Model 108

4.1 Filter Bank Architecture : : : : : : : : : : : : : : : : : : : : : : : : 1094.2 RLC Proto-Type Filters : : : : : : : : : : : : : : : : : : : : : : : : 112

4.2.1 RC Proto-Type Lowpass : : : : : : : : : : : : : : : : : : : : 1124.2.2 RLC Proto-Type Bandpass Filter : : : : : : : : : : : : : : : 114

4.3 Complete Filter Bank Model : : : : : : : : : : : : : : : : : : : : : : 1184.3.1 Transfer Function : : : : : : : : : : : : : : : : : : : : : : : : 1184.3.2 Filter Bank Tuning : : : : : : : : : : : : : : : : : : : : : : : 1204.3.3 Filter Bank Noise : : : : : : : : : : : : : : : : : : : : : : : : 123

4.4 Information Rate and Power Dissipation : : : : : : : : : : : : : : : 1254.5 Signal Power Distribution : : : : : : : : : : : : : : : : : : : : : : : 1264.6 Results and Discussion : : : : : : : : : : : : : : : : : : : : : : : : : 128

5 Comparison of Continuous and Discrete Circuits 130

5.1 Capacity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1315.2 Four Signal Representations : : : : : : : : : : : : : : : : : : : : : : 133

5.2.1 Continuous-Value Continuous-Time : : : : : : : : : : : : : : 1345.2.2 Continuous-Value Discrete-Time : : : : : : : : : : : : : : : : 1385.2.3 Discrete-Value Discrete-Time Circuit : : : : : : : : : : : : : 1415.2.4 Discrete-Value Continuous-Time : : : : : : : : : : : : : : : : 145

5.3 Graphical Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1475.4 Detailed Analysis of the DVCT Circuit : : : : : : : : : : : : : : : : 149

5.4.1 Jitter in a DVCT Channel : : : : : : : : : : : : : : : : : : : 1495.4.2 Di�erential Entropy for a DVCT Source : : : : : : : : : : : 1515.4.3 Approximate Capacity of DVCT Channel : : : : : : : : : : 154

6 Summary and Future Research 156

A MOS Technology 161

A.1 MOS Transistor Model : : : : : : : : : : : : : : : : : : : : : : : : : 162A.2 Other Monolithic Elements : : : : : : : : : : : : : : : : : : : : : : : 169

B Cochlear Experimental Setup 171

B.1 Experiments with the Hopkins Electronic EAR : : : : : : : : : : : : 171B.1.1 Abstract : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 171B.1.2 Preliminary Results with the HEEAR Chip Set : : : : : : : 174

B.2 Harmonic and Intermodulation Distortion : : : : : : : : : : : : : : 179B.3 Current-to-Voltage Converter : : : : : : : : : : : : : : : : : : : : : 179B.4 A BiCMOS Voltage Bu�er : : : : : : : : : : : : : : : : : : : : : : : 180

B.4.1 Compound PMOS/NPN Transistors : : : : : : : : : : : : : 183B.4.2 Circuit Description : : : : : : : : : : : : : : : : : : : : : : : 186B.4.3 Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 187

Bibliography 189

Vita 196

List of Figures

1.1 Optimization paradigm. : : : : : : : : : : : : : : : : : : : : : : : : 6

2.1 (a) MOSFET-C and (b) Transconductance-C integrators. Bothhave transfer function G=C s. : : : : : : : : : : : : : : : : : : : : : 13

2.2 Di�erential transconductance-C integrator. : : : : : : : : : : : : : : 132.3 (a) Histogram of instantaneous values taken from the training set

of the TIMIT database normalized by the standard deviation. Ex-perimental data are marked by x's. The dotted line is a �t usingthe double-gamma distribution with � = 0:5. The solid line is for� = 0:267. Note the e�ect of clipping at the sides of the graph. (b)A close view of the central portion of the graph in (a). : : : : : : : 18

2.4 The probability density functions of (a) the uniform (solid) andcosine value (dotted) and (b) the normal (solid) and double-sidedgamma (dotted) random variable. : : : : : : : : : : : : : : : : : : : 21

2.5 Noise models for the transconductance-C integrator: (a) referredto the output current, (b) referred to the input voltage, whereVin;n = Iout;n=G. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23

2.6 Sample transconductor output current as a function of input volt-age. The solid line is the actual nonlinear function; the dashedline is a linear approximation for which the slope is the nominaltransconductance, Go. : : : : : : : : : : : : : : : : : : : : : : : : : 28

2.7 Nonlinearly transformed input voltage, h(Vin), as a function of theinput voltage for a sample transconductor. (a) The �rst distortionmeasure is computed as the mean-square error between h(Vin), thesolid line, and Vin, the dashed line. (b) The second distortion mea-sure is computed as the minimum mean-square di�erence betweenh(Vin) times a gain factor �, the solid line, and Vin, the dashed line.In this plot, � = 0:84. : : : : : : : : : : : : : : : : : : : : : : : : : : 30

2.8 The normalized transconductance plotted as a function the inputvoltage for a sample transconductor, solid line. The dashed line isthe ideal normalized tranconductance, equal to unity. The maxi-mum normalized transconductance distortion is the maximum dis-tance between the two curves. : : : : : : : : : : : : : : : : : : : : : 33

2.9 Self-biased transconductance-C integrator (a) circuit and (b) symbol. 38

2.10 For the self-biased transconductor (a) output current in units of Ib asa function of the input voltage in units of Vt=�, and (b) normalizedtransconductance G=Go as a function of input voltage. Note thatthe transconductance function is convex. : : : : : : : : : : : : : : : 38

2.11 Noise model of self-biased transconductor (a) referenced to the out-put current and (b) referred to the input voltage. : : : : : : : : : : 40

2.12 Self-biased transconductance-C inverting lowpass �lter (a) circuitand (b) noise model. : : : : : : : : : : : : : : : : : : : : : : : : : : 40

2.13 The �rst distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distri-butions: uniform (solid), cosine value (dashed), and normal (dotted). 43

2.14 The optimal gain factor �� as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distri-butions: uniform (solid), cosine value (dashed), and normal (dotted). 45

2.15 The second distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distri-butions: uniform (solid), cosine value (dashed), and normal (dotted). 46

2.16 The signal-to-noise-plus-distortion ratio using (a) the �rst and (b)the second distortion measure as a function of � in units of Vt=�,where Vt = 25:7 mV, � = 0:7 and C = 5:0 pF. Curves are drawnfor each of three input distributions: uniform (solid), cosine value(dashed), and normal (dotted). : : : : : : : : : : : : : : : : : : : : 48

2.17 The distortion-free dynamic range using the �rst distortion measure(a) as a function of C, where � = 0:7, and (b) as a function of �,where C = 5:0 pF. Vt = 25:7 mV. Curves are drawn for each ofthree input distributions: uniform (solid), cosine value (dashed),and normal (dotted). : : : : : : : : : : : : : : : : : : : : : : : : : : 49

2.18 The distortion-free dynamic range using the second distortion mea-sure (a) as a function of C, where � = 0:7, and (b) as a functionof �, where C = 5:0 pF. Vt = 25:7 mV. Curves are drawn for eachof three input distributions: uniform (solid), cosine value (dashed),and normal (dotted). : : : : : : : : : : : : : : : : : : : : : : : : : : 50

2.19 The distortion-limited dynamic range using the �rst distortion mea-sure (a) as a function of C, where � = 0:7, and (b) as a function of�, where C = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion.Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted). : : : : : : : : : 52

2.20 The distortion-limited dynamic range using the second distortionmeasure (a) as a function of C, where � = 0:7, and (b) as a functionof �, where C = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion.Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted). : : : : : : : : : 53

3.1 The transconductance-C integrator using the basic di�erential pair. 56

3.2 The basic di�erential pair (a) circuit, and (b) simpli�ed AC noisemodel. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 58

3.3 Normalized transconductance for the basic di�erential pair as a func-tion of VDM with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : 64

3.4 The di�erential pair with source degeneration via diode-connectedtransistors (a) circuit, and (b) small-signal noise model. : : : : : : : 67

3.5 Normalized transconductance for the di�erential pair with sourcedegeneration via diode-connected transistors as a function of VDM

with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : : : : : 703.6 The di�erential pair with source degeneration via a single di�usor

(a) circuit, and (b) small-signal noise model. : : : : : : : : : : : : : 733.7 For the di�erential pair with source degeneration via a single di�u-

sor, G normalized by the maximal transconductance as a functionof VDM with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : 78

3.8 The di�erential pair with source degeneration via double di�usors(a) circuit, and (b) small-signal noise model. : : : : : : : : : : : : : 80

3.9 For the di�erential pair with source degeneration via double di�u-sors, G normalized by the maximal transconductance as a functionof VDM with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : 84

3.10 A transconductor with two asymmetric di�erential pairs. : : : : : : 863.11 For a transconductor with two asymmetric di�erential pairs, G nor-

malized by the maximal transconductance as a function of VDM withVt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : : : : : : : : 89

3.12 A transconductor with three asymmetric di�erential pairs. : : : : : 923.13 For a transconductor with three asymmetric di�erential pairs, G

normalized by the maximal transconductance as a function of VDM .with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : : : : : 92

3.14 A transconductor with two asymmetric di�erential pairs demon-strating the substrate biasing technique to achieve a maximally attransconductance function. : : : : : : : : : : : : : : : : : : : : : : : 95

3.15 Application of the voltage-splitting technique to the transconductorwith source degeneration via double di�usors. : : : : : : : : : : : : 97

3.16 Fully complementary transconductor design based on source degen-eration via double di�usors. : : : : : : : : : : : : : : : : : : : : : : 99

3.17 Experimental data of the di�erential output current as a functionof input voltage for (a) the basic di�erential pair, (b) the di�eren-tial pair with source degeneration via double di�usors, and (c) thedi�erential pair with source degeneration via a single di�usor. Eachdot represents a sample point. : : : : : : : : : : : : : : : : : : : : : 101

3.18 Normalized transconductance as a function of VDM computed fromexperimental data for (a) the basic di�erential pair, (b) the di�eren-tial pair with source degeneration via double di�usors, and (c) thedi�erential pair with source degeneration via a single di�usor. Solidlines show the predicted values. : : : : : : : : : : : : : : : : : : : : 102

3.19 Experimental data of the normalized transconductance as a functionof VDM for (a) the basic di�erential pair and (b) the di�erential pairwith source degeneration via a single di�usor. : : : : : : : : : : : : 104

4.1 Block diagram of Liu's N -channel basilar membrane model consist-ing of a cascade of N lowpass sections with taps to two bandpass�lters per output channel (Liu, 1992). : : : : : : : : : : : : : : : : : 110

4.2 RC �rst-order lowpass �lter (a) proto-type, (b) transconductance-Cimplementation, and (c) noise model. : : : : : : : : : : : : : : : : : 113

4.3 Self-biased transconductance-C integrator: (a) circuit and symbol,(b) con�gured as �rst-order lowpass �lter. : : : : : : : : : : : : : : 115

4.4 RLC proto-type second-order bandpass �lter (a) proto-type, (b)transconductance-C implementation, and (c) noise model. : : : : : : 116

4.5 RLC proto-type second-order bandpass �lter composed of six self-biased transconductors. : : : : : : : : : : : : : : : : : : : : : : : : : 119

4.6 Response of 16-channel cochlear �lter bank using exact equations,(a) magnitude and (b) group delay. Filter parameters are as follows:fc(1) = 8000 Hz, fc(16) = 100 Hz, Q3(1) = 2:6, and Q3(16) = 2:6.Two preliminary lowpass sections are added for better uniformityin the peak response. : : : : : : : : : : : : : : : : : : : : : : : : : : 121

4.7 Single-section of lowpass cascade showing tuning mechanism via (a)supply lines and (b) substrate lines. : : : : : : : : : : : : : : : : : : 122

4.8 (a) Power spectral density of noise in 16-channel cochlear �lter bankwith C = 5:0 pF and other parameters as earlier de�ned, and (b)RMS noise as a function of center frequency. : : : : : : : : : : : : : 124

4.9 Short-term averaged power spectrum taken from the training set ofthe TIMIT database normalized by the standard deviation. Exper-imental data are marked by x's. The dotted line is a �t using abandpass �lter with center frequency 550 Hz and Q � 1. : : : : : : 128

5.1 (a) CVCT RC lowpass circuit, (b) CVDT sample{and{hold circuit,(c) DVCT RC delay circuit, and (d) DVDT clocked M -bit delay. : : 135

5.2 Signal-to-noise ratio as a function of mean power. Results from (Ho-sticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed (2fp =fs = 100 MHz), and DVDT (� = 100, 2fp = fs = 100 MHz) Numberof bits. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 148

5.3 System capacity as a function of mean power dissipation. Resultsfrom (Hosticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed(2fp = fs = 100 MHz), and DVDT (� = 100, 2fp = fs = 100 MHz)Number of bits. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 149

5.4 Re-formulated signal-to-noise ratio as a function of mean power(fp = 0:5fs = 100 MHz, � = 1E� 12). CVCT Sold, CVDT Dashed,DVDT Number of bits. : : : : : : : : : : : : : : : : : : : : : : : : : 150

5.5 Re-formulated system capacity as a function of mean power (fp =0:5fs = 100 MHz, � = 1E � 12). CVCT Solid, CVDT Dashed,DVDT Number of bits. : : : : : : : : : : : : : : : : : : : : : : : : : 151

A.1 (a) View of an nMOS transistor on the substrate and (b) symbol. : 163A.2 MOS small-signal subthreshold model including sources of shot noise

only, (a) as a di�usor, and (b) in saturation. : : : : : : : : : : : : : 167A.3 Noise data taken from a PMOS transistor with W=L = 1148=4.

Solid lines are noise model, x's are data. Curve (a) corresponds to1 nA for an equivalent square device, (b) 10 nA, and (c) 100 nA.(� = 0:7, Cox = 1500 F/m2, and M = 4:0E�26 J.) : : : : : : : : : : 169

B.1 Experimental setup for simultaneously stimulating and recordingfrom the HEEAR chip set. One PC outputs a previously recordedspeech signal via the D/A converter module. The analog speechsignal is attenuated before being presented to the silicon basilarmembrane. Thirty-one output channels are fed into independenthair-cell synapse circuits. The outputs from the HEEAR chip setare digitized and stored on a second PC after passing through acustom analog interface. Synchronization is achieved by recordingthe input signal along with the 31 output channels. Depending onthe application, a microphone can be connected directly to the inputof the pre-ampli�er. : : : : : : : : : : : : : : : : : : : : : : : : : : : 173

B.2 Silicon cochlea response to 1kHz tone burst at 1/4 fullscale. Thecharacteristic frequency of the output channels are shown above halfof the traces. Only one channel (668 Hz) appears to be adapting tothe stimulus. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 175

B.3 Silicon cochlea response to 1kHz tone burst at 1/2 fullscale. Thecharacteristic frequency of the output channels are shown above halfof the traces. The response of one channel (668Hz) is high duringthe �rst three cycles of the tone burst, but reduces to roughly onehalf its initial value by the tenth cycle. : : : : : : : : : : : : : : : : 175

B.4 Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 12dB SNR. The characteristic frequency of the output channels areshown above half of the traces. : : : : : : : : : : : : : : : : : : : : : 176

B.5 Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 0dB SNR. The characteristic frequency of the output channels areshown above half of the traces. : : : : : : : : : : : : : : : : : : : : : 176

B.6 Silicon cochlea response to male token of /jh er/ at 1/5 fullscaleRMS and 6 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. A high-frequency burstmarks the release of /jh/. : : : : : : : : : : : : : : : : : : : : : : : 177

B.7 Silicon cochlea response to male token of /jh er/ at 1/5 fullscaleRMS and 0 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. The consonant /jh/appears to be buried in the noise. : : : : : : : : : : : : : : : : : : : 177

B.8 Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 6 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. A high-frequency burstmarks the release of /jh/. : : : : : : : : : : : : : : : : : : : : : : : 178

B.9 Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 0 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. The consonant /jh/ isbarely discernible in the noise. : : : : : : : : : : : : : : : : : : : : : 178

B.10 Schematic of current-to-voltage converter as used in the computerinterface to the Hopkins Electronic EAR. : : : : : : : : : : : : : : : 181

B.11 Compound PMOS/NPN Transistor. : : : : : : : : : : : : : : : : : : 183B.12 Current as a function of voltage for a compound PMOS/NPN tran-

sistor (solid line) and an NPN transistor (dashed line). Two tran-sistors of each type were measured, but the resulting curves were sosimilar for each type that they would be indistinguishable on thisgraph. The saturation of the NPN transistor output current near10mA was caused by limitations in the measurement equipment. Inall cases, VCE = 2:0V . : : : : : : : : : : : : : : : : : : : : : : : : : 186

B.13 BiCMOS Bu�er Ampli�er. : : : : : : : : : : : : : : : : : : : : : : : 187

List of Tables

1.1 Percent Error Rates from (Neti, 1994). : : : : : : : : : : : : : : : : 3

2.1 Probability density functions of four input signal distributions as afunction of �. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20

2.2 Performance ratios. : : : : : : : : : : : : : : : : : : : : : : : : : : : 35

3.1 Summary of Linearization Techniques with Constant (Go=C), C =5 pF (Vt = 25:7 mV, � = 0:7). : : : : : : : : : : : : : : : : : : : : : 105

3.2 Summary of Linearization Techniques with Constant (Go=C), Ib(Vt = 25:7 mV, � = 0:7). : : : : : : : : : : : : : : : : : : : : : : : : 106

4.1 Characteristics of the �rst and second-order OTA-C �lters : : : : : 1174.2 Square magnitude transfer functions from all noise sources to Vout

for the second-order OTA-C �lter. : : : : : : : : : : : : : : : : : : : 118

A.1 List of MOS device parameters and quantities : : : : : : : : : : : : 162

B.1 Speci�cations, simulation results, and measurements for the BiC-MOS bu�er ampli�er. : : : : : : : : : : : : : : : : : : : : : : : : : : 182

Chapter 1

Introduction

1.1 Motivation

Emerging opportunities in information technologies point towards markets for

portable systems where battery operation, light weight and small size will be in

demand. A distinct characteristic of these systems is their direct interface to people

in real{world environments. Thus, with their widespread deployment, the perfor-

mance of the sensory communication interfaces, or what is often called the user

interface, is becoming rapidly a central issue. What are already computationally

di�cult problems in speech and vision become now even harder. Mobile opera-

tion necessitates that the sensory communication interface be capable of robust

operation under highly variable environmental conditions (Andreou, 1995).

An example of a challenging user interface is the automatic recognition of hu-

man speech. Typically, the incoming speech signal is digitized, transformed, and

compressed. Its linguistic attributes such as the phonemic sequence and word order

are then identi�ed using a sophisticated classi�cation scheme, such as a Hidden-

Markov Model. The preprocessing in state-of-the-art speech recognizers can be

classi�ed as short-term Fourier transform, linear prediction, cepstrum, and their

variations. These signal analysis or decomposition schemes are mathematically

oriented, and are based, to some extent, on a simpli�ed model of speech produc-

tion. The notable exceptions are the Bark and Mel-frequency scale �lter banks

which have roots in the psychophysics of human hearing. Although great strides

have been made in the last two decades, the best speech recognition systems today

still do not have performance comparable to the human auditory system. Particu-

larly di�cult at the early stage is, for instance, the identi�cation of rapid-changing

sounds such as stop consonants. This very fact suggests that perhaps one should

explore alternative signal representation schemes in order to model how speech is

processed by the human nervous system { the very best speech recognition sys-

tem. Indeed engineers can learn a great deal from the cochlea, the periphery of

the human auditory system.

Cochlear �lter banks abstract the function of the mammalian cochlea, in par-

ticular, the movement of the basilar membrane in response to acoustic vibrations.

From a signal processing viewpoint, cochlear �lter banks can be thought of as

multiresolution analyzers. As such, they retain good frequency resolution at low

frequencies and good temporal resolution at high frequencies. We distinguish be-

tween cochlear �lter banks and silicon cochleas, which incorporate some of the

more salient non-linearities of cochlear function. Examples of such nonlinearties

are automatic-gain control, recti�cation, saturation, and the like. In this work,

we restrict our study to a linear abstraction of cochlear function, but with the

understanding that non-linearities play an essential role in widely diverse acoustic

environments.

The tradeo� between time and frequency resolution is viewed as the fundamen-

tal di�erence between the conventional spectrographic analysis based on the short-

term Fourier transform and cochlear analysis for broadband, rapidly-changing sig-

nals, such as speech. It can be shown that linear �lter bank approximations to

cochlear models approximate wavelet analysis (Liu, 1992; Yang et al., 1992) in

a scale domain that preserves good temporal resolution. As a consequence, the

frequency of each spectral component in a broadband signal can be accurately

determined from the inter-peak intervals in the �lter bank output signals. Such

properties of cochlear models have been demonstrated with natural speech and

synthetic complex signals (Liu, 1992; Liu et al., 1992a; Liu et al., 1992b; Liu et al.,

1993).

The use of cochlear models as the front-end signal analyzer has been shown

experimentally to yield improved performance in small-scale speech recognition

systems (Ghitza, 1986; Meng and Zue, 1990). More recently, Chalapathy Neti

reported on a study with a commercially available speech recognition system in a

large{vocabulary, isolated{word task in the presence of additive babble noise (Neti,

1994). By employing an auditory based acoustic processor as a front end, he was

able to demonstrate a much more graceful degradation in system performance,

compared to the conventional FFT-based acoustic processing scheme. His main

results are summarized in Table 1.1.

Table 1.1: Percent Error Rates from (Neti, 1994).

Pre-processing 42.2 dB 36.7 dB 30.7 dB 24.7 dBMethod @ SNR @ SNR @ SNR SNR

Cochlear 4.71 % 4.71 % 6.35 % 22.54 %FFT-Based 3.69 % 4.51 % 16.80 % 40.37 %

The auditory processing employed in Neti's study begins with a software simu-

lation of the basilar membrane �lter bank proposed by Liu and analyzed in detail

in Chapter 4 of this work. The �lter bank is followed by a temporal feature ex-

traction scheme proposed by (Yang et al., 1992) and interfaced (Neti, 1994) to the

Hidden-Markov recognizer1.

Two hypotheses can be gleaned from Table 1.1. The �rst is that, in a clean

1The source code for Liu's model as well as Neti's additions can be found on CD-ROM

environment, an auditory model does not improve performance. The second is that

FFT-based models are not robust in the presence of additive background noise.

Both of these conjectures will be subject to debate over the next decade, as we seek

processing methods that not only give good performance in clean environments,

but also in the presence of noise.

To date, however, cochlear models have not been widely adopted by the speech

recognition research community. Two major obstacles exist. One is the compu-

tational complexity of the cochlear model itself; the other is the large amount of

analysis data a cochlear model generates. Cochlear models require much more

computing resources than conventional speech analysis approaches to the extent

that a reasonably accurate cochlear model can be computationally too expensive

for general-purpose digital computers. As a case in point, the time required to pro-

cess speech using the cochlear model in Neti's experiment was 120 times real-time

on a SPARC 2. Clearly, a real-time cochlear model would make this processing

scheme more attractive.

Much of this work and the work of many others has been devoted to the ef-

�cient hardware implementation of the �rst part of the auditory modeling task,

the cochlear �lter bank. The �rst real-time electronic cochlear �lter bank was in-

troduced by Lyon and Mead (Lyon and Mead, 1988). Their model consisted of

a cascade of 480 second-order sections, i.e., almost 1000 poles on a single chip,

operating in subthreshold CMOS. It consumed only milliwatts of power. Their

pioneering e�orts have been a great source of inspiration. Since that time, many

continuous{time analog VLSI implementations of cochlear models have been re-

ported, including those in (Liu et al., 1992b; Watts et al., 1992; Lazzaro et al.,

1993; Bhadkamkar, 1993; Andreou and Liu, 1993). Cochlear models have also

been reported using switched{capacitor circuits (Lin et al., 1994), and switched{

current techniques (Park et al., 1993). Even with real-time cochlear models, their

interface to digital computers has been problematic. An experimental setup to in-

terface Liu's cochlear model for real-time stimulation and recording was reported

in (Furth et al., 1994) and is described in appendix B. A second, more e�cient

solution to the problem and a discussion of system{level issues has been reported

recently by Lazzaro, Wawrzynek and Kramer (Lazzaro et al., 1994).

While research so far is aimed at improving various characteristics of the Lyon-

Mead design, such as improved linear range, low bit-rate communication protocols,

and more realistic phase characteristics, the issues of noise, distortion, and dynamic

range in the �lter banks have not been given a thorough treatment. This research

is intended to be a major step in that direction.

In addition to speech recognition, a real-time cochlear model can potentially

be applied to other signal processing tasks, such as tactile aids for the hearing-

impaired and cochlear implants. Integrated circuits are small and light in weight,

and in most of these applications it is also desirable that the device be low-power.

1.2 Approach

The goal of our work is to optimize the design of an acoustic processor at all levels

{ from the architecture or algorithms to the detailed circuit implementation. This

problem is constrained at one end by the attributes of the speech, and at the other

by the available fabrication technology. Further constraints are imposed by the

acoustic environment, such as ambient noise and distortions, as well as market

constraints as discussed earlier. Fig. 1.1 depicts this paradigm.

The technology of choice is CMOS, due to the possibility of very-large scale-

integration (VLSI), low-cost, and high reliability. But some question may arise as

to the most e�cient use of this technology, i.e. what signal representation should

be utilized. Analog? Digital? Continuous-time? Discrete-time? We address this

problem by considering the maximum possible information rate as a function of

Acoustic Source

Environment

Technology

Market

Architecture Circuits

Figure 1.1: Optimization paradigm.

power dissipated of four circuits each processing signals using a di�erent represen-

tation.

At the architectural level, we study the hardware implementation of the basilar

membrane model proposed by Liu (Liu, 1992). We extend his work, determining

the noise, distortion, and signal-handling capabilities of this �lter bank. As a

measure of goodness, we propose and estimate the information rate per Watt, in

bits per Joule, for a particular VLSI implementation of this �lter bank.

At the circuit level, the problem of designing high dynamic-range continuous-

time �lter is addressed. Using known properties of speech signals, the dynamic

range of a single �ltering element is computed. It is then extended using linearizing

techniques in order to greatly reduce undesirable signal distortions.

1.3 Dissertation Outline

The next chapter, Chapter 2, discusses a method for computing the dynamic

range of a CMOS transconductance-C integrator, covering the topics of input signal

statistics, noise, distortion, and dynamic range.

Chapter 3 introduces and analyzes several CMOS transconductor designs op-

erating in the subthreshold region. At least three of them have never been used

in subthreshold design { those with source degeneration via single and double

di�usors, and one which uses asymmetric di�erential pairs.

Chapter 4 applies the techniques and circuits of Chapter 2 and Chapter 3 to

the silicon implementation of a proposed cochlear model (Liu, 1992) using analog

VLSI technology.

Chapter 5 discusses limitations in information processing using continuous

and discrete systems. A new measure of goodness is proposed for mobile subsys-

tems, bits/sec/watt.

The results suggest future research directions, as found in Chapter 6.

Appendix A contains a symmetric large-signal and small-signal noise model

for subthreshold CMOS.

Appendix B gives details of the hardware setup for testing the VLSI imple-

mentations of �lter banks in large-scale speech recognition experiments.

Chapter 2

Dynamic Range of Integrators

for Continuous-Time Audio

Signal Processing in Analog

Given constraints on the available technology, supply voltage, total current con-

sumption, and die area, our goal is to optimize the circuit realization of an analog

�lter in terms of its achievable dynamic range. The design of optimum fully-

integrated continuous-time �lters requires the optimization of a single integrator,

as well as optimization of the �lter structure (Groenewold, 1991). In this chapter

we derive the dynamic range of a single integrator implemented as a lowpass �lter

as a basic building block for realizing continuous-time �lters for audio processing

in analog VLSI. We compute the dynamic range as a function of capacitance and

body e�ect coe�cient �.

The �rst real-time analog VLSI cochlear �lter bank was introduced by Lyon

and Mead (Lyon and Mead, 1988). Their model consisted of a cascade of 480

second-order sections, i.e., almost 1000 poles, operating in subthreshold CMOS.

Since that time, many continuous{time analog VLSI implementations of cochlear

models have been reported, including those in (Liu et al., 1992b; Watts et al.,

1992; Lazzaro et al., 1993; Bhadkamkar, 1993; Andreou and Liu, 1993). Cochlear

models have also been reported using switched{capacitor circuits (Lin et al., 1994)

and switched{current techniques (Park et al., 1993).

While research so far is aimed at improving various characteristics of the orig-

inal design, such as improved linear range, low bit-rate communication protocols,

and more realistic phase characteristics, the issues of noise, distortion, and dy-

namic range in the �lter banks have not been given a thorough treatment. Given

constraints on the available technology, supply voltage, total current consumption,

and die area, our ultimate goal is to optimize the circuit realization of an ana-

log continuous{time linear �lter bank in terms of its achievable dynamic range.

Whereas �nally a non-linear �lter bank with automatic gain control appears to

be necessary for addressing the wide dynamic range inherent to natural speech

sounds, here we restrict our research to the design of linear systems.

In this chapter we present a framework for the design of multiresolution ana-

log �lter banks implemented in subthreshold CMOS technology using the well-

developed random signals formalism. We begin with a discussion on the choice of

integrators, concluding that the transconductance-C is the preferable structure for

our problem. Models of the audio input signal are given in section 2.2. In sec-

tion 2.3, we compute the output-referred noise-level of a general transconductance-

C �lter using a white noise model for the transconductor. Three mean-square

distortion measures referenced to the input signal are de�ned in section 2.4. Vari-

ous performance ratios are used to characterize the transconductance-C integrator.

Then, by way of example, the dynamic range of the self-biased transconductance-C

integrator is derived in section 2.6.

The high side of the dynamic range of an integrator is the maximal signal

level it can handle. For applications in which linearity in the signal is of utmost

importance, the maximal signal level is the level at which distortion products

are just equal to the noise level. We refer to this type of dynamic range as the

distortion-free dynamic range. For applications in which a certain level of distortion

is tolerable, the maximal signal level is the level at which distortion products are

just equal to the maximal allowable distortion level. We call this type of dynamic

range the distortion-limited dynamic range. In this chapter, we will deal with both

types of dynamic range. They can be formally described by the equations

DFDR � V 2s

�� V2

DLDR � V 2s

�� V2

where V 2s , V

2d , and V 2

n are the mean-square signal, distortion, and noise values,

respectively, and c is the percent allowable distortion.1

2.1 CMOS Integrators

Groenewold (Groenewold, 1991) brie y describes four possible integrator struc-

tures. The passive conductance/passive admittance integrator, consisting of a lin-

ear resistor and a linear capacitor, is dismissed because its pole location is not at

the origin of the s-plane, i.e. it is not a pure integrator, but a lowpass �lter. As for

the active conductance/active admittance structure, consisting of a transconductor

and an active capacitance, its noise factor is increased, whereas its signal handling

capacity is reduced, compared to either of the two remaining circuit structures.

Thus, it cannot be used to achieve an optimal dynamic range.

The two types of integrators Groenewold (Groenewold, 1991) analyzes in depth

are the MOSFET-C and transconductance-C integrators, shown in Fig. 2.1.

The main advantage of the MOSFET-C integrator is that it can approach the

dynamic-range maximum, as de�ned by Groenewold (Groenewold, 1991). Its chief

disadvantage is the relatively narrow frequency tuning range of approximately half

1V denotes the mean, or average value, of the signal V .

an octave. Such a tuning range is generally adequate to compensate for para-

metric variations in the fabrication process; however, �lter banks for processing

wideband signals, such as speech, require a frequency tuning range of 2 decades or

more, encompassing much of the normal hearing range of 20 Hz to 20 kHz. A sec-

ond disadvantage of the MOSFET-C implementation is the need for a high-gain,

low-output-impedance ampli�er. Thus, one possible road towards the design of op-

timumdynamic-range integrators for speech processing is to develop a MOSFET-C

implementation with a much broader tuning range and a low power ampli�er.

The alternate approach employs transconductance-C integrators and is the one

adopted in our work. By exploiting the exponential current-to-voltage relation-

ship in subthreshold MOS transistors, integrators that span several decades in

frequency can be easily made. One has the added bonus of low power and low

voltage operation, as the device physics of subthreshold CMOS enable the lowest

possible saturation voltage (Vittoz, 1994; Andreou and Boahen, 1994). The main

disadvantage of transconductors operating in the subthreshold region is the rela-

tively poor linear range. There exist however several techniques to linearize these

inherently nonlinear transconductors (Watts et al., 1992; Tanimoto et al., 1991;

Vittoz, 1994; Furth and Andreou, 1995). In short, for low voltage, low power

systems, transconductance-C integrators appear to be the best, if not the only,

option (Tanimoto et al., 1991).

As explained by Groenewold (Groenewold, 1991), the use of the di�erential

signaling con�guration of Fig. 2.2 can increase the dynamic range by 6 dB. This

increase is because the capacitor voltage swing is doubled, whereas the e�ective

noise seen at the input is unchanged. In this preliminary study, we will examine

the dynamic range of the single-ended integrator. Thus, we expect an increase

in the dynamic range of up to 6 dB beyond that reported in this paper when we

employ a di�erential signaling scheme.

VoutVin G

GVin Vout

Figure 2.1: (a) MOSFET-C and (b) Transconductance-C integrators. Both havetransfer function G=C s.

G-Vin / 2

+Vin / 2C

-Vout / 2

+Vout / 2

Figure 2.2: Di�erential transconductance-C integrator.

2.2 Acoustic Input Signals

The e�ciency and performance of any information processing system, both hard-

ware and software, can be improved by incorporating prior knowledge in the design

phase. Indeed, this is the keystone for success in all statistical speech recognition

systems (Roe and Wilpon, 1994). In the problem at hand, we know a-priori that

the system will process speech signals which, in the framework of random signals,

can be described by linear statistics such as the mean and variance of the ampli-

tude. Two examples of prior knowledge which will be exploited in the synthesis

and characterization of the cochlear �lter bank are the input amplitude distribution

f(Vin) and power spectral density SV in(!).

Traditional methods of determining system performance assume a sinusoidal

input signal. The maximum signal level is then the level of the sinusoid such

that the distortion is acceptably low. Two common distortion measures are total

harmonic distortion and intermodulation distortion. The use of sinusoidal input

signals is perhaps appropriate when evaluating a narrow band �lter. On the other

hand, a cochlear �lter bank is likely to encounter various types of signals, ranging

from such broadband signals as speech and music, to narrowband signals, like

tones. In general, we �nd that sinusiodal signals overestimate the performance of

a cochlear �lter bank, as compared to more speech-like signals. In this section, we

consider the stastical properties of speech signals and incorporate them into the

evaluation of our �lter bank model.

2.2.1 Random Variables and Processes

A random variable can be formally described as follows (Papoulis, 1965). One

is given an experiment E, whose outcomes � are various objects belonging to a

sample space S. Now, to every outcome � one assigns a number according to the

real-valued function

x = x(�) (2:2)

Because the value of a random variable x is determined by the outcome of the

experiment, we may assign probabilities to the possible values of the random vari-

Given a real number x, the set fx � xg, consisting of all outcomes � such that

the inequality is satis�ed, is an event. De�ne then the probability distribution

function of the random variable x as

Fx(x) = Pfx � xg (2:3)

If the function Fx(x) is absolutely continuous, the random variable x is of

continuous type. The derivative

f(x) =dFx(x)

dx(2:4)

of the distribution function is called the probability density function.

A real stochastic, i.e. random, process is statistically determined if one knows

its nth order distribution functions

F (x1; :::; xn; t1; :::tn) = Pfx(t1) � x1; :::x(tn) � xng (2:5)

for any n and t1; :::; tn. The mean of a process x(t) is

�x(t) = E[x(t)] (2:6)

The autocorrelation function of a process is de�ned as

Rx(t1; t2) = E[x(t1)x(t2)] (2:7)

A random process x(t) is wide-sense stationary (or weakly stationary) if its ex-

pected value is a constant and its autocorrelation depends only on � = t1 � t2:

�x = E[x(t)] Rx(� ) = E[x(t)x(t+ � )] (2:8)

The one-sided power spectrum (or spectral density) Sx(!) of a wide-sense sta-

tionary process is the Fourier transform of its autocorrelation:

Sx(!) =Z 1

Rx(� )e�j�!d� (2:9)

with inverse

Rx(� ) =1

Sx(!)ej�!d! (2:10)

Substituting � = 0, the above equation yields

Rx(0) = E[x2(t)] =1

Sx(!)d! (2:11)

Thus the total area of the power spectrum (normalized by 2�) equals the \average

power" of the process x(t).

A property that we will make use of is that if a wide-sense stationary random

process x(t) undergoes a linear transformation, as in

y(t) = L[x(t)] (2:12)

then y(t) is also stationary in this sense. Let H(j!) be the transfer function of the

linear transformation L[�]. Then the power spectrum of the output process y(t) is

given by

Sy(!) = Sx(!) jH(j!)j2 (2:13)

2.2.2 Speech

Let speech be the primary acoustic source. Assuming it is a wide-sense stationary

random process, one can compute the amplitude probability density and power

spectral densities from a standard database. Amplitude histograms of the time-

domain waveforms of voiced and unvoiced segments of speech have a maximum

near zero and sides which decrease exponentially as the amplitude moves away

from zero. Two probability density functions which �t this description are the

Gaussian, or normal, and the two-sided Gamma. The normal probability density

was considered by Max (Max, 1960) in the problem of minimizing the distortion in

speech coders. The Gamma distribution was discussed by Paez and Glisson (Paez

and Glisson, 1972) for the same problem. By constructing amplitude histograms

of raw speech signals they show that the best approximation is obtained using the

Gamma distribution with shape parameter � = 0.5.

The probability density function of the two-sided Gamma distribution can be

written in the following form (Rice, 1988):

f(x) =��

2�(�)

e��jxj

jxj1�� (2:14)

It has mean-absolute value �=�, and variance �(�+1)=�2. If we de�ne the variance

as �2 and eliminate the variable �, the distribution can be rewritten in the form

f(x) =(�2 + �)�=2

2��(�)

e�jxjp�2+�=�

jxj1�� (2:15)

For the special case that � = 0:5, the probability density function reduces to

f(x) =

vuut p3

8��jxje�p

3=4jxj=� (2:16)

In order to test this hypothesis, the amplitude histogram of the entire training

set from the TIMIT database was computed . Results are given in Fig. 2.3. Indeed,

the Gamma distribution appears to be a good �t for the data. Using the method

of moments (Rice, 1988), a better estimate for this database appears to be � is

0.267. The di�erence between the two models is seen chie y in the tails of the

distribution. For amplitudes less than �ve standard deviations from the mean

value, zero, either distribution appears adequate, as evidenced by Fig. 2.3(b).

One practical di�culty of the double-gamma distribution is that its amplitude

is unbounded. And, while the tails of the distribution fall o� quickly, of the order

e�jxj, it is not fast enough for at least one transconductor design, the CMOS

−20 −15 −10 −5 0 5 10 15 2010

10−8

10−6

10−4

10−2

(a) x, std. dev.

−5 0 510

10−3

10−2

10−1

(b) x, std. dev.

Figure 2.3: (a) Histogram of instantaneous values taken from the training set ofthe TIMIT database normalized by the standard deviation. Experimental data aremarked by x's. The dotted line is a �t using the double-gamma distribution with� = 0:5. The solid line is for � = 0:267. Note the e�ect of clipping at the sides ofthe graph. (b) A close view of the central portion of the graph in (a).

inverter, for which the distortion grows at just the same order. One possible

remedy to this situation is to introduce a bounded double-gamma distribution,

which is clipped, say, at �ve standard deviations.

In addition, we consider three other types of input signals: those that have

uniform, sinusoidal, and normal statistics. The uniformly distributed signal is at-

tractive because it is mathematically tractable and represents a situation that is

not too far from the case of a pure tone. As shall be discussed in chapter 5, a uni-

formly distributed input signal results in the highest information rate for a circuit

that has a peak amplitude constraint. The second type of input signal is classical

in �lter theory and analysis, pure tones and sums of pure tones. The amplitude

distribution of a pure tone looks U-shaped and is derived below. In addition, sums

of pure tones can be used to study intermodulation frequency distortion. The third

type of input signal, normal, is the most similar to natural sounds, such as speech

and music.

To derive an equation for the cosine amplitude distribution, we make use of

basic probability theory. Let the input signal x be given by

x = g(t) =p2� cos

�2�t

�(2:17)

where � is the RMS value and � is the period.

Now suppose that t is a uniform random variable on the interval [0; 0:5=� ],

comprising one-half of a cycle of the cosine. Let f� (t) denote the probability

density function of t, given by

f� (t) = 2� (2:18)

on this interval, and zero everywhere else.

On the same interval, the function g(t) is monotonically decreasing, and there-

fore, an inverse function exists, namely,

t = g�1(x) =�

2�arccos

xp2�

!(2:19)

Table 2.1: Probability density functions of four input signal distributions as afunction of �.

Uniform Cosine Value Normal Gamma

f(x) 1

2p3�

1p2��p

1�x2=2�21

�p2�e�x

2=2�2r p

8��jxje�p3jxj=2�

Range �p3��x�p3� �p2��x�p2� �1<x<1 �1<x<1

From probability theory, the distribution function of the amplitude of x on that

interval is given by (Ross, 1988),

f(x) = f� [g�1(x)]

�� ddxg�1(x)�� (2:20)

Because the cosine amplitude distribution is the same in the �rst half-cycle as

in the second, it follows that the cosine amplitude density function is equal to

f(x) =1p

2��q1 � x2=(2�2)

(2:21)

Note that the cosine amplitude distribution is independent of the period, � , and

depends only on the standard deviation, �.

All input distributions considered in this work have zero mean. They are also

symmetric about their mean value. We equalize the input signals in terms of their

variance. Probability densities are written as functions of �, as in Table 2.1. The

shapes of these distributions are shown in Fig. 2.4. For simplicity we adopt the

gamma distribution with parameter � = 0:5.

2.3 Noise

The noise level represents the smallest signal level that can be adequately processed

by a circuit. In this section, we consider the noise sources of a single integrator,

which is the basic element of a linear �lter. The noise level of a pure integrator

cannot be determined directly. Therefore, we compute the noise level in a unity-

gain single-order lowpass �lter using such integrators. More generally, the noise

−4 −3 −2 −1 0 1 2 3 40

(a) Input Amplitude [V/sigma]

−4 −3 −2 −1 0 1 2 3 40

(b) Input Amplitude [V/sigma]

Figure 2.4: The probability density functions of (a) the uniform (solid) and cosinevalue (dotted) and (b) the normal (solid) and double-sided gamma (dotted) randomvariable.

level in a circuit depends not only on the noise sources of the basic element, but

also on the speci�c �lter architecture. 2

2.3.1 Input-Referred Noise Power Spectrum

A transconductance-C integrator consists of a noiseless capacitance C and a noisy

transconductor. The transconductor can be modeled accurately as a noiseless

transconductance G in parallel with a noisy current source Iout;n as shown in

Fig. 2.5(a). 3

By convention, the noise in a transconductor is referred to the input voltage by

dividing the output current noise by the transconductance, as shown in Fig. 2.5(b),

Vin;n =Iout;nG

(2:22)

Referring the output current noise to an equivalent input noise voltage is convenient

for voltage-mode circuits, i.e. circuits for which the input and output signals are

voltages, rather than currents. As we shall see the output noise voltage can be

computed easily from the equivalent input noise voltage and the �lter transfer

function.

Let SIout;n(!) be the (one-sided) power spectral density of the output current

noise. Then, from (2.22) and (2.13) the input-referred power spectral density is

given by

SV in;n(!) =SIout;n(!)

G2(2:23)

2.3.2 Output-Referred Noise Level

From (2.13) the output-referred noise spectrum is the product of the input-referred

noise spectrum and the square magnitude of the �lter transfer function between

2It is found that the noise level of a nonlinear transconductance function depends very weakly

on the input signal level and distribution. We ignore these secondary e�ects in this work.3Throughout this work, the subscript n refers to a noise signal.

Vin VoutG

Vin Vout

Iout,n

Figure 2.5: Noise models for the transconductance-C integrator: (a) referred tothe output current, (b) referred to the input voltage, where Vin;n = Iout;n=G.

the input of the tranconductor and the output node. Let H(!) be the transfer

function. Then

SV out;n(!) = SV in;n(!) jH(!)j2 (2:24)

In order to compute the output-referred noise level, or mean-square value, ac-

cording to (2.11), we integrate the output noise power spectral density over all

radian frequencies and normalize by 2�, as in

V 2out;n =

SV out;n(!) d! (2:25)

For the special case that SV in;n(!) is constant (white) of the form

SV in;n(!) =4kT

Go� (2:26)

the output-referred noise level can be written in the form

V 2out;n =

Go�ENBW (2:27)

where xi is the noise factor (unity for a pure resistor), Go is the nominal conduc-

tance or transconductance, and

ENBW � 1

Z 10jH(!)j2 d! (2:28)

is the equivalent noise bandwidth of the circuit.

We want to compute the output-referred noise level for the transconductance-C

integrator; however, the noise level in a perfect integrator is theoretically in�nite.

To show this, we need to compute ENBW for the integrator. The transfer function

between the input signal and the output of a pure integrator can be written as

HI (!) =G

j!C(2:29)

From (2.28) the equivalent noise bandwidth is given by

ENBW � 1

!2C2d! (2:30)

This integral is unbounded.

Therefore, we need to assume a particular �lter structure in order to compute

the output-referred noise level. We choose the simplest �lter for this purposes, the

unity-gain �rst-order lowpass, with transfer function is of the form

HLP (!) = 1=(1 + j!=!c) (2:31)

where !c = Go=C.

For the case that H(!) = HLP (!), we have

ENBW =1

1 + !2=!2c

d! (2.32)

Then the mean-square output voltage noise is

V 2out;n =

C� (2:33)

If a �lter has more than one transconductor, we assume that the noise produced

by each transconductor is statisically independent. In this case, by the principal

of superposition, we can sum the contributing e�ects of each noise source at the

output node by considering one noise source at a time. Suppose a �lter has N

transconductors. Compute the transfer function from the input of each transcon-

ductor to the output. Then multiply the input-referred power spectral density of

each transconductor by the square magnitude of its transfer function. At the out-

put node, sum the �ltered power spectra from each noise source. Then integrate

to �nd the mean-square output noise voltage. A small example of a �rst-order

lowpass �lter with two transconductors appears at the end of this chapter.

2.4 Distortion

Gray et al. (Gray et al., 1980) de�ne a distortion measure as the assignment of a

nonnegative number to a pair of ideal and real quantities, where the real quantity

is intended to be a reasonable approximation of the ideal quantity. The distortion

measure must be zero whenever the two quantities match exactly. To be useful, a

distortion measure must satisfy the following three properties: 1) it is subjectively

meaningful, in the sense that large distortion corresponds to poor quality repro-

ductions and small distortion corresponds to good quality reproductions, 2) it is

mathematically tractable so that it leads to good design techniques, and 3) it can

be computed e�ciently.

They state that the most common distortion measure is the squared error

largely because it is tractable and computable. It is also the most common way of

dealing with distortion in the context of random signals. However, for low-bit-rate

speech systems, they assert that such a distortion measure does not appear to be

always subjectively meaningful. As an example, a \shh" sound is essentially a

white process and any typical waveform will sound the same. In order to satisfy

the property of being subjectively meaningful, several distortion measures have

been introduced which are based on the di�erence between the log spectra (Gray,

Jr. and Markel, 1976) of the ideal and real signals.

Although subjectively meaningful distortion measures are desirable when eval-

uating the performance of audio processing systems, we do not have the mathe-

matical tools yet needed to evaluate and design continuous-time �lters based on

magnitude spectra of the ideal versus real �lter response. Perhaps a suitable mea-

sure of distortion would be to use the absolute di�erence of power spectra, as

V 2out;d =

Z 10jSout;r(!)� Sout;i(!)j d! (2:34)

where Sout;r is the actual output power spectrum including the e�ects of nonlinear

�ltering, while Sout;i is the output power spectrum that would have resulted from

an ideally linear system. This distortion measure is possibly more subjectively

meaningful than the classical mean-square measure. In addition, it is a power

measure, and therefore �ts neatly into the framework for computing dynamic range,

which calls for the distortion power.

Nevertheless, we do not attempt in this work to compute the actual output

power spectrum, with all essential nonlinearities present in the �lter. Therefore, in

this research, two mean-square error measures are applied to the transconductor

alone, but not to the overall �lter response.

2.4.1 Input-Referred Distortion

Let the input voltage Vin follow a known amplitude distribution with probability

density function f(Vin), where Vin = 0 is assumed to be the operating point.

This point may be �xed, or dynamically obtained through adaptation, where it

is assumed to introduce no distortion. The input voltage passes through a non-

linear voltage-to-current function Iout(Vin). One can view the nonlinear function

as resulting from a nonlinear voltage transformation h(Vin) multiplied by a linear

transconductance Go, as in

Iout(Vin) = Goh(Vin) (2:35)

Go � @Iout(Vin)

��Vin=0

(2:36)

that is, Go is equal to the slope of Iout(Vin) at the operating point.

In Fig. 2.6 a sample transconductance function is plotted, along with a line

whose slope is the nominal transconductance, Go.

First Distortion Measure

Referred to the input voltage, the �rst distortion measure is the mean-square error

of the actual input voltage minus the transformed input voltage h(Vin), as in

V 2in;d1 � E

h(Vin � h(Vin)

(2:37)

−50 0 50−4

Input Amplitude [mV]

Figure 2.6: Sample transconductor output current as a function of input volt-age. The solid line is the actual nonlinear function; the dashed line is a linearapproximation for which the slope is the nominal transconductance, Go.

In Fig. 2.7(a) the transformed input voltage h(Vin) is plotted as a function of

the input voltage. The �rst distortion measure computes the mean-square error

between this curve and Vin, which is the straight dashed line with a slope of unity.

Using (2.35) to eliminate h(Vin), the �rst distortion measure can be written in

the form

V 2in;d1 � E

24 Vin � Iout(Vin)

!235 (2.38)

= E[V 2in] +

E[I2out(Vin)]

� 2 E[VinIout(Vin)]

where the expectation operation may be with respect to a deterministic time-

domain signal, e.g. a pure tone, or with respect to a stationary input amplitude

distribution. For the case of the latter, we have

V 2in;d1 �

Z 1�1

f(Vin)

Vin � Iout(Vin)

dVin (2:39)

Earlier in this chapter, we computed the amplitude distribution of a pure tone, so

that even a pure tone may be cast in a random variable framework. As a result,

for each type of input signal considered, we may apply (2.39) for the computation

of the �rst distortion level.

Second Distortion Measure

The second distortion measure involves the minimization of the mean square error

with respect a gain � times the nonlinear transformation, h(Vin). It is de�ned as

V 2in;d2 � min

�Eh(Vin � �h(Vin))

(2:40)

where � is a real number. In Fig. 2.7(b) the transformed input voltage h(Vin) times

a gain � = :84 is plotted as a function of the input voltage. Assuming that this

value of � results in the lowest mean-square error, the second distortion measure

−50 0 50

(a) Input Amplitude [mV]

−50 0 50

(b) Input Amplitude [mV]

Figure 2.7: Nonlinearly transformed input voltage, h(Vin), as a function of theinput voltage for a sample transconductor. (a) The �rst distortion measure iscomputed as the mean-square error between h(Vin), the solid line, and Vin, thedashed line. (b) The second distortion measure is computed as the minimummean-square di�erence between h(Vin) times a gain factor �, the solid line, andVin, the dashed line. In this plot, � = 0:84.

computes the mean-square error between this curve and Vin, which is the straight

dashed line with a slope of one.

Using (2.35) to eliminate h(Vin), the second distortion measure can be written

in the form

V 2in;d2 = min

24 Vin � �Iout(Vin)

!235 (2.41)

= min�

in] +�2 E[I2out(Vin)]

(Go)2� 2 � E[VinIout(Vin)]

Taking the expectation with respect to a known amplitude distribution, f(Vin), we

V 2in;d2 � min

Z 1�1

f(Vin)

Vin � �Iout(Vin)

dVin (2:42)

The optimal gain factor �� is that gain factor which results in the minimum

mean-square error. Its value depends on both the transconductor and the input

distribution. It can be solved for, as in

�� =E[GoVinIout(Vin)]

E[I2out(Vin)](2:43)

Substituting (2.43) into (2.42), after some simpli�cation, we get

V 2in;d2 = E[V 2

in]�E2[VinIout(Vin)]

E[I2out(Vin)](2:44)

The second distortion measure can be written as a function of the �rst, as in

V 2in;d2 = V 2

in;d1 � (1� ��)2E[I2out(Vin)]

(Go)2(2:45)

From (2.45) we see that V 2in;d2 � V 2

in;d1.

The second distortion measure is the error which results from a least-square

error estimate of the input voltage given the output current. This error is orthog-

onal to the input signal (Papoulis, 1965). Thus, if the input signal is a pure tone,

the error represents harmonic distortion. Similarly, if the input signal is the sum

of two tones, the error represents a combination of harmonic and intermodulation

distortions. By contrast, for a sinusoidal input signal, the �rst distortion measure

is the sum of harmonic, intermodulation, and gain distortions.

Absolute Maximum Deviation

An alternate measure of distortion in transconductors is the absolute maximum

deviation. It is used in (Tanimoto et al., 1991). This distortion measure is more

restrictive than the two mean-square error measures (Gray et al., 1980). Let G(Vin)

be the actual, nonlinear tranconductance as a function of Vin, where

G(Vin) � @Iout(Vin)

@Vin(2:46)

and Go be the nominal value. 4 De�ne the maximumnormalized transconductance

distortion as

DG � max

��G(Vin)Go� 1

�� 8 Vin (2:47)

In Fig. 2.8 the normalized transconductance is plotted as a function of the input

voltage for the same sample transconductor. Ideally, this function would be at,

equal to unity. The maximum normalized transconductance distortion is the max-

imum distance between the two curves for a given input amplitude range. For

example, if Vin is restricted to only �5 mV, the distortion is approximately .01 or

1%. On the other hand, if the range of Vin is extended to �16 mV, the distortion

is approximately 0.10 or 10%.

While this distortion measure is not a power measure, it is possible to compute

a power measure using it, by squaring DG and multiplying by the input signal

power, Vin;s. More formally, one can de�ne an input-referred distortion measure,

V 2in;dg, as

V 2in;dg = V 2

in;s(DG)2 (2:48)

4For the case that the transconductance function has more than one in exion point, such

as an equiripple design, substitute the global maximum or global minimum for the nominal

transconductance.

−15 −10 −5 0 5 10 150.9

Input Amplitude [mV]

Figure 2.8: The normalized transconductance plotted as a function the input volt-age for a sample transconductor, solid line. The dashed line is the ideal normalizedtranconductance, equal to unity. The maximum normalized transconductance dis-tortion is the maximum distance between the two curves.

The chief disadvantage of such a distortion measure is that it applies only to

bounded input signal distributions, such as the cosine and uniform distributions.

Unbounded distributions would have to be clipped, i.e. distorted, prior to apply-

ing such a distortion measure. In Chapter 3 we apply the maximum normalized

transconductance distortion measure to several transconductor designs.

2.4.2 Output-Referred Distortion

The input-referred distortion level can be referred to the output current of a

transconductor by multiplying by the square of the nominal transconductance.

However, it is not clear how the input-referred distortion level of a transconduc-

tor relates to the distortion level of a transconductance-C �lter, where the output

current is now integrated on a capacitor to become a voltage. 5

In particular, let us consider the nontrivial example of a lowpass �lter with

transfer function HLP (!) = 1=(1 + j!=!c), and !c = G=C. Now, if the input

frequency ! is much less than !c, the output voltage will follow the input voltage,

no matter how much distortion is present in the transconductor G. However, for

! >> !c, then the error in the output amplitude is on the order of !c=!.

For the present work, we assume that the input-referred distortion is equal to

the output-referred distortion, regardless of the number of transconductors present

in the �lter, except for the case of non-unity gain. In this case, we multiply by the

square of the gain.

2.5 Dynamic Range

Various performance ratios exist for quantifying a circuit's behavior. The best

known is the signal-to-noise ratio (SNR). Another common �gure-of-merit is the

5One possible exception is the case of a constant-gain ampli�er with only one transconductance

element. In this case, the output-referred distortion level is equal to the input-referred distortion

level times the square of the gain.

Table 2.2: Performance ratios.

DNR SDR SNR SDNRV 2

V 2n+V 2

signal-to-distortion-plus-noise ratio (SDNR). In order to compute the distortion-

free and distortion-limited dynamic range, we also introduce the distortion-to-noise

(DNR) and signal-to-distortion ratios (SDR). These four performance ratios are

summarized in Table 2.2.

For convenience, we revisit the de�nitions of dynamic range as found in (2.2)

DFDR � V 2s

�� V2

DLDR � V 2s

�� V 2

Dynamic range can be viewed as the signal-to-noise ratio evaluated at the maximal

input level. For the distortion-free dynamic range, the maximum signal level is

the level such that the distortion-to-noise ratio is equal to one. By contrast, the

maximal signal level for the distortion-limited dynamic range is the level such that

the signal-to-distortion ratio is equal to a constant c2. Because SDR is a ratio of

powers, c will be a ratio of amplitudes, which permits comparison with the well

known measure of total harmonic distortion.

The signal-to-noise-plus-distortion ratio of a signal is the signal level divided

by sum of the noise and distortion levels. The characteristic shape of this ratio is

that it rises with a slope of 20 dB/dec for small input signals, reaches a maximum

value when the distortion level becomes roughly equal to the noise level, and then

rapidly declines for large signal levels.

We could have written the performance ratiosDNR, SDR, and SNR as functions

of the RMS value, �. If we do so, an expression for the distortion-free dynamic

range is

DFDR = SNR(�)jDNR(�)=1(2:49)

If the distortion-to-noise ratio has an inverse function (which it typically has), we

can write

DFDR = SNR(DNR�1(1))) (2:50)

Additionally, an expression for the distortion-limited dynamic range can be

written as

DLDR = SNR(�)jSDR(�)=c2(2:51)

If the distortion-to-signal ratio has an inverse function (which it typically has), we

can write

DLDR = SNR(SDR�1(c2)) (2:52)

2.6 Example: Self-biased Transconductance-C

Integrator

We have established the theoretical framework for computing the dynamic

range of a linear continuous-time transconductance-C integrator. In this section,

we compute the dynamic range for a particular design example: the self-biased

transconductance-C integrator.

In order to achieve a low-noise design, a class-AB transconductor without cur-

rent mirrors is sought. As a starting point, the two-transistor circuit of Fig. 2.9

is selected. This transconductor is tunable over a wide frequency range via the

supply voltages or substrate terminals. Because it operates in a push-pull fashion,

it has the most favorable noise properties among all CMOS transconductor con�g-

urations at a given bias current (Groenewold, 1991; Vittoz, 1994). Moreover, all

even order harmonics are canceled by this push-pull e�ect. This circuit has been

used e�ectively by Nauta (Nauta, 1992) in very high frequency �ltering applica-

tions because it has no internal nodes. In the following subsections, the dynamic

range is derived for the self-biased transconductance-C integrator con�gured as a

lowpass �lter.

2.6.1 Output Current and Transconductance

We want to �nd an expression for the output current Iout in Fig. 2.9. From (A.12)

and noting that the source and substrate are at the same terminal for each tran-

sistor, we can write

In = I0Se�(Vin+Vdd=2)=Vt (2.53)

Ip = I0Se�(Vdd=2�Vin)=Vt (2.54)

Let Ib denote the current through each of the two transistors when Vin is zero, i.e.

Ib = I0Se�Vdd=(2Vt) (2:55)

Note that it is imperative that Vdd=2 be less than the threshold voltage of the

transistor, in order to insure subthreshold operation of the device. Then the output

current, Iout, can be written as

Iout = �2Ib sinh(�Vin=Vt) (2:56)

The slope of the output current function Iout(Vin) is given by

G(Vin) � @Iout@Vin

=�2Ib�Vt

cosh(�Vin=Vt) (2:57)

while its nominal value is

Go =�2Ib�Vt

(2:58)

In Fig. 2.10 the output current and transconductance is plotted for the self-biased

transconductor.

Vin Vout

(a) (b)

Figure 2.9: Self-biased transconductance-C integrator (a) circuit and (b) symbol.

−1 −0.5 0 0.5 1

(a) Input Amplitude [Vt/kappa]

−1 −0.5 0 0.5 10

(b) Input Amplitude [Vt/kappa]

Figure 2.10: For the self-biased transconductor (a) output current in units of Ib as afunction of the input voltage in units of Vt=�, and (b) normalized transconductanceG=Go as a function of input voltage. Note that the transconductance function isconvex.

2.6.2 Noise

For the case of the self-biased transconductor, we compute the input-referred

noise power spectral density using the model of Fig. 2.11. Fig. 2.11(a) models

the noise properties of a single integrator. Assuming the noise produced by each

of the two transistors is independent, the output current noise power spectrum

(one-sided) is given by

SIout;n(!) = 2q(Ib + Ib) = 4qIb (2:59)

Note that the output current noise power spectrum is technically a function of the

input voltage, since the total current passing through the two transistors is not

constant (= 2Ib), but grows in the shape of a cosh function. This dependency is

a second-order e�ect, and perhaps even con icts with the proposed framework for

computing dynamic range.

We relate the output current noise power spectrum to the input-referred noise

power spectrum as in

SV in;n(!) =4qIbG2

=2qIbjGj

Vt�Ib

jGj � (2.60)

where � = 1=(2�) for this transconductor. To obtain the last equality, recall that

Vt � kT=q.

In Fig 2.12(a) shows the circuit of a lowpass �lter using the self-biased transcon-

ductance-C integrator. We assume a linear small-signal model, where G1 = G2 =

Go is the nominal transconductance. Its transfer function is HLP (!) = 1=(1 +

j!=!c), where !c = Go=C. As such, the transfer function from the input of G1 is

HLP (!). The transfer function from the input of G2 is also HLP (!). Therefore,

the noise power spectrum at the output node is

SV out;n(!) =4kT

jG1j�jHLP (!)j2 + 4kT

jG2j�jHLP (!)j2 (2:61)

SI(ω) =

4kT|G|

Figure 2.11: Noise model of self-biased transconductor (a) referenced to the outputcurrent and (b) referred to the input voltage.

Vin Vout

Vin - G2- G1 Vout

4kT|G2|

ξ4kT|G1|

Figure 2.12: Self-biased transconductance-C inverting lowpass �lter (a) circuit and(b) noise model.

Integrating over all radian frequencies, the output-referred noise level is

V 2out;n =

C� (2:62)

where, as before, � = 1=(2�).

This equation is perhaps familiar. The well known equipartition theorem tells

us that in a network of resistors with a single node capacitance, the noise level is

equal to kT=C, independent of the conductance values (Sarpeshkar et al., 1993). In

our network of diode-connected MOS transistors, the noise level is independent of

the bias current, Ib, and, consequently, the output impedance. However, the noise

level appears to be inversely proportional to 2� = 1=�. Thus, for � < 1 the noise

level is higher as compared to a network of linear resistors. For devices in which �

has the value of unity, such as bipolar and junction-FET transistors operating in

the subthreshold region, they yield the lowest noise possible in a solid-state device.

And yet, if there existed devices which had � > 1, (2.62) predicts that the noise

level would be lower than that of a resistive network.

2.6.3 Distortion

Let the current-to-voltage function Iout(Vin) = �2Ib sinh(�Vin=Vt), the output cur-rent of the self-biased transconductor, and Go = �2Ib�=Vt, the nominal transcon-

ductance. From (2.39), the �rst distortion measure can be written as

V 2in;d1 �

Z 1�1

f(Vin)�Vin � Vt

�sinh(�Vin=Vt)

�2dVin (2:63)

where f(Vin) is the probability density function of the input voltage, Vin. Using

numerical techniques, this integral is computed for each of three input distributions

for a range of values of �, the input RMS value.

For example, for the normal distribution, where f(Vin) = 1=(�p2�)e�V

in=2�2,

the �rst distortion measure is computed as

V 2in;d1 �

Z 1�1

�p2�

e�V2

in=2�2

�Vin � Vt

�sinh(�Vin=Vt)

�2dVin (2:64)

A plot of the �rst distortion level as a function of the input voltage RMS value

� is shown in Fig. 2.13 for the cosine amplitude, uniform, and normal distributions.

The cosine amplitude distribution consistently gives the lowest distortion, because

it has the tighest bound, �p2�, while the normal distribution consistently results

in the highest distortion, because it is unbounded.

Using 2.39, the �rst distortion measure can also be rendered in the following

V 2in;d1 =

�E[(�Vin=Vt)

2] + E[sinh2(�Vin=Vt)]

� 2 E[�Vin=Vt sinh(�Vin=Vt)]) (2.65)

Using the trigonometric identity sinh2(x) = �1=2 + 1=2 cosh(2x), and expanding

sinh(x) and cosh(x) into a power series, we obtain the equation

V 2in;d1 =

0@ E[(�Vin=Vt)

6 � 2)

+ E[(�Vin=Vt)8](2

8 � 2)

7!+ :::

1A (2.66)

From (2.66), we note that, independent of the input distribution, the �rst dis-

tortion level depends on only even order moments beginning with the sixth. For

the uniform, cosine value, and normal distributions, the sixth moment is propor-

tional to �6. Thus, we anticipate a slope of 60dB/dec for small values of �. As �

increases, this slope will increase also.

We want to compute the input-referred distortion using the second measure

for the same input signal distributions and transconductance function. The op-

timal gain factor �� is that gain factor which minimizes the mean-square error

between the input voltage and the nonlinearly transformed input voltage. It can

be computed from (2.43) as

�� =E[�Vin=Vt sinh(�Vin=Vt)]

E[sinh2(�Vin)](2:67)

10−2

10−1

−160

−140

−120

−100

(a) Input Sigma [Vt/kappa]

Figure 2.13: The �rst distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).

Using numerical techniques, the optimal gain factor is computed for each of three

input distributions. For example, for the cosine amplitude distribution, the optimal

gain factor can be found as the ratio of two integrals, as in

�� =

Rp2��p2�

1p2��p

1�V 2

in=2�2

�VinVt

sinh(�VinVt

) dVin

Rp2��p2�

1p2��p

1�V 2

in=2�2

sinh2(�Vin) dVin(2:68)

The results of these computations are shown in Fig. 2.14.

Returning to (2.67), one can expand the functions sinh(x) and sinh2(x) using

a Taylor series. If we denote

D(Vin) = 1 +23 E[(�Vin=Vt)4]

4! E[(�Vin=Vt)2]+ ::: (2:69)

then a series approximation of the optimal gain factor is as follows

�� =1

D(Vin)

E[(�Vin=Vt)4]

3! E[(�Vin=Vt)2]+ :::

!(2.70)

From (2.70) and (2.69), one can show that �� approaches one as � gets very small.

This is precisely the behavior of the curves in Fig. 2.14.

The second distortion measure is computed using the optimal gain factor. It

can be computed from (2.42), as in

V 2in;d2 =

Z 1�1

f(Vin)�Vin � ��

Vt�sinh(�Vin=Vt)

�2dVin (2:71)

where f(Vin) is the probability density function of the input voltage, Vin. In

Fig. 2.15 we plot the second distortion level as a function of � for three input

distributions.

The second distortion level is more di�cult to estimate using series approxima-

tions than the �rst distortion level. Starting from (2.44), and after much algebra,

we �nd

V 2in;d2 =

D(Vin)

0@ E[(�Vin=Vt)

6� 2)

5!� E2[(�Vin=Vt)4]

E[(�Vin=Vt)2]

3! 3!+

E[(Vin=Vt�)8](2

8� 2)

7!� E[(�Vin=Vt)4] E[(�Vin=Vt)6]

E[(�Vin=Vt)2]

3! 5!+ :::

1A (2.72)

10−2

10−1

Input Sigma [Vt/kappa]

Figure 2.14: The optimal gain factor �� as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).

10−2

10−1

−160

−140

−120

−100

Input Sigma [Vt/kappa]

Figure 2.15: The second distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).

If we compare the lowest order terms in the series approximations for the �rst and

second distortion levels, we �nd the second distortion level to be 4 � 10 dB less

than the �rst.

2.6.4 Dynamic Range

The signal-to-noise-plus-distortion ratio using the �rst and second distortion mea-

sures is given in Fig. 2.16. From this �gure we see qualitatively that there is an

optimum input signal level at which to operate, depending on the distortion mea-

sure used and the input signal distribution. The optimum signal level occurs in

the vicinity of the peak value for the signal-to-noise-plus-distortion ratio. For in-

put levels below this peak, the signal-to-noise ratio is low, whereas for input level

above this peak, the signal-to-distortion ratio is low.

From �rst-order approximations using the series approximations derived earlier

and truncating after the �rst term, we expect the distortion-free dynamic range

using the �rst distortion measure to have a 6:67 dB/dec slope as a function of the

integrating capacitance C and a �6:67 dB/dec slope as a function of the body-e�ectcoe�cient �. For nominal parameter values of C = 5 pF and � = 0:7, DFDR1 is

in the range 41.7-44.2 dB. In Fig. 2.17(a) we plot the �rst distortion-free dynamic

range as a function of capacitance for three input distributions. Fig. 2.17(b) shows

the �rst distortion-free dynamic range as a function of �. Similar plots can be

generated using the second distortion measure, as in Fig. 2.18. The slope of the

second distortion-free dynamic range is identical to that of the �rst, whereas its

level is 1.3{3.4 dB higher.

Using �rst-order approximations, one can show that the distortion-limited dy-

namic range is a slightly stronger function of C and � than the distortion-free

dynamic range. Speci�cally, we anticipate a 10 dB/dec slope as a function of C

and a �10 dB/dec slope as a function of �. The distortion-limited dynamic range

10−2

10−1

(a) Input Sigma [Ut/kappa]

10−2

10−1

(b) Input Sigma [Ut/kappa]

Figure 2.16: The signal-to-noise-plus-distortion ratio using (a) the �rst and (b) thesecond distortion measure as a function of � in units of Vt=�, where Vt = 25:7 mV,� = 0:7 and C = 5:0 pF. Curves are drawn for each of three input distributions:uniform (solid), cosine value (dashed), and normal (dotted).

(a) Capacitance [pF]

10−1

(b) kappa

Figure 2.17: The distortion-free dynamic range using the �rst distortion measure(a) as a function of C, where � = 0:7, and (b) as a function of �, where C = 5:0 pF.Vt = 25:7 mV. Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).

10−1

(b) kappa

Figure 2.18: The distortion-free dynamic range using the second distortion measure(a) as a function of C, where � = 0:7, and (b) as a function of �, where C = 5:0 pF.Vt = 25:7 mV. Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).

using the �rst distortion measure is in the range of 45.5{49.35 dB for the nominal

parameter values of C = 5:0 pF and � = 0:7 and a maximum of 2% amplitude

distortion (c = :02). In Fig. 2.19(a) we plot the �rst distortion-limited dynamic

range as a function of capacitance for three input distributions. Fig. 2.19(b) shows

the �rst distortion-limited dynamic range as a function of �. Similar plots are

given using the second distortion measure in Fig. 2.20. The slope of the second

distortion-limited dynamic range is identical to that of the �rst, whereas its level

is 2.0{5.2 dB higher.

The self-biased transconductor has the rare property that the nominal transcon-

ductance Go is the minimum transconductance, i.e. the transconductance function

is convex. One might expect that, in a cascade of such transconductors, as in a

cochlear �lter bank implementation, the output current distortion would eventually

push the transistor into the above threshold region of operation. Simply stated,

these transconductors do not have a saturating output current, but a saturating

output voltage. As we shall see in Chapter 3, transconductors based on di�erential

pairs have a saturating current characteristic, and as such the nominal transcon-

ductance is also the maximum transconductance. The output current distortion

for di�erential transconductors tends to reduce the actual output signal.

In the next chapter we explore di�erential transconductors. Using analysis tech-

niques introduced in Chapter 2, three new linearized transconductors are designed

and optimized for operation in subthreshold CMOS.

10−1

(b) kappa

Figure 2.19: The distortion-limited dynamic range using the �rst distortion mea-sure (a) as a function of C, where � = 0:7, and (b) as a function of �, whereC = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion. Curves are drawnfor each of three input distributions: uniform (solid), cosine value (dashed), andnormal (dotted).

10−1

(b) kappa

Figure 2.20: The distortion-limited dynamic range using the second distortionmeasure (a) as a function of C, where � = 0:7, and (b) as a function of �, whereC = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion. Curves are drawnfor each of three input distributions: uniform (solid), cosine value (dashed), andnormal (dotted).

Chapter 3

Linearized Transconductors in

Subthreshold CMOS

Analog circuits implemented in subthreshold CMOS are attractive because of

their low power consumption and compatibility with standard digital CMOS pro-

cesses (Vittoz, 1994). Continuous-time linear �ltering of audio signals, for appli-

cations such as hearing aids, is one class of analog circuits to which subthreshold

CMOS poses a particular challenge. The reason is that subthreshold current in

a CMOS device depends exponentially on the gate voltage. As a case in point,

we show in section 3.2 that the linear range of the basic two-transistor di�erential

pair operating below threshold is less than �7:5 mV. However, by applying several

linearizing techniques we are able to increase the linear range by as much as eight

times. Moreover, these techniques require only modest increases in silicon area

and power consumption.

Section 3.2 describes and analyzes the basic two-transistor di�erential pair. In

this section we de�ne two performance measures for transconductors, linear range

and current e�ciency. Sections 3.3 and 3.4 analyze in detail the four linearizing

techniques that are included in this research. Hints on improved transconductor

designs are given in section 3.5. The chapter concludes with experimental results

and a table summary.

A model for the current in an nMOS device operating below threshold is given

in Appendix A and is repeated here for convenience:

IDS = I0S exp�VGB=Vt(exp�VSB=Vt� exp�VDB=Vt) (3:1)

For transistors operating in saturation, i.e., (VDB � VSB) � 5Vt, the drain depen-

dence can be safely ignored.

3.1 The Transconductance-C Integrator

A di�erential-input, single-ended output transconductance-C integrator with no

linearization is shown in Fig. 3.1. It consists of a transconductor, a current mirror,

and an integrating capacitance. In this case the transconductor is a simple di�er-

ential pair, as found in Liu (Liu, 1992). In Fig. 3.1 and for the remainder of this

work, a three-terminal MOS transistor is assumed to have the bulk terminal tied to

a common local substrate. The mirror is assumed noiseless in order to isolate the

behavior of the transconductor. Alternately, the mirror can be replaced by a com-

plementary active stage, yielding a di�erential output. The noise is doubled, but so

is the nominal transconductance, so that no net noise is introduced (Groenewold,

1991).

Our goal is to maximize the dynamic range of the transconductance-C integra-

tor. Recall that dynamic range is the ratio of the maximum output voltage signal

power divided by the output voltage noise power, expressed in dB. If we assume a

lowpass con�guration with cuto� frequency Go=C, the equivalent noise bandwidth

is Go=4C, as derived in Chapter 2. Then the output voltage noise power can be

computed most conveniently as the product of the input-referred noise density with

the equivalent noise bandwidth of the circuit.

CI1 I2

Noiseless Current Mirror

IoutIin

Figure 3.1: The transconductance-C integrator using the basic di�erential pair.

3.2 The Di�erential Pair and De�nitions

The basic di�erential pair is shown in Fig. 3.2(a). It consists of two matched

transistors M1 and M2 operating in saturation and a third transistor M3 operating

as a current source, Ib. Let V1 and V2 be de�ned by their common-mode and

di�erential-mode voltages with respect to the substrate potential, where

VCM � V1 + V22

VDM � V1 � V2

Solving for the input voltages, we have

V1 = VCM +VDM

2(3.3)

V2 = VCM � VDM

Let I1 and I2 be the current passing through the two transistors as shown in

Fig 3.2(a). We have the constraint

Ib = I1 + I2 (3:4)

Let VS be the voltage at the source of the di�erential pair. If we assume that VB

the bulk potential is at zero volts, we can write

I1 = I0S exp�(VCM+VDM=2)=Vt exp�VS=Vt

I2 = I0S exp�(VCM�VDM=2)=Vt exp�VS=Vt (3.5)

De�ne the di�erential output current as in

IDM � I1 � I2 (3:6)

IDM = I0S exp�VCM=Vt exp�VN=Vt(exp�VDM=2Vt� exp��VDM=2Vt) (3:7)

in3 gd3

gm2+gmb2+gd2

gm1+gmb1

Figure 3.2: The basic di�erential pair (a) circuit, and (b) simpli�ed AC noisemodel.

From (3.4), Ib can be written as

Ib = I0S exp�VCM=Vt e�VN

Vt (exp�VDM=2Vt+exp��VDM=2Vt) (3:8)

Normalizing IDM by the bias current and canceling common terms, we get

(exp�VDM=2Vt� exp��VDM=2Vt)

(exp�VDM=2Vt +exp��VDM=2Vt)(3:9)

Recognizing the right-hand-side as the hyperbolic tangent, we can write

IDM = Ib tanh��VDM

�(3:10)

De�ne the transconductance of the di�erential pair as

G(VDM ) � @IDM

@VDM(3:11)

where the nominal value Go is equal to G(VDM = 0). For the case of the basic

di�erential pair,

G(VDM ) =Ib

cosh2��VDM2Vt

� �

2Vt(3:12)

The nominal value occurs for VDM equal to zero. Thus,

Go = Ib�

2Vt(3:13)

In Appendix A a small-signal noise model is given for the MOS transistor.

Assuming a �xed (noiseless) bias potential, Vbias, a simpli�ed small-signal noise

model for the di�erential pair can be derived as in Fig 3.2(b). Two equations

which can be used to solve for the di�erential noise current idn, where

idn � i1 � i2 (3:14)

are as follows:

idn = in1 � vs(gm1 + gmb1 + gd1)� in2 + vs(gm2 + gmb2 + gd2)

in1 + in2 � in3 = vs(2gm1 + 2gmb1 + 2gd1 + gd3) (3.15)

The second equation is found by summing all currents at node vs. Let us assume

that transistors M1 and M2 are matched, so that their small-signal parameters are

equal. After simpli�cation, we �nd that

idn = in1 � in2 (3:16)

It is interesting to note that noise in the bias current is e�ectively a common-mode

noise source and is canceled by the di�erential output current.

To �nd the power spectrum of the di�erential noise current, we add the e�ects

of each independent noise source. As such, we sum the power spectra of the

individual sources multiplied by the square magnitude of its transfer function. In

this case the transfer functions are simply 1 and -1 for in1 and in2, respectively.

Sidn(!) = (1)2Sin1(!) + (�1)2Sin2(!) = Sin1(!) + Sin2(!) (3:17)

From Appendix A, the power spectrum of the current noise in a single transistor

is 2qIDS. For the nominal input voltage VDM = 0, I1 = I2 = Ib=2. Therefore, we

Sidn(!) = 2qIb2+ 2q

Ib2= 2qIb (3:18)

From Chapter 2 the input-referred noise density is the di�erential output cur-

rent density divided by the square of the nominal transconductance. Thus,

V 2in;n =

Sidn(!)

(3:19)

Multiplying the input noise density by the equivalent noise bandwidth of a lowpass

�lter Go=4C, the output noise power is

V 2out;n =

Sidn(!)

4GoC(3:20)

For the case of the di�erential pair, we substitute Sidn and Go into (3.20) to obtain

V 2out;n =

qVt�C

(3:21)

The maximum normalized transconductance distortion measure, as de�ned in

Chapter 2, is applied to the transconductor. It is repeated here for convenience.

DG � max

��G(Vin)�Go

�� 8 Vin (3:22)

where Go is the nominal transconductance. 1 One can express DG as a percentage.

where a typical value is 1.0%. Referred to the input voltage, a distortion power

measure can be computed as

V 2in;dg = V 2

in;s(DG)2 (3:23)

For the remainder of this chapter, the term distortion will refer to the maximum

normalized transconductance distortion measure, unless otherwise stated. As out-

lined in section 2.4, other distortion measures are possible. This particular dis-

tortion measure is the most conservative and also the simplest to compute of the

three considered in that section.

De�ne the maximum input voltage Vmax of the transconductor as the contin-

uous set of di�erential input voltages VDM for which the maximum normalized

distortion is less than or equal to a constant. By symmetry, the linear range of the

transconductor will be �Vmax. Other constraints can be added to the de�nition of

linear range. For example, we may require a certain degree of smoothness in the

transconductance function at VDM = 0.

If the transconductance function is convex and takes its maximum value at

VDM = 0, then Vmax can be easily computed by �nding that value of VDM which

achieves the maximum distortion, as in

��G(Vmax)�Go

�� (3:24)

1For the case that the transconductance function has more than one in ection point, such as

an equiripple design, substitute the global maximum for the nominal transconductance.

Supposing that the transconductance function has an inverse,G�1(�), one can solvefor Vmax, as in

Vmax = G�1 (Go(1�DG)) (3:25)

If the inverse function does not exist, numerical techniques can be used to solve

the above equation.

Given DG, the maximum input voltage Vmax for the basic di�erential pair can

be determined analytically as

Vmax =2Vt�

cosh�1

1�DG

!(3:26)

Note that Vmax is proportional to Vt=�. In fact, one can write Vmax in the form

Vmax = d(DG)Vt=� (3:27)

where d(�) is a function of the percent distortion only. If we specify DG as 1%,

that function becomes a constant and we have

Vmax = 0:201Vt�

(3:28)

For Vt = 25:7 mV and � = 0:7 we obtain a linear range of �7:37 mV. In Fig. 3.3

we plot normalized G as a function of VDM .

Let us de�ne the current e�ciency of a transconductor as the maximal linear

output current expressed as a fraction of the total bias current. If we write the

di�erential output current IDM as a function of the di�erential input voltage VDM ,

then the current e�ciency �I can be expressed as

�I =IDM(VDM = Vmax)

Ib� 100% (3:29)

where Vmax is the maximum di�erential input voltage. The current e�ciency gives

us a measure of how much of the available current is usable for performing linear

computations. Note that the bias current serves the dual purpose of tuning the

nominal transconductance.

Given the maximuminput voltage, the current e�ciency of the basic di�erential

pair can be computed as

�I = tanh��Vmax

�� 100% (3:30)

Applying trigonometric identities, this equation simpli�es to

�I =qDG � 100% (3:31)

Thus, for DG equal to 0.01, we �nd that the current e�ciency of this design is

10.0%.

The input amplitude is permitted to vary in the range of �Vmax. If we assume

that the input is sinusoidal with amplitude Vmax, the maximum input signal power

is given by

V 2in;s =

V 2max

2(3:32)

For a transconductance-C �lter con�gured as a unity-gain lowpass �lter, the max-

imum output signal power is equal to the maximum input signal power, so that

V 2out;s =

V 2max

2(3:33)

The dynamic range can be expressed as the maximum signal-to-noise ratio at

a given distortion DG, as in

DR =V 2out;s

V 2out;n

��DG

(3:34)

Substituting (3.32) into the above equation, we get

V 2max

V 2out;n

(3:35)

For the case of the di�erential pair, substituting (3.21) and (3.28) the dynamic

range can be written as

�0:201Vt

�2 �C

= :0201VtC

�q(3.36)

−15 −10 −5 0 5 10 15

Vdm [mV]

Figure 3.3: Normalized transconductance for the basic di�erential pair as a func-tion of VDM with Vt = 25:7 mV and � = 0:7.

For a nominal capacitance value of 5.0 pF, Vt = 25:7 mV and � = 0:7, the dynamic

range of the di�erential pair is 43.6 dB. From (3.36), we can identify certain trends

or relationships. Increasing the temperature of operation will increase Vt and as

such extends the dynamic range. How does this happen? Increases in temperature

increase the noise power, but only linearly with T . However, increases in temper-

ature increase the signal power by the factor T 2. Taking the ratio of the signal

power to the noise power yields a net increase. Similarly, increases in C and/or

decreases in � result in a higher dynamic range.

While it is not immediately apparent how � can be a varied, the next section

details one possible means of reducing an \e�ective" � using diode-connected tran-

sistors. Alternately, one might use the \back-gate", i.e., the substrate, of the MOS

transistor as the input terminal (Sarpeshkar et al., 1996), since the e�ective � of

the back-gate is equal to (1� �).

3.3 Source Degeneration

Source degeneration can be accomplished by placing a conductance at the source

of the di�erential pair. In a standard digital CMOS process, no high-value re-

sistors exist. Therefore, resistors will be generated using transistors only. Three

techniques for improving the linear range of the basic di�erential pair using source

degeneration are outlined below.

3.3.1 Diode-Connected Transistors

A di�erential pair with source degeneration via diode-connected transistors is

shown in Fig. 3.4 (Watts, 1992). It consists of two pairs of matched transistors

M1�M2 and M3�M4 and a �fth transistor M5 operating as a current source, Ib.

Let VS now be the voltage at the source of the two diode-connected transistors,

referenced to ground. The voltages at the sources of M1 and M2 are VS1 and VS2,

respectively. We assume that all transistors are operating in saturation and that

VB, the bulk potential, is at zero volts. Applying (3.1) to transistors M1 and M3

in Fig. 3.4(a), and eliminating the voltage, VS1, one can show that

I1 = I0Seffe�2V1

(1+�)Vt e�

VS(1+�)Vt (3:37)

where Seff is the e�ective width-to-length ratio given by

Seff = (S�1S3)

1=(1+�) (3:38)

We note from (3.37) that the width-to-length ratios of transistors M1 and M3 do

not control the current I1 independently. An equation analogous to (3.37) holds

for transistors M2 and M4 of Fig. 3.4(a).

Let V1 and V2 be de�ned as in (3.4). The constraint Ib = I1 + I2 still holds.

We �nd that

IDM = Ib tanh

�2VDM

(1 + �)2Vt

!(3:39)

The transconductance as a function of VDM can be written as

cosh2�

�2VDM(1+�)2Vt

� �2

(1 + �)2Vt(3:40)

The nominal value occurs for VDM equal to zero. Thus,

Go = Ib�

��

1 + �

�(3:41)

To some extent, one can the view the presence of the diode-connected transistor as

having reduced the e�ective � by the factor �=(1+�). But it is not, unfortunately,

quite that simple for the computation of noise power.

A simpli�ed small-signal noise model for the di�erential pair with source de-

generation via diode-connected transistors is shown in Fig 3.4(b). Three equations

which can be used to solve for the di�erential noise current idn � i1 � i2 are:

idn = in1 � vs1(gm1 + gmb1 + gd1)

in4in3gm3

-vsgmb4

gm1+gmb1

gm2+gmb2+gd2

gm4+gd4

-vs gmb3

vs1 vs2

Figure 3.4: The di�erential pair with source degeneration via diode-connectedtransistors (a) circuit, and (b) small-signal noise model.

�in2 + vs2(gm2 + gmb2 + gd2)

idn = in3 + (vs1 � vs)(gm3 + gd3)� vsgmb3

�in4 � (vs2 � vs)(gm4 + gd4) + vsgmb4 (3.42)

in3 + in4 � in5 = vs(gmb3 + gmb4 + gd5) + (vs � vs1)(gm3 + gd3)

+(vs � vs2)(gm4 + gd4)

The third equation is found by summing all currents at node vs. Let us assume that

transistor pairsM1{M2 andM3{M4 are matched, i.e., their small-signal parameters

are the same. Then after some simpli�cation, we �nd

idn =(in1 � in2)(gm3 + gd3)

gm1 + gmb1 + gd1 + gm3 + gd3+

(in3 � in4)(gm1 + gmb1 + gd1)

gm1 + gmb1 + gd1 + gm3 + gd3(3:43)

To estimate the power spectrum of the di�erential noise current, we make the

following assumptions. Let gd1 = gd5 � 0. From Appendix A gm = �IDS=Vt and

gmb = (1 � �)IDS=Vt, so that gm + gmb = IDS=Vt. Since the current IDS is the

same through the di�erential pair as through the diode-connected transistors, the

equation for idn reduces to

idn =(in1 � in2)�

1 + �+

(in3 � in4)

1 + �(3:44)

As before, we sum the power spectra of the independent noise sources multiplied

by the square magnitude of their respective transfer function. We obtain

Sidn(!) =�

1 + �

[Sin1(!) + Sin2(!)] +�

1 + �

[Sin3(!) + Sin4(!)] (3:45)

is 2qIDS. For VDM = 0, I1 = I2 = Ib=2. Therefore, each of the four noise current

sources have power spectrum 2qIb=2. We obtain

Sidn(!) =1 + �2

(1 + �)22qIb (3:46)

For � = 0:7, the fraction (1 + �2)=(1 + �)2 is approximately 0.5. Therefore, for

a given bias current, the di�erential noise spectral density in this transconductor

is almost half that of a simple di�erential pair. Using the same type of models

described in this work, one can show that a current source with source degeneration

via a diode-connected transistor has a lower noise current density than a simple or

cascoded current source.

To compute the output referred noise power for the di�erential pair with diode-

connected transistors, we substitute equations for Go and Sidn into (3.20) to obtain

V 2out;n =

qVt�C

1 + �2

�+ �2(3:47)

Thus, the output-referred noise power is increased by the factor (1 + �2)=(�+ �2)

compared to the simple di�erential pair. For � = 0:7, this amounts to a 25%

increase.

As done for the simple di�erential pair, the maximum input voltage Vmax can

be determined analytically as in

Vmax =2Vt�

1 + �

�cosh�1

1 �DG

!(3:48)

If we set the the distortion DG to 1.0%, as before, we can rewrite the expression

for Vmax leaving visible parameters Vt and �, where

Vmax = 0:201Vt�

1 + �

�(3:49)

For Vt = 25:7 mV and � = 0:7 this relation predicts a linear range of �17:9 mV.

In Fig. 3.5 we plot G as a function of VDM .

Given the maximum input voltage, one can show that the current e�ciency

of the di�erential pair with source degeneration via diode-connected transistors is

identical to that of the basic transconductor, i.e. 10.0%.

−30 −20 −10 0 10 20 30

Vdm [mV]

Figure 3.5: Normalized transconductance for the di�erential pair with source de-generation via diode-connected transistors as a function of VDM with Vt = 25:7 mVand � = 0:7.

To compute the dynamic range for the case of the di�erential pair with diode-

connected transistors, substitute (3.47) and (3.49) into (3.35) to obtain

DR = :0201VtC

(1 + �)3

�(1 + �2)(3:50)

For a nominal capacitance value of 5.0 pF and other parameters as before, the

dynamic range of the di�erential pair with diode-connected transistors is 50.4 dB,

representing a 6.8 dB increase over that of the basic di�erential pair.

The addition of diode-connected transistors cannot be viewed solely as a re-

duction in � as hinted in (Watts, 1992), in which the author de�nes a new �0 =

�2=(1 + �), the chief reason being that the noise level is lower than anticipated

from this simple substitution.

The authors in (Watts, 1992) also propose source degeneration using more

than one diode-connected transistor in order to further enhance the linear range.

However, the cost of using one or more diode-connected transistors at the source

of a transistor is an increased supply voltage. A technique to adjust the threshold

voltage down to several hundred millivolts using oating gates would o�set this

increase. Nevertheless, we would like to achieve an improved linear range without

having to increase the supply voltage and, hence, the total power consumption.

3.3.2 Single Di�usor

The di�usor was proposed in (Boahen and Andreou, 1992) and discussed exten-

sively in (Andreou and Boahen, 1994). Its di�usivity, or conductivity, is controlled

by an applied gate potential, VGC . A di�erential pair with source degeneration via

a single di�usor M3 is shown in Fig. 3.6. The same circuit topology, as applied to

above threshold CMOS, can be found in (Tsividis et al., 1986).

Let VS1 and VS2 be the voltages at the sources of the di�erential pair, M1 and

M2, respectively. Let us de�ne

VS1 � VCMS +VDMS

VS2 � VCMS � VDMS

2(3.51)

where VCMS is the common-mode source voltage, and VDMS is the di�erential-mode

source voltage. They are given by the equations

VCMS � VS1 + VS22

VDMS � VS1 � VS2 (3.52)

We assume that all transistors except M3 operate in saturation and that VB,

the bulk potential, is at zero volts. Applying (3.1) to transistors M1 and M2 in

Fig. 3.6, we �nd that

I1 = I0S1e�(VCM+VDM=2)

Vt e�VCMS+VDMS=2

I2 = I0S1e�(VCM�VDM=2)

Vt e�VCMS�VDMS=2

Vt (3.53)

Writing IDM = I1 � I2 and simplifying, we get an equation for IDM as follows

IDM = 2I0S1e�VCM�VCMS

Vt sinh��VDM � VDMS

�(3:54)

Let I12 be the current passing from node VS1 to VS2. It can be written as

I12 = I0S3e�VGCVt

�e�VCMS�VDMS=2

Vt � e�VCMS+VDMS=2

�(3:55)

Using the constraint Ib = I1 + I2, we �nd

2Vt� VDMS

�(3:56)

The voltage VDMS is best eliminated from (3.56), since it is a function of VDM . To

that end, apply Kircho�'s current law at nodes VS1 and VS2 to obtain

I12 =I1 � I2

2(3:57)

0.5 Ib

Vbias Vbias VGC

0.5 Ib

VS2VS1

in4 in5

gmf3+gmbf3

gd4 gd5

gm1+gmb1

gm2+gmb2+gd2

vs2vs1

Figure 3.6: The di�erential pair with source degeneration via a single di�usor (a)circuit, and (b) small-signal noise model.

Using (3.55) and (3.57) we can write another equation for IDM . After simplifying,

we get

IDM = 4I0S3e�VGC�VCMS

Vt sinh�VDMS

�(3:58)

Equating (3.54) and (3.58) and applying trigonometric identities, we solve for

VDMS = 2Vt tanh�1

264 sinh

��VDM2Vt

�2S3S1e�(VGC�VCM )

Vt + cosh��VDM2Vt

�375 (3.59)

m � S3

S1(3:60)

be the relative width-to-length ratio and

meff =S3

S1e�(VGC�VCM )=Vt (3:61)

be the \e�ective" value of m. From (3.56) and (3.59) we note that the relative

width-to-length ratio and the term e�(VGC�VCM)=Vt have the same e�ect on the

voltage VDMS and hence the output current, IDM. We would like meff to be

constant, i.e., independent of the input signal. Therefore, we shall assume that

the voltage applied to the gate of the di�usor is exactly the common mode voltage

of the input signals, i.e., VGC = VCM . Alternately, one could set VGC higher to

achieve a higher e�ective value for m, or lower to simulate a lower one.

Substituting (3.59) into (3.56), we obtain a complete expression for IDM

IDM = Ib tanh

0@�VDM

2Vt� tanh�1

24 sinh

��VDM2Vt

�2m+ cosh

��VDM2Vt

�351A (3:62)

Di�erentiating IDM , the transconductance function G can be written as

0@�VDM

2Vt� tanh�1

24 sinh

��VDM2Vt

�2m+cosh

��VDM2Vt

�351A

�4m2 + 2m cosh

��VDM2Vt

�4m2 + 4m cosh

��VDM2Vt

�+ 1

��

�(3.63)

The relative width-to-length ratio m is the single parameter that we have to a�ect

the shape of the transconductance function.

Two possible criteria for optimizing the linear range are equiripple and maximal

atness. The optimality criterion that we follow is that of maximal atness, since

it is easier to derive analytically and it generally provides for a more robust design

strategy against device mismatch 2 With one degree of freedom, the �rst nonzero

derivative of G will be set to zero. By design G is an even function of VDM .

Therefore its �rst derivative is zero. Setting the second derivative equal to zero,

we �nd the only positive root occurs at m = 0:25. This root is independent of Ib,

�, and Vt.

The nominal transconductance value occurs at VDM equal to zero. Thus,

Go = Ib�

1 + 2m

�= Ib

�(3:64)

A simpli�ed small-signal noise model for the di�erential pair with source de-

generation via a single di�usor is shown in Fig 3.6(b). Three equations which can

be used to solve for the di�erential noise current idn � i1 � i2 are:

idn = in1 � in2 � vs1(gm1 + gmb1 + gd1)

+vs2(gm2 + gmb2 + gd2)

in1 � in4 � inf3 + inr3 = vs1(gm1 + gmb1 + gd1 + gd4)

+(vs1 � vs2)(gmf3 + gmbf3) (3.65)

in2 � in5 + inf3 � inr3 = vs2(gm2 + gmb2 + gd2 + gd5)

+(vs2 � vs1)(gmf3 + gmbf3)

The last two equations are found by summing all currents at nodes vs1 and vs2,

respectively. We assume that the di�erential pair M1{M2 and the two current

2It remains to be proven through a detailed sensitivity analysis that maximal atness criterion

is indeed more robust than an equiripple design.

sources M4{M5 are matched, i.e., their small-signal parameters are equal. Solving

for idn, one can show that

idn =(in1 � in2)(gd4 + 2gmf3 + 2gmbf3)

gm1 + gmb1 + gd1 + 2gmf3 + 2gmbf3 + gd4(3.66)

+(in4 � in5 + 2inf3 � 2inr3)(gm1 + gmb1 + gd1)

gm1 + gmb1 + gd1 + 2gmf3 + 2gmbf3 + gd4

following assumptions. Set gd1 = gd4 � 0. Recall that gm + gmb = IDS=Vt. From

Appendix A one can deduce that gmf3 = mgm1, where m is the scaling ratio S3=S1.

In this case the equation for idn reduces to

idn =(in1 � in2)2m

1 + 2m+

(in4 � in5 + 2inf3 � 2inr3)1

1 + 2m(3:67)

As before, we sum the power spectra of the independent noise sources multiplied

by the square magnitude of their respective transfer functions, yielding,

Sidn(!) =�

1 + 2m

[Sin1(!) + Sin2(!)] (3.68)

1 + 2m

[Sin4(!) + Sin5(!)] +�

1 + 2m

[Sin3f (!) + Sin3r(!)]

is 2qIDS. For VDM = 0, I1 = I2 = Ib=2, whereas IF = IR = mIb=2 for transistor

M3. We obtain

Sidn(!) =�

1 + 2m

2qIb +�

1 + 2m

2qIb +�

1 + 2m

= 2qIb (3.69)

This remarkable result indicates that, at a given bias current, the addition of a

di�usor between the sources of the two input transistors does not add any net

power to the di�erential output current noise. Whereas for m = 0 (no connection)

and m!1 (the basic di�erential pair), we expected the noise density to be 2qIb,

it is surprising that the same result holds for any value of m.

To compute the output referred noise power for the di�erential pair with a

single di�usor, substitute equations for Go and Sidn into (3.20) to obtain

V 2out;n = 3

qVt�C

(3:70)

We see that the output-referred noise power is increased by a factor 3 which is due

to the reduction in Go.

The maximum input voltage Vmax can no longer be determined analytically.

If we set the the distortion DG to 1.0%, as before, using numerical techniques to

determine the constant term, we can write an expression for the linear range as in

Vmax = 1:59Vt�

(3:71)

For the same circuit conditions as before, the linear range is �58:4 mV, or roughly

eight times that of the basic di�erential pair. The current e�ciency of the di�eren-

tial pair with a single di�usor is 26.5%, or more than 2.5 times that of the simple

di�erential pair. In Fig. 3.7 we plot G as a function of VDM .

To compute the dynamic range for the case of the di�erential pair with a single

di�usor, substitute (3.70) and (3.71) into (3.35) to obtain

DR = 0:422VtC

�q(3:72)

dynamic range of this transconductor is 56.8 dB, representing a 13.2 dB increase

over that of the basic di�erential pair, and a 6.4 dB increase over that of the

di�erential pair with diode-connected transistors.

A distinct disadvantage to this di�erential pair con�guration is that it requires

additional common-mode circuitry to ensure that the input signals operate around

VGC . If for some reason the common-mode voltage drifts away from this value, the

linear range will be drastically reduced.

−75 −50 −25 0 25 50 75

Vdm [mV]

Figure 3.7: For the di�erential pair with source degeneration via a single di�usor,G normalized by the maximal transconductance as a function of VDM with Vt =25:7 mV and � = 0:7.

3.3.3 Double Di�usors

A di�erential pair with source degeneration via double di�usors M3 and M4 is

shown in Fig. 3.8. The basic topology for this circuit is derived from the work

described in (Krummenacher and Joehl, 1988). As for the single di�usor transcon-

ductor, we will show that the double di�usor transconductor has one free param-

eter, m = S3=S1, the relative aspect ratio of the two matched transistor pairs

M1{M2 and M3{M4. On the other hand, this di�erential pair con�guration does

not require extra common-mode circuitry.

Let VS1 and VS2 be as de�ned in (3.51). We assume that transistors M1 and

M2 operate in saturation and that VB, the bulk potential, is at zero volts. Apply-

ing (3.1) to transistors M1 and M2 in Fig. 3.6, we would �nd the same expressions

for I1, I2, and IDM = I1� I2 as for the single di�usor case. The di�erence between

these two con�gurations comes in the expression for I12, the current passing from

node VS1 to VS2. An equation for I12 can be written as

I12 = I0S3

�e�(VCM+VDM=2)

Vt + e�(VCM�VDM=2)

��

e�VCMS�VDMS=2

Vt � e�VCMS+VDMS=2

�(3.73)

Using (3.73) and (3.57) we can write another equation for IDM . After simpli-

fying, we get

IDM = 8I0S3e�VCM�VCMS

Vt cosh��VDM

�sinh

�VDMS

�(3:74)

Equating (3.54) and (3.74) and applying trigonometric identities, we solve for

VDMS = 2Vt tanh�1�

4m+ 1tanh

��VDM

��(3:75)

Substituting (3.75) into (3.56), we obtain a complete expression for IDM

2Vt� tanh�1

4m+ 1tanh

��VDM

��(3:76)

0.5 Ib

Vbias Vbias

0.5 Ib

VS2VS1

in5 in6

gd5 gd6

gm1+gmb1

gm2+gmb2+gd2

vs2vs1

inf3 - inr3

gmf3+gmbf3

inf4 - inr4

gmf4+gmbf4

Figure 3.8: The di�erential pair with source degeneration via double di�usors (a)circuit, and (b) small-signal noise model.

Di�erentiating IDM , the transconductance function G can be written as

cosh2��VDM2Vt

� tanh�1h

tanh��VDM2Vt

�i�

�(16m2 + 8m) cosh2

��VDM2Vt

�� 4m

(16m2 + 8m) cosh2��VDM2Vt

�+ 1

��

�(3.77)

The relative width-to-length ratio m is the single parameter with which we can

a�ect the shape of the transconductance function. Setting the second derivative

of G equal to zero, we �nd the only positive root occurs at m = 0:5. This root is

independent of Ib, �, and Vt.

The nominal value of G occurs at VDM equal to zero. Thus,

Go = Ib�

4m+ 1= Ib

3(3:78)

A simpli�ed small-signal noise model for the di�erential pair with source degen-

eration via double di�usors is shown in Fig 3.8(b). The analysis of this circuit is

quite similar to the case of a single di�usor, except that there are four independent

noise sources between nodes vs1 and vs2, and the conductance between these two

nodes is double, re ecting the fact that there are two di�usors in parallel. Assum-

ing precise matching of the three transistor pairs, M1{M2, M3{M4, and M5{M6,

one can write an expression for the di�erential output noise current as

idn =(in1 � in2)(gd5 + 4gmf3 + 4gmbf3)

gm1 + gmb1 + gd1 + 4gmf3 + 4gmbf3 + gd5(3.79)

+(in5 � in6 + 2inf3 � 2inr3 + 2inf4 � 2inr4)(gm1 + gmb1 + gd1)

gm1 + gmb1 + gd1 + 4gmf3 + 4gmbf3 + gd5

following assumptions. Set gd1 = gd5 � 0. Recall that gm + gmb = IDS=Vt. From

Appendix A we can deduce that gmf3 = mgm1, where m is the scaling ratio S3=S1.

The equation for idn simpli�es to

idn =(in1 � in2)4m

1 + 4m+

(in5 � in6 + 2inf3 � 2inr3 + 2inf4 � 2inr4)

1 + 4m(3:80)

As previously done, we sum the power spectra of the independent noise sources

multiplied by the square magnitude of their respective transfer functions.

Sidn(!) =�

1 + 4m

[Sin1(!) + Sin2(!)] (3.81)

1 + 4m

[Sin5(!) + Sin6(!)]

1 + 4m

[Sin3f(!) + Sin3r(!) + Sin3f(!) + Sin3r(!)]

is 2qIDS. For VDM = 0, I1 = I2 = Ib=2, whereas IF = IR = mIb=2 for transistors

M3 and M4. We obtain

Sidn(!) =�

1 + 4m

2qIb +�

1 + 4m

2qIb +�

1 + 4m

(2qmIb + 2qmIb)

= 2qIb (3.82)

As for the case of the single di�usor, the addition of two di�usors between the

sources of the di�erential pair does not add to the di�erential output current noise

density.

To compute the output referred noise power for the di�erential pair with double

di�usors, substitute equations for Go and Sidn into (3.20) to obtain

V 2out;n =

qVt�C

(3:83)

We see that the output-referred noise power is increased by the factor 3=2 which

is due to a similar reduction in Go.

If we set the the distortion DG to 1.0%, using numerical techniques one can

show that the maximum input voltage is

Vmax = 0:795Vt�

(3:84)

For the same circuit conditions as before, we obtain a linear range of �29:2 mV,

or exactly one-half that of the transconductor with a single di�usor. The current

e�ciency of the di�erential pair with double di�usors is 26.5%, identical to that

of the single di�usor design. In Fig. 3.9 we plot normalized transconductance as a

function of VDM .

The dynamic range for the di�erential pair with double di�usors can be found

by substituting (3.83) and (3.84) into (3.35), to obtain

DR = 0:211VtC

�q(3:85)

over that of the basic di�erential pair, and only a 3.0 dB decrease compared to

that of the single di�usor.

3.4 Multiple Di�erential Pairs

Another technique for linearizing the basic transconductor employs a multiplicity of

asymmetric di�erential pairs (Tanimoto et al., 1991). This technique was originally

applied to bipolar junction transistors (BJT's). Here it is extended to subthreshold

CMOS design, where CMOS device characteristics are similar to those of BJT's.

This technique is enhanced with the proposed use of the substrate terminal to

e�ectively modify the width-to-length ratio of the transistor.

3.4.1 Two Di�erential Pairs

A transconductor with two di�erential pairs is shown in Fig. 3.10. It consists of

two pairs of unequal size transistors and two current sources. The transistor M1a

is m times wider than M2a. Conversely, transistor M2b is m times wider than M1b.

The e�ect of sizing the transistors in this way is to create an intentional voltage

o�set. Note that the same e�ect could be obtained by the use of oating gate

transistors.

−30 −20 −10 0 10 20 30

Vdm [mV]

Figure 3.9: For the di�erential pair with source degeneration via double di�usors,G normalized by the maximal transconductance as a function of VDM with Vt =25:7 mV and � = 0:7.

Let VS1 be the voltage at the source of the transistor pair M1a{M2a, and VS2

be the voltage at the source of the transistor pair M1b{M2b. We assume that all

transistors are operating in saturation and that VB, the bulk potential, is at zero

volts. For the current transconductor con�guration, let the relative transistor ratio

be de�ned as

m � S1a

S2a� S2b

S1b(3:86)

It is helpful to write the relative transistor ratio as follows:

m = elnm (3:87)

Applying (3.1) to transistors M1a and M2a in Fig. 3.10, one �nds that

I1a = I0S2aelnme

�(VCM+VDM=2)

Vt e�VNaVt

I2a = I0S2ae�(VCM�VDM=2)

Vt e�VNaVt (3.88)

where S2a is the width-to-length ratio of transistor M2a. From (3.88) and the

constraint I1a + I2a = Ib=2, we can write an expression for the di�erence current,

IDMa, as in

IDMa =Ib2tanh

�VDM

!(3:89)

A similar, but complementary, equation can be derived for the transistor pair

M1b{M2b.

The total di�erential current is therefore

IDM =Ib2tanh

�VDM

+Ib2tanh

�VDM

2Vt� lnm

!(3.90)

The transconductance as a function of VDM is

2 cosh2��VDM2Vt

+ lnm2

� � �

2 cosh2��VDM2Vt

� lnm2

� � �

�(3.91)

0.5 Ib

VSaM1a

0.5 Ib

VSbM1b

Figure 3.10: A transconductor with two asymmetric di�erential pairs.

The relative width-to-length ratio m of the transistor pairs will be used to obtain

a maximally at shape for the transconductance function. Setting the second

derivative of G equal to zero, we �nd that the only positive root that is greater

than one occurs at m = 2 +p3. Due to symmetry, a second root occurs at

m = 1=(2 +p3). These roots are independent of Ib, �, and Vt. In particular,

for a value of � = 1, the current-to-voltage characteristics of a CMOS transistor

operating in the subthreshold saturation region look identical to those of a bipolar

transistor, which is operating in its active region. As a consequence, this optimum

relative width-to-length ratio holds for both subthreshold CMOS and BJT design.

The nominal value for G occurs at VDM equal to zero. Thus,

Go = Ib�

m+ 2 + 1=m

�(3:92)

It is not obvious, but, in fact, the transconductor with two asymmetric di�er-

ential pairs with m = 2 +p3 has the same voltage-to-current relationship as that

of the transconductor with double di�usors with m = 0:5. We omit the proof for

brevity. However, the key to showing this identity is found by making use of the

trigonometric identity:

tanh(a� b) =tanh(a)� tanh(b)

1� tanh(a) � tanh(b) (3:93)

The noise of a transconductor with asymmetric di�erential pairs is essentially

the bias current (Tanimoto et al., 1991). A laborious small-signal noise model

can be used to verify their result. As such, the equations for input-referred noise,

output noise power, linear range, current e�ciency, and dynamic range are the

same as those for the double di�usors.

In particular, allowing 1.0% distortion, using numerical techniques one can

show that the maximum input voltage is

Vmax = 0:795Vt�

(3:94)

The linear range is �29:2 mV, identical to that of the transconductor with double

di�usors. The current e�ciency is 26.5%, identical to that of the single and double

di�usor designs. In Fig. 3.11 we plot normalized transconductance as a function

of VDM . It is indistinguishable from that shown in Fig. 3.9.

As for the double di�usor design, the dynamic range of the transconductor with

two asymmetric di�erential pairs is

DR = 0:211VtC

�q(3:95)

dynamic range of this transconductor is 53.8 dB.

Similar to the double di�usor transconductor, no common-mode circuitry is

required for this transconductor. A minor advantage of the double di�usor design

over that of the two asymmetric di�erential pairs is that the optimum relative

width-to-length ratio is very simple in the former case m = 0:5, and quite compli-

cated in the latter casem = 2+p3 = 3:73. Indeed, a relative width-to-length ratio

of 4 is adopted in many practical designs which use two asymmetric di�erential

pairs.

3.4.2 Three Di�erential Pairs

It is possible to have more than two asymmetric di�erential pairs in order to

increase the linear range and current e�ciency (Tanimoto et al., 1991). As in

the previous design, we �nd that the optimal width-to-length ratios and current

source ratios are the same for bipolar circuits as for subthreshold CMOS, despite

the presence of �. In Fig. 3.12 we show the circuit for three asymmetric di�erential

pairs.

Using the results from the case of two asymmetric di�erential pairs, we can

−30 −20 −10 0 10 20 30

Vdm [mV]

Figure 3.11: For a transconductor with two asymmetric di�erential pairs, G nor-malized by the maximal transconductance as a function of VDM with Vt = 25:7 mVand � = 0:7.

write an equation for the total di�erential current as

IDM = �Ib tanh

�VDM

+ (1� 2�)Ib tanh��VDM

�(3.96)

+ �Ib tanh

�VDM

2Vt� lnm

where the relative width-to-length ratio is now de�ned as

m � S1a

S2a� S2c

S1c(3:97)

The transconductance as a function of VDM is therefore

G =�Ib

cosh2��VDM2Vt

+ lnm2

� � �

+(1� 2�)Ib

cosh2��VDM2Vt

� � �

�(3.98)

+�Ib

cosh2��VDM2Vt

� lnm2

� � �

The relative width-to-length ratio m of two of the transistor pairs and the relative

strength of the two sets of bias currents, �, are the two parameters that we have

to modify the shape of the transconductance function.

As before, we follow is the maximal atness criterion for optimizing the linear

range. Setting simultaneously the second and fourth derivatives of G equal to

zero, we �nd that the only positive root of m that is greater than one occurs at

m = 4 +p15. The only positive root of � occurs at � = 25=66. These roots are

independent of Ib, �, and Vt.

The nominal value of G occurs at VDM equal to zero. Thus,

Go = Ib�

m+ 2 + 1=m+ 1 � 2�

2Vt(3.99)

Assuming that the di�erential output current noise is essentially due to the

bias current (Tanimoto et al., 1991), the output referred noise can be written as

V 2out;n =

�qVt�C

(3:100)

where the factor 11=6 is due to a similar reduction in Go.

If we set the distortion DG to 1.0%, as before, using numerical techniques to

determine the constant, we can write the expression for the linear range as in

Vmax = 1:34Vt�

(3:101)

For the same circuit conditions as before, we obtain a linear range of �49:1 mV,

which is 67% higher than the linear range of the transconductor with two asym-

metric di�erential pairs. The current e�ciency of the transconductor using three

asymmetric di�erential pairs is 36.4%. In Fig. 3.13 we plot G as a function of VDM .

To compute the dynamic range for the case of three asymmetric di�erential

pairs, substitute (3.100) and (3.101) into (3.35) to obtain

DR = 0:486VtC

�q(3:102)

over two asymmetric di�erential pairs, and only a 0.7 dB increase compared to

that of the single di�usor.

The transconductor built using three asymmetric di�erential pairs seems to

have every possible advantage. The dynamic range is very high and it does not

require common-mode circuitry. However, the matching requirements now become

increasingly severe. Not only must the di�erential pairs maintain a precise relative

aspect ratio m = 4 +p15 = 7:87, but also the current sources need a relative

sizing of �-to-(1 � 2�), or 1.56-to-1.

(1-2α)Ib

VSbM1b

VScM1c

VSaM1a

Figure 3.12: A transconductor with three asymmetric di�erential pairs.

−60 −40 −20 0 20 40 60

Vdm [mV]

Figure 3.13: For a transconductor with three asymmetric di�erential pairs, Gnormalized by the maximal transconductance as a function of VDM . with Vt =25:7 mV and � = 0:7.

3.4.3 Substrate Biasing Technique

The CMOS transistor is a symmetric four-terminal device, which has e�ectively

two voltage-control nodes, the gate and the substrate. The use of the substrate as a

control node is not common, although it has recently found increasing use (Cohen

and Andreou, 1992; Sarpeshkar et al., 1996). It seems that if a twin-tub process

became available through low-cost foundry services, the substrate would be more

fully exploited as a second input terminal voltage.

The equation for the current through an NMOS device in saturation can be

written in the form (see Appendix A),

IDS = I0Se(1��)VB=Vte(�VG�VS)=Vt (3:103)

Note that both the width-to-length ratio S and the bulk voltage VB can be used to

modulate the output current, independent of VG and VS . If we de�ne an e�ective

width-to-length ratio as

Seff = Se(1��)VB=Vt (3:104)

then it is possible to continuously adjust Seff via the bulk potential. Note that

the interface between the source and the bulk is a reversed-biased p-n junction.

Therefore we do not want to increase VB relative to VS . Rather, we must ensure

that VB is always less than or equal to the minimum expected source voltage. In

general, we are limited then to reducing the e�ective width-to-length ratio using

the substrate biasing technique.

As a sample design, consider Fig. 3.14. We would like to achieve the scaling

ratio of (2+p3)-to-1, which results in a maximally at transconductance function

for the case of two asymmetric di�erential pairs. At the mask level, we give all

four transistors the same aspect ratio S. We then reduce the bulk potential VB2 of

the two inside transistors (M2a and M1b) relative to the bulk potential VB1 of the

two outside transistors (M1a and M2b) such that

meff � Seff1aSeff2a

= 2 +p3 (3:105)

Substituting (3.104) for each transistor in the above equation, we have

meff =S exp(1��)VB1=Vt

S exp(1��)VB2=Vt= exp(1��)(VB1�VB2)=Vt = 2 +

p3 (3:106)

For nominal parameter values of Vt = 25:7 mV and � = 0:7, we �nd VB1 � VB2 =

113 mV, i.e., we need to lower VB2 113 mV with respect to VB1.

One notable drawback to this technique is that the optimal value of VB1� VB2

now depends on � and Vt. Therefore, these parameters either must be known a

priori and be held constant, or a circuit is needed that continually adapts the

bulk voltages to their desired values. Clearly, further research into this method is

warranted.

3.5 Hints on Improved Transconductor Design

3.5.1 Use of the Gate Capacitance

In the CMOS process available through foundry services, the gate-oxide thickness

is approximately half that of the capacitor-oxide thickness. As such, more dense

designs can be achieved through the use of the input gate as the integrating ca-

pacitance. In addition, by using large-area devices, icker noise, which we have

ignored for the most part, becomes less noticeable.

One possible drawback to the use of input gates as the integrating capacitance

is that the gate capacitance is weakly nonlinear. It may be, though, that by using

a constant gm input stage this nonlinearity will be canceled. More research is

necessary in this area.

0.5 Ib

VB2VB1

Figure 3.14: A transconductor with two asymmetric di�erential pairs demonstrat-ing the substrate biasing technique to achieve a maximally at transconductancefunction.

3.5.2 Voltage-Splitting

One technique to double the linear range of a given transconductor design is known

as voltage-splitting (Torrance et al., 1985). Its merits are that it is relatively

simple to implement and that it automatically computes the common-mode input

voltage. As such, this technique can be easily incorporated into a di�erential-

input di�erential-output design with only the addition of a common-mode feedback

circuit from the input of one stage to the output of the preceding stage. The

common-mode feedback circuit adds to the overall power dissipation, and so needs

to be carefully considered in terms of the overall improvement in dynamic range.

One good transconductor design that uses of this technique is found in (Silva-

Martinez et al., 1990). In their design the MOS transistors operate above threshold.

Similar in topology, but di�erent in operation, is the circuit shown in Fig. 3.15,

in which all the transistors operate in subthreshold. In this �gure, the voltage-

splitting technique is carefully applied to the optimum transconductor design using

double di�usors. In simulation and analysis, we achieve a factor of two improve-

ment in linear range.

We fabricated a ten-channel version of the Liu �lter bank model described in

Chapter 4 with the transconductor design of Fig. 3.15. Rather than applying

common-mode feedback, we chose a single-ended output using a current mirror.

Preliminary measurements indicate an improved linear range using this technique.

We have not performed a detailed noise analysis. However, from the �lter

structure, it appears that the noise will be greater than 2qIb since a fraction of

the bias current is consumed in computing the common-mode voltage. Another

potential di�culty with this design is that the matching requirements now extend

to the four current sources shown in Fig. 3.15, the six di�erential input transistors,

as well as the four di�usors.

IM2aIM1

0.5 Ib

V1 V2 M1 M2a

0.5 Ib

IM6IM5a

0.5 Ib

M5a M6

0.5 Ib

0.5 IbI1 I2

M5b M7M2b

Figure 3.15: Application of the voltage-splitting technique to the transconductorwith source degeneration via double di�usors.

3.5.3 Class-AB Operation

It is noted in (Groenewold, 1991) that class-AB transconductors have the lowest

noise factor since there is no extra bias element. In particular, the output-referred

noise current density is doubled, whereas the nominal transconductance is also

doubled by the e�ect of the push-pull con�guration. Referred to the input voltage,

then, the noise factor is unchanged. On the other hand, if the transconductor is

class-B, the noise factor is again doubled, but the nominal transconductance is

unchanged. As a result, the noise factor is doubled.

Therefore, a low-noise transconductor design would, at least in theory, be com-

plementary. In order for the complementary design to function well, it is also

required that the parameters for the native and complementary devices be well

matched. In a simulation with well-matched nMOS and pMOS transistors, we

�nd that a class-AB complementary version of the transconductor with double dif-

fusors operates with the desired result. This circuit is depicted in Fig. 3.16. In a

fully-integrated implementation of this circuit, there is the possibility of a system-

atic di�erence in � between native and complementary devices, which would limit

the achievable dynamic range. This di�erence would be largely due to di�erences

in the dopant concentration between the bulk of the native device (the substrate),

and that of the complementary device (the well).

3.6 Experimental Results

To date, two experiments have been made to verify the analysis and simulation of

the transconductors presented in this chapter.

3.6.1 Static Measurements

Static measurements of the basic di�erential pair, the di�erential pair with source

degeneration using single di�usors, and the di�erential pair with source degen-

V1 V2 M1

0.5 Ib

0.5 Ib0.5 Ib

I1I2M5 M6

Figure 3.16: Fully complementary transconductor design based on source degen-eration via double di�usors.

eration using double di�usors were made using ADC488 and DAC488 analog-to-

digital and digital-to-analog converter modules under control of a PC-compatible

computer. All transistors had a width of 2000 �m and a length of 20 �m. Chips

were fabricated in a standard 2 �m n-well process. Relative transistor aspect ra-

tios of 2 and 0.5 were achieved by combining transistors in parallel and in series,

respectively.

The output voltages of the DAC488 were attenuated and lowpass �ltered at

a very low cuto� frequency before being presented to the di�erential inputs of

the transconductors. The output currents were converted to voltages using low-

value resistors and a bu�er/ampli�er. These voltages, in turn, were digitized by

the ADC488. Care was taken to ensure that the common-mode voltage remained

unchanged in these experiments. In Fig. 3.17, the di�erential output current data

is plotted as a function of di�erential input voltage.

In software, a �nite di�erence method was used to compute the normalized

transconductance function. The output current data was �rst smoothed, and then

di�erentiated with respect to the (known) di�erential input voltage. The DC o�set

and maximum transconductance values were estimated using a least-mean squares

�t of a subset of the data to a parabolic function. As such, in the plot of Fig. 3.18,

the transconductance function is normalized by the maximum value, and the DC

o�set removed from the data.

From Figs. 3.17 and 3.18 it is evident that an improvement in linear range is

obtained for both the single and double di�usor transconductors, as compared to

the basic di�erential pair. On the other hand, this improvement is not as high as

one would hope for. One possible explanation is poor device matching.

Also note from Fig. 3.18 that the computed normalized transconductance func-

tion for the basic di�erential pair is essentially \noise-free," unlike those for the

two transconductors with source degeneration. The latter designs exhibit some-

−100 −50 0 50 100−2

−1.5

−0.5

2Differential Pairs (2000um x 20um)

Vdm [mV]

(a) (b) (c)

Figure 3.17: Experimental data of the di�erential output current as a functionof input voltage for (a) the basic di�erential pair, (b) the di�erential pair withsource degeneration via double di�usors, and (c) the di�erential pair with sourcedegeneration via a single di�usor. Each dot represents a sample point.

−60 −40 −20 0 20 40 60

Normalized Transconductance (2000um x 20um)

Vdm [V]

(a) (b) (c)

Figure 3.18: Normalized transconductance as a function of VDM computed fromexperimental data for (a) the basic di�erential pair, (b) the di�erential pair withsource degeneration via double di�usors, and (c) the di�erential pair with sourcedegeneration via a single di�usor. Solid lines show the predicted values.

what strange peaks and valleys in their transconductance function. One possible

explanation for these extraneous peaks and valleys is poor device matching.

3.6.2 Dynamic Measurements

The transconductance functions of the basic di�erential pair and the di�erential

pair with a single di�usor were measured using an SR850 lock-in ampli�er. All

transistors had a width of 1377.6 �m and length of 4.8 �m. Chips were fabricated

in a standard 1.2 �m n-well process. We varied VGC at the gate of the di�usor in

Fig. 3.6 in order to control the \e�ective" width-to-length ratio of the di�usor.

Two voltage signals originating from the SR850 were attenuated and summed

externally with a summing ampli�er. The complement to the input signal was also

computed using a unity-gain inverting ampli�er. The voltage signal seen across

the di�erential inputs consisted of a a 1 mV peak-to-peak sine wave at frequency

100 Hz superimposed upon a DC bias in the range of �200 mV. The output

currents were converted to a voltage through low-valued resistors, bu�ered, and

then AC coupled to the input of the lock-in ampli�er.

Fig. 3.19 shows the experimental data. The linear range of the basic di�erential

pair is measured as 18.6 mV, while that of the transconductor with a single di�u-

sor is 133.6 mV. The improvement in linear range is approximately seven times,

whereas the expected improvement from �rst-order analysis was eight. This dis-

crepancy may be attributed to such non-idealities as a nonzero drain-to-source

conductance and variations in � as a function of bulk-to-source potential.

3.6.3 Summary of Results

We summarize the analytical results of this chapter with several tables which allow

easy comparison of several transconductor designs.

For a �xed bias current Ib, the highest nominal transconductance Go is ob-

−75 −50 −25 0 25 50 75

Vdm [mV]

(a) (b)

Figure 3.19: Experimental data of the normalized transconductance as a functionof VDM for (a) the basic di�erential pair and (b) the di�erential pair with sourcedegeneration via a single di�usor.

tained by the basic di�erential pair. The linearization techniques discussed in this

chapter e�ectively trade-o� transconductance for linear range, i.e., the transcon-

ductance value is decreased while the linear range is increased. Indeed, the author

doubts that it is possible to increase the linear range while not adversely a�ect-

ing the nominal transconductance value. Moreover, except for the di�erential pair

with source degeneration via diode-connected transistors, the output current noise

density for each of the linearization techniques is essentially constant at (2qIb).

For a constant corner frequency (Go=C), and constant integrating capacitance

C = 5 pF, we see in Table 3.1 the \cost" in terms of increased power consumption

relative to that of the basic di�erential pair, and also the expected \bene�t" in

terms of improved dynamic range for each of the transconductors described in this

chapter. The increased power consumption comes from the need to set Go equal

to a constant for each of the transconductor designs. Since Go / Ib, constant Go

is achieved by increasing the power consumption of the linearized transconductors

as compared to the basic di�erential pair.

Table 3.1: Summary of Linearization Techniques with Constant (Go=C), C = 5 pF(Vt = 25:7 mV, � = 0:7).

Technique Relative Ib Dynamic Range at DG = 1%

Di�erential Pair 1 43.6 dBDiode-Connected 2.43 50.4 dBSingle Di�usor 3 56.8 dBDouble Di�usors 1.5 53.8 dB

Two Asymmetric Pairs 1.5 53.8 dBThree Asymmetric Pairs 1.83 57.5 dB

Another way of viewing the relative merit of each transconductor topology

is to again �x the corner frequency (Go=C), but this time to also �x the power

consumption, Ib. Since �xing Ib e�ectively �xes Go, it is necessary to decrease

the integrating capacitance for each of the linearized transconductors in order to

maintain a constant corner frequency. Thus, below we see not only the \bene�t"

of each linearizing technique in terms of increased dynamic range, we also see the

\bene�t" of reduced integrating capacitance.

Table 3.2: Summary of Linearization Techniques with Constant (Go=C), Ib (Vt =25:7 mV, � = 0:7).

Technique Integrating C Dynamic Range at DG = 1%

Di�erential Pair 5 43.6 dBDiode-Connected 2.06 46.5 dBSingle Di�usor 1.67 52.1 dBDouble Di�usors 3.33 52.1 dB

Two Asymmetric Pairs 3.33 52.1 dBThree Asymmetric Pairs 2.73 54.8 dB

From the results presented above, it is clear that various tradeo�s exist between

design criteria. For a �xed �lter cuto� frequency, (Go=C), and �xed transconductor

design, as the capacitance increases, so does the power consumption, since Ib / Go.

And as the integrating capacitance increases, so does the dynamic range, in a linear

fashion. Thus, for a particular distortion-limited transconductor designs, increases

in dynamic range come only at a cost of increased power consumption and increased

Choosing one linearization technique over another can have the hidden cost of

more complex circuit design, increased area, and increased sensitivity to device

mismatch. All of these secondary design criteria must be accounted for in the

selection of a particular transconductor.

One question that the tables above do not answer is the following: given a

particular cuto� frequency (Go=C) and either a constant integrating capacitance

or constant bias current, what is the highest achievable dynamic range of any

distortion-limited transconductance-C integrator topology?

In this chapter, we have introduced several new transconductor topologies

to subthreshold CMOS design. Our intention is to apply these new designs in

continuous-time �lter architectures, such as the one described in the following

chapter. Optimizing the dynamic range of the transconductor is the �rst step

toward optimization of the entire �lter structure.

Chapter 4

The Multi-Resolution Filter

Bank Model

The e�ciency and performance of any information processing system, both hard-

ware and software, can be improved by incorporating prior knowledge in the design

phase. Indeed, this is the keystone for success in all statistical speech recognition

systems (Roe and Wilpon, 1994).

The ultimate goal of our research is the e�cient extraction of information

from sensory signals { auditory and visual { by hardware systems. As a tentative

measure of goodness, we propose the following optimizing criterion: maximize

the number of bits/sec/Watt, or bits/Joule. A high information rate is the desired

outcome of the processing, while the power consumption represents the cost. Other

cost measures are possible, such as size and weight, but very often they can be

related to the overall power consumption.

Biological systems serve as working models of sensory information processing

as they seem to achieve a very high information rate at very low levels of power

consumption. Therefore, we stand to learn by abstracting from known functions

and organizations of information processing in biological systems when we attempt

to solve similar problems using VLSI. To date, several biologically-inspired VLSI

systems for vision and audition have emerged from such an undertaking, including

the silicon retina and the silicon cochlea (Mead, 1989; Andreou and Boahen, 1994;

Liu et al., 1992b).

In this chapter, we present an analysis and design strategy for hardware cochlear

�lter bank models, addressing issues both at the architectural and circuit levels.

Total power dissipation is a prime engineering constraint and, as such, this work

�nds applications in the areas of portable speech-recognition equipment, hearing

aids, and cochlear implants.

4.1 Filter Bank Architecture

Prior knowledge is sometimes referred to as the \model" and represents the struc-

ture of the information processing system. A parametric description with a min-

imal number of parameters is desirable. A static, linear �lter bank model of the

basilar membrane in the cochlea has been proposed by Liu (Liu, 1992; Liu et al.,

1993). A block diagram of the architecture is found in Fig. 4.1. The design is a

result of the e�ective bandwidth concept and reproduces faithfully the results from

hydrodynamic simulations of a one-dimensional transmission-line model of cochlear

uid mechanics (Allen, 1985). The �lter bank structure has only four tuning pa-

rameters yet is exible enough so that an appropriate set of parameters can be

found to �t the neurophysiological data. In particular, the �lter bank is tuned so

that the model output closely resembles the auditory �bers' instantaneous �ring

rates (IFR) in response to steady-state sinusoids and tone bursts (Liu, 1992). To

do so, it has been found that two second{order sections are necessary (Liu, 1992;

Liu et al., 1992a; Liu et al., 1993) instead of one (Liu et al., 1992b).

However, the performance of any �xed-model based system will inevitably de-

grade if the operating environment does not match the environment under which

the system was originally designed. For speech communication the variability in

the acoustic environment, variability between individual speakers, and variability

Outputs

Figure 4.1: Block diagram of Liu's N -channel basilar membrane model consistingof a cascade of N lowpass sections with taps to two bandpass �lters per outputchannel (Liu, 1992).

in the hardware components (mismatch) necessitates the possibility of adaptation.

Two software systems which successfully adapt in order to counter variability

are found in the area of speech recognition. Neti (Neti, 1994) used a software

model of the basilar membrane proposed by Liu (Liu, 1992), followed by temporal

feature extraction proposed by Yang (Yang et al., 1992), as a front-end for a large-

vocabulary isolated-word speech recognition experiment. By adjusting the four

tuning parameters of the basilar membrane in response to changes in the level of

additive babble noise, Neti reported more than a 50% decrease in word error rate at

moderate signal-to-noise ratios as compared with the more conventional acoustic

processing scheme. More recently, Kamm et al (Kamm et al., 1995) describes

a system which attempts to adjust one �lter bank parameter from the estimated

vocal-tract length of the individual speaker. They reported an overall improvement

in the word error rate of 5% for continuous-word speaker-independent telephone

speech. Thus, although adaptation is not treated in this work, it is understood by

the author that this topic must be addressed in a �nal design.

The �lter bank can be viewed as a single-input, multiple-output �lter bank.

With N output nodes, the transfer function from the input node to output n is

given by:

H(n; s) =

Q3(n)!c(n)

Q3(n)!c(n)+ s2

!c(n)2

1A (4:1)

where !c(i) is the cuto� frequency of the lowpass �lter and center frequency of

the bandpass �lters for the ith section, and Q3(i) is the 3-dB quality factor of

the bandpass �lters for the ith section. The four �lter tuning parameters are the

center frequency range, !c(1){!c(N), and the quality factor range Q3(1){Q3(N).

4.2 RLC Proto-Type Filters

To achieve robust overall characteristics, the �lter building block designs were

based on RC and RLC proto-types. One of their desirable properties is their

insensitivity of the peak response to component values, i.e. the peak gain is always

unity. The second is a consequence of the �rst. In a cascade of such �lters, noise

from previous �lters is never ampli�ed by successive �lters. Indeed, the noise

level increases only linearly with the number of stages. Finally, the particular

second-order section chosen is the optimum design for low-Q, wide-frequency range

�lters (Nevarez-Lozano and Sanchez-Sinencio, 1991).

4.2.1 RC Proto-Type Lowpass

The �rst-order RC proto-type lowpass �lter has a single pole on the negative real

axis in the s-plane. The RC proto-type is shown in Fig. 4.2(a). Its transfer function

HLP (!) =1

1 + j!

where !c is the corner frequency. Its square magnitude and group delay are

jHLP (!)j2 =1

1 + !2

4tLP (!) =1!c

1 + !2

We choose to implement the resistance using a transconductor of value G, as

shown in Fig. 4.2(b).1 Then !c = G=C. Let us consider the noise in this �lter.

The spectral density of the thermal noise in a single transconductor is

SG(!) =4kT�

G(4:4)

where � is the noise factor, typically greater than unity.

1For a discussion on the choice of implementation, see Chapter 2.

GVin Vout

GSin,n(ω)Vout

SG(ω)

VoutVin

Figure 4.2: RC �rst-order lowpass �lter (a) proto-type, (b) transconductance-Cimplementation, and (c) noise model.

A noise model for the lowpass �lter is shown in Fig. 4.2(c), where the input

signal is also corrupted by Gaussian noise with power spectrum Sin(!). We assume

the input noise to be independent of the noise in the transconductor. Then, at the

output node we will see Gaussian noise with power-spectral density

Sout(!) = [Sin(!) + SG(!)] jHLP (!)j2 (4:5)

If we integrate SLP (!) across all radian frequencies and divide by 2�, we �nd the

total noise power in units of square volts. As an example, suppose that SLP;in = 0,

that is, there is no noise in the input signal. Then the mean-square output noise

V 2out;n =

4kT�

1 + !2

d! (4.6)

=4kT�

=kT�

Note that the equivalent noise bandwidth of a lowpass �lter is !c=4.

In Chapter 2 we discussed the self-biased transconductance-C integrator as a

building block for implementing �lter functions. Fig. 4.3(a) shows this transcon-

ductor circuit and symbol. Fig. 4.3(b) is a single-order inverting lowpass �lter

consisting of two such transconductors.

4.2.2 RLC Proto-Type Bandpass Filter

The second-order RLC proto-type bandpass �lter has a pair of complex-conjugate

poles in the left-half s-plane and a zero at the origin. It is shown in Fig. 4.4(a).

The transfer function is given by

HBP (!) =

1 + j!

Q3!c� !2

where !c is the center frequency and Q3 is the 3-dB quality factor, de�ned as the

ratio of the center frequency to the 3-dB-down bandwidth. Note that Q3 = 0:5

Vin Vout

Figure 4.3: Self-biased transconductance-C integrator: (a) circuit and symbol, (b)con�gured as �rst-order lowpass �lter.

corresonds to two real poles at ! = !c. The square magnitude and group delay

jHBP (!)j2 =

�2�1 � !2

�2+�

�2 (4.8)

4tBP (!) =

�1 + !2

��1 � !2

�2+�

�2We adopt a transconductance-C implementation of the RLC prototype �lter,

as shown in Fig. 4.4(b). We replace the R with a transconductor G1, and the L

with a gyrator (generalized impedance converter) and second capacitor of value C.

The gyrator is formed with two transconductors G2 and G3 in negative feedback.

For ease in the design, we set G2 = G3. The equivalent inductance Leq seen at the

input to G2 is C=(G2G3). This particular second-order section has been chosen

because it is the optimum design for low-Q, wide-frequency range �lters (Nevarez-

GVin Vout

G1Sin,n(ω)Vout

SG1(ω)

~SG2(ω)

~SG3(ω)

Figure 4.4: RLC proto-type second-order bandpass �lter (a) proto-type, (b)transconductance-C implementation, and (c) noise model.

Lozano and Sanchez-Sinencio, 1991). The transfer characteristics of the �rst- and

second-order sections are summarized in Table 4.1.

Table 4.1: Characteristics of the �rst and second-order OTA-C �lters

Parameter First-order lowpass Second-order bandpass

H(s)!c

s2 + !cQ3s+ !2

!c G=CpG2G3=C

Q3 {pG2G3=G1

In Fig. 4.4(c) we model the various noise sources in this particular VLSI imple-

mentation. By superposition we can consider each noise source independently and

then add their resulting e�ects at the output node. Let SG1, SG2, and SG3 be the

power spectral density of the �rst, second, and third transconductors, respectively.

As before, assume that the input signal has noise power spectrum Sin. Let HLP2

be the transfer function of a second-order unity-gain lowpass �lter with the same

pole locations as the bandpass �lter, i.e.,

HLP2(!) =1

1 + j!

Q3!c� !2

Its square magnitude is

jHLP2(!)j2 = 1�1� !2

�2+�

�2 (4:10)

One can show that the square magnitude transfer functions between the equivalent

noise sources and the output voltage are given as in Table 4.2.

Setting G2 = G3, the ratio G3=G1 is just Q3 Then the total output noise power

spectrum for the VLSI implementation is given by

Sout(!) =hSin(!) + SG1(!) + SG3(!)Q

ijHBP (!)j2 + SG2(!) jHLP2(!)j2 (4:11)

We �nd the total noise power by integrating Sout(!) across all frequencies. As an

example, suppose that Sin = 0, that is, there is no noise in the input signal. Using

Table 4.2: Square magnitude transfer functions from all noise sources to Vout forthe second-order OTA-C �lter.

Noise Source Transfer Function

Sin jHBP (!)j2SG1 jHBP (!)j2SG2 jHLP (!)j2SG3 jHBP (!)G3=G1j2

the identities

Z 10jHBP (!)j2 d! =

(4.12)

Z 10jHLP2(!)j2 d! =

the noise power in the output voltage is

V 2out =

"4kT�

4kT�

#!c4Q3

+4kT�

4=kT�

C(1 + 2Q3) (4.13)

Note that it is assumed that the noise factor � is the same for each transconductor.

Fig. 4.5 shows the second-order bandpass circuit, consisting of six self-biased

transconductors. Together with the �rst-order lowpass circuit of Fig. 4.3(b), they

can be used as building blocks for the basilar membrane model of Fig. 4.1.

4.3 Complete Filter Bank Model

Having studied the elements of the �lter bank, the �rst-order lowpass and the

second-order bandpass, we proceed to derive equations describing the behavior of

the entire system.

4.3.1 Transfer Function

The basilar membranemodel can be viewed as a single-input, multiple-output �lter

bank. Suppose there are N output nodes. From Fig. 4.1, the transfer function from

G1- G1 -

G3 -G3-G3

Figure 4.5: RLC proto-type second-order bandpass �lter composed of six self-biased transconductors.

the input to the output n is given by:

H(n; !) =

Q3(n)!c(n)

1 + j!

Q3(n)!c(n)� !2

!c(n)2

1 + j!

1A (4:14)

where !c(i) is the cuto� frequency of the lowpass �lter and the center frequency of

the bandpass �lters for the ith output section and Q3(i) is the 3-dB quality factor

of the ith output section.

A practical VLSI implementation in subthreshold CMOS imposes the following

constraints: the spacing between !c(i) and !c(i+1) is exponential and the spacing

between Q3(i) and Q3(i + 1) is exponential. Note that exponential spacing on a

linear scale is the same as linear spacing on a logarithmic scale. As a result, one

need only specify the range in these parameters as well as the number of output

sections N in order to completely determine the �lter bank transfer characteristics.

Mathematically, we can write

!c(i) = !c(1)N�iN�1!c(N)

i�1N�1 (4.15)

Q3(i) = Q3(1)N�iN�1Q3(N)

i�1N�1 (4.16)

A more fundamental constraint is that !c(1) > !c(N). This constraint speci�es

the direction in which the acoustic wave travels.

A number of preliminary lowpass sections are needed before the �rst output

channel in order to establish more uniform magnitude and phase characteristics

within the �lter bank. The exact number depends largely on the frequency spacing.

For the moment, let there be M such preliminary sections. In practice one can

achieve this goal by ignoring the �rst M output channels in the �lter bank, and

only observe outputs M + 1 through N , for a total of N �M channels. However,

this technique has two undesirable consequences. The �rst is wasted space and

power consumption, since the bandpass �lters in the �rst M sections are unused.

The second is that the �rst several sections could have very high quality factors

that may introduce noise into the other circuits. For this reason, we specify M

preliminary lowpass sections prior to the N output sections, as in:

H(n; !) =

Q3(n)!c(n)

1 + j!

Q3(n)!c(n)� !2

!c(n)2

1 + j!

1A (4:17)

where !p(i) is the cuto� frequency of the ith preliminary lowpass section. Sub-

mitting to the constraint of exponential spacing described earlier, an equation for

computing !p(i) is

!p(i) = !c(1)N+M�iN�1 !c(N)

i�1�MN�1 (4:18)

In Figs. 4.6 we plot magnitude response and the group delay as a function of

frequency for a 16-channel basilar membrane with constant Q3 and two preliminary

lowpass sections.

4.3.2 Filter Bank Tuning

Exponential spacing of the corner frequencies is easily achieved in subthreshold

CMOS design due to the exponential voltage-to-current relationship, as described

in Appendix A. The corner frequencies are proportional to G, the transconduc-

(a) Frequency [Hz]

(b) Frequency [Hz]

Figure 4.6: Response of 16-channel cochlear �lter bank using exact equations, (a)magnitude and (b) group delay. Filter parameters are as follows: fc(1) = 8000 Hz,fc(16) = 100 Hz, Q3(1) = 2:6, and Q3(16) = 2:6. Two preliminary lowpass sectionsare added for better uniformity in the peak response.

tance. The transconductance, in turn, is proportional to the bias current. Hence,

it is our goal to exponentially space the bias current at each transconductor in

order to exponentially space the corner frequencies.

The method for achieving exponential spacing is by linearly spacing one of three

voltages terminals of the CMOS transistor. A good current source will not have

any drain voltage dependence, leaving the other three nodes as possible candidates:

the source, the gate, and the substrate.

Linear spacing of the supply voltages is possible using a resistive ladder. An

example of such a biasing scheme is depicted in 4.7(a), where one section of a

lowpass cascade is shown. As such, the sources of both the nMOS and pMOS

transistors in a complementary design can be linearly graded. Note that if the

transconductor is not complementary, only one of the power supplies need be

graded. However, because the source of a transistor is a low-impedance node, the

impedance of the resistive ladder must of necessity be lower still. A low-impedance

resistive ladder will result in signi�cant power dissipation. Therefore, this type of

resistive ladder is generally undesirable.

Gn- Gn

+Vsubn

-Vsubn

Figure 4.7: Single-section of lowpass cascade showing tuning mechanism via (a)supply lines and (b) substrate lines.

A second method for exponentially spacing the bias current of a transconductor

is at the gate of a current source. However, the self-biased transconductance-

C integrator does not have an independent current bias, hence, its name. But

transconductor designs based on the di�erential pair have an independent current

bias element. The method of linearly spacing the gate voltage of the bias transistor

has been used by Liu and others (Lyon and Mead, 1988; Watts, 1992) in the design

of cochlear �lter banks.

A �nal method of modulating the bias current in a current source is the oft-

overlooked substrate, or back-gate. A linear change in the substrate voltage results

in an exponential change in the bias current, albeit less e�ectively than the gate

potential. For a class-A or -B design, only one substrate potential need be modu-

lated. For a complementary transconductor, both substrates need resistive ladders.

If it were possible to bias both the n-substrate and the p-substrate, as in a twin-tub

process, the last solution would be the best for the case of the self-biased transcon-

ductor. The substrate sinks and sources very little current, and hence little power

would be consumed in setting the bias currents. The substrate biasing scheme is

depicted in Fig. 4.7(b).

4.3.3 Filter Bank Noise

Each �lter in a cascade contributes to the overall noise spectrum at the output

channel. Let us relate the transconductance noise spectra to the �lter parameters,

assuming constant capacitance C. For the case of the ith �rst-order lowpass �lter,

we have

SG(i; !) =4kT�

C!c(i)(4:19)

For the ith bandpass �lter, with G2(i) = G3(i), we have

SG1(i; !) =4kTQ3(i)�

C!c(i)

SG2(i; !) =4kT�

C!c(i)(4.20)

SG3(i; !) =4kT�

C!c(i)

We can now write an iterative procedure for computing the noise power at the

output of the ith lowpass section:

SLP (i; !) = [SLP (i� 1; !) + SG(i; !)] jHLP (i)j2 (4:21)

where HLP (i) is the �lter response of the ith lowpass section. The input to the ith

double-bandpass section is the ith lowpass �lter. Computing the output noise in

two steps we have

SBP1(i; !) =hSLP (i; !) + SG1(i; !) +Q2

3SG3(i; !)ijHBP (i; !)j2

+SG2 jHLP2(i; !)j2 (4.22)

SBP2(i; !) =hSBP1(i; !) + SG1(i; !) +Q2

3SG3(i; !)ijHBP (i; !)j2

+SG2 jHLP2(i; !)j2

Assuming ideally-matched self-biased transconductors exhibiting only white

thermal noise, Fig. 4.8(a) shows the theoretical output noise power spectrum of

the entire 16-channel �lter cascade with parameters Q3 and frequency range as

before. The output channel noise is dominated by the quality factor of the second-

order sections. Integrating the power spectrum across all frequencies, the output

channels have almost constant RMS noise � 0:105 mV, as shown in Fig. 4.8(b).

10−1

(a) Frequency [Hz]

(b) Frequency [Hz]

Figure 4.8: (a) Power spectral density of noise in 16-channel cochlear �lter bankwith C = 5:0 pF and other parameters as earlier de�ned, and (b) RMS noise as afunction of center frequency.

4.4 Information Rate and Power Dissipation

The maximum number of bits per second, or channel capacity, can be computed

from the dynamic range of an analog continuous-time �lter using information the-

ory (Shannon, 1948). In Chapter 5, we derive the following result:

C = fp log2

�1 +

�eDR

�(4:23)

where fp is the bandwidth of the �lter. The above equation applies to linear

systems under additive Gaussian noise conditions. It assumes a peak amplitude

constraint, which is appropriate for circuits which must operate within a certain

voltage range to avoid distortion and clipping.

Computing the exact distortion for the entire �lter structure is beyond the

scope of this work. However, we can estimate the dynamic range of each channel

using the distortion measure of just a single integrator. Allowing a maximum of

2% distortion, the maximum RMS input signal is only 6:43 mV for a normally

distributed input. The peak gain of each output channel is approximately 0.42,

due to overlap between the lowpass and bandpass �lters. As such, the distortion-

limited dynamic range for each channel is approximately (6:43 � 0:42=0:105)2 or

28.2 dB.

We can estimate the maximum information rate from (4.23), noting that the

message bandwidth fp is approximately !c=(2�Q3) for each channel. Assuming

non-overlapping channels and independence between channels, the total maximum

information rate is

Ctot =NXi=1

2�Q3(i)log2

�1 +

�eDR(i)

�(4:24)

For the parameter values chosen earlier, the system capacity is calculated as

68 kbits/sec.

The current consumption in the �lter bank, not including that needed for tun-

ing, which, admittedly, can be quite large, is theoretically 237 nA. Using a 1.5-Volt

power supply, the total power dissipated is approximately 355 nW. Finally, we es-

timate the maximum information rate per Watt, or number of bits per Joule, as

0.19 bits/pJ.

4.5 Signal Power Distribution

We restrict outselves to the design of constant-Q �lter banks. In this state the

linear �lter bank approximates wavelet analysis in a scale domain that preserves

good temporal resolution (Liu et al., 1993).

Given a particular input signal frequency distribution, SV in(!), it seems rea-

sonable to divide the signal power among the N output channels uniformly. Let

us assume that

SV in(!) � V 2o =! (4:25)

within the frequency band !(N){!(1) of interest, where V 2o is a constant of pro-

portionality.

A set of �lter parameters which spreads the signal power evenly across all

channels can be expressed as

!c(i) = !c(1)N�iN�1!c(N)

i�1N�1

Q3(i) =

24 !c(1)

! 12(N�1)

� !c(N)

! 12(N�1)

35�1

(4.26)

where !c(i) and Q3(i) are the center frequency and quality factor for channel i.

2 This set of �lter parameters is constant-Q, since the equation for Q3(i) is not

2We wish to make a �ne but important distinction between the parameters of the �lter bank,i.e. center frequency and quality factor of the �rst- and second-order sections, and the �lter bankbehavior. In particular, the center frequency of the output channel will be lower than that ofthe individual �lters, because the lowpass cascade introduces a skew to the output �lter shape.In addition, the output channel quality factor will be greater than that of the individual second-

a function of i. It also spans the entire frequency range from !c(N){!c(1). The

e�ective 3 dB-down bandwidth of channel i can be expressed as follows:

BW3(i) =q!c(i� 1)!c(i)�

q!c(i)!c(i+ 1) (4:27)

For the moment, ignore the two extreme channels, i = 1 and i = N . We assume

that the bandlimits of channel i are the geometric mean of that channel with its two

neighboring channels, i.e.q!c(i� 1)!c(i) for the upper limit and

q!c(i)!c(i+ 1)

for the lower limit.

Then the signal power at channel i is found by integrating the power spectrum

of the input signal over the bandwidth of channel i, i.e.,

V 2in;s(i) =

Z p!c(i+1)!c(i)

p!c(i+1)!c(i)

A(i)2V 2o

!d! (4:28)

A(i) is the attenuation of channel i, and is assumed at across the frequency band.

Integrating we �nd

V 2in;s(i) =

A(i)2V 2o

4�ln

!c(i� 1)

!c(i+ 1)

!(4:29)

Substituting !c(i) from the equation above, we obtain

V 2in;s(i) =

A(i)2V 2o

2�(N � 1)ln

!(4:30)

Provided that A(i) is the same across each channel, the output power will be the

We see from the above equation that it is important that the shape and peak

frequency of each channel be approximately constant. It is for this reason that we

introduce preliminary lowpass �lters into the design.

An attempt was made to verify the assumption that speech follow a 1=f for

the frequency range of interest. Using all of the training data from the TIMIT

order sections, because there are two such �lters in cascade. We are unable to derive an analyticexpression relating the parametric description to the behavioral description. In this section, weare describing only the behavioral description.

database, we computed the average short-term spectrum. The result is shown in

Fig. 4.9. A simple parametric �t for the data (marked with x's) is a broad bandpass

�lter with center frequency of 550 Hz. We conclude that a 1=f model is not wholly

appropriate for speech.

freq, Hz

Figure 4.9: Short-term averaged power spectrum taken from the training set ofthe TIMIT database normalized by the standard deviation. Experimental dataare marked by x's. The dotted line is a �t using a bandpass �lter with centerfrequency 550 Hz and Q � 1.

4.6 Results and Discussion

Several versions of this �lter bank architecture have been fabricated using transcon-

ductor designs described in Chapter 3. At the time of this writing, they are func-

tional, but not fully tested.

The problem of optimal extraction of information from sensory signals by \real"

computing hardware in terms of maximum information rate per unit of power con-

sumed has not been completely resolved in this work. Rather, having experimented

with one transconductor circuit design and one architecture, we leave open the

possibility of future improvements by way of: (1) integrators with higher dynamic

range and/or lower power consumption, (2) enhanced �lter architectures, and (3)

non-linear adaptation.

In the following chapter we explore fundamental limitations in low-power in-

formation processing systems, such as the basilar membrane, by comparing analog

and digital implementations of a simple delay function. Our goal is to begin to

answer the question of why and how to process signals in an analog format.

Chapter 5

Comparison of Continuous and

Discrete Circuits

A fundamental question in information processing is: What is the most power

e�cient representation for processing information? For example, most, if not all,

speech recognition systems pass the output of the microphone to an anti-aliasing

�lter and then to an analog-to-digital converter. Henceforth, all information pro-

cessing is performed in the digital domain. In particular, the extraction of features

and the classi�cation of these features into phonetic segments is performed with

digital hardware.

It is our hypothesis that the e�cient extraction of features from speech can be

performed using low-power, low-precision hardware. Classi�cation, on the other

hand, is perhaps a task best suited to digital hardware, where memory and band-

width requirements are enormous.

In this chapter we begin to explore some of the fundamental di�erences between

analog and digital information processing systems. Following the paradigm set

forth by Hosticka in his paper (Hosticka, 1985), we will compare four di�erent

information processing systems performing a very basic task. We l relate the

maximum signal-to-noise ratio, or dynamic range, to power dissipation. And then

we will compute the maximum information rate, or capacity, as a function of power.

Comparing these systems, we hope to shed light on the question of when and where

it is advantageous to use an analog rather than a digital signal representation.

The types of processing tasks that we ultimately wish to consider are those

found in area of sensory communication. It is generally believed that the human

nervous system does not behave exactly like a digital computer. Indeed, at the

cellular level, many neurons could be classi�ed as discrete in value, that is, ei-

ther spiking or not spiking, but not discrete in time. To date, much research has

been undertaken to solve problems such as automatic speech recognition or image

recognition using only digital computers. It is at least possible, if not probable,

that power e�cient algorithms and architectures will emerge as we explore sig-

nal representations that are more common to biology, the best speech and image

recognition system in the world.

5.1 Capacity

The maximum information rate of error-free transmission that can be processed is

limited by the system capacity C which is given by the signal-to-noise ratio S=N

and the message bandwidth fp, as de�ned by the Hartley-Shannon law (Shannon,

C = fp log2(1 + S=N) (5:1)

where S and N are the average signal and noise power, respectively. The dimension

of C is bits per second. This law applies to systems having an average signal power

constraint S within Gaussian noise of power N . According to (Shannon, 1948), in

order to approach this limiting rate of transmission, the transmitted signals must

approximate, in statistical properties, white noise.

For low-voltage signal processing circuits, perhaps a more common constraint

is peak amplitude, or peak power. In particular, let us assume that the signal is a

voltage which operates in the region from 0 to V Volts. Let the reference voltage,

or signal ground, have the value V=2. Then the signal is constrained to fall in the

range �V=2, with peak power Speak = V 2=4. 1

In the case of a peak power constraint, the equation for the channel capacity

C for a frequency band fp perturbed by white thermal noise of power N is given

by (Shannon, 1948)

C = fp log2

�1 +

2SPeak�eN

�(5:2)

where the instantaneous signal power is limited to Speak at every sample point. 2

The maximum entropy occurs if the samples are independent with a distribution

function which is uniform in the range �qSpeak to +

qSpeak. In our case, the

signal is limited to the range of 0 to V . Therefore, peak signal power is V 2=4 since

maximum amplitude range is �V=2.Let the input signal be uniformly distributed in the range 0 to V sampled at

time-intervals of 1=(2fp) and bandlimited to fp. Then the average signal power S

is approximately V 2=12. 3 Therefore we establish the relation

Speak = 3S (5:3)

If the samples are independent of one another, the power spectrum of the input

will appear almost white for long time intervals. Its power spectral density is

12fp(5:4)

Substituting Speak into (5.2) we get

C = fp log2

�1 +

�eS=N

�(5:5)

1Interestingly, a current signal does not generally su�er the same constraint and might bemore appropriately modeled using an average power constraint.

2Sample points are assumed to be taken at the Nyquist rate, i.e. every 1=(2fp) seconds.3We consider the mean-value of the input signal V=2 to carry no information.

This is the equation for the capacity of a system subject to a peak power constraint,

Speak.

The dynamic range is de�ned as the maximum possible signal-to-noise ratio.

Therefore, we can write (5.5) in the form

C = fp log2

�1 +

�eDR

�(5:6)

This equation neglects an important facet about the input signal. While it is uni-

formly distributed in the range 0 to V at the sample point, it will not be uniformly

distributed outside the sample points. More importantly, it is not constrained to

lie in the region 0 to V volts. 4

5.2 Four Signal Representations

There are four types of signal representations that we distinguish in this work,

ranging from what is commonly known as analog to what is commonly known

as digital. They are: continuous-value continuous-time (CVCT), continuous-value

discrete-time (CVDT), discrete-value continuous-time (DVCT), and discrete-value

discrete-time (DVDT).

Examples of these four circuits are given in Fig. 5.1. These circuits implement,

as near as possible, the same function { a simple delay. They are the subject

of study for the remainder of this chapter. In Fig. 5.1(a), a �rst-order lowpass

�lter is an example of a CVCT delay. We shall also refer to this circuit-type as

analog. Fig. 5.1(b) shows a CVDT delay function. This circuit type is often called

switched-capacitor, or, more simply, switched-cap. A delay function implemented

in a DVCT circuit is given in Fig. 5.1(c). It consists of an RC circuit with a

comparator. We will also refer to this type as time-domain.5 In this work we

4Histograms of randomly-generated input signals indicate that the peak amplitude is approx-imately two times larger for samples taken at rates much higher than the Nyquist rate.

5Wewish to distinguish between time-domain circuits and asynchronous logic. In time-domain

assume binary-valued signaling. The last type of circuit is DVDT, as shown in

Fig. 5.1(d), which is a parallel synchronous register.

Using such a simple computational element, the delay function, we hope to

illuminate some of the trends associated with these four circuit techniques. Un-

fortunately, there is no simple equivalent to a pure delay in CVCT circuits; the

lowpass is possibly the fairest approximation. Of the four circuit types, possi-

bly the least-well understood is the time-domain processing. Interestingly, time-

domain processing appears to be an important means of communicating in neural

pathways between the cochlea and the brain (Rice et al., 1995).

In the case of the analog and switch-cap circuits in Fig. 5.1, the brick-wall

noiseless �lter de�nes the message bandwidth fp. The bandwidth of the DVDT

system is set by the sampling rate fc. In the case of the two discrete-time systems,

the input signals must be band-limited by an anti-aliasing �lter below the Nyquist

rate. For the case of the DVCT system, events have an average rate of ft.

5.2.1 Continuous-Value Continuous-Time

The noise spectral density of a nominal resistor of value R (one-sided) is

�f= 4kTR (5:7)

where k is Boltzman's constant and T is absolute temperature.

An RC lowpass �lter has transfer function

H(j2�f) � Vc(j2�f)

Vi(j2�f)=

1 + j2�f�(5:8)

where � = RC. The phase and square magnitude are

jH(j2�f)j2 =1

1 + (2�f� )2(5.9)

circuits the information is contained in the time between events, as opposed to asynchronous logic,where the state of the event is of primary importance.

VoutVin

Noiseless

Brick-Wall

FilterC

Noiseless

Brick-Wall

Filter

Aliasing

Filter

VoutVin

Vin1 Octave

Bandpass

Filter

1 Octave

Bandpass

Filter

VinAnti-

Aliasing

Filter

Analog /

Digital

Converter

Figure 5.1: (a) CVCT RC lowpass circuit, (b) CVDT sample{and{hold circuit, (c)DVCT RC delay circuit, and (d) DVDT clockedM -bit delay.

6 H(j2�f) = � arctan(2�f� )

� �2�f� + (2�f� )3

3� (2�f� )5

5+ :::

The noise equivalent bandwidth of an RC lowpass �lter is found by integrating the

square magnitude over the entire frequency range. We obtain

ENBW =Z 10

1 + (2�f� )2df =

4�(5:10)

The mean-square noise voltage on the capacitor C is then

V 2c;n = ENBW � 4kTR =

C(5:11)

The main point to be made here is that the mean-square noise on a capacitor is a

function only of the capacitance. We make use of this relation when we consider

other circuit types.

For the RC circuit, however, the presence of the capacitor has no e�ect on

the theoretical channel capacity. The reason is that a �ltering operation amounts

to no more than a coordinate transformation (Shannon, 1948). And because a

lowpass �lter is not absolutely bandlimiting, a su�ciently complex receiver can

detect signals which have frequency components above the cuto� frequency. Also

note that the white noise produced by the lumped resistor is �ltered in exactly

the same manner as the input signal. Therefore, the signal-to-noise ratio at each

frequency of the output signal is just the same as the ratio of the input signal

power to the thermal noise power.

Therefore, we can simplify our analysis considerably. Temporarily removing the

capacitor from Fig. 5.1(a), the output noise power is the resistive thermal noise

times the message bandwidth fp

V 2out;n = 4kTRfp (5:12)

independent of the cuto� frequency. The output signal power will be equal to the

input signal power, if the capacitor is removed. Assuming the signal is uniformly

distributed in the region 0 to V , that signal power is V 2=12, as stated earlier.

Therefore, the signal-to-noise ratio is

S=N =V 2

48kTRfp(5:13)

The above equation for signal-to-noise ratio is valid if the capacitor were re-

introduced in Fig. 5.1(a) because its e�ect on the signal power and the noise power

would cancel.

In order to compute the mean power dissipation, we derive the mean voltage

across the resistor and then divide by the resistance R. The transfer function of

the input voltage Vin to the voltage across the resistor VR is

H(j2�f) � VR(j2�f)

Vin(j2�f)=

jf=fo1 + jf=fo

(5:14)

where fo = 1=(2�RC). The power spectrum of the voltage drop across the resistor

is found by multiplying the input power spectrum in (5.4) by the square magnitude

of the transfer function. We obtain equation

(f=fo)2

1 + (f=fo)2(5:15)

The power dissipated in the resistor is found by integrating over the message

bandwidth and dividing by R, as in

Pm =Z fp

V 2(f=fo)2

12Rfp (1 + (f=fo)2)df (5.16)

"1 � fo

fparctan

Writing the mean power dissipation as a function of the maximum signal-to-

noise ratio, we have

Pm = 4kTS=Nfp

"1 � fo

fparctan

!#(5:17)

Similarly, one can write S=N as a function of Pm, as in

S=N =Pm

1h1� fo

fparctan

�i (5:18)

If we compare the equation derived for S=N as found in (Hosticka, 1985) with the

above equation for the case of fp = fo, we �nd that (5.18) yields a S=N which is

6.7 dB higher. The discrepancy is largely found in the manner in which the mean

power dissipation was derived. The other author assumes that all of the input

signal power is dissipated in the resistor.

In addition, we wish to write the capacity as a function of the dissipated power.

Substituting (5.18) into (5.5), the capacity is

C = fp log2

0@1 + 6

PM4kTfp

1h1� fo

fparctan

�i1A (5:19)

Let W be the power-delay product. The delay in an RC circuits is theoretically

a function of frequency. If we suppose that the message bandwidth fp is much less

than fo, then the equivalent delay is � . One arrives at that conclusion by taking

a Taylor Series expansion of the actual phase, as in (5.10), and then truncating

after the �rst term. Note that the phase response of a pure delay �t is �2�f�t.Assuming that the delay in a lowpass �lter is approximately � = 1=(2�fo). we get

W =2kTS=N

"1 � fo

fparctan

!#(5:20)

5.2.2 Continuous-Value Discrete-Time

The CVDT circuit to be analyzed is the sample-and-hold circuit of Fig. 5.1(b) The

input signal is band-limited by an anti-aliasing �lter to frequency fs=2, where fs is

the sampling rate. It is assumed that the switch is closed for a period much longer

than the correlation time RswC=2. But since this is the fundamental requirement

for a complete charge transfer anyway it can be used quite safely. On the other

hand, because RswC=2 is small, the equivalent noise bandwidth necessarily exceeds

the Nyquist rate and the noise is aliased.

The thermal noise of the MOSFET switch is aliased by the sampling process

into the baseband (0 � fs=2), where fs is the sampling frequency. In order to

show this, we must convolve the noise spectrum at node Vc with a pulse train

at frequencies nfs, where n = 0;�1;�2; :::. At node Vc, the shape of the noise

spectrum is given by (5.10), where � is replaced by RswC. To get the noise shape

at node Vd, we compute

1 + (2�f� )2�

1Xk=�1

�(k2�fc) (5.21)

k=�1

1 + (2�[f � kfc]� )2

From the summation, we see that the noise which is outside the baseband is aliased

into the baseband, and that, in fact, none of the noise escapes this aliasing. Since

the total noise on a capacitor is given by (5.11), it follows that the noise in the

baseband is also given by this equation and that it is almost at. Thus, the noise

spectrum at node Vd is

�f=kT

fs(5:22)

Since the output node Vout is a �ltered version of Vd at frequency fp, the noise

power at the output is

V 2out;n =

2kTfpCfs

(5:23)

As before, we shall assume that the input signal is bandlimited to fp and is

almost white, in such a way that the peak power Speak is never exceeded at every

sample point, 1=(2fp). Note that if we sample at a rate fc which is greater than

the Nyquist rate, we will �nd that some of the samples will exceed the peak power

limitation.

The average input signal power S is approximately V 2=12, as stated earlier.

Dividing S by N in (5.23) we get

S=N =CV 2fs24kTfp

(5:24)

In order to compute the mean power dissipated in the switch, we need to com-

pute the average voltage di�erence between samples. We perform this computation

assuming that the input frequency is not phase-locked to the clocked frequency.

However, this is a slight violation of the assumption that the peak input power

does not exceed Speak. For now, we will live with this discrepancy.

Let Vc(n) be the sample at time n=fs and Vc(n+ 1) be the next. We want the

compute the expected power of the voltage di�erence (�Vc)2, where

(�Vc)2 � (Vc(n)� Vc(n+ 1))2 (5.25)

= (Vc(n))2 + (Vc(n+ 1))2 � 2(Vc(n)Vc(n+ 1))2

The �rst two terms are equal to the signal power. The last term is the input signal

autocorrelation function sampled at 1=fs.

Now, the input signal is approximately white, with average power V 2=12.

Therefore, the autocorrelation function R(� ) is a sinc function, as in,

R(� ) =V 2

sin(2�fp� )

2�fp�(5:26)

Substituting � = 1=fs, we get the average voltage di�erence power as

(�Vc)2 =V 2

1� sin(2�fp=fs)

2�fp=fs

!(5:27)

When a capacitor is charged from 0 to �Vc Volts, the energy stored on the

capacitor is C(�Vc)2=2 Joules. That same amount of energy gets dissipated in

the switch, no matter haw small the switch resistance (see the derivation leading

to (5.42).) Similarly, when the capacitor is discharged from �Vc to 0, C(�Vc)2=2

Joules are dissipated in the switch. The average power dissipated in switching

events occurring at a rate fs will be

Pm =C(�Vc)2

CV 2fs12

1� sin(2�fp=fs)

2�fp=fs

!(5:28)

We can write the mean power dissipation in the channel, i.e. in the switch, as

a function of S=N as follows:

Pm = 2kTS=Nfp

1� sin(2�fp=fs)

2�fp=fs

!(5:29)

The power-delay product is equal to Pm times 1=fs, yielding

W =2kTfpS=N

1� sin(2�fp=fs)

2�fp=fs

!(5:30)

The signal-to-noise ratio can be written as

S=N =Pm

1�1 � sin(2�fp=fs)

2�fp=fs

� (5:31)

If we compare the two equation for S=N as a function of power in (5.31) to that

obtained in (Hosticka, 1985) for the case that fp = 0:5fs, the equation derived here

is 6 dB higher. Again the main reason for the discrepancy comes in the manner in

which the other author computes the mean power dissipation.

The capacity can be expressed as

C = fp log2

0@1 + 6

Pm2kTfp

1�1� sin(2�fp=fs)

2�fp=fs

�1A (5:32)

5.2.3 Discrete-Value Discrete-Time Circuit

For the case of the parallel M -bit register of Fig. 5.1(d), the signal-to-noise ratio

is a function of the number of bits used. The input samples must be quantized to

�t the �nite register length M ; herein lies the major source of noise.

Assume that the quantization noise Qn is uniformly distributed between the

two nearest quantization steps. Let the distance between quantization steps be 1

bit, or one count. Then the average quantization noise power is 1=12 square bits.

The signal power S is also computed assuming a uniformly distributed input

signal distribution with i.i.d. samples. With 2M levels, ranging from 0 to 2M � 1,

we have

S =2MXk=1

2Mk2 �

0@ 2MXk=1

(5.33)

=2M(2M + 1)(2 � 2M + 1)

6 � 2M � 2M (2M + 1)

2 � 2M!2

=22M � 1

Combining the last two results, the S=N ratio is

S=N = 22M � 1 (5:34)

The above computations assume �xed-point, rather than oating-point arithmetic.

If fs now represents the the signaling rate at which the entire M -bit message

is transmitted, the digital system capacity is

C =Mfs (5:35)

The above equation is correct if we only consider the error introduced by quan-

tization noise. Should there be any additive noise in the channel, the binary

transmission will exhibit a certain bit error rate Pe. If we tried to reconstruct an

analog waveform from the received digital signal, we would �nd the resulting S=N

ratio to be lower.

Suppose that the probability of a single bit error is Pe and that the probability

of more than one error occurring in a single M -bit transmission is negligible. The

expected square distance between the sent and received digital signal D2 is

D2 = Pe12 + Pe2

2 + :::+ Pe(2M�2)2 + Pe(2

M�1)2 (5.36)

=M�1Xk=0

Pe(2k)2 = Pe

22M � 1

If we now sum the noise contributions of the quantization step and the distor-

tion introduced by the M -bit register, the total \noise" is 1=12 + Pe(22M � 1)=3.

Assuming the signal power to be approximately unchanged, we have

S=N =22M � 1

1 + 4Pe(22M � 1)(5:37)

In order for the distortion introduced by the M -bit register to be negligible, we

must satisfy the condition 4Pe(22M � 1) << 1.

Until now we have followed directly the formulation of Hosticka (Hosticka, 1985)

for the DVDT circuit. Presently, we introduce a weighted probability of error �,

� = 4Pe(22M � 1) (5:38)

If � is held constant, the degradation in S=N ratio due to one processing step, such

as an M -bit register, is held constant, independent of M . In this case the Pe for a

single gate must decrease as the number of bits increases.

Now we wish to relate the probability of error Pe to the power consumption in

theM -bit register. Let us consider a stream of binary symbols with two permissible

states with a voltage separation V . At the receiver we are interested in knowing

whether a pulse of �xed amplitude V is present or not within a certain time interval.

Assuming Gaussian noise V 2n which a�ects both states equally and the detection

threshold is set to V=2, the bit error rate is

Pe =Z1

1q2�V 2

�x22V 2

!dx (5.39)

= erfc

2qV 2n

where erfc(x) � 1=p2�R1

x exp(y2=2) dy. 6

Consider a logic gate consuming no quiescent current and calculate the energy

in an elementary switching event WG.

WG =Z1

0Isw(t)Vsw(t) dt (5:40)

where Isw and Vsw are the current through and voltage across the active switch,

respectively. Let Rsw be the small but �nite resistance across the active switch

and Cg be the parasitic capacitance of the gate of the next logic gate. Then the

equations for charging the capacitor voltage from 0 to V Volts are

Isw(t) = V=Rsw exp(�t=�sw) (5.41)

Vsw(t) = V exp(�t=�sw)6Note that the complementary error function is sometimes de�ned in a di�erent manner.

where �sw = RswCg. Plugging into the equation for Wg and integrating we �nd

Wg =CgV

2(5:42)

The noise power in a single gate is V 2c = kT=Cg. Therefore, we have

Pe = erfc

�1=2!(5:43)

Supposing that the inverse to the complementary error function exists, we have

Wg = 2kT

"erfc�1

4(22M � 1)

!#2(5:44)

If there are M bits in the register, and the clock is operating at the Nyquist

rate, then on average half of themwill be switching and half will remain unchanged.

In this case we have

Pm =MWg

2fs (5.45)

= kTMfs

"erfc�1

4(22M � 1)

where fs is the sampling rate.

Now the number of bits M relates to the S=N ratio of an analog signal by the

equation

2log2(1 + S=N) (5:46)

Substituting into (5.46) we obtain

Pm =kTfs2

log2(1 + S=N)

"erfc�1

!#2(5:47)

Solving for S=N as a function of Pm appears di�cult, if not impossible. Therefore,

we will use numerical techniques to solve the above equation for a particular Pm.

Similarly, we can write an equation for Pm as a function of the capacity, whereas

the inverse looks very di�cult. Substituting M = C=fs, we get

Pm = kTC"erfc�1

4(22C=fs � 1)

!#2(5:48)

The equation for the energy dissipated per M -bit digital transmission is just

Pmfs, which can be computed easily from the above equations. The lower bound

on the amount of energy dissipated in a digital gate was derived by Landauer. He

estimates that dissipation of the order kT per logic step is required owing to ther-

modynamic limits (Landauer, 1961). How close we operate to the thermodynamic

limit will directly in uence the probability of error.

5.2.4 Discrete-Value Continuous-Time

Since the DVCT circuit of Fig. 5.1(c) was not discussed by Hosticka (Hosticka,

1985), care must be taken in order to adequately formulate the problem. Suppose

there is a binary source. At certain instants in time, the source changes state,

from either 0 to 1, or 1 to 0. The two important pieces of information are the

time between transitions and the direction of the transition. This type of signal

representation is generally referred to as zero-crossings. Logan's theorem states

that if a signal is strictly bandlimited to within one octave, then a signal can be

completely reconstructed from its zero-crossings to within a constant. Therefore,

we have included a noiseless single-octave bandpass �lter at the input in Fig. 5.1(c).

In the circuit realization, suppose the source begins in state 0, outputting 0

Volts. When the �rst event occurs, it switches state, outputting V Volts. We add

two constraints to the source. The �rst is that the minimum time between discrete

events is tmin seconds. If we like, we can set tmin to 0. The second constraint is

that the average time between transitions is tav.

Two questions we must answer. What is the e�ect of the noise in the resistor

on the arrival time of the transition? In other words, what is the jitter? Also,

what is the distribution of the source which maximizes the entropy over such a

channel, that is, what is the capacity of this channel?

The major theoretical results of the DVCT channel are found in section 5.4.

They can be summarized as follows:

1. The jitter, or noise in the DVCT circuit can be approximated as Gaussian,

under the constraint V=� >> kT=C.

2. The maximuminformation rate measured in bits/second of any DVCT source

under an average power constraint is a source for which the time between

transitions follows an exponential distribution.

3. Using the entropy-power inequality, a lower bound on the capacity of the

DVCT circuit is derived which becomes tight as the noise power is reduced.

Not coincidentally, the capacity lower bound is reached for the DVCT source

which follows an exponential distribution.

The energy dissipation per transition in the DVCT circuit is

W =CV 2

2(5:49)

as derived for the digital circuit, provided that the transition is complete. The

power dissipation is W divided by the average time between transitions, or

Pm =CV 2

2tav(5:50)

where tav is the average length between transitions.

The signal power is the variance of the input distribution. In order to approach

the lower bound on the capacity of the DVCT circuit, the input must follow an

exponential distribution. Its signal power is

tminx2pX (x) dx�

tminxpX(x) dx

�2= (tav � tmin)

2 (5:51)

The noise power is just two times t2d;n, as derived in (5.62) in section 5.4, or

N = �2 =8kT� 2

CV 2(5:52)

So the signal-to-noise ratio is

S=N =(tav � tmin)

(tav � tmin)2CV 2

8kT� 2(5:53)

Also from section 5.4, a lower bound on the capacity of the DVCT circuit given by

C00 = 1

2tavlog2

�1 +

2�S=N

�(5:54)

The signal-to-noise ratio can also be written as a function of the power dissipation.

In this case, we have

S=N =(tav � tmin)2tavPm

4kT� 2(5:55)

At this point it is more convenient to covert all units of seconds into frequencies.

Let fav = 1=tav, fo = 1=(2�� ), and fmax = 1=tmin. Then we have

S=N =(fmax � fav)2f2o�

2PmkTf2maxf

(5:56)

Substituting this expression into that of the approximate capacity, we obtain

C00 = fav2

(fmax � fav)2f2nPmkTf2maxf

!(5:57)

One restriction to the above analysis is that it does not take into account the

possibility of spurious transitions, i.e. the noise signal never has a magnitude equal

to V=2. If one wishes to operate at extremely low, or even negative signal-to-noise

ratios, the above formulation must be augmented to account for the possibility of

spurious transitions. We believe that they can be treated much the same way as a

digital error, except that the probability of error must be integrated over the time

interval.

5.3 Graphical Results

In order to compare the formulation of Hosticka with the one presented in this

Chapter, we have duplicated two of the most signi�cant graphs from the work

of Hosticka (Hosticka, 1985). We include the parameters chosen in his analysis,

and wherever possible have attempted to choose similar conditions in the present

reformulation. A notable exception is the choice of �, where � is de�ned by the

relation (Hosticka, 1985)

Pe = 2�2�M (5:58)

In displaying his results, Hosticka chooses � = 100, which, for a modest 4-bit

register, results in a probability of error for a single gate equal to 150E � 243.

That value does not seem realistic for state-of-the-art digital circuit design.

On the other hand, we have selected probability of errors on the order of 1E�14,which seems more appropriate for digital technology. The parameter � of Hosticka

can be directly related to the weighted probability of error � introduced in our

work via the equation

� = 2�2�M+2(22M � 1) (5:59)

10−10

10−8

10−6

10−4

Mean Power Pm (W)

nal−

Figure 5.2: Signal-to-noise ratio as a function of mean power. Results from (Ho-sticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed (2fp = fs = 100 MHz),and DVDT (� = 100, 2fp = fs = 100 MHz) Number of bits.

10−12

10−10

10−8

10−6

10−4

Mean Power Pm (W)

Figure 5.3: System capacity as a function of mean power dissipation. Resultsfrom (Hosticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed (2fp = fs =100 MHz), and DVDT (� = 100, 2fp = fs = 100 MHz) Number of bits.

5.4 Detailed Analysis of the DVCT Circuit

We have reserved until now some of the detailed analysis of the DVCT circuit

under question. The three main results follow.

5.4.1 Jitter in a DVCT Channel

Assume all nodes in the channel are at 0V. Let the input source transition from 0

to V at time t = 0. Assuming a noiseless channel, the voltage at the node of the

capacitance is then Vc(t) = V (1� exp(�t=� )). However, the resistor R introduces

noise. A lumped resistor can be viewed as a white gaussian voltage source in series

with a noiseless resistor. Let Vc;n be the noise at node Vc. Its spectral shape is

that of a lowpass �lter, while its mean-square value is kT=C. Using the principle

of superposition, we sum the e�ect of the two sources together to obtain

Vc(t) = V�1 � exp

�� t

��+ Vc;n (5:60)

10−12

10−11

10−10

10−9

10−8

Mean Power Pm (W)

nal−

Figure 5.4: Re-formulated signal-to-noise ratio as a function of mean power (fp =0:5fs = 100 MHz, � = 1E � 12). CVCT Sold, CVDT Dashed, DVDT Number ofbits.

where � = RC.

Let the comparator have a threshold V=2. To solve for td, the time at which

the comparitor makes a transition, equate (5.60) with the threshold. We �nd

td = ��ln 2 � ln

�1 +

2Vc;nV

��(5:61)

Using the approximation ln(1 + x) � x for small x, we �nd that the average value

of td is � ln 2, while the mean-square noise of td;n is approximately

t2d;n �4V 2

c;n�2

4kT� 2

CV 2(5:62)

Note that for this analysis to hold, one must wait at least 4� for the voltage Vc to

settle to either its low or high state before the source makes the next transition.

Thus, tmin = 4� . If the symbols we wish to communicate are the time intervals

between transitions, the noise power will then be twice t2d;n.

10−12

10−10

10−8

10−6

Mean Power Pm (W)

Figure 5.5: Re-formulated system capacity as a function of mean power (fp =0:5fs = 100 MHz, � = 1E � 12). CVCT Solid, CVDT Dashed, DVDT Number ofbits.

5.4.2 Di�erential Entropy for a DVCT Source

A standard de�nition of di�erential entropy of a random variableX can be rendered

in units of bits/symbol, as in

h2(x) = �Z1

pX(x) log2 pX(x) dx (5:63)

where pX (x) is the probability distribution of X. On the other hand, we prefer the

units of di�erential entropy to be in bits/second. In a discrete-time channel the

conversion is simple: divide h2(x) by the time each symbol occupies the channel,

which is the reciprocal of the clock rate. For a discrete-value continuous-time

source, the time occupied by each symbol is the symbol.

Suppose there is a discrete-value continuous-time source which can take one of

two possible values, 0 or V . The symbols that the source produces are the time

between successive transitions. Each symbol occupies the channel for the length of

that symbol. Thus the average time each symbol occupies the channel is just the

average length of each symbol. Therefore, to convert h2(x) to units of bits/second,

we divide it by the average length of each symbol. Let ~h2(x) denote the di�erential

entropy of a DVCT source in units of bits/second, where

~h2(x) =� R10 pX(x) log2 pX(x) dxR

0 xpX(x) dx(5:64)

We impose two physical constraints on our DVCT source. These constraints

actually arise from considerations of the DVCT channel. The �rst is that transi-

tions cannot occur less than tmin seconds apart. If later we choose to relax this

constraint, we can set tmin = 0. The second constraint is that the average num-

ber of transitions per unit time is constant. Each transition consumes a certain

amount of energy to charge or discharge the gate capacitance. Thus, �xing the av-

erage number of transitions per unit time is equivalent to �xing the average power

dissipation in the channel. Because the symbols are the time between transitions,

the average number of transitions per unit time is just the average symbol length.

Let tav be the average symbol length. Then

tav =Z1

tminxpX(x) dx (5:65)

where pX(x) is a probability density in the range tmin to 1.

Combining the constraints with the de�nition of di�erential entropy in units of

bits/second, we now pose our problem as

maxpX(x)

~h2(x) = maxpX(x)

� R1tmin pX(x) log2 pX(x) dxR1

tminxpX(x) dx

(5:66)

Now we prove that the probability distribution which maximizes the di�erential

entropy under these constraints is exponential of the following form:

pX(x) =1

tav � tminexp

�� x� tmin

tav � tmin

�(5:67)

It satis�es the constraint (5.65).

Let X be a random variable with probability distribution pX (x), and Y be

a second random variable that follows any probability distribution, qY (x) in the

region (tmin;1) and which also satis�es the constraint (5.65). We will compute

the di�erential entropy of X and then the di�erential entropy of Y , We then show

that ~h2(y) � ~h2(x) � 0, with equality if and only if qY (x) = pX(x) at every value

~h2(x) =� R1tmin pX(x) log2 pX (x) dxR

tminxpX(x) dx

tminpX(x) log2(1=pX (x)) dx

tminpX(x) [ln 2 log2(tav � tmin) + (x� tmin)=(tav � tmin)]= ln 2 dx

=log2(tav � tmin) + 1= ln 2

log2 e(tav � tmin)

tav(5.68)

It is easily veri�ed using the same approach as above that

~h2(x) =log2 e(tav � tmin)

tav=� R1tmin qY (x) log2 pX(x) dxR

tminxqY (x) dx

(5:69)

Computing the di�erence of the di�erential entropies, we have

~h2(y)� ~h2(x) =� R1tmin qY (x) log2 qY (x) dxR

tminxqY (x) dx

� � R1tmin qY (x) log2 pX(x) dxR1

tminxqY (x) dx

tminqY (x) log2(1=qY (x))

tminqY (x) log2 pX(x) dx

tminqY (x) log2(pX(x)=qY (x)) dx

tminqY (x)(1� pX(x)=qY (x)) dx

ln 2tav= 0 (5.70)

with equality if and only if qY (x) = pX(x).

Therefore the di�erential entropy of a source, under the constraints that the

symbols are larger than tmin � 0 and that the average rate is tav > tmin, is given

by an exponential distribution.

5.4.3 Approximate Capacity of DVCT Channel

The capacity of the DVCT system with random input symbol X and random

output symbol Y is de�ned as follows:

C = maxpX

I(X ^ Y )X

(5:71)

where I(X ^ Y ) is the mutual information between X and Y and X is the mean

value of X.

Let Y = X +N , where N is Gaussian noise (not necessarily white) with mean

zero and variance �2. Then (Shannon, 1948)

I(X ^ Y ) = h2(y)� h2(yjx)

= h2(y)� h2(n) (5.72)

h2(n) =1

�2�e�2

�(5:73)

Let the source have a probability distribution pX(x). We constrain the source so

that its average symbol rate is �xed, as in,

tav =Z1

0xpX(x) dx (5:74)

If we include these properties of the channel into the computation for the channel

capacity, we have

C = maxpX

h2(y)� h2(n)

tav(5:75)

Thus, to maximize the capacity of the DVCT channel, we need to maximize the

di�erential entropy of the output signal Y with respect to the probability density

of X. This maximization is quite di�cult to carry out. Therefore, we look for a

lower bound that becomes tight as the power of the noise goes to zero.

The entropy-power inequality states that (Blahut, 1987)

h2(y) � 1

�22h2(x) + 22h2(n)

�(5:76)

Re-writing the right-hand-side in order to more easily view the entropy of X, we

h2(y) � h2(x) +1

�1 + 2�2(h2(x)�h2(n))

�(5:77)

Now we are in a position to de�ne an approximate capacity C0 � C for the

DVCT channel,

C0 = maxpX

h2(x)� h2(n) +12log2

�1 + 2�2(h2(x)�h2(n))

�tav

(5:78)

The third term in (5.78) will in general be small compared to h2(x) � h2(n); as

such, we simplify our computation still further. Let C00 � C0 � C be de�ned as

C00 = 1

2tavlog2

�1 + 2�2(h2(x)�h2(n))

�+max

h2(x)� h2(n)

tav(5:79)

Now the maximization is just with respect to the di�erential entropy of X. Sub-

stituting h2(x) = log2 e(tav� tmin) from the previous subsection and (5.73), we can

write the approximate capacity for the DVCT channel as

C00 = 1

2tavlog2

(tav � tmin)2

!(5:80)

The units of C00 are bits/second.

Chapter 6

Summary and Future Research

In Chapter 2, we presented a framework for computing the dynamic range of

a CMOS transconductance-C integrator, paying close attention to the topics of

input signal statistics, sources of noise, measures of distortion, and methods for

computing dynamic range. Our presently unreached goal is to optimize the cir-

cuit realization of an analog �lter in terms of its achievable dynamic range, given

constraints on power consumption. However, it is an important step in the right

direction, as with this mathematical framework in hand, it is possible to compare

two �lter designs in terms of their achievable dynamic range.

In our analysis, we have assumed that speech signals can be adequately de-

scribed by a normal distribution. Without constraints on the input signal range,

the double-gamma distribution, which is a better description of speech, does not

converge for the integral calculations of section 2.4.1. In future work, we plan to

introduce a clipped version of the double-gamma distribution to alleviate conver-

gence problems.

One might ask which distortion measure is more appropriate for speech pro-

cessing. The main advantage of the �rst mean-square distortion measure is that

it requires no a priori information of the input signal. In addition, it provides a

worst-case scenario since gain error is ignored. The main disadvantage of the �rst

distortion measure is that it results in a low value for the linear range of nonlinear

functions with convex or concave �rst derivatives, such as sinh(x) or tanh(x).

The second distortion measure, on the other hand, requires a model of the input

signal. The model, which in our experiments consists of a single-parameter distri-

bution, is needed to compute the optimal gain factor. The bene�t of computing

the optimal gain factor is an improved linear range because gain error is removed.

Most, if not all, commonly used distortion measures either discount gain error, or

at least distinguish it from harmonic and intermodulation distortion error. Thus,

in order to obtain a higher linear range and to be able to make direct comparisons

with other common distortion measures, we favor the second distortion measure

over the �rst.

Chapter 3 introduces and analyzes several CMOS transconductor designs op-

erating in the subthreshold region. At least three of them have never been used

in subthreshold design. These three techniques, which are promising for use in

low-power continuous-time �ltering applications, are 1) source degeneration us-

ing a single di�usor, 2) source degeneration using double di�usors, and 3) multiple

asymmetric di�erential pairs. These linearizing schemes o�ered signi�cantly higher

current e�ciency, as compared with the basic transconductor or the transconduc-

tor with source degeneration via diode-connected transistors. The single di�usor

gave the highest linear range (116.8 mV); however, it requires extra common-mode

voltage circuitry. The double di�usors and two asymmetric di�erential pairs o�er

half the linear range, but no common-mode circuitry is required. Finally, three

asymmetric di�erential pairs gave the highest current e�ciency (36.4%) of all the

linearizing schemes discussed in this research, with only a modest increase in the

complexity of the circuit. Its linear range (98.1 mV) is comparable to that of the

single di�usor, whereas it requires no common-mode voltage circuitry.

The optimal scaling ratios of the di�usor circuits, which were m = 0:25 and

m = 0:5, are easy to implement in VLSI. On the other hand, the optimal scaling

ratios for the multiple asymmetric di�erential pairs were not rational (m = 2+p3

and m = 4 +p15). As such, their scaling ratio must be rounded to the nearest

convenient scale factor in a practical design.

One particularly useful property of the subthreshold transconductors analyzed

in this work is that linear range, current e�ciency and optimal transistor scaling

are independent of the bias current. In that way, a single layout can be used

repeatedly in a large scale system that consists of hundreds of transconductors

biased at current levels which vary over several orders of magnitude (Liu et al.,

1992b).

Further work needs to be done to characterize the tolerance of these designs to

structural variability, i.e. mismatch in transistor parameters.

Just as Tanimoto et al (Tanimoto et al., 1991) extended the technique of two

asymmetric di�erential pairs to three or more, so we believe that there is an ex-

tension to the technique of source degeneration via double di�usors. The analysis

and optimization of these circuits promises to be very challenging.

Chapter 4 applies the techniques and circuits of the two previous chapters to

the silicon implementation of a proposed cochlear model (Liu, 1992) using analog

very-large-scale integrated circuit technology. The model is static, i.e. linear,

whereas we recognize the need for adaptation even at early stages of processing.

Two properties of a real basilar membrane, which are interrelated, are not

included in our current architecture. The �rst is that the real cochlea has an

enormous dynamic range of 100 dB or more. In the linear circuits described in

this work, using typical component and parameter values, we estimate a dynamic

range of roughly 40-60 dB for the self-biased transconductance-C integrator. It is

unlikely that further optimizations will increase the dynamic range more than a

few dB unless we are willing to expend enormous amounts of power and area. On

the other hand, one can argue that speech communication is generally performed

over an acoustic level not exceeding 60 dB, in which case the �nest linear model

may su�ce in real applications.

The second property of a real basilar membrane is that it contains nonlinear

signal processing. In particular, as the amplitude of the acoustic signal increases,

the membrane becomes more sti�, changing its �ltering properties dramatically.

Future analysis will attempt to quantify these issues, looking at models and sim-

ulations to qualitatively capture the two properties just mentioned.

Chapter 5 discusses limitations in information processing using continuous

and discrete systems. Our analysis does not consider the power dissipated in the

encoder or decoder. As a case in point, the Brick-Wall noiseless �lter of Fig. 5.1(a)

can be viewed as the �rst step of an analog decoder. Similarly, the quantizer in

Fig. 5.1(d) can be viewed as a very basic digital encoder. Thus, we have ignored

power dissipation outside of the channel.

Additionally, our analysis also does entertain the possibility of applying error-

correcting codes to the digital message before transmitting. Then it would be

possible to run the register at a higher clock rate, with much greater noise, and still

have the same capacity. On the other hand, the process of encoding and decoding

introduces major sources of power dissipation. These issues will be addressed in

future research.

A future trend in digital technology is the movement toward multi-level or

multi-valued logic. In other words, digital signals will begin to look surprisingly

more analog, although still maintaining discrete-time. On the other hand, one

could envision the advent of multi-resolution analog processing, whereby an in-

coming signal is broken down and processed according to its scale. In the auditory

periphery, for example, there are nerve �bers which are sensitive at low sound

pressure levels and those which are sensitive at high sound pressure levels. Thus,

with low dynamic range (approximately 30 dB) components, one can construct a

system with a much broader dynamic range.

Several circuit techniques have not been explored in detail in this work {

MOSFET-C, current-mode processing, and log-domain processing, to name a few.

In a way, each technique deserves the same consideration as the transconductance-

C technique for implementing low-power continuous-time linear �lter banks.

A measure of goodness is proposed for circuit design, bits/sec/watt. It is not

attering to analog design in general, where dynamic range, not information ca-

pacity, is directly proportional power dissipation. Taking the logarithm of the

dynamic range means that, in general, low power designs will result in the most

e�cient processing schemes.

The problem of optimal extraction of information from sensory signals by \real"

computing hardware in terms of maximum information rate per unit of power con-

sumed has not been completely resolved in this work. Rather, having experimented

with one transconductor circuit design and one architecture, we leave open the

possibility of future improvements by way of: (1) integrators with higher dynamic

range and/or lower power consumption, (2) enhanced �lter architectures, and (3)

non-linear adaptation. In the future, we hope to address these three issues.

Appendix A

MOS Technology

Very-large scale integrated circuit fabrication facilities are expensive. In recent

years, fabrication of VLSI circuits has become readily accessible to universities and

organizations that have no fabrication facilities of their own. This availability is

accomplished through the MOSISTM Service established by the Defense Advanced

Research Projects Agency (DARPA) and the National Science Foundation (NSF).

MOSIS, located at the University of Southern California in Los Angeles, serves as

a silicon broker which collects a number of relatively small projects from di�erent

organizations and �nds a manufacturer that fabricates them on a silicon wafer.

Therefore the overhead fabrication cost, shared by many small projects, is greatly

reduced.

As of 1995, the most common (inexpensive) fabrication process o�ered by the

MOSIS foundry is the 2-�m CMOS technology,1 although 1.2-�m and 0.8-�m

processes are available at a mildly higher fee. Most of the circuits and chips

described in this dissertation are fabricated in a low-noise n-well double-poly BiC-

MOS process, in which the following devices can be made on the same substrate:

1) n-channel (in p-substrate) and p-channel (in n-well) MOSFETs; 2) vertical NPN

bipolar transistors; 3) capacitors using the two polysilicon layers; and 4) depletion-

mode FETs that can also be used to implement charge-coupled devices.

1The feature size, 2-�m, implies that the minimum channel length of MOSFETs, L, is 2�m.

Table A.1: List of MOS device parameters and quantitiesIDS drain-to-source currentVGB gate-to-bulk voltage

VSB source-to-bulk voltage

VDB drain-to-bulk voltage

VGS gate-to-source voltage

VBS substrate-to-source voltage

Vth threshold voltage, typically 0.7�1.0VVt kT=q, thermal voltage, about 26mV at 300�K

k Boltzmann's constant, 1:38 � 10�23 J/K

T temperature, in �K

q electron charge, 1:6 � 10�19 C

I0 current coe�cient for subthreshold operation

W e�ective transistor channel widthL e�ective transistor channel length

S e�ective transistor channel width-to-length ratio

� gate e�ectiveness measure, typically 0.7

V0 Early voltage, approximately proportional to L

�0 charge mobility

K 0 Cox�0=2, current coe�cient for above-threshold

Cox gate-oxide capacitance per unit area.

Cdep depletion region capacitance per unit area.

A.1 MOS Transistor Model

Fig. A.1(a) shows a cross-sectional view of a CMOS transistor. It has four ter-

minals, the source, drain, gate, and bulk. The structure of a MOSFET is usually

symmetric, i.e. the assignment of source and drain is determined by the applied

voltages in the circuit. Fig. A.1(b) gives the symbol for an nMOS transistor,

with all four terminals. The MOS device drain-to-source current can be written

as a function F , of the terminal voltages with a general functional form for the

current-voltage relationship valid for all regions of operation given by:

IDS / F(VGB; VSB)�F(VGB; VDB) (A:1)

This functional form was �rst introduced by Meyer (Meyer, 1971) and is also

discussed in (Tsividis, 1987). For an n-type device, F is a nonnegative, monoton-

ically increasing function of VGB and a monotonically decreasing function of VSB

(or VDB).

Drain Source

(a) (b)

Figure A.1: (a) View of an nMOS transistor on the substrate and (b) symbol.

Two modes of operation are possible for MOS transistors, subthreshold and

above-threshold. The threshold voltage Vth, typically 0.7{1.1 V, is the gate voltage

above which mobile charge is induced in the transistor channel, and below which

the channel current results from charges jumping over the energy barrier formed by

the gate. That is, the subthreshold conduction mechanism is di�usion, as opposed

to drift in the above-threshold region. The region between the subthreshold and

above-threshold operation is often referred to as the transition region, where both

drift and di�usion currents are nonnegligible. For a typical MOSFET of square

geometry (W=L = 1), the subthreshold region is de�ned for channel currents below

10{100 nA.

An equation for an MOS device operating above threshold is

IDS =K 0

2Shmax

�0; [VGS � VTH(VBS)]

2��max

�0; [VGD � VTH(VBD)]

where K 0 = Cox�0=2 is the above-threshold current coe�cient. This equation

shows explicitly the symmetry of the output current, as the di�erence of two

quadratics. In the region that one of the terms is zero, the device is said to

be in saturation. In the region where both terms are nonzero, the device is said to

be in the ohmic region. These regions are not discontinuous.

In the subthreshold region, a further factorization of F is possible (Boahen and

Andreou, 1992):

IDS / G(VGB) [H(VSB)�H(VDB)] (A:3)

where G and H are exponential functions. This equation shows that the source-

driven and drain-driven components are controlled independently by VSB and VDB.

And VGB controls both components in a symmetric and multiplicative manner. In

this mode of operation the MOS transistor has been called a di�usor (Boahen

and Andreou, 1992), analogous to the variable conductance electrical junctions in

biological systems.

An expression for the current in an nMOS transistor operating in subthreshold

can thus be written as:

IDS = I0Se�VGB=Vt

he�VSB=Vt � e�VDB=V t

i(A:4)

The terminal voltages VGB; VSB; VDB are referenced to the substrate. The constant

I0 depends on the mobility (�o) and other physical properties of silicon. S is a

geometry factor, the width W to length L ratio of the device. Current through a

pMOS device is given by:

ISD = I0Se��VGB=Vt

heVSB=Vt � eVDB=V t

i(A:5)

It is not guaranteed that I0 and � be the same for both pMOS and nMOS devices.

Variation in I0 has been extensively studied (Pavasovic et al., 1991), while variation

in � is not so well documented.

The parameter � is de�ned as

� =Cox

Cox + Cdep(A:6)

The physical signi�cance of � is apparent if the observation is made that the oxide

and depletion capacitances form a capacitive divider between the gate and bulk

terminals that determines the surface potential. The parameter � takes values

between 0.6 and 0.9.

If we de�ne two currents IF and IR, the forward and reverse currents, respec-

tively, such that

IF = I0S exp�VGB=Vt exp�VSB=Vt (A.7)

IR = I0S exp�VGB=Vt exp�VDB=Vt

then we have

IDS = IF � IR (A:8)

IF and IR can be written in the form:

IF = I0S exp(1��)VBS=Vt exp�VGS=Vt (A.9)

IR = I0S exp(1��)VBD=Vt exp�VGD=Vt

These equations show explicitly the dependence on VBS and VBD in which the role

of the bulk is as a back-gate. These equations are useful for devices which operate

as gate-controlled conductors, or di�usors (Boahen and Andreou, 1992).

A small-signal model derived from IF and IR can be constructed in such a way

as to preserve the symmetry between the source and drain, as in Fig. A.2(a), where

gmf � @IF@VGS

gmbf � @IF@VBS

=(1� �)

VtIF (A.10)

gmr � @IR@VGD

gmbr � @IR@VBD

=(1� �)

For an nMOS device that is biased with VDS � 5Vt, i.e. in saturation, IF >> IR

and the drain current is given approximately by:

IDS = IF = I0S exp�(1��)VSB=Vt exp�VGS=Vt (A:11)

This equation is most often used for circuit designs where devices operate in satu-

ration as transconductance ampli�ers. However, channel-length modulation (Early

e�ect) { which we have completely ignored thus far { becomes signi�cant in satu-

ration. As such, the device equation can be augmented with:

IDS = In0Sn exp�(1��n)VSB=Vt exp�nVGS=Vt(1 + VDS=V0) (A:12)

where V0 is the Early voltage. A small-signal model of an nMOS device in satura-

tion includes only the small-signal forward current parameters, with the addition

gd � @IDS@VDS

V0IDS (A:13)

the output conductance. It can be seen in Fig. A.2(b).

The noise in a subthreshold MOS transistor can be reasonably-well modeled

as a bias-dependent shot noise having a at spectrum. Recently, the intimate

relationship between thermal noise and shot noise has been illuminated by Lan-

dauer (Landauer, 1993). In particular, he shows that shot noise and thermal noise

are special limits of a more general noise formula. We are therefore not excluding

any fundamental noise e�ects by treating shot noise alone.

The shot noise in a MOS transistor has two independent components, forward

and reverse. They have one-sided power spectrum given by (Sarpeshkar et al.,

1993; Tsividis, 1987)

Sif;sh(!) = 2qIF (A.14)

Sir;sh(!) = 2qIR

gmfvgsvs

gmbfvbs

gmrvbd

gmbrvbd

Sif=2qIF

Sir=2qIR

gmvgsvs

gmbvbs

Si=2qIDS

vb (b)

Figure A.2: MOS small-signal subthreshold model including sources of shot noiseonly, (a) as a di�usor, and (b) in saturation.

This type of noise has been experimentally con�rmed for subthreshold currents up

to 100 pA (Sarpeshkar et al., 1993) in large-area square devices. The small-signal

model of Fig. A.2(a) includes these two noise sources. In saturation, only Sif is

signi�cant, as depicted in Fig. A.2(b).

For sub-threshold currents between 1 nA and 100 nA, it appears necessary to

include the e�ects of icker noise for mid-to-low frequencies. A model for icker

noise which is referenced to the gate voltage is given by (Vittoz, 1994).

Sv;f (!) = 4kT�

!(A:15)

where � is a process-dependent parameter. According to Vittoz the parameter �

is often larger for nMOS than for pMOS transistors, and may range between 0.02

and 2 F/m2. A model for icker noise referenced to the output current is given

by (Tsividis, 1987).

Si;f (!) =Mg2mC 0oxWL

!(A:16)

where gm is the forward transconductance in saturation, C 0

ox the gate capacitance

per unit area and M a process-dependent constant with units of Joules. The two

models are related through the relation 4kT=� =M=C 0

ox. Something which is not

clear from these two models is the e�ect of icker noise in the region that IF � IR.

It is possible that these two noise currents may be correlated and at least partially

cancel one another.

Assuming shot and icker noise are independent, a complete noise model for a

transistor operating in subthreshold saturation is

Si(!) = Si;sh(!) + Si;f(!) (A:17)

Fig. A.3 shows the power spectral density for a PMOS transistor, where W =

1148 �m and L = 4 �m. The model is given by solid lines, the data are marked by

x's. The three curves correspond to nominal current values of (a) 1 nA, (b) 10 nA,

and (c) 100 nA for an equivalent square device. One free process-dependent pa-

rameter M is used to model the icker noise. Note that, at low enough current

levels, icker noise cannot be detected within the audio frequency range. This

property is seen for curve (a) of Fig. A.3 in which there is little evidence of icker

noise for frequencies above 50 Hz.

10−13

10−12

10−11

freq, Hz

Figure A.3: Noise data taken from a PMOS transistor with W=L = 1148=4. Solidlines are noise model, x's are data. Curve (a) corresponds to 1 nA for an equivalentsquare device, (b) 10 nA, and (c) 100 nA. (� = 0:7, Cox = 1500 F/m2, andM = 4:0E�26 J.)

A.2 Other Monolithic Elements

Passive resistors and capacitors can be realized on the chip using various layers

already available in the fabrication process.

Resistors can be realized on chip using either the polysilicon or the di�usion

layer. For a typical process, the resistivity is about 20�25=2 for polysilicon

and 30�60=2 for di�usion. Even though these resistors can be relatively well-

matched, the designer has no control over the exact resistance value. Therefore,

a reasonable circuit design should not depend on the absolute resistance of any

on-chip resistor.

Capacitors are easily implemented using the thin oxide layer sandwiched be-

tween the two polysilicon layers in a double-poly process. Typical capacitance is

about 0.5fF/�m2 and is usually well-matched. It should be mentioned, however,

the bottom plate (the �rst polysilicon layer) has about 0.05fF/�m2 of capacitance

to the substrate. It is thus desirable to design circuits in which the bottom plates

are connected to a low-impedance node, or to use only grounded capacitors. In

situations where neither of the above is possible, a bootstrapping technique may

be necessary to cancel this parasitic capacitance.

No practical inductors can be made on the chip, but as we havel seen in an

earlier chapter, an active inductor can be implemented using a gyrator (generalized

impedance converter) and a capacitor.

Other devices not mentioned in this appendix are available. Most notably are

vertical NPN bipolar transistors, lateral PNP transistors, photo-diodes and photo-

transistors, junction �eld-e�ect transistors, and more. The author regrets only the

lack of an isolated diode (using n-doped and p-doped polysilicon, for example) and

the lack of a true twin-tub process, whereby both nMOS and pMOS transistors

reside in oating wells. If one is willing to go to the extra e�ort of post-processing

IC's, more elaborate electro-mechanical devices can also be integrated onto the

same chip.

Appendix B

Cochlear Experimental Setup

B.1 Experiments with the Hopkins Electronic

B.1.1 Abstract

We have developed hardware and software for continuous long-term recordings

from the Hopkins Electronic EAR (HEEAR), an analog VLSI model of the auditory

periphery designed in our laboratory (Liu et al., 1992b). Figure B.1 shows the

experimental setup. Previously recorded audio signals are used to stimulate the

cochlear model. These signals can be downloaded to the �rst PC's hard disk over

an ethernet link. The PC converts the data �le into analog values using a digital-

to-analog converter module. The only limit to the length of the input �le is the

size of the �rst PC's hard disk. The second PC can store up to 30 minutes of

multi-channel analog signals from the model, sampling either 32 signals at 12 kHz

or 16 signals at 24 kHz. The recorded data are sent back to a workstation for

further analysis over a second ethernet link. All processing is performed by the

HEEAR chip set in real-time, consuming less than 25 milliwatts of power, including

external potentiometers used to set the parameters of the hardware model. We

have designed the experimental setup with the goal of processing four half-hour

segments from a standard database using the silicon cochlea. The outputs of the

HEEAR chips will be used by another research group to train and test a large

vocabulary speech recognition system.

We present preliminary results from a series of experiments which are being

conducted using the Hopkins Electronic Ear. In our �rst study, the silicon cochlea

is stimulated using tone bursts with amplitudes which vary over one order of magni-

tude, in order to demonstrate the properties of adaptation and signal compression

characteristic of auditory processors. The output of the cochlea is also studied

in response to pure tones with and without additive white noise. In a third ex-

periment, speech segments of one male and one female speaker are taken from a

standard database. The clean speech is degraded with successively larger amounts

of band-limited white noise, to obtain signal-to-noise ratios of 30 dB, 20 dB, 12 dB,

6 dB, and 0 dB. Using images similar to the neurogram (Secker-Walker and Searle,

1990), our intention is to give a qualitative demonstration of the HEEAR chips'

ability to process speech robustly in the presence of noise.

486 PC

160 MByte Hard Disk

Ethernet Link

DT2821D/A

Converter

Pre-Amplifier

Basilar Membrane

Hair Cells and Synapses

Current-to-Voltage Conv 31

Anti-Aliasing Filter

2.2 GByte Hard Disk

Ethernet Link

DT283932-ChannelA/D Conv

Microphone

486 PC

Speaker

Buffer Amplifiers

Silicon Cochlea

Custom Interface

Processing file ... cello.ddSampling rate ... 48000

Figure B.1: Experimental setup for simultaneously stimulating and recording fromthe HEEAR chip set. One PC outputs a previously recorded speech signal viathe D/A converter module. The analog speech signal is attenuated before beingpresented to the silicon basilar membrane. Thirty-one output channels are fedinto independent hair-cell synapse circuits. The outputs from the HEEAR chipset are digitized and stored on a second PC after passing through a custom analoginterface. Synchronization is achieved by recording the input signal along with the31 output channels. Depending on the application, a microphone can be connecteddirectly to the input of the pre-ampli�er.

B.1.2 Preliminary Results with the HEEAR Chip Set

Three series of experiments have been conducted in order to give a qualitative

demonstration of the HEEAR chip set as an auditory processor. In the near future

we plan to process speech from a standard database using the silicon cochlea. The

outputs of the HEEAR chips will be used by another research group to train and

test a large vocabulary speech recognition system.

Experiment I: Tone Bursts of Varying Amplitudes

The silicon cochlea is stimulated using tone bursts of increasing amplitude. From

Figs. B.2 and B.3, we note that as the stimulus input increases, the onset of the

response increases. However, the steady-state response (after 8 msec) is compressed

at the higher input amplitudes. The time scale over which the adaptation process

takes place is roughly 3-5 msec.

Experiment II: Sinusoids in Noise

The output of the silicon cochlea is recorded in response to steady-state sinusoids

with additive white noise ( at, 0-8kHz). As shown in Figs. B.4 and B.5, the shape

of the silicon cochlea response does not appear to be greatly disturbed, even at a

signal-to-noise ratio of 0dB.

Experiment III: Male and Female Speech in Noise

Acoustically similar speech segments of one male and one female speaker are taken

from a standard database. The syllable chosen, /jh er/ comes from the word

'adjourned'. The clean speech is degraded with successively larger amounts of

white noise, with a minimum signal-to-noise ratio of 0dB. Figs. B.6 and B.7 show

results for the male speaker, while Figs. B.8 and B.9 are for the female speaker.

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04-200

Silicon Cochlea Response to 1.0 kHz Tone Burst at 1/4 Full-Scale

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.2: Silicon cochlea response to 1kHz tone burst at 1/4 fullscale. Thecharacteristic frequency of the output channels are shown above half of the traces.Only one channel (668 Hz) appears to be adapting to the stimulus.

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04-200

Silicon Cochlea Response to 1.0 kHz Tone Burst at 1/2 Full-Scale

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.3: Silicon cochlea response to 1kHz tone burst at 1/2 fullscale. Thecharacteristic frequency of the output channels are shown above half of the traces.The response of one channel (668Hz) is high during the �rst three cycles of thetone burst, but reduces to roughly one half its initial value by the tenth cycle.

0 0.005 0.01 0.015 0.02 0.025-200

Silicon Cochlea Response to 200 Hz Tone 1/4 Full-Scale RMS 12 dB SNR

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.4: Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 12dB SNR. The characteristic frequency of the output channels are shown above halfof the traces.

0 0.005 0.01 0.015 0.02 0.025-200

Silicon Cochlea Response to 200 Hz Tone 1/4 Full-Scale RMS 0 dB SNR

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.5: Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 0 dBSNR. The characteristic frequency of the output channels are shown above half ofthe traces.

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200

HEEAR Response to Male /jh er/ at 1/5 Full-Scale RMS and 6 dB SNR

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.6: Silicon cochlea response to male token of /jh er/ at 1/5 fullscale RMSand 6 dB SNR. The characteristic frequency of the output channels are shownabove half of the traces. A high-frequency burst marks the release of /jh/.

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200

HEEAR Response to Male /jh er/ at 1/5 Full-Scale RMS and 0 dB SNR

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.7: Silicon cochlea response to male token of /jh er/ at 1/5 fullscale RMSand 0 dB SNR. The characteristic frequency of the output channels are shownabove half of the traces. The consonant /jh/ appears to be buried in the noise.

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200

HEEAR Response to Female /jh er/ at 1/5 Full-Scale RMS and 6 dB SNR

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.8: Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 6 dB SNR. The characteristic frequency of the output channels areshown above half of the traces. A high-frequency burst marks the release of /jh/.

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200

HEEAR Response to Female /jh er/ at 1/5 Full-Scale RMS and 0 dB SNR

6.91 kHz

3.85 kHz

2.15 kHz

1.20 kHz

668 Hz

372 Hz

208 Hz

116 Hz

Figure B.9: Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 0 dB SNR. The characteristic frequency of the output channels areshown above half of the traces. The consonant /jh/ is barely discernible in thenoise.

B.2 Harmonic and Intermodulation Distortion

As stated in chapter 2, the second distortion measure computes harmonic and

intermodulation distortions for a sinusoidal input signal. An alternate method for

computing these distortions is to expand the output signal in a power series and

then use trigonometric identities to \fold" it back up.

We would like to point out a potentially useful property of intermodulation

distortion in speech processing. Suppose that we are given a speech signal for which

the fundamental frequency is absent. A good example is a male voice transmitted

over a telephone line. When human subjects listen to the received speech signal,

they have no trouble inferring the missing fundamental frequency. One manner

in which the detection of the missing frequency might take place is by deliberate

nonlinear processing in which the fundamental frequency arises as one of the main

distortion products. For example, if two tones were used as input to the self-

biased transconductor with harmonic integers n = 2 and m = 3, at a large enough

signal level, the output current would contain a discernible distortion product at

harmonic integer j2n � mj = 1, i.e., the fundamental. Evidence to support this

type of processing in the inner ear of the mammal is the appearance of small

peaks in the synchrony of the auditory nerve �bers at characteristic frequencies

of 2F1, 3F1, and (2F1 � F2), where F1 and F2 are the �rst and second formant

frequencies (Young and Sachs, 1979).

B.3 Current-to-Voltage Converter

It is our experience that current-to-voltage conversion is not as straightforward as

voltage-to-current conversion. After four revisions, we �nally adopted the design

which is shown in Fig. B.10. During the design phase, we were unable to �nd a

simulation program that could accurately predict the performance for these cir-

cuits. Most notably, one reputable simulation program predicted a bandwidth of

400 kHz for these devices; we measured it as 40 kHz.

The operation of the circuit is as follows. As to the power supplies, set Vdd =

2:5V and Vss = �2:5V. The voltage Vbias sets the bias current for the two ampli�ers.

Its nominal value is Vdd � 0:95V. The voltage Voffst establishes an o�set current

in the current-to-voltage converter so that bipolar currents can be measured, with

the only nuisance being a DC o�set at the output voltage. Its nominal value is

Vdd�0:85V. The voltage Vref establishes the clamping voltage for the input current.

In our setup we set this to GND = 0V. The input current is labeled as Iin. Its

nominal range is 0 � 50nA. The voltage Vgain provides an additional current gain

of exp[(Vgain � Vref )=Vt], so be careful. Vgain is the only voltage input, other than

the power supplies that must be low impedance. In our design, we did not need

the additional voltage gain, and therefore set Vgain = Vref = 0V. Finally, keep in

mind that the output voltage Vout must be bu�ered before it can venture out of the

chip. The reason for this precaution is that the parasitic capacitance associated

with a pad and wire would adversely e�ect circuit performance.

B.4 A BiCMOS Voltage Bu�er

In designing low-power, mostly CMOS analog VLSI systems, high-drive capability

is not an issue until one wants to store and analyze the outputs digitally. An exam-

ple of such a system is an analog VLSI cochlea with 30 frequency channels (Liu,

1992; Liu et al., 1992b). Internal signal processing is done with small voltages

(� 100mV ) and tiny currents (� 20nA); however, in order to view the outputs

externally, we need to drive capacitances up to 300pF and resistive loads down to

10K. Indeed some type of bu�er ampli�er is required.

Speci�c design goals for our bu�er ampli�er are listed in the �rst column of

Table B.1. A trade-o� exists among these design goals, and we found no single

Vref 52/6

52/652/6

45/12 45/12

52/6 52/6

24/10 96/10

Voffst

Iin Vgain

Figure B.10: Schematic of current-to-voltage converter as used in the computerinterface to the Hopkins Electronic EAR.

architecture and layout to match every possible requirement. We are also con-

strained by the available technology. Through MOSIS we have available a 2:0�m

double-poly n-well BiCMOS process, which permits the fabrication of oating ca-

pacitors, as well as vertical NPN transistors exhibiting beta's in the range of 50 to

We used a class-AB source follower output stage for our bu�er ampli�er to

ensure a low quiescent current. The main drawback of using source followers at

the output is a reduced linear range (Gregorian and Temes, 1986). If a strictly

CMOS implementation is used, one stands to lose 1:5V {2V at either supply rail.

Thus, using a � 2:5V power supply, the linear range might be only � 0:75V

when driving a large load. Next we considered stacking an NPN transistor with a

PMOS transistor in its own well. The linear range improved to about � 1:0V in

Table B.1: Speci�cations, simulation results, and measurements for the BiCMOSbu�er ampli�er.

Feature Design Goal Simulation MeasurementPower Supply � �2:5V �2:5V �2:5VSupply Current � 100�a 304�a 300�aTotal Area � :04mm2 NA 0:165mm2

O�set (mean) � 0:5mV �1:3mV �1:6mV(4 s.d.) � 5:0mV NA 5:3mV

Output Range � �1:5V �1:25V �1:25V@ RL = 1:0KOpen-Loop Phase Margin � 70 deg 86 deg NA@ CL = 300pFOpen-Loop Gain Margin � 10dB 14dB NA@ CL = 300pFMax. DC Gain Error � 1:0% 0:9% 2:2%@ Gain = 1, RL = 1:0KBandwidth � 20KHz 20KHz 20KHz@ Max DC Gain ErrorTotal Harm. Distortion � 1:0% - 0:4%@ 1kHz, RL = 1:0KSlew Rate � 1V=�s 1:1V=�s 0:97V=�s@ RL = 1:0K;CL = 300pF

Figure B.11: Compound PMOS/NPN Transistor.

this case. Finally we examined a compound PMOS/NPN transistor to replace the

PMOS transistor. In this case the linear range of the bu�er ampli�er increased to

� 1:25V .

B.4.1 Compound PMOS/NPN Transistors

In a compound PMOS/NPN transistor (see Figure B.11) the source and well of

the PMOS transistor and the collector of the NPN transistor are all in common,

while the drain of the PMOS transistor drives the base of the NPN transistor. The

control node is the gate of the PMOS transistor. The voltage/current character-

istics of the compound PMOS/NPN are analogous to those of a very wide PMOS

transistor.

Below the voltage/current characteristics of the compound PMOS/NPN are

derived for the case that the PMOS transistor operates below threshold. The sub-

threshold region for square CMOS devices is de�ned as being for currents less than

approximately 30nA (Vittoz, 1994; Andreou and Boahen, 1994).

For a PMOS device that has the source and well at the same potential and

is biased in the saturation region (VSD � 4Vt), the source-drain current is given

by (Andreou and Boahen, 1994):

ISD = I0Se�

�VGS

Vt + gdVSD (B:1)

where:

VGS = VG � VS is the gate to source voltage,

VSD = VS � VD is the source to drain voltage,

S � W=L is the width to length ratio,

I0 is the zero-bias current,

� is the body e�ect coe�cient,

gd is the drain conductance,

Vt � kT=q is the thermal voltage.

We model the NPN bipolar transistor as a current-input/current-output device

with a non-zero output conductance (Andreou and Boahen, 1994):

IC = �IB + g0VCE (B:2)

where:

VCE = VC � VE is the collector to emitter voltage,

� is the current gain,

g0 is the output conductance.

A large signal model of the compound PMOS/NPN device can be derived by

realizing that ISD = IB and IC + IB = IE. Then substituting equation (B.1) into

equation (B.2), we �nd the result:

IE = I0(1 + �)Se��VGS

Vt + (1 + �)gdVSD + g0VCE (B:3)

The �rst term in equation (B.3) will dominate the two latter. Hence, we see

that the compound PMOS/NPN behaves similar to a PMOS transistor with width-

to-length ratio scaled by (1 + �). In e�ect, we have increased the maximum sub-

threshold current of a PMOS device with width-to-length ratio S from 30nA x S

to 30nA x (1 + �)S.

We identify two reasons for operating the PMOS transistor below threshold in

the compound PMOS/NPN transistor. Firstly, a PMOS transistor requires a lower

gate-to-source voltage when operating below threshold than when operating above.

It follows that we will obtain a larger output swing using a sub-threshold device

than an above-threshold device. Secondly, we want our design to be as symmetric

as possible in order to achieve good linearity (Gregorian and Temes, 1986). If we

consider the NPN transistor as a voltage-input/current-output device, a simpli�ed

large signal model is as follows (Andreou and Boahen, 1994):

IC = IES�

1 + �eVBE

Vt + g0VCE (B:4)

where:

VBE = VB � VE is the base to emitter voltage,

IES is the saturation current,

and other parameters are as de�ned earlier. We see that the exponential voltage-

to-current relationship of equation (B.3) is similar (and complementary) to that of

an NPN transistor in equation (B.4). We exploit this similarity to achieve a more

symmetric design.

On a test chip we fabricated individual compound PMOS/NPN transistors and

NPN transistors of the same dimensions as those used in the design of the bu�er.

Figure B.12 shows the measured I-V characteristics for the case that VCE = 2:0V .

For both transistor types the output current is exponentially dependent on the

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAA

AAAAAAAAAAAAAAAAAAAA

0 0.2 0.4 0.6 0.8 1 1.2

I(P_NPN1)

I(NPN1)

Ie, Ic

Vsg, Vbe

Figure B.12: Current as a function of voltage for a compound PMOS/NPN tran-sistor (solid line) and an NPN transistor (dashed line). Two transistors of eachtype were measured, but the resulting curves were so similar for each type thatthey would be indistinguishable on this graph. The saturation of the NPN tran-sistor output current near 10mA was caused by limitations in the measurementequipment. In all cases, VCE = 2:0V .

control voltage over approximately 7 orders of magnitude. In addition, we see that

the NPN I-V curve has a steeper slope than that of the compound PMOS/NPN.

The di�erence in slope is due to the extra � term in the exponent of equation (B.3)

which is not found in equation (B.4).

B.4.2 Circuit Description

Figure B.13 shows the schematic of the bu�er including device geometries. The

input stage consists of a large-area PMOS di�erential pair in a oating well. In

order to reduce systematic o�set, transistors M9 and M10 copy the voltage from

the drain of M3 to that of M4. In this way, the mirroring ratio between transistors

M3 and M4 is very close to unity. Transistor pairs M5-M6 and M11-M12 mirror

124/10

50/10M6200/10

M12500/10

M810/6

M910/10

M450/10

M1015/6

M11124/10

M1610/10

Q416 x 64

Q216 x 64

Q116 x 64

Q316 x 64

M13 M14

Figure B.13: BiCMOS Bu�er Ampli�er.

and amplify the di�erential current by a factor of four, establishing a push/pull

con�guration. The output stage consists of NPN transistor Q2 biased one diode

drop above the output voltage stacked with the compound PMOS/NPN transistor

M14/Q4 biased one diode drop below the output voltage. Frequency compensation

is achieved using pole/zero cancellation (Gregorian and Temes, 1986) with two

15pF capacitors and transistors M15-M16 operating in the triode region.

B.4.3 Results

Circuit simulations were performed with SPICE3e using BSIM and BIPOLAR

models provided by MOSIS. Results of simulation experiments can be found in

column 3 of Table B.1. Eighteen bu�ers were arranged in a 2mm x 2mm tiny chip

and placed in a 40-pin DIP package. Electrical measurements are averaged from

four such chips and can be found in the last column of Table B.1.

We note one minor discrepancy between simulated and measured results. The

maximum DC gain error, which we anticipated to be lower than 1%, is actually

greater than 2% for the maximumload. One factor contributing to this discrepancy

is the long metal line connecting the output of the bu�er to the pad. We omitted

this 5 parasitic resistance from our simulations. However, even in the condition

of no load the gain error is measured to be 0.7%. It follows that the open-loop gain

must be lower than anticipated, but we have no means of measuring it directly.

Bibliography

Allen, J. (1985). Cochlear modeling. IEEE ASSP Magazine, 2(1):3{29.

Andreou, A. and Boahen, K. (1994). Neural information processing II. In Ismail,

M. and Fiez, T., editors, Analog VLSI Signal and Information Processing.

McGraw-Hill.

Andreou, A. and Liu, W. (1993). BiCMOS circuits for silicon cochleas. In Dedieu,

H., editor, 1993 European Conference on Circuit Theory and Design, pages

503{508, Davos, Switzerland. Elsevier Science B.V.

Andreou, A. G. (1995). Low power analog VLSI systems for sensory information

processing. In Sheu, B. J., Ismail, M., Sanchez-Sinencio, E., and Wu, T. H.,

editors, Microsystems Technology for Multimedia Applications: An Introduc-

tion, ISCAS '95 Tutorial Sessions, pages 501{522. IEEE, Seattle, WA.

Bhadkamkar, N. (1993). A variable resolution, nonlinear silicon cochlea. Technical

Report CSL-TR-93-558, Stanford University, Stanford.

Blahut, R. (1987). Principles and Practice of Information Theory. Addison-Wesley,

Reading, Mass. Pages 280{282.

Boahen, K. and Andreou, A. (1992). A constrast sensistive silicon retina with

reciprocal synapses. In Moody, J., Hansen, S., and Lippmann, R., editors,

Advances in Neural Information Processing 4. Morgan-Kaufmann, San Mateo,

Cohen, M. and Andreou, A. (1992). Current-mode subthreshold MOS implementa-

tion of the Herault-Jutten autoadaptive network. IEEE J. Solid-State Circuits,

27(5):714{727.

Furth, P. and Andreou, A. (1995). Linearised di�erential transconductors in sub-

threshold CMOS. Electronics Letters, 31(7):545{547.

Furth, P., Goel, N., Andreou, A., and Goldstein, Jr., M. (1994). Experiments

with the Hopkins Electronic EAR. In 14th Speech Research Symposium, pages

183{189, Baltimore MD.

Ghitza, O. (1986). Auditory nerve representation as a front-end for speech recog-

nition in a noisy environment. Computer Speech and Language, 1:109{130.

Gray, R., Buzo, A., Gray, Jr., A., and Matsuyama, Y. (1980). Distortion mea-

sures for speech processing. IEEE Trans. Acoust., Speech, Signal Processing,

28(4):367{376.

Gray, Jr., A. and Markel, J. (1976). Distance measures for speech processing.

IEEE Trans. Acoust., Speech, Signal Processing, 24(5):380{391.

Gregorian, R. and Temes, G. (1986). Analog MOS Integrated Circuits for Signal

Processing. John Wiley & Sons.

Groenewold, G. (1991). Optimal dynamic range integrators. IEEE Trans. Circuits

Syst. I, 39(8):614{627.

Hosticka, B. (1985). Performance comparison of analog and digital circuits. Proc.

IEEE, 73(1):25{29.

Kamm, T., Andreou, A., and Cohen, J. (1995). Vocal tract normalization in speech

recognition: compensating for speaker variability. In 15th Speech Research

Symposium, pages 175{178, Baltimore MD.

Krummenacher, F. and Joehl, N. (1988). A 4-MHz CMOS continuous-time �lter

with on-chip automatic tuning. IEEE J. Solid-State Circuits, 23(3):750{758.

Landauer, R. (1961). Irreversibility and heat generation in the computing process.

IBM J. Res. Devel., 5:183{191.

Landauer, R. (1993). Solid-state shot noise. Physical Review B, 47(24):16427{

16432.

Lazzaro, J., Wawrzynek, J., and Kramer, A. (1994). Systems technologies for

silicon auditory models. IEEE Micro, pages 7{15.

Lazzaro, J., Wawrzynek, J., Mahowald, M., Sivilotti, M., and Gillespie, D. (1993).

Silicon auditory processors as computer peripherals. IEEE Trans. Neural Net-

works, 4(3):523{528.

Lin, J., Ki, W., Edwards, T., and Shamma, S. (1994). Analog VLSI implementation

of auditory wavelet transforms using switched-capacitor circuits. IEEE Trans.

Circuits Syst. I, 41(9):572{583.

Liu, W. (1992). An Analog Cochlear Model: Signal Representation and VLSI

Realization. PhD thesis, Johns Hopkins University, Baltimore.

Liu, W., Andreou, A., and Goldstein, Jr., M. (1992a). Multiresolution speech

analysis with an analog cochlear model. In IEEE-SP International Symposium

on Time-Frequency and Time-Scale Analysis, pages 433{436, Victoria, BC,

Canada.

Liu, W., Andreou, A., and Goldstein, Jr., M. (1992b). Voiced-speech representation

by an analog silicon model of the auditory periphery. IEEE Trans. Neural

Networks, 3(3):477{487.

Liu, W., Andreou, A., and Goldstein, Jr., M. (1993). Analog cochlear model

for multiresolution speech analysis. In Hanson, S., Cowan, J., and Giles, C.,

editors, Advances in Neural Information Processing Systems 5, pages 666{673.

Morgan Kaufmann, San Mateo, CA.

Lyon, R. and Mead, C. (1988). An analog electronic cochlea. IEEE Trans. Acoust.,

Speech, and Signal Proc., 36(7):1119{1134.

Max, J. (1960). Quantizing for minimum distortion. IRE Trans. Inform. Theory,

6:7{12.

Mead, C. (1989). Analog VLSI and Neural Systems. Addison-Wesley, Reading,

Meng, H. and Zue, V. (1990). A comparative study of acoustic representations of

speech for vowel classi�cation using multi-layer perceptrons. In Int'l Conf. on

Spoken Language Processing, pages 1053{1056.

Meyer, J. (1971). MOS models and circuit simulation. RCA Review, 32:42{63.

Nauta, B. (1992). A CMOS tranconductance-C �lter technique for very high fre-

quencies. IEEE J. Solid-State Circuits, 27(2):142{153.

Neti, C. (1994). Neuromorphic speech processing for noisy environments. In IEEE

Intl. Conf. on Neural Networks, pages 4425{4430, Orlando, FL.

Nevarez-Lozano, H. and Sanchez-Sinencio, E. (1991). Minimum parasitic e�ects

biquadratic OTA-C �lter architectures. Analog Integrated Circuits and Signal

Processing, 1(4):297{319.

Paez, M. and Glisson, T. (1972). Minimum mean-squared error quantization in

speech PCM and DPCM systems. IEEE Trans. Comm., 20:225{230.

Papoulis, A. (1965). Probability, Random Variables, and Stochastic Processes.

McGraw-Hill, New York. P. 219.

Park, J., Abel, C., and Ismail, M. (1993). Design of silicon cochlea using MOS

switched-current techniques. In Dedieu, H., editor, 1993 European Conference

on Circuit Theory and Design, pages 269{273, Davos, Switzerland. Elsevier

Science B.V.

Pavasovic, A., Andreou, A., and Westgate, C. (1991). Characterization of CMOS

process variations by measuring subthreshold current. In Green, R. and Ruud,

C., editors, Nondestructive Characterization of Materials IV. Plenum Press,

New York.

Rice, J. (1988). Mathematical Statistics and Data Analysis. Wadsworth &

BrooksCole, Paci�c Grove, CA.

Rice, J., Young, E., and Spirou, G. (1995). Auditory-nerve encoding of pinna-

based spectral cues: Rate representation of high-frequency stimuli. J. Acoust.

Soc. Am., 97:1764{1776.

Roe, D. and Wilpon, J., editors (1994). Voice Communication Between Humans

and Machines. National Academy Press, Washington, D.C.

Ross, S. (1988). A First Course in Probability. Macmillan, New York, third edition.

Sarpeshkar, R., Delbruck, T., and Mead, C. (1993). White noise in MOS transistors

and resistors. IEEE Circuits Devices Mag., 9(6):23{29.

Sarpeshkar, R., Lyon, R., and Mead, C. (1996). An analog VLSI cochlea with

new transconductance ampli�ers and nonlinear gain control. In ISCAS-96,

Atlanta, GA.

Secker-Walker, H. and Searle, C. (1990). Time-domain analysis of auditory-nerve-

�ber �ring rates. J. Acoust. Soc. Am., 88:1427{1436.

Shannon, C. (1948). A mathematical theory of communication. Bell Syst. Tech.

J., 27:379{423, 623{656.

Silva-Martinez, J., Steyaert, M., and Sansen, W. (1990). A high frequency large

signal very low distortion transconductor. In IEEE ESSCIRC-90, pages 169{

Tanimoto, H., Koyama, M., and Yoshida, Y. (1991). Realization of a 1-V active

�lter using a linearization technique employing plurality of emitter-coupled

pairs. IEEE J. Solid-State Circuits, 26(7):937{945.

Torrance, R., Viswanathan, T., and Hanson, J. (1985). CMOS voltage to current

transducers. IEEE Trans. Circuits Syst., 32(11):1097{1104.

Tsividis, Y., Czarnul, A., and Fang, S. (1986). MOS transconductors and integra-

tors with high linearity. Electron. Lett., 22:245{246. Errata, vol. 22, p. 619,

May, 1986.

Tsividis, Y. P. (1987). Operation and Modeling of the MOS Transistor. McGraw-

Hill, New York. P. 343.

Vittoz, E. (1994). Micropower techniques. In Franca, J. and Tsividis, Y., editors,

Design of MOS VLSI Circuits for Telecommunications and Signal Processing.

Prentice-Hall, 2nd edition.

Watts, L. (1992). Cochlear Mechanics: Analysis and Analog VLSI. PhD thesis,

California Institute of Technology, Pasadena, CA.

Watts, L., Kern, D., Lyon, R., and Mead, C. (1992). Improved implementation of

the silicon cochlea. IEEE J. Solid-State Circuits, 27(5):692{700.

Yang, X., Yang, K., and Shamma, S. (1992). Auditory representations of acoustic

signals. IEEE Trans. Information Theory, 38(2):824{839.

Young, E. and Sachs, M. (1979). Representation of steady-state vowels in the

temporal aspects of the discharge patterns of populations of auditory-nerve

�bers. J. Acoust. Soc. Am., 66:1381{1403.

Paul Matthew Furth was born in Washington, DC on June 27, 1963. He received

the B.A. in French from Grinnell College in 1984 and the B.S. in engineering (elec-

trical) from the California Institute of Technology, as part of a combined 5-year

liberal arts/engineering program. From 1985-1989 he worked as an electronics

project engineer for TRW Technar, a company which manufactured airbag crash

sensors for automobiles. His responsibilities were hardware and software design for

computer-controlled shock test equipment, as well as documentation and training.

In 1989, he entered the Ph.D. program in the Electrical and Computer Engineering

Department of Johns Hopkins University, where he received the M.Sc.Eng. degree

in 1992. Since 1989, he has been a teaching assistant and instructor for the depart-

ment and a research assistant in the Sensory Communications Laboratory, under

the direction of Professor Andreas Andreou. His research interests are low power

analog circuit design, speech processing, and circuit and device limitations.

Abstract - WordPress.nmsu.eduwordpress.nmsu.edu/.../Optimal_Filter_Banks_Furth_1996.pdf ·...

Documents