Post on 23-Apr-2020
transcript
On the Design of Optimal Continuous-Time
Filter Banks in Subthreshold CMOS
by
Paul M. Furth
A dissertation submitted to the Johns Hopkins University
in conformity with the requirements for the degree of
Doctor of Philosophy
Baltimore, Maryland
1996
c 1996 by Paul M. Furth
All rights reserved
Just to fool the page enumerator.
i
Abstract
This dissertation attempts to provide a comprehensive method of designing
analog hardware, whenever power dissipation is one of the engineering constraints.
It begins by looking at analog �lters and, in particular, multiresolution �lter banks
that model the spectral characteristics of the cochlea. In this sense, these structures
are also referred to as cochlear �lter banks. We distinguish cochlear �lter banks
from silicon cochleas, which incorporate adaptation, or gain control, within the
�ltering mechanism. While we understand that such adaption is important for
handling very large dynamic range input signals, in this work we are concerned
only with the design of continuous-time linear �lter banks. The design employs
MOS transistors operating in subthreshold in order to achieve a wide tuning range
(two decades or more) and low power consumption.
A mathematical framework for analyzing linear continuous-time circuits is de-
veloped, where prior knowledge of the input signal, such as speech, is taken into
account. That prior knowledge takes the form of probability density functions of
the input amplitude as well as power spectral densities. Our goal is to optimize the
�lter design with respect to the achievable dynamic range given tight constraints
on area and power consumption. Models of MOS transistors operating in the sub-
threshold region are adapted into the framework, although other technologies or
regions of operation are possible. By convention, dynamic range is de�ned as the
ratio between the maximum output signal level divided by the noise oor, such
that the distortion is acceptably low. Admittedly, the most di�cult portion of
this framework is the computation of distortion, which in this work is de�ned as
the mean-square-error. Other distortion measures, which are based on the magni-
tude frequency spectrum, may be more appropriate for speech, but are as yet not
incorporated into the design. The mathematical framework presented here is not
limited to the design of �lter banks, but extends to the more general area of linear
continuous-time �lter design whenever power consumption is a major constraint.
By way of example, the mathematical framework is used to derive the dy-
namic range of a transconductance-C lowpass �lter, in which the transconductor
is perhaps the most simple of all, the CMOS inverter.
As a starting point, we decided to study the architecture and circuit imple-
mentation of the Liu basilar membrane model. At the circuit level, the original
cochlear �lter bank implementation of Liu (Liu, 1992) was based on a transcon-
ductor using a simple di�erential pair. The dynamic range of this transconductor
ii
con�gured as a lowpass �lter is found to be only slightly higher than that of the
CMOS inverter. However, the dynamic range of the di�erential pair can be greatly
extended by using one of three linearizing techniques. To our knowledge, only one
of these techniques had been applied to previously to subthreshold CMOS, source
degeneration with diode-connected transistors. The other two methods, which
yield greater improvements in dynamic range, are new to subthreshold circuit de-
sign. They are, source degeneration with single and double di�usors, and multiple
asymmetric di�erential pairs. Deriving analytic expressions for the input/output
functions of these transconductors is done in laborious detail, mainly because it
is not a trivial task. These have been fabricated, and tested in order to verify
the derivations. These new transconductors o�er between 6.8 dB and 13.9 dB
improvement in dynamic range over that of the simple di�erential pair with only
modest increases in circuit complexity.
At the architectural level, Liu's basilar membrane model can be described as
a lowpass cascade with taps between each stage, each tap feeding two identical
series bandpass sections. This architecture is studied in order to derive its noise
characteristics, which, for a constant-Q implementation, are approximately at
across all output channels. In addition, assuming that the input signal power
spectrum is 1=f , we show that the signal power is distributed evenly across all
channels using this architecture. In this case, the maximum signal-to-noise ratio,
or dynamic range, is constant accross all channels, a desirable characteristic for
parallel-distributed processing systems.
Again by way of example, the transconductor based on the CMOS inverter is
substituted into the original disign in order to compute the dynamic range and
power dissipation of the entire �lter bank. As a measure of goodness, we propose
the following information measure: bits/sec/Watt, or bits/Joule. Based on exact
equations and a 1.5-Volt implementation, the �lter bank is estimated to process
0.19 bits/pJ.
The last part of this work attempts to address a more fundamental question.
Why process signals in analog? Some researchers believe that cheap, fast, reli-
able, digital processors make analog processing, i.e. analog computation, obsolete.
Perhaps a more carefully worded question is as follows: when is it advantageous
to process signals in an analog format, and when is it advantageous to process
signals digitally? Two constraints that are applicable to portable systems are
low power consumption and small size. In this work we consider four types of
signal processing: analog (continuous-time continuous-value), switched-capacitor
(discrete-time, continuous-value), time-domain (continuous-time, discrete-value),
iii
and synchronous digital (discrete-time, discrete-value). Each of these systems are
evaluated in terms of maximum possible information rate in bits per second per
watt of power for the case of a simple delay function.
In the summary chapter, future research directions are discussed. The appen-
dices contain a symmetricmodel of the CMOS transistor operating in subthreshold,
as well as details of a computer-controlled experimental setup for testing hardware
cochlear �lter banks in the context of automatic speech recognition.
iv
Acknowledgments
As a mentor, Dr. Andreas Andreou has a gift for understanding the importance
of what we do and identifying future trends in engineering. By contrast, I am
more excited by the details of what we do, often spending many days on long
mathematical derivations. As a result, we work as a team { more and more so,
as the day of graduation approaches. Over the six years that I've known and
worked with him, he has shared his ideas, time, and money generously with me. I
gratefully acknowledge his support.
My work builds upon the research of Dr. Weimin Liu of Hughes Network
Systems, formerly a student in the Sensory Communications Laboratory at Johns
Hopkins University. Over the years, Weimin modeled careful analysis, simulation,
experimental technique, and document preparation for the more junior members
of the lab, such as myself.
Dr. Moise Goldstein held in uence on my early years at Hopkins. He intro-
duced me to the lab and has helped me to appreciate the wonders of the human
nervous system. I share his desire to use my engineering knowledge to help people.
Mr. Robert Jenkins of the Applied Physics Laboratory is a wonderful (part-
time) supervisor. He has encouraged and supported my research, providing the
�rst testing of the Hopkins Electronic EAR in a classi�cation test. He has every
con�dence in those who work under him, including me.
Dr. Gert Cauwenberghs, a relatively new faculty member at Hopkins, has been
generous towards me by being available for discussions and by reviewing all of my
papers. I also thank former department chair Dr. Charles R. Westgate for his
support and personal interest in me. Through him I obtained as much experience
as a teacher as I could while at Hopkins.
I express my gratitude to two other professors, who each instructed me in
three, seemingly peripheral, subjects. Drs. Brian Hughes and Wilson Jack Rugh
are among the �nest educators I know.
Over the years, I have interacted with many colleagues and fellow students who
had either at one time worked in the Sensory Communications Laboratory (for-
v
merly, the Speech Processing Lab) or are still burning the midnight oil in Barton
Hall. In particular, I thank Ben Yuhas, Nina Kowalski, Kwabena \Buster" Boa-
hen, Philippe Pouliquen, Marc Cohen, Richard Meitzler, Kewei Yang, Fernando
Pineda, Kim Strohbein, Zaven Kalaygian, Nagendra \Goel" Kumar, Mark Martin,
Hitoshi Miwa, and Stane Gruden. Together they provided a stimulating intellec-
tual environment for learning and conducting research and made the work much
more fun. Special thanks to Mark, Goel, Stane, Tim, Philippe, and Rich, who
provided substantial technical support, and equally valuable, criticism, to my own
research.
I thank good friends, Tom, Dave, Roger, and Je�, who, although they do
not understand the technical intricacies of analog VLSI, nonetheless know how to
listen, understand, and encourage a edgling graduate student.
For the most beautiful wife, Carol, our three wonderful children, Aria, Cadence,
and David Canon, our families, and our parents, I express my deep love and ap-
preciation. It was their love, comfort, patience, and support that helped carry me
through graduate school. Six years is six years, no matter how you slice it.
I conclude with words from a song that I wrote several years ago: Thank you
God for life, for owers to smell and mountains to climb, for air to sweetly breathe.
Thank you God for work, the chance to use the talents I have, in making something
new.This thesis I dedicate to Him.
vi
Abbreviations
BiCMOS bipolar/CMOS { fabrication process in which bipolarjunction transistors and CMOS devices can be realizedon the same substrate
C capacitor { typically 1{10 pF in our designs
CMOS complementary MOS { fabrication process in whichboth p-channel and n-channel MOSFETs can be real-ized on the same substrate
DFDR distortion-free dynamic range
DLDR distortion-limited dynamic range
DR dynamic range { ratio of maximal to minimal signallevel
FET �eld-e�ect transistor
G large-signal transconductance
IC integrated circuit
MOS metal-oxide-semiconductor { the structure of a typeof �eld-e�ect transistor. Despite the name, moderndevices use polysilicon, instead of metal, as the gatematerial
MOSFET MOS �eld-e�ect transistor
MOSFET-C MOSFET-capacitor { a type of continuous-time �lterrealization
MOSIS an IC fabrication foundry
NMOS n-channel MOSFETPMOS p-channel MOSFET
RC resistor-capacitor
RLC resistor-inductor-capacitor
RMS root-mean-square value
Trans.-C transconductance-capacitor { a type of continuous-time �lter realization
VLSI very large scale integration of circuit elements on asingle substrate or chip
vii
Contents
Abstract ii
Acknowledgments v
Abbreviations vii
1 Introduction 1
1.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11.2 Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 51.3 Dissertation Outline : : : : : : : : : : : : : : : : : : : : : : : : : : 7
2 Dynamic Range of Integrators for Continuous-Time Audio Signal
Processing in Analog VLSI 9
2.1 CMOS Integrators : : : : : : : : : : : : : : : : : : : : : : : : : : : 112.2 Acoustic Input Signals : : : : : : : : : : : : : : : : : : : : : : : : : 14
2.2.1 Random Variables and Processes : : : : : : : : : : : : : : : 142.2.2 Speech : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 16
2.3 Noise : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 202.3.1 Input-Referred Noise Power Spectrum : : : : : : : : : : : : 222.3.2 Output-Referred Noise Level : : : : : : : : : : : : : : : : : : 22
2.4 Distortion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 252.4.1 Input-Referred Distortion : : : : : : : : : : : : : : : : : : : 272.4.2 Output-Referred Distortion : : : : : : : : : : : : : : : : : : 34
2.5 Dynamic Range : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 342.6 Example: Self-biased Transconductance-C Integrator : : : : : : : : 36
2.6.1 Output Current and Transconductance : : : : : : : : : : : : 372.6.2 Noise : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 392.6.3 Distortion : : : : : : : : : : : : : : : : : : : : : : : : : : : : 412.6.4 Dynamic Range : : : : : : : : : : : : : : : : : : : : : : : : : 47
3 Linearized Transconductors in Subthreshold CMOS 54
3.1 The Transconductance-C Integrator : : : : : : : : : : : : : : : : : : 553.2 The Di�erential Pair and De�nitions : : : : : : : : : : : : : : : : : 573.3 Source Degeneration : : : : : : : : : : : : : : : : : : : : : : : : : : 65
3.3.1 Diode-Connected Transistors : : : : : : : : : : : : : : : : : : 65
viii
3.3.2 Single Di�usor : : : : : : : : : : : : : : : : : : : : : : : : : 713.3.3 Double Di�usors : : : : : : : : : : : : : : : : : : : : : : : : 79
3.4 Multiple Di�erential Pairs : : : : : : : : : : : : : : : : : : : : : : : 833.4.1 Two Di�erential Pairs : : : : : : : : : : : : : : : : : : : : : 833.4.2 Three Di�erential Pairs : : : : : : : : : : : : : : : : : : : : : 883.4.3 Substrate Biasing Technique : : : : : : : : : : : : : : : : : : 93
3.5 Hints on Improved Transconductor Design : : : : : : : : : : : : : : 943.5.1 Use of the Gate Capacitance : : : : : : : : : : : : : : : : : : 943.5.2 Voltage-Splitting : : : : : : : : : : : : : : : : : : : : : : : : 963.5.3 Class-AB Operation : : : : : : : : : : : : : : : : : : : : : : 98
3.6 Experimental Results : : : : : : : : : : : : : : : : : : : : : : : : : : 983.6.1 Static Measurements : : : : : : : : : : : : : : : : : : : : : : 983.6.2 Dynamic Measurements : : : : : : : : : : : : : : : : : : : : 1033.6.3 Summary of Results : : : : : : : : : : : : : : : : : : : : : : 103
4 The Multi-Resolution Filter Bank Model 108
4.1 Filter Bank Architecture : : : : : : : : : : : : : : : : : : : : : : : : 1094.2 RLC Proto-Type Filters : : : : : : : : : : : : : : : : : : : : : : : : 112
4.2.1 RC Proto-Type Lowpass : : : : : : : : : : : : : : : : : : : : 1124.2.2 RLC Proto-Type Bandpass Filter : : : : : : : : : : : : : : : 114
4.3 Complete Filter Bank Model : : : : : : : : : : : : : : : : : : : : : : 1184.3.1 Transfer Function : : : : : : : : : : : : : : : : : : : : : : : : 1184.3.2 Filter Bank Tuning : : : : : : : : : : : : : : : : : : : : : : : 1204.3.3 Filter Bank Noise : : : : : : : : : : : : : : : : : : : : : : : : 123
4.4 Information Rate and Power Dissipation : : : : : : : : : : : : : : : 1254.5 Signal Power Distribution : : : : : : : : : : : : : : : : : : : : : : : 1264.6 Results and Discussion : : : : : : : : : : : : : : : : : : : : : : : : : 128
5 Comparison of Continuous and Discrete Circuits 130
5.1 Capacity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1315.2 Four Signal Representations : : : : : : : : : : : : : : : : : : : : : : 133
5.2.1 Continuous-Value Continuous-Time : : : : : : : : : : : : : : 1345.2.2 Continuous-Value Discrete-Time : : : : : : : : : : : : : : : : 1385.2.3 Discrete-Value Discrete-Time Circuit : : : : : : : : : : : : : 1415.2.4 Discrete-Value Continuous-Time : : : : : : : : : : : : : : : : 145
5.3 Graphical Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1475.4 Detailed Analysis of the DVCT Circuit : : : : : : : : : : : : : : : : 149
5.4.1 Jitter in a DVCT Channel : : : : : : : : : : : : : : : : : : : 1495.4.2 Di�erential Entropy for a DVCT Source : : : : : : : : : : : 1515.4.3 Approximate Capacity of DVCT Channel : : : : : : : : : : 154
6 Summary and Future Research 156
ix
A MOS Technology 161
A.1 MOS Transistor Model : : : : : : : : : : : : : : : : : : : : : : : : : 162A.2 Other Monolithic Elements : : : : : : : : : : : : : : : : : : : : : : : 169
B Cochlear Experimental Setup 171
B.1 Experiments with the Hopkins Electronic EAR : : : : : : : : : : : : 171B.1.1 Abstract : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 171B.1.2 Preliminary Results with the HEEAR Chip Set : : : : : : : 174
B.2 Harmonic and Intermodulation Distortion : : : : : : : : : : : : : : 179B.3 Current-to-Voltage Converter : : : : : : : : : : : : : : : : : : : : : 179B.4 A BiCMOS Voltage Bu�er : : : : : : : : : : : : : : : : : : : : : : : 180
B.4.1 Compound PMOS/NPN Transistors : : : : : : : : : : : : : 183B.4.2 Circuit Description : : : : : : : : : : : : : : : : : : : : : : : 186B.4.3 Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 187
Bibliography 189
Vita 196
x
List of Figures
1.1 Optimization paradigm. : : : : : : : : : : : : : : : : : : : : : : : : 6
2.1 (a) MOSFET-C and (b) Transconductance-C integrators. Bothhave transfer function G=C s. : : : : : : : : : : : : : : : : : : : : : 13
2.2 Di�erential transconductance-C integrator. : : : : : : : : : : : : : : 132.3 (a) Histogram of instantaneous values taken from the training set
of the TIMIT database normalized by the standard deviation. Ex-perimental data are marked by x's. The dotted line is a �t usingthe double-gamma distribution with � = 0:5. The solid line is for� = 0:267. Note the e�ect of clipping at the sides of the graph. (b)A close view of the central portion of the graph in (a). : : : : : : : 18
2.4 The probability density functions of (a) the uniform (solid) andcosine value (dotted) and (b) the normal (solid) and double-sidedgamma (dotted) random variable. : : : : : : : : : : : : : : : : : : : 21
2.5 Noise models for the transconductance-C integrator: (a) referredto the output current, (b) referred to the input voltage, whereVin;n = Iout;n=G. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23
2.6 Sample transconductor output current as a function of input volt-age. The solid line is the actual nonlinear function; the dashedline is a linear approximation for which the slope is the nominaltransconductance, Go. : : : : : : : : : : : : : : : : : : : : : : : : : 28
2.7 Nonlinearly transformed input voltage, h(Vin), as a function of theinput voltage for a sample transconductor. (a) The �rst distortionmeasure is computed as the mean-square error between h(Vin), thesolid line, and Vin, the dashed line. (b) The second distortion mea-sure is computed as the minimum mean-square di�erence betweenh(Vin) times a gain factor �, the solid line, and Vin, the dashed line.In this plot, � = 0:84. : : : : : : : : : : : : : : : : : : : : : : : : : : 30
2.8 The normalized transconductance plotted as a function the inputvoltage for a sample transconductor, solid line. The dashed line isthe ideal normalized tranconductance, equal to unity. The maxi-mum normalized transconductance distortion is the maximum dis-tance between the two curves. : : : : : : : : : : : : : : : : : : : : : 33
2.9 Self-biased transconductance-C integrator (a) circuit and (b) symbol. 38
xi
2.10 For the self-biased transconductor (a) output current in units of Ib asa function of the input voltage in units of Vt=�, and (b) normalizedtransconductance G=Go as a function of input voltage. Note thatthe transconductance function is convex. : : : : : : : : : : : : : : : 38
2.11 Noise model of self-biased transconductor (a) referenced to the out-put current and (b) referred to the input voltage. : : : : : : : : : : 40
2.12 Self-biased transconductance-C inverting lowpass �lter (a) circuitand (b) noise model. : : : : : : : : : : : : : : : : : : : : : : : : : : 40
2.13 The �rst distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distri-butions: uniform (solid), cosine value (dashed), and normal (dotted). 43
2.14 The optimal gain factor �� as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distri-butions: uniform (solid), cosine value (dashed), and normal (dotted). 45
2.15 The second distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distri-butions: uniform (solid), cosine value (dashed), and normal (dotted). 46
2.16 The signal-to-noise-plus-distortion ratio using (a) the �rst and (b)the second distortion measure as a function of � in units of Vt=�,where Vt = 25:7 mV, � = 0:7 and C = 5:0 pF. Curves are drawnfor each of three input distributions: uniform (solid), cosine value(dashed), and normal (dotted). : : : : : : : : : : : : : : : : : : : : 48
2.17 The distortion-free dynamic range using the �rst distortion measure(a) as a function of C, where � = 0:7, and (b) as a function of �,where C = 5:0 pF. Vt = 25:7 mV. Curves are drawn for each ofthree input distributions: uniform (solid), cosine value (dashed),and normal (dotted). : : : : : : : : : : : : : : : : : : : : : : : : : : 49
2.18 The distortion-free dynamic range using the second distortion mea-sure (a) as a function of C, where � = 0:7, and (b) as a functionof �, where C = 5:0 pF. Vt = 25:7 mV. Curves are drawn for eachof three input distributions: uniform (solid), cosine value (dashed),and normal (dotted). : : : : : : : : : : : : : : : : : : : : : : : : : : 50
2.19 The distortion-limited dynamic range using the �rst distortion mea-sure (a) as a function of C, where � = 0:7, and (b) as a function of�, where C = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion.Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted). : : : : : : : : : 52
2.20 The distortion-limited dynamic range using the second distortionmeasure (a) as a function of C, where � = 0:7, and (b) as a functionof �, where C = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion.Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted). : : : : : : : : : 53
3.1 The transconductance-C integrator using the basic di�erential pair. 56
xii
3.2 The basic di�erential pair (a) circuit, and (b) simpli�ed AC noisemodel. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 58
3.3 Normalized transconductance for the basic di�erential pair as a func-tion of VDM with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : 64
3.4 The di�erential pair with source degeneration via diode-connectedtransistors (a) circuit, and (b) small-signal noise model. : : : : : : : 67
3.5 Normalized transconductance for the di�erential pair with sourcedegeneration via diode-connected transistors as a function of VDM
with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : : : : : 703.6 The di�erential pair with source degeneration via a single di�usor
(a) circuit, and (b) small-signal noise model. : : : : : : : : : : : : : 733.7 For the di�erential pair with source degeneration via a single di�u-
sor, G normalized by the maximal transconductance as a functionof VDM with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : 78
3.8 The di�erential pair with source degeneration via double di�usors(a) circuit, and (b) small-signal noise model. : : : : : : : : : : : : : 80
3.9 For the di�erential pair with source degeneration via double di�u-sors, G normalized by the maximal transconductance as a functionof VDM with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : 84
3.10 A transconductor with two asymmetric di�erential pairs. : : : : : : 863.11 For a transconductor with two asymmetric di�erential pairs, G nor-
malized by the maximal transconductance as a function of VDM withVt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : : : : : : : : 89
3.12 A transconductor with three asymmetric di�erential pairs. : : : : : 923.13 For a transconductor with three asymmetric di�erential pairs, G
normalized by the maximal transconductance as a function of VDM .with Vt = 25:7 mV and � = 0:7. : : : : : : : : : : : : : : : : : : : : 92
3.14 A transconductor with two asymmetric di�erential pairs demon-strating the substrate biasing technique to achieve a maximally attransconductance function. : : : : : : : : : : : : : : : : : : : : : : : 95
3.15 Application of the voltage-splitting technique to the transconductorwith source degeneration via double di�usors. : : : : : : : : : : : : 97
3.16 Fully complementary transconductor design based on source degen-eration via double di�usors. : : : : : : : : : : : : : : : : : : : : : : 99
3.17 Experimental data of the di�erential output current as a functionof input voltage for (a) the basic di�erential pair, (b) the di�eren-tial pair with source degeneration via double di�usors, and (c) thedi�erential pair with source degeneration via a single di�usor. Eachdot represents a sample point. : : : : : : : : : : : : : : : : : : : : : 101
3.18 Normalized transconductance as a function of VDM computed fromexperimental data for (a) the basic di�erential pair, (b) the di�eren-tial pair with source degeneration via double di�usors, and (c) thedi�erential pair with source degeneration via a single di�usor. Solidlines show the predicted values. : : : : : : : : : : : : : : : : : : : : 102
xiii
3.19 Experimental data of the normalized transconductance as a functionof VDM for (a) the basic di�erential pair and (b) the di�erential pairwith source degeneration via a single di�usor. : : : : : : : : : : : : 104
4.1 Block diagram of Liu's N -channel basilar membrane model consist-ing of a cascade of N lowpass sections with taps to two bandpass�lters per output channel (Liu, 1992). : : : : : : : : : : : : : : : : : 110
4.2 RC �rst-order lowpass �lter (a) proto-type, (b) transconductance-Cimplementation, and (c) noise model. : : : : : : : : : : : : : : : : : 113
4.3 Self-biased transconductance-C integrator: (a) circuit and symbol,(b) con�gured as �rst-order lowpass �lter. : : : : : : : : : : : : : : 115
4.4 RLC proto-type second-order bandpass �lter (a) proto-type, (b)transconductance-C implementation, and (c) noise model. : : : : : : 116
4.5 RLC proto-type second-order bandpass �lter composed of six self-biased transconductors. : : : : : : : : : : : : : : : : : : : : : : : : : 119
4.6 Response of 16-channel cochlear �lter bank using exact equations,(a) magnitude and (b) group delay. Filter parameters are as follows:fc(1) = 8000 Hz, fc(16) = 100 Hz, Q3(1) = 2:6, and Q3(16) = 2:6.Two preliminary lowpass sections are added for better uniformityin the peak response. : : : : : : : : : : : : : : : : : : : : : : : : : : 121
4.7 Single-section of lowpass cascade showing tuning mechanism via (a)supply lines and (b) substrate lines. : : : : : : : : : : : : : : : : : : 122
4.8 (a) Power spectral density of noise in 16-channel cochlear �lter bankwith C = 5:0 pF and other parameters as earlier de�ned, and (b)RMS noise as a function of center frequency. : : : : : : : : : : : : : 124
4.9 Short-term averaged power spectrum taken from the training set ofthe TIMIT database normalized by the standard deviation. Exper-imental data are marked by x's. The dotted line is a �t using abandpass �lter with center frequency 550 Hz and Q � 1. : : : : : : 128
5.1 (a) CVCT RC lowpass circuit, (b) CVDT sample{and{hold circuit,(c) DVCT RC delay circuit, and (d) DVDT clocked M -bit delay. : : 135
5.2 Signal-to-noise ratio as a function of mean power. Results from (Ho-sticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed (2fp =fs = 100 MHz), and DVDT (� = 100, 2fp = fs = 100 MHz) Numberof bits. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 148
5.3 System capacity as a function of mean power dissipation. Resultsfrom (Hosticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed(2fp = fs = 100 MHz), and DVDT (� = 100, 2fp = fs = 100 MHz)Number of bits. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 149
5.4 Re-formulated signal-to-noise ratio as a function of mean power(fp = 0:5fs = 100 MHz, � = 1E� 12). CVCT Sold, CVDT Dashed,DVDT Number of bits. : : : : : : : : : : : : : : : : : : : : : : : : : 150
xiv
5.5 Re-formulated system capacity as a function of mean power (fp =0:5fs = 100 MHz, � = 1E � 12). CVCT Solid, CVDT Dashed,DVDT Number of bits. : : : : : : : : : : : : : : : : : : : : : : : : : 151
A.1 (a) View of an nMOS transistor on the substrate and (b) symbol. : 163A.2 MOS small-signal subthreshold model including sources of shot noise
only, (a) as a di�usor, and (b) in saturation. : : : : : : : : : : : : : 167A.3 Noise data taken from a PMOS transistor with W=L = 1148=4.
Solid lines are noise model, x's are data. Curve (a) corresponds to1 nA for an equivalent square device, (b) 10 nA, and (c) 100 nA.(� = 0:7, Cox = 1500 F/m2, and M = 4:0E�26 J.) : : : : : : : : : : 169
B.1 Experimental setup for simultaneously stimulating and recordingfrom the HEEAR chip set. One PC outputs a previously recordedspeech signal via the D/A converter module. The analog speechsignal is attenuated before being presented to the silicon basilarmembrane. Thirty-one output channels are fed into independenthair-cell synapse circuits. The outputs from the HEEAR chip setare digitized and stored on a second PC after passing through acustom analog interface. Synchronization is achieved by recordingthe input signal along with the 31 output channels. Depending onthe application, a microphone can be connected directly to the inputof the pre-ampli�er. : : : : : : : : : : : : : : : : : : : : : : : : : : : 173
B.2 Silicon cochlea response to 1kHz tone burst at 1/4 fullscale. Thecharacteristic frequency of the output channels are shown above halfof the traces. Only one channel (668 Hz) appears to be adapting tothe stimulus. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 175
B.3 Silicon cochlea response to 1kHz tone burst at 1/2 fullscale. Thecharacteristic frequency of the output channels are shown above halfof the traces. The response of one channel (668Hz) is high duringthe �rst three cycles of the tone burst, but reduces to roughly onehalf its initial value by the tenth cycle. : : : : : : : : : : : : : : : : 175
B.4 Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 12dB SNR. The characteristic frequency of the output channels areshown above half of the traces. : : : : : : : : : : : : : : : : : : : : : 176
B.5 Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 0dB SNR. The characteristic frequency of the output channels areshown above half of the traces. : : : : : : : : : : : : : : : : : : : : : 176
B.6 Silicon cochlea response to male token of /jh er/ at 1/5 fullscaleRMS and 6 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. A high-frequency burstmarks the release of /jh/. : : : : : : : : : : : : : : : : : : : : : : : 177
xv
B.7 Silicon cochlea response to male token of /jh er/ at 1/5 fullscaleRMS and 0 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. The consonant /jh/appears to be buried in the noise. : : : : : : : : : : : : : : : : : : : 177
B.8 Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 6 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. A high-frequency burstmarks the release of /jh/. : : : : : : : : : : : : : : : : : : : : : : : 178
B.9 Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 0 dB SNR. The characteristic frequency of the outputchannels are shown above half of the traces. The consonant /jh/ isbarely discernible in the noise. : : : : : : : : : : : : : : : : : : : : : 178
B.10 Schematic of current-to-voltage converter as used in the computerinterface to the Hopkins Electronic EAR. : : : : : : : : : : : : : : : 181
B.11 Compound PMOS/NPN Transistor. : : : : : : : : : : : : : : : : : : 183B.12 Current as a function of voltage for a compound PMOS/NPN tran-
sistor (solid line) and an NPN transistor (dashed line). Two tran-sistors of each type were measured, but the resulting curves were sosimilar for each type that they would be indistinguishable on thisgraph. The saturation of the NPN transistor output current near10mA was caused by limitations in the measurement equipment. Inall cases, VCE = 2:0V . : : : : : : : : : : : : : : : : : : : : : : : : : 186
B.13 BiCMOS Bu�er Ampli�er. : : : : : : : : : : : : : : : : : : : : : : : 187
xvi
List of Tables
1.1 Percent Error Rates from (Neti, 1994). : : : : : : : : : : : : : : : : 3
2.1 Probability density functions of four input signal distributions as afunction of �. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20
2.2 Performance ratios. : : : : : : : : : : : : : : : : : : : : : : : : : : : 35
3.1 Summary of Linearization Techniques with Constant (Go=C), C =5 pF (Vt = 25:7 mV, � = 0:7). : : : : : : : : : : : : : : : : : : : : : 105
3.2 Summary of Linearization Techniques with Constant (Go=C), Ib(Vt = 25:7 mV, � = 0:7). : : : : : : : : : : : : : : : : : : : : : : : : 106
4.1 Characteristics of the �rst and second-order OTA-C �lters : : : : : 1174.2 Square magnitude transfer functions from all noise sources to Vout
for the second-order OTA-C �lter. : : : : : : : : : : : : : : : : : : : 118
A.1 List of MOS device parameters and quantities : : : : : : : : : : : : 162
B.1 Speci�cations, simulation results, and measurements for the BiC-MOS bu�er ampli�er. : : : : : : : : : : : : : : : : : : : : : : : : : : 182
xvii
Chapter 1
Introduction
1.1 Motivation
Emerging opportunities in information technologies point towards markets for
portable systems where battery operation, light weight and small size will be in
demand. A distinct characteristic of these systems is their direct interface to people
in real{world environments. Thus, with their widespread deployment, the perfor-
mance of the sensory communication interfaces, or what is often called the user
interface, is becoming rapidly a central issue. What are already computationally
di�cult problems in speech and vision become now even harder. Mobile opera-
tion necessitates that the sensory communication interface be capable of robust
operation under highly variable environmental conditions (Andreou, 1995).
An example of a challenging user interface is the automatic recognition of hu-
man speech. Typically, the incoming speech signal is digitized, transformed, and
compressed. Its linguistic attributes such as the phonemic sequence and word order
are then identi�ed using a sophisticated classi�cation scheme, such as a Hidden-
Markov Model. The preprocessing in state-of-the-art speech recognizers can be
classi�ed as short-term Fourier transform, linear prediction, cepstrum, and their
variations. These signal analysis or decomposition schemes are mathematically
oriented, and are based, to some extent, on a simpli�ed model of speech produc-
1
2
tion. The notable exceptions are the Bark and Mel-frequency scale �lter banks
which have roots in the psychophysics of human hearing. Although great strides
have been made in the last two decades, the best speech recognition systems today
still do not have performance comparable to the human auditory system. Particu-
larly di�cult at the early stage is, for instance, the identi�cation of rapid-changing
sounds such as stop consonants. This very fact suggests that perhaps one should
explore alternative signal representation schemes in order to model how speech is
processed by the human nervous system { the very best speech recognition sys-
tem. Indeed engineers can learn a great deal from the cochlea, the periphery of
the human auditory system.
Cochlear �lter banks abstract the function of the mammalian cochlea, in par-
ticular, the movement of the basilar membrane in response to acoustic vibrations.
From a signal processing viewpoint, cochlear �lter banks can be thought of as
multiresolution analyzers. As such, they retain good frequency resolution at low
frequencies and good temporal resolution at high frequencies. We distinguish be-
tween cochlear �lter banks and silicon cochleas, which incorporate some of the
more salient non-linearities of cochlear function. Examples of such nonlinearties
are automatic-gain control, recti�cation, saturation, and the like. In this work,
we restrict our study to a linear abstraction of cochlear function, but with the
understanding that non-linearities play an essential role in widely diverse acoustic
environments.
The tradeo� between time and frequency resolution is viewed as the fundamen-
tal di�erence between the conventional spectrographic analysis based on the short-
term Fourier transform and cochlear analysis for broadband, rapidly-changing sig-
nals, such as speech. It can be shown that linear �lter bank approximations to
cochlear models approximate wavelet analysis (Liu, 1992; Yang et al., 1992) in
a scale domain that preserves good temporal resolution. As a consequence, the
3
frequency of each spectral component in a broadband signal can be accurately
determined from the inter-peak intervals in the �lter bank output signals. Such
properties of cochlear models have been demonstrated with natural speech and
synthetic complex signals (Liu, 1992; Liu et al., 1992a; Liu et al., 1992b; Liu et al.,
1993).
The use of cochlear models as the front-end signal analyzer has been shown
experimentally to yield improved performance in small-scale speech recognition
systems (Ghitza, 1986; Meng and Zue, 1990). More recently, Chalapathy Neti
reported on a study with a commercially available speech recognition system in a
large{vocabulary, isolated{word task in the presence of additive babble noise (Neti,
1994). By employing an auditory based acoustic processor as a front end, he was
able to demonstrate a much more graceful degradation in system performance,
compared to the conventional FFT-based acoustic processing scheme. His main
results are summarized in Table 1.1.
Table 1.1: Percent Error Rates from (Neti, 1994).
Pre-processing 42.2 dB 36.7 dB 30.7 dB 24.7 dBMethod @ SNR @ SNR @ SNR SNR
Cochlear 4.71 % 4.71 % 6.35 % 22.54 %FFT-Based 3.69 % 4.51 % 16.80 % 40.37 %
The auditory processing employed in Neti's study begins with a software simu-
lation of the basilar membrane �lter bank proposed by Liu and analyzed in detail
in Chapter 4 of this work. The �lter bank is followed by a temporal feature ex-
traction scheme proposed by (Yang et al., 1992) and interfaced (Neti, 1994) to the
Hidden-Markov recognizer1.
Two hypotheses can be gleaned from Table 1.1. The �rst is that, in a clean
1The source code for Liu's model as well as Neti's additions can be found on CD-ROM
4
environment, an auditory model does not improve performance. The second is that
FFT-based models are not robust in the presence of additive background noise.
Both of these conjectures will be subject to debate over the next decade, as we seek
processing methods that not only give good performance in clean environments,
but also in the presence of noise.
To date, however, cochlear models have not been widely adopted by the speech
recognition research community. Two major obstacles exist. One is the compu-
tational complexity of the cochlear model itself; the other is the large amount of
analysis data a cochlear model generates. Cochlear models require much more
computing resources than conventional speech analysis approaches to the extent
that a reasonably accurate cochlear model can be computationally too expensive
for general-purpose digital computers. As a case in point, the time required to pro-
cess speech using the cochlear model in Neti's experiment was 120 times real-time
on a SPARC 2. Clearly, a real-time cochlear model would make this processing
scheme more attractive.
Much of this work and the work of many others has been devoted to the ef-
�cient hardware implementation of the �rst part of the auditory modeling task,
the cochlear �lter bank. The �rst real-time electronic cochlear �lter bank was in-
troduced by Lyon and Mead (Lyon and Mead, 1988). Their model consisted of
a cascade of 480 second-order sections, i.e., almost 1000 poles on a single chip,
operating in subthreshold CMOS. It consumed only milliwatts of power. Their
pioneering e�orts have been a great source of inspiration. Since that time, many
continuous{time analog VLSI implementations of cochlear models have been re-
ported, including those in (Liu et al., 1992b; Watts et al., 1992; Lazzaro et al.,
1993; Bhadkamkar, 1993; Andreou and Liu, 1993). Cochlear models have also
been reported using switched{capacitor circuits (Lin et al., 1994), and switched{
current techniques (Park et al., 1993). Even with real-time cochlear models, their
5
interface to digital computers has been problematic. An experimental setup to in-
terface Liu's cochlear model for real-time stimulation and recording was reported
in (Furth et al., 1994) and is described in appendix B. A second, more e�cient
solution to the problem and a discussion of system{level issues has been reported
recently by Lazzaro, Wawrzynek and Kramer (Lazzaro et al., 1994).
While research so far is aimed at improving various characteristics of the Lyon-
Mead design, such as improved linear range, low bit-rate communication protocols,
and more realistic phase characteristics, the issues of noise, distortion, and dynamic
range in the �lter banks have not been given a thorough treatment. This research
is intended to be a major step in that direction.
In addition to speech recognition, a real-time cochlear model can potentially
be applied to other signal processing tasks, such as tactile aids for the hearing-
impaired and cochlear implants. Integrated circuits are small and light in weight,
and in most of these applications it is also desirable that the device be low-power.
1.2 Approach
The goal of our work is to optimize the design of an acoustic processor at all levels
{ from the architecture or algorithms to the detailed circuit implementation. This
problem is constrained at one end by the attributes of the speech, and at the other
by the available fabrication technology. Further constraints are imposed by the
acoustic environment, such as ambient noise and distortions, as well as market
constraints as discussed earlier. Fig. 1.1 depicts this paradigm.
The technology of choice is CMOS, due to the possibility of very-large scale-
integration (VLSI), low-cost, and high reliability. But some question may arise as
to the most e�cient use of this technology, i.e. what signal representation should
be utilized. Analog? Digital? Continuous-time? Discrete-time? We address this
problem by considering the maximum possible information rate as a function of
6
Acoustic Source
Environment
Technology
Market
Architecture Circuits
Figure 1.1: Optimization paradigm.
7
power dissipated of four circuits each processing signals using a di�erent represen-
tation.
At the architectural level, we study the hardware implementation of the basilar
membrane model proposed by Liu (Liu, 1992). We extend his work, determining
the noise, distortion, and signal-handling capabilities of this �lter bank. As a
measure of goodness, we propose and estimate the information rate per Watt, in
bits per Joule, for a particular VLSI implementation of this �lter bank.
At the circuit level, the problem of designing high dynamic-range continuous-
time �lter is addressed. Using known properties of speech signals, the dynamic
range of a single �ltering element is computed. It is then extended using linearizing
techniques in order to greatly reduce undesirable signal distortions.
1.3 Dissertation Outline
The next chapter, Chapter 2, discusses a method for computing the dynamic
range of a CMOS transconductance-C integrator, covering the topics of input signal
statistics, noise, distortion, and dynamic range.
Chapter 3 introduces and analyzes several CMOS transconductor designs op-
erating in the subthreshold region. At least three of them have never been used
in subthreshold design { those with source degeneration via single and double
di�usors, and one which uses asymmetric di�erential pairs.
Chapter 4 applies the techniques and circuits of Chapter 2 and Chapter 3 to
the silicon implementation of a proposed cochlear model (Liu, 1992) using analog
VLSI technology.
Chapter 5 discusses limitations in information processing using continuous
and discrete systems. A new measure of goodness is proposed for mobile subsys-
tems, bits/sec/watt.
The results suggest future research directions, as found in Chapter 6.
8
Appendix A contains a symmetric large-signal and small-signal noise model
for subthreshold CMOS.
Appendix B gives details of the hardware setup for testing the VLSI imple-
mentations of �lter banks in large-scale speech recognition experiments.
Chapter 2
Dynamic Range of Integrators
for Continuous-Time Audio
Signal Processing in Analog
VLSI
Given constraints on the available technology, supply voltage, total current con-
sumption, and die area, our goal is to optimize the circuit realization of an analog
�lter in terms of its achievable dynamic range. The design of optimum fully-
integrated continuous-time �lters requires the optimization of a single integrator,
as well as optimization of the �lter structure (Groenewold, 1991). In this chapter
we derive the dynamic range of a single integrator implemented as a lowpass �lter
as a basic building block for realizing continuous-time �lters for audio processing
in analog VLSI. We compute the dynamic range as a function of capacitance and
body e�ect coe�cient �.
The �rst real-time analog VLSI cochlear �lter bank was introduced by Lyon
and Mead (Lyon and Mead, 1988). Their model consisted of a cascade of 480
second-order sections, i.e., almost 1000 poles, operating in subthreshold CMOS.
Since that time, many continuous{time analog VLSI implementations of cochlear
models have been reported, including those in (Liu et al., 1992b; Watts et al.,
1992; Lazzaro et al., 1993; Bhadkamkar, 1993; Andreou and Liu, 1993). Cochlear
9
10
models have also been reported using switched{capacitor circuits (Lin et al., 1994)
and switched{current techniques (Park et al., 1993).
While research so far is aimed at improving various characteristics of the orig-
inal design, such as improved linear range, low bit-rate communication protocols,
and more realistic phase characteristics, the issues of noise, distortion, and dy-
namic range in the �lter banks have not been given a thorough treatment. Given
constraints on the available technology, supply voltage, total current consumption,
and die area, our ultimate goal is to optimize the circuit realization of an ana-
log continuous{time linear �lter bank in terms of its achievable dynamic range.
Whereas �nally a non-linear �lter bank with automatic gain control appears to
be necessary for addressing the wide dynamic range inherent to natural speech
sounds, here we restrict our research to the design of linear systems.
In this chapter we present a framework for the design of multiresolution ana-
log �lter banks implemented in subthreshold CMOS technology using the well-
developed random signals formalism. We begin with a discussion on the choice of
integrators, concluding that the transconductance-C is the preferable structure for
our problem. Models of the audio input signal are given in section 2.2. In sec-
tion 2.3, we compute the output-referred noise-level of a general transconductance-
C �lter using a white noise model for the transconductor. Three mean-square
distortion measures referenced to the input signal are de�ned in section 2.4. Vari-
ous performance ratios are used to characterize the transconductance-C integrator.
Then, by way of example, the dynamic range of the self-biased transconductance-C
integrator is derived in section 2.6.
The high side of the dynamic range of an integrator is the maximal signal
level it can handle. For applications in which linearity in the signal is of utmost
importance, the maximal signal level is the level at which distortion products
are just equal to the noise level. We refer to this type of dynamic range as the
11
distortion-free dynamic range. For applications in which a certain level of distortion
is tolerable, the maximal signal level is the level at which distortion products are
just equal to the maximal allowable distortion level. We call this type of dynamic
range the distortion-limited dynamic range. In this chapter, we will deal with both
types of dynamic range. They can be formally described by the equations
DFDR � V 2s
V 2n
����� V2
d
V2n
=1
(2.1)
DLDR � V 2s
V 2n
����� V2
d
V2s
=c2
where V 2s , V
2d , and V 2
n are the mean-square signal, distortion, and noise values,
respectively, and c is the percent allowable distortion.1
2.1 CMOS Integrators
Groenewold (Groenewold, 1991) brie y describes four possible integrator struc-
tures. The passive conductance/passive admittance integrator, consisting of a lin-
ear resistor and a linear capacitor, is dismissed because its pole location is not at
the origin of the s-plane, i.e. it is not a pure integrator, but a lowpass �lter. As for
the active conductance/active admittance structure, consisting of a transconductor
and an active capacitance, its noise factor is increased, whereas its signal handling
capacity is reduced, compared to either of the two remaining circuit structures.
Thus, it cannot be used to achieve an optimal dynamic range.
The two types of integrators Groenewold (Groenewold, 1991) analyzes in depth
are the MOSFET-C and transconductance-C integrators, shown in Fig. 2.1.
The main advantage of the MOSFET-C integrator is that it can approach the
dynamic-range maximum, as de�ned by Groenewold (Groenewold, 1991). Its chief
disadvantage is the relatively narrow frequency tuning range of approximately half
1V denotes the mean, or average value, of the signal V .
12
an octave. Such a tuning range is generally adequate to compensate for para-
metric variations in the fabrication process; however, �lter banks for processing
wideband signals, such as speech, require a frequency tuning range of 2 decades or
more, encompassing much of the normal hearing range of 20 Hz to 20 kHz. A sec-
ond disadvantage of the MOSFET-C implementation is the need for a high-gain,
low-output-impedance ampli�er. Thus, one possible road towards the design of op-
timumdynamic-range integrators for speech processing is to develop a MOSFET-C
implementation with a much broader tuning range and a low power ampli�er.
The alternate approach employs transconductance-C integrators and is the one
adopted in our work. By exploiting the exponential current-to-voltage relation-
ship in subthreshold MOS transistors, integrators that span several decades in
frequency can be easily made. One has the added bonus of low power and low
voltage operation, as the device physics of subthreshold CMOS enable the lowest
possible saturation voltage (Vittoz, 1994; Andreou and Boahen, 1994). The main
disadvantage of transconductors operating in the subthreshold region is the rela-
tively poor linear range. There exist however several techniques to linearize these
inherently nonlinear transconductors (Watts et al., 1992; Tanimoto et al., 1991;
Vittoz, 1994; Furth and Andreou, 1995). In short, for low voltage, low power
systems, transconductance-C integrators appear to be the best, if not the only,
option (Tanimoto et al., 1991).
As explained by Groenewold (Groenewold, 1991), the use of the di�erential
signaling con�guration of Fig. 2.2 can increase the dynamic range by 6 dB. This
increase is because the capacitor voltage swing is doubled, whereas the e�ective
noise seen at the input is unchanged. In this preliminary study, we will examine
the dynamic range of the single-ended integrator. Thus, we expect an increase
in the dynamic range of up to 6 dB beyond that reported in this paper when we
employ a di�erential signaling scheme.
13
C
VoutVin G
GVin Vout
C
(a)
(b)
Figure 2.1: (a) MOSFET-C and (b) Transconductance-C integrators. Both havetransfer function G=C s.
G-Vin / 2
+Vin / 2C
-Vout / 2
+Vout / 2
Figure 2.2: Di�erential transconductance-C integrator.
14
2.2 Acoustic Input Signals
The e�ciency and performance of any information processing system, both hard-
ware and software, can be improved by incorporating prior knowledge in the design
phase. Indeed, this is the keystone for success in all statistical speech recognition
systems (Roe and Wilpon, 1994). In the problem at hand, we know a-priori that
the system will process speech signals which, in the framework of random signals,
can be described by linear statistics such as the mean and variance of the ampli-
tude. Two examples of prior knowledge which will be exploited in the synthesis
and characterization of the cochlear �lter bank are the input amplitude distribution
f(Vin) and power spectral density SV in(!).
Traditional methods of determining system performance assume a sinusoidal
input signal. The maximum signal level is then the level of the sinusoid such
that the distortion is acceptably low. Two common distortion measures are total
harmonic distortion and intermodulation distortion. The use of sinusoidal input
signals is perhaps appropriate when evaluating a narrow band �lter. On the other
hand, a cochlear �lter bank is likely to encounter various types of signals, ranging
from such broadband signals as speech and music, to narrowband signals, like
tones. In general, we �nd that sinusiodal signals overestimate the performance of
a cochlear �lter bank, as compared to more speech-like signals. In this section, we
consider the stastical properties of speech signals and incorporate them into the
evaluation of our �lter bank model.
2.2.1 Random Variables and Processes
A random variable can be formally described as follows (Papoulis, 1965). One
is given an experiment E, whose outcomes � are various objects belonging to a
sample space S. Now, to every outcome � one assigns a number according to the
15
real-valued function
x = x(�) (2:2)
Because the value of a random variable x is determined by the outcome of the
experiment, we may assign probabilities to the possible values of the random vari-
able.
Given a real number x, the set fx � xg, consisting of all outcomes � such that
the inequality is satis�ed, is an event. De�ne then the probability distribution
function of the random variable x as
Fx(x) = Pfx � xg (2:3)
If the function Fx(x) is absolutely continuous, the random variable x is of
continuous type. The derivative
f(x) =dFx(x)
dx(2:4)
of the distribution function is called the probability density function.
A real stochastic, i.e. random, process is statistically determined if one knows
its nth order distribution functions
F (x1; :::; xn; t1; :::tn) = Pfx(t1) � x1; :::x(tn) � xng (2:5)
for any n and t1; :::; tn. The mean of a process x(t) is
�x(t) = E[x(t)] (2:6)
The autocorrelation function of a process is de�ned as
Rx(t1; t2) = E[x(t1)x(t2)] (2:7)
A random process x(t) is wide-sense stationary (or weakly stationary) if its ex-
pected value is a constant and its autocorrelation depends only on � = t1 � t2:
�x = E[x(t)] Rx(� ) = E[x(t)x(t+ � )] (2:8)
16
The one-sided power spectrum (or spectral density) Sx(!) of a wide-sense sta-
tionary process is the Fourier transform of its autocorrelation:
Sx(!) =Z 1
0
Rx(� )e�j�!d� (2:9)
with inverse
Rx(� ) =1
2�
Z 1
0
Sx(!)ej�!d! (2:10)
Substituting � = 0, the above equation yields
Rx(0) = E[x2(t)] =1
2�
Z 1
0
Sx(!)d! (2:11)
Thus the total area of the power spectrum (normalized by 2�) equals the \average
power" of the process x(t).
A property that we will make use of is that if a wide-sense stationary random
process x(t) undergoes a linear transformation, as in
y(t) = L[x(t)] (2:12)
then y(t) is also stationary in this sense. Let H(j!) be the transfer function of the
linear transformation L[�]. Then the power spectrum of the output process y(t) is
given by
Sy(!) = Sx(!) jH(j!)j2 (2:13)
2.2.2 Speech
Let speech be the primary acoustic source. Assuming it is a wide-sense stationary
random process, one can compute the amplitude probability density and power
spectral densities from a standard database. Amplitude histograms of the time-
domain waveforms of voiced and unvoiced segments of speech have a maximum
near zero and sides which decrease exponentially as the amplitude moves away
from zero. Two probability density functions which �t this description are the
17
Gaussian, or normal, and the two-sided Gamma. The normal probability density
was considered by Max (Max, 1960) in the problem of minimizing the distortion in
speech coders. The Gamma distribution was discussed by Paez and Glisson (Paez
and Glisson, 1972) for the same problem. By constructing amplitude histograms
of raw speech signals they show that the best approximation is obtained using the
Gamma distribution with shape parameter � = 0.5.
The probability density function of the two-sided Gamma distribution can be
written in the following form (Rice, 1988):
f(x) =��
2�(�)
e��jxj
jxj1�� (2:14)
It has mean-absolute value �=�, and variance �(�+1)=�2. If we de�ne the variance
as �2 and eliminate the variable �, the distribution can be rewritten in the form
f(x) =(�2 + �)�=2
2���(�)
e�jxjp�2+�=�
jxj1�� (2:15)
For the special case that � = 0:5, the probability density function reduces to
f(x) =
vuut p3
8��jxje�p
3=4jxj=� (2:16)
In order to test this hypothesis, the amplitude histogram of the entire training
set from the TIMIT database was computed . Results are given in Fig. 2.3. Indeed,
the Gamma distribution appears to be a good �t for the data. Using the method
of moments (Rice, 1988), a better estimate for this database appears to be � is
0.267. The di�erence between the two models is seen chie y in the tails of the
distribution. For amplitudes less than �ve standard deviations from the mean
value, zero, either distribution appears adequate, as evidenced by Fig. 2.3(b).
One practical di�culty of the double-gamma distribution is that its amplitude
is unbounded. And, while the tails of the distribution fall o� quickly, of the order
e�jxj, it is not fast enough for at least one transconductor design, the CMOS
18
−20 −15 −10 −5 0 5 10 15 2010
−10
10−8
10−6
10−4
10−2
100
(a) x, std. dev.
p(x)
−5 0 510
−4
10−3
10−2
10−1
100
(b) x, std. dev.
p(x)
Figure 2.3: (a) Histogram of instantaneous values taken from the training set ofthe TIMIT database normalized by the standard deviation. Experimental data aremarked by x's. The dotted line is a �t using the double-gamma distribution with� = 0:5. The solid line is for � = 0:267. Note the e�ect of clipping at the sides ofthe graph. (b) A close view of the central portion of the graph in (a).
19
inverter, for which the distortion grows at just the same order. One possible
remedy to this situation is to introduce a bounded double-gamma distribution,
which is clipped, say, at �ve standard deviations.
In addition, we consider three other types of input signals: those that have
uniform, sinusoidal, and normal statistics. The uniformly distributed signal is at-
tractive because it is mathematically tractable and represents a situation that is
not too far from the case of a pure tone. As shall be discussed in chapter 5, a uni-
formly distributed input signal results in the highest information rate for a circuit
that has a peak amplitude constraint. The second type of input signal is classical
in �lter theory and analysis, pure tones and sums of pure tones. The amplitude
distribution of a pure tone looks U-shaped and is derived below. In addition, sums
of pure tones can be used to study intermodulation frequency distortion. The third
type of input signal, normal, is the most similar to natural sounds, such as speech
and music.
To derive an equation for the cosine amplitude distribution, we make use of
basic probability theory. Let the input signal x be given by
x = g(t) =p2� cos
�2�t
�
�(2:17)
where � is the RMS value and � is the period.
Now suppose that t is a uniform random variable on the interval [0; 0:5=� ],
comprising one-half of a cycle of the cosine. Let f� (t) denote the probability
density function of t, given by
f� (t) = 2� (2:18)
on this interval, and zero everywhere else.
On the same interval, the function g(t) is monotonically decreasing, and there-
fore, an inverse function exists, namely,
t = g�1(x) =�
2�arccos
xp2�
!(2:19)
20
Table 2.1: Probability density functions of four input signal distributions as afunction of �.
Uniform Cosine Value Normal Gamma
f(x) 1
2p3�
1p2��p
1�x2=2�21
�p2�e�x
2=2�2r p
3
8��jxje�p3jxj=2�
Range �p3��x�p3� �p2��x�p2� �1<x<1 �1<x<1
From probability theory, the distribution function of the amplitude of x on that
interval is given by (Ross, 1988),
f(x) = f� [g�1(x)]
����� ddxg�1(x)����� (2:20)
Because the cosine amplitude distribution is the same in the �rst half-cycle as
in the second, it follows that the cosine amplitude density function is equal to
f(x) =1p
2��q1 � x2=(2�2)
(2:21)
Note that the cosine amplitude distribution is independent of the period, � , and
depends only on the standard deviation, �.
All input distributions considered in this work have zero mean. They are also
symmetric about their mean value. We equalize the input signals in terms of their
variance. Probability densities are written as functions of �, as in Table 2.1. The
shapes of these distributions are shown in Fig. 2.4. For simplicity we adopt the
gamma distribution with parameter � = 0:5.
2.3 Noise
The noise level represents the smallest signal level that can be adequately processed
by a circuit. In this section, we consider the noise sources of a single integrator,
which is the basic element of a linear �lter. The noise level of a pure integrator
cannot be determined directly. Therefore, we compute the noise level in a unity-
gain single-order lowpass �lter using such integrators. More generally, the noise
21
−4 −3 −2 −1 0 1 2 3 40
0.2
0.4
0.6
0.8
(a) Input Amplitude [V/sigma]
Pro
babi
lity
−4 −3 −2 −1 0 1 2 3 40
0.2
0.4
0.6
0.8
(b) Input Amplitude [V/sigma]
Pro
babi
lity
Figure 2.4: The probability density functions of (a) the uniform (solid) and cosinevalue (dotted) and (b) the normal (solid) and double-sided gamma (dotted) randomvariable.
22
level in a circuit depends not only on the noise sources of the basic element, but
also on the speci�c �lter architecture. 2
2.3.1 Input-Referred Noise Power Spectrum
A transconductance-C integrator consists of a noiseless capacitance C and a noisy
transconductor. The transconductor can be modeled accurately as a noiseless
transconductance G in parallel with a noisy current source Iout;n as shown in
Fig. 2.5(a). 3
By convention, the noise in a transconductor is referred to the input voltage by
dividing the output current noise by the transconductance, as shown in Fig. 2.5(b),
where
Vin;n =Iout;nG
(2:22)
Referring the output current noise to an equivalent input noise voltage is convenient
for voltage-mode circuits, i.e. circuits for which the input and output signals are
voltages, rather than currents. As we shall see the output noise voltage can be
computed easily from the equivalent input noise voltage and the �lter transfer
function.
Let SIout;n(!) be the (one-sided) power spectral density of the output current
noise. Then, from (2.22) and (2.13) the input-referred power spectral density is
given by
SV in;n(!) =SIout;n(!)
G2(2:23)
2.3.2 Output-Referred Noise Level
From (2.13) the output-referred noise spectrum is the product of the input-referred
noise spectrum and the square magnitude of the �lter transfer function between
2It is found that the noise level of a nonlinear transconductance function depends very weakly
on the input signal level and distribution. We ignore these secondary e�ects in this work.3Throughout this work, the subscript n refers to a noise signal.
23
Vin VoutG
Vin Vout
Vin,n
G
C
C
Iout,n
(a)
(b)
Figure 2.5: Noise models for the transconductance-C integrator: (a) referred tothe output current, (b) referred to the input voltage, where Vin;n = Iout;n=G.
24
the input of the tranconductor and the output node. Let H(!) be the transfer
function. Then
SV out;n(!) = SV in;n(!) jH(!)j2 (2:24)
In order to compute the output-referred noise level, or mean-square value, ac-
cording to (2.11), we integrate the output noise power spectral density over all
radian frequencies and normalize by 2�, as in
V 2out;n =
1
2�
Z 10
SV out;n(!) d! (2:25)
For the special case that SV in;n(!) is constant (white) of the form
SV in;n(!) =4kT
Go� (2:26)
the output-referred noise level can be written in the form
V 2out;n =
4kT
Go�ENBW (2:27)
where xi is the noise factor (unity for a pure resistor), Go is the nominal conduc-
tance or transconductance, and
ENBW � 1
2�
Z 10jH(!)j2 d! (2:28)
is the equivalent noise bandwidth of the circuit.
We want to compute the output-referred noise level for the transconductance-C
integrator; however, the noise level in a perfect integrator is theoretically in�nite.
To show this, we need to compute ENBW for the integrator. The transfer function
between the input signal and the output of a pure integrator can be written as
HI (!) =G
j!C(2:29)
From (2.28) the equivalent noise bandwidth is given by
ENBW � 1
2�
Z 10
G2
!2C2d! (2:30)
25
This integral is unbounded.
Therefore, we need to assume a particular �lter structure in order to compute
the output-referred noise level. We choose the simplest �lter for this purposes, the
unity-gain �rst-order lowpass, with transfer function is of the form
HLP (!) = 1=(1 + j!=!c) (2:31)
where !c = Go=C.
For the case that H(!) = HLP (!), we have
ENBW =1
2�
Z 10
1
1 + !2=!2c
d! (2.32)
=!c4
=Go
4C
Then the mean-square output voltage noise is
V 2out;n =
kT
C� (2:33)
If a �lter has more than one transconductor, we assume that the noise produced
by each transconductor is statisically independent. In this case, by the principal
of superposition, we can sum the contributing e�ects of each noise source at the
output node by considering one noise source at a time. Suppose a �lter has N
transconductors. Compute the transfer function from the input of each transcon-
ductor to the output. Then multiply the input-referred power spectral density of
each transconductor by the square magnitude of its transfer function. At the out-
put node, sum the �ltered power spectra from each noise source. Then integrate
to �nd the mean-square output noise voltage. A small example of a �rst-order
lowpass �lter with two transconductors appears at the end of this chapter.
2.4 Distortion
Gray et al. (Gray et al., 1980) de�ne a distortion measure as the assignment of a
nonnegative number to a pair of ideal and real quantities, where the real quantity
26
is intended to be a reasonable approximation of the ideal quantity. The distortion
measure must be zero whenever the two quantities match exactly. To be useful, a
distortion measure must satisfy the following three properties: 1) it is subjectively
meaningful, in the sense that large distortion corresponds to poor quality repro-
ductions and small distortion corresponds to good quality reproductions, 2) it is
mathematically tractable so that it leads to good design techniques, and 3) it can
be computed e�ciently.
They state that the most common distortion measure is the squared error
largely because it is tractable and computable. It is also the most common way of
dealing with distortion in the context of random signals. However, for low-bit-rate
speech systems, they assert that such a distortion measure does not appear to be
always subjectively meaningful. As an example, a \shh" sound is essentially a
white process and any typical waveform will sound the same. In order to satisfy
the property of being subjectively meaningful, several distortion measures have
been introduced which are based on the di�erence between the log spectra (Gray,
Jr. and Markel, 1976) of the ideal and real signals.
Although subjectively meaningful distortion measures are desirable when eval-
uating the performance of audio processing systems, we do not have the mathe-
matical tools yet needed to evaluate and design continuous-time �lters based on
magnitude spectra of the ideal versus real �lter response. Perhaps a suitable mea-
sure of distortion would be to use the absolute di�erence of power spectra, as
in
V 2out;d =
1
2�
Z 10jSout;r(!)� Sout;i(!)j d! (2:34)
where Sout;r is the actual output power spectrum including the e�ects of nonlinear
�ltering, while Sout;i is the output power spectrum that would have resulted from
an ideally linear system. This distortion measure is possibly more subjectively
meaningful than the classical mean-square measure. In addition, it is a power
27
measure, and therefore �ts neatly into the framework for computing dynamic range,
which calls for the distortion power.
Nevertheless, we do not attempt in this work to compute the actual output
power spectrum, with all essential nonlinearities present in the �lter. Therefore, in
this research, two mean-square error measures are applied to the transconductor
alone, but not to the overall �lter response.
2.4.1 Input-Referred Distortion
Let the input voltage Vin follow a known amplitude distribution with probability
density function f(Vin), where Vin = 0 is assumed to be the operating point.
This point may be �xed, or dynamically obtained through adaptation, where it
is assumed to introduce no distortion. The input voltage passes through a non-
linear voltage-to-current function Iout(Vin). One can view the nonlinear function
as resulting from a nonlinear voltage transformation h(Vin) multiplied by a linear
transconductance Go, as in
Iout(Vin) = Goh(Vin) (2:35)
where
Go � @Iout(Vin)
@Vin
�����Vin=0
(2:36)
that is, Go is equal to the slope of Iout(Vin) at the operating point.
In Fig. 2.6 a sample transconductance function is plotted, along with a line
whose slope is the nominal transconductance, Go.
First Distortion Measure
Referred to the input voltage, the �rst distortion measure is the mean-square error
of the actual input voltage minus the transformed input voltage h(Vin), as in
V 2in;d1 � E
h(Vin � h(Vin)
2i
(2:37)
28
−50 0 50−4
−3
−2
−1
0
1
2
3
4
Input Amplitude [mV]
Out
put [
nA]
Figure 2.6: Sample transconductor output current as a function of input volt-age. The solid line is the actual nonlinear function; the dashed line is a linearapproximation for which the slope is the nominal transconductance, Go.
29
In Fig. 2.7(a) the transformed input voltage h(Vin) is plotted as a function of
the input voltage. The �rst distortion measure computes the mean-square error
between this curve and Vin, which is the straight dashed line with a slope of unity.
Using (2.35) to eliminate h(Vin), the �rst distortion measure can be written in
the form
V 2in;d1 � E
24 Vin � Iout(Vin)
Go
!235 (2.38)
= E[V 2in] +
E[I2out(Vin)]
G2o
� 2 E[VinIout(Vin)]
Go
where the expectation operation may be with respect to a deterministic time-
domain signal, e.g. a pure tone, or with respect to a stationary input amplitude
distribution. For the case of the latter, we have
V 2in;d1 �
Z 1�1
f(Vin)
Vin � Iout(Vin)
Go
!2
dVin (2:39)
Earlier in this chapter, we computed the amplitude distribution of a pure tone, so
that even a pure tone may be cast in a random variable framework. As a result,
for each type of input signal considered, we may apply (2.39) for the computation
of the �rst distortion level.
Second Distortion Measure
The second distortion measure involves the minimization of the mean square error
with respect a gain � times the nonlinear transformation, h(Vin). It is de�ned as
V 2in;d2 � min
�Eh(Vin � �h(Vin))
2i
(2:40)
where � is a real number. In Fig. 2.7(b) the transformed input voltage h(Vin) times
a gain � = :84 is plotted as a function of the input voltage. Assuming that this
value of � results in the lowest mean-square error, the second distortion measure
30
−50 0 50
−50
0
50
(a) Input Amplitude [mV]
h(V
in)
[mV
]
−50 0 50
−50
0
50
(b) Input Amplitude [mV]
zeta
* h
(Vin
) [m
V]
Figure 2.7: Nonlinearly transformed input voltage, h(Vin), as a function of theinput voltage for a sample transconductor. (a) The �rst distortion measure iscomputed as the mean-square error between h(Vin), the solid line, and Vin, thedashed line. (b) The second distortion measure is computed as the minimummean-square di�erence between h(Vin) times a gain factor �, the solid line, andVin, the dashed line. In this plot, � = 0:84.
31
computes the mean-square error between this curve and Vin, which is the straight
dashed line with a slope of one.
Using (2.35) to eliminate h(Vin), the second distortion measure can be written
in the form
V 2in;d2 = min
�E
24 Vin � �Iout(Vin)
Go
!235 (2.41)
= min�
E[V 2
in] +�2 E[I2out(Vin)]
(Go)2� 2 � E[VinIout(Vin)]
Go
!
Taking the expectation with respect to a known amplitude distribution, f(Vin), we
have
V 2in;d2 � min
�
Z 1�1
f(Vin)
Vin � �Iout(Vin)
Go
!2
dVin (2:42)
The optimal gain factor �� is that gain factor which results in the minimum
mean-square error. Its value depends on both the transconductor and the input
distribution. It can be solved for, as in
�� =E[GoVinIout(Vin)]
E[I2out(Vin)](2:43)
Substituting (2.43) into (2.42), after some simpli�cation, we get
V 2in;d2 = E[V 2
in]�E2[VinIout(Vin)]
E[I2out(Vin)](2:44)
The second distortion measure can be written as a function of the �rst, as in
V 2in;d2 = V 2
in;d1 � (1� ��)2E[I2out(Vin)]
(Go)2(2:45)
From (2.45) we see that V 2in;d2 � V 2
in;d1.
The second distortion measure is the error which results from a least-square
error estimate of the input voltage given the output current. This error is orthog-
onal to the input signal (Papoulis, 1965). Thus, if the input signal is a pure tone,
the error represents harmonic distortion. Similarly, if the input signal is the sum
of two tones, the error represents a combination of harmonic and intermodulation
32
distortions. By contrast, for a sinusoidal input signal, the �rst distortion measure
is the sum of harmonic, intermodulation, and gain distortions.
Absolute Maximum Deviation
An alternate measure of distortion in transconductors is the absolute maximum
deviation. It is used in (Tanimoto et al., 1991). This distortion measure is more
restrictive than the two mean-square error measures (Gray et al., 1980). Let G(Vin)
be the actual, nonlinear tranconductance as a function of Vin, where
G(Vin) � @Iout(Vin)
@Vin(2:46)
and Go be the nominal value. 4 De�ne the maximumnormalized transconductance
distortion as
DG � max
�����G(Vin)Go� 1
����� 8 Vin (2:47)
In Fig. 2.8 the normalized transconductance is plotted as a function of the input
voltage for the same sample transconductor. Ideally, this function would be at,
equal to unity. The maximum normalized transconductance distortion is the max-
imum distance between the two curves for a given input amplitude range. For
example, if Vin is restricted to only �5 mV, the distortion is approximately .01 or
1%. On the other hand, if the range of Vin is extended to �16 mV, the distortion
is approximately 0.10 or 10%.
While this distortion measure is not a power measure, it is possible to compute
a power measure using it, by squaring DG and multiplying by the input signal
power, Vin;s. More formally, one can de�ne an input-referred distortion measure,
V 2in;dg, as
V 2in;dg = V 2
in;s(DG)2 (2:48)
4For the case that the transconductance function has more than one in exion point, such
as an equiripple design, substitute the global maximum or global minimum for the nominal
transconductance.
33
−15 −10 −5 0 5 10 150.9
0.92
0.94
0.96
0.98
1
1.02
1.04
1.06
1.08
1.1
Input Amplitude [mV]
Nor
mal
ized
Tra
ns.
Figure 2.8: The normalized transconductance plotted as a function the input volt-age for a sample transconductor, solid line. The dashed line is the ideal normalizedtranconductance, equal to unity. The maximum normalized transconductance dis-tortion is the maximum distance between the two curves.
34
The chief disadvantage of such a distortion measure is that it applies only to
bounded input signal distributions, such as the cosine and uniform distributions.
Unbounded distributions would have to be clipped, i.e. distorted, prior to apply-
ing such a distortion measure. In Chapter 3 we apply the maximum normalized
transconductance distortion measure to several transconductor designs.
2.4.2 Output-Referred Distortion
The input-referred distortion level can be referred to the output current of a
transconductor by multiplying by the square of the nominal transconductance.
However, it is not clear how the input-referred distortion level of a transconduc-
tor relates to the distortion level of a transconductance-C �lter, where the output
current is now integrated on a capacitor to become a voltage. 5
In particular, let us consider the nontrivial example of a lowpass �lter with
transfer function HLP (!) = 1=(1 + j!=!c), and !c = G=C. Now, if the input
frequency ! is much less than !c, the output voltage will follow the input voltage,
no matter how much distortion is present in the transconductor G. However, for
! >> !c, then the error in the output amplitude is on the order of !c=!.
For the present work, we assume that the input-referred distortion is equal to
the output-referred distortion, regardless of the number of transconductors present
in the �lter, except for the case of non-unity gain. In this case, we multiply by the
square of the gain.
2.5 Dynamic Range
Various performance ratios exist for quantifying a circuit's behavior. The best
known is the signal-to-noise ratio (SNR). Another common �gure-of-merit is the
5One possible exception is the case of a constant-gain ampli�er with only one transconductance
element. In this case, the output-referred distortion level is equal to the input-referred distortion
level times the square of the gain.
35
Table 2.2: Performance ratios.
DNR SDR SNR SDNRV 2
d
V 2n
V 2s
V 2
d
V 2s
V 2n
V 2s
V 2n+V 2
d
signal-to-distortion-plus-noise ratio (SDNR). In order to compute the distortion-
free and distortion-limited dynamic range, we also introduce the distortion-to-noise
(DNR) and signal-to-distortion ratios (SDR). These four performance ratios are
summarized in Table 2.2.
For convenience, we revisit the de�nitions of dynamic range as found in (2.2)
DFDR � V 2s
V 2n
����� V2
d
V 2n
=1
DLDR � V 2s
V 2n
����� V 2
d
V 2s
=c2
Dynamic range can be viewed as the signal-to-noise ratio evaluated at the maximal
input level. For the distortion-free dynamic range, the maximum signal level is
the level such that the distortion-to-noise ratio is equal to one. By contrast, the
maximal signal level for the distortion-limited dynamic range is the level such that
the signal-to-distortion ratio is equal to a constant c2. Because SDR is a ratio of
powers, c will be a ratio of amplitudes, which permits comparison with the well
known measure of total harmonic distortion.
The signal-to-noise-plus-distortion ratio of a signal is the signal level divided
by sum of the noise and distortion levels. The characteristic shape of this ratio is
that it rises with a slope of 20 dB/dec for small input signals, reaches a maximum
value when the distortion level becomes roughly equal to the noise level, and then
rapidly declines for large signal levels.
We could have written the performance ratiosDNR, SDR, and SNR as functions
of the RMS value, �. If we do so, an expression for the distortion-free dynamic
36
range is
DFDR = SNR(�)jDNR(�)=1(2:49)
If the distortion-to-noise ratio has an inverse function (which it typically has), we
can write
DFDR = SNR(DNR�1(1))) (2:50)
Additionally, an expression for the distortion-limited dynamic range can be
written as
DLDR = SNR(�)jSDR(�)=c2(2:51)
If the distortion-to-signal ratio has an inverse function (which it typically has), we
can write
DLDR = SNR(SDR�1(c2)) (2:52)
2.6 Example: Self-biased Transconductance-C
Integrator
We have established the theoretical framework for computing the dynamic
range of a linear continuous-time transconductance-C integrator. In this section,
we compute the dynamic range for a particular design example: the self-biased
transconductance-C integrator.
In order to achieve a low-noise design, a class-AB transconductor without cur-
rent mirrors is sought. As a starting point, the two-transistor circuit of Fig. 2.9
is selected. This transconductor is tunable over a wide frequency range via the
supply voltages or substrate terminals. Because it operates in a push-pull fashion,
it has the most favorable noise properties among all CMOS transconductor con�g-
urations at a given bias current (Groenewold, 1991; Vittoz, 1994). Moreover, all
even order harmonics are canceled by this push-pull e�ect. This circuit has been
37
used e�ectively by Nauta (Nauta, 1992) in very high frequency �ltering applica-
tions because it has no internal nodes. In the following subsections, the dynamic
range is derived for the self-biased transconductance-C integrator con�gured as a
lowpass �lter.
2.6.1 Output Current and Transconductance
We want to �nd an expression for the output current Iout in Fig. 2.9. From (A.12)
and noting that the source and substrate are at the same terminal for each tran-
sistor, we can write
In = I0Se�(Vin+Vdd=2)=Vt (2.53)
Ip = I0Se�(Vdd=2�Vin)=Vt (2.54)
Let Ib denote the current through each of the two transistors when Vin is zero, i.e.
Ib = I0Se�Vdd=(2Vt) (2:55)
Note that it is imperative that Vdd=2 be less than the threshold voltage of the
transistor, in order to insure subthreshold operation of the device. Then the output
current, Iout, can be written as
Iout = �2Ib sinh(�Vin=Vt) (2:56)
The slope of the output current function Iout(Vin) is given by
G(Vin) � @Iout@Vin
=�2Ib�Vt
cosh(�Vin=Vt) (2:57)
while its nominal value is
Go =�2Ib�Vt
(2:58)
In Fig. 2.10 the output current and transconductance is plotted for the self-biased
transconductor.
38
Vin Vout
C
Vin Vout
C
- G
(a) (b)
Figure 2.9: Self-biased transconductance-C integrator (a) circuit and (b) symbol.
−1 −0.5 0 0.5 1
−2
0
2
(a) Input Amplitude [Vt/kappa]
Out
put [
Ib]
−1 −0.5 0 0.5 10
0.5
1
1.5
2
(b) Input Amplitude [Vt/kappa]
Nor
mal
ized
G
Figure 2.10: For the self-biased transconductor (a) output current in units of Ib as afunction of the input voltage in units of Vt=�, and (b) normalized transconductanceG=Go as a function of input voltage. Note that the transconductance function isconvex.
39
2.6.2 Noise
For the case of the self-biased transconductor, we compute the input-referred
noise power spectral density using the model of Fig. 2.11. Fig. 2.11(a) models
the noise properties of a single integrator. Assuming the noise produced by each
of the two transistors is independent, the output current noise power spectrum
(one-sided) is given by
SIout;n(!) = 2q(Ib + Ib) = 4qIb (2:59)
Note that the output current noise power spectrum is technically a function of the
input voltage, since the total current passing through the two transistors is not
constant (= 2Ib), but grows in the shape of a cosh function. This dependency is
a second-order e�ect, and perhaps even con icts with the proposed framework for
computing dynamic range.
We relate the output current noise power spectrum to the input-referred noise
power spectrum as in
SV in;n(!) =4qIbG2
=2qIbjGj
Vt�Ib
=4kT
jGj � (2.60)
where � = 1=(2�) for this transconductor. To obtain the last equality, recall that
Vt � kT=q.
In Fig 2.12(a) shows the circuit of a lowpass �lter using the self-biased transcon-
ductance-C integrator. We assume a linear small-signal model, where G1 = G2 =
Go is the nominal transconductance. Its transfer function is HLP (!) = 1=(1 +
j!=!c), where !c = Go=C. As such, the transfer function from the input of G1 is
HLP (!). The transfer function from the input of G2 is also HLP (!). Therefore,
the noise power spectrum at the output node is
SV out;n(!) =4kT
jG1j�jHLP (!)j2 + 4kT
jG2j�jHLP (!)j2 (2:61)
40
(a)
Iout
(b)
Iout
2qIb
- G
2qIb
SI(ω) =
SI(ω) =
4kT|G|
ξ
Figure 2.11: Noise model of self-biased transconductor (a) referenced to the outputcurrent and (b) referred to the input voltage.
Vin Vout
Iout
C
(a)
Vin - G2- G1 Vout
C
(b)
4kT|G2|
ξ4kT|G1|
ξ
Figure 2.12: Self-biased transconductance-C inverting lowpass �lter (a) circuit and(b) noise model.
41
Integrating over all radian frequencies, the output-referred noise level is
V 2out;n =
2kT
C� (2:62)
where, as before, � = 1=(2�).
This equation is perhaps familiar. The well known equipartition theorem tells
us that in a network of resistors with a single node capacitance, the noise level is
equal to kT=C, independent of the conductance values (Sarpeshkar et al., 1993). In
our network of diode-connected MOS transistors, the noise level is independent of
the bias current, Ib, and, consequently, the output impedance. However, the noise
level appears to be inversely proportional to 2� = 1=�. Thus, for � < 1 the noise
level is higher as compared to a network of linear resistors. For devices in which �
has the value of unity, such as bipolar and junction-FET transistors operating in
the subthreshold region, they yield the lowest noise possible in a solid-state device.
And yet, if there existed devices which had � > 1, (2.62) predicts that the noise
level would be lower than that of a resistive network.
2.6.3 Distortion
Let the current-to-voltage function Iout(Vin) = �2Ib sinh(�Vin=Vt), the output cur-rent of the self-biased transconductor, and Go = �2Ib�=Vt, the nominal transcon-
ductance. From (2.39), the �rst distortion measure can be written as
V 2in;d1 �
Z 1�1
f(Vin)�Vin � Vt
�sinh(�Vin=Vt)
�2dVin (2:63)
where f(Vin) is the probability density function of the input voltage, Vin. Using
numerical techniques, this integral is computed for each of three input distributions
for a range of values of �, the input RMS value.
For example, for the normal distribution, where f(Vin) = 1=(�p2�)e�V
2
in=2�2,
the �rst distortion measure is computed as
V 2in;d1 �
Z 1�1
1
�p2�
e�V2
in=2�2
�Vin � Vt
�sinh(�Vin=Vt)
�2dVin (2:64)
42
A plot of the �rst distortion level as a function of the input voltage RMS value
� is shown in Fig. 2.13 for the cosine amplitude, uniform, and normal distributions.
The cosine amplitude distribution consistently gives the lowest distortion, because
it has the tighest bound, �p2�, while the normal distribution consistently results
in the highest distortion, because it is unbounded.
Using 2.39, the �rst distortion measure can also be rendered in the following
form
V 2in;d1 =
V 2t
�2
�E[(�Vin=Vt)
2] + E[sinh2(�Vin=Vt)]
� 2 E[�Vin=Vt sinh(�Vin=Vt)]) (2.65)
Using the trigonometric identity sinh2(x) = �1=2 + 1=2 cosh(2x), and expanding
sinh(x) and cosh(x) into a power series, we obtain the equation
V 2in;d1 =
V 2t
�2
0@ E[(�Vin=Vt)
6](2
5
6 � 2)
5!
+ E[(�Vin=Vt)8](2
7
8 � 2)
7!+ :::
1A (2.66)
From (2.66), we note that, independent of the input distribution, the �rst dis-
tortion level depends on only even order moments beginning with the sixth. For
the uniform, cosine value, and normal distributions, the sixth moment is propor-
tional to �6. Thus, we anticipate a slope of 60dB/dec for small values of �. As �
increases, this slope will increase also.
We want to compute the input-referred distortion using the second measure
for the same input signal distributions and transconductance function. The op-
timal gain factor �� is that gain factor which minimizes the mean-square error
between the input voltage and the nonlinearly transformed input voltage. It can
be computed from (2.43) as
�� =E[�Vin=Vt sinh(�Vin=Vt)]
E[sinh2(�Vin)](2:67)
43
10−2
10−1
100
−160
−140
−120
−100
−80
−60
−40
(a) Input Sigma [Vt/kappa]
Firs
t Dis
tort
ion
Leve
l [dB
]
Figure 2.13: The �rst distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).
44
Using numerical techniques, the optimal gain factor is computed for each of three
input distributions. For example, for the cosine amplitude distribution, the optimal
gain factor can be found as the ratio of two integrals, as in
�� =
Rp2��p2�
1p2��p
1�V 2
in=2�2
�VinVt
sinh(�VinVt
) dVin
Rp2��p2�
1p2��p
1�V 2
in=2�2
sinh2(�Vin) dVin(2:68)
The results of these computations are shown in Fig. 2.14.
Returning to (2.67), one can expand the functions sinh(x) and sinh2(x) using
a Taylor series. If we denote
D(Vin) = 1 +23 E[(�Vin=Vt)4]
4! E[(�Vin=Vt)2]+ ::: (2:69)
then a series approximation of the optimal gain factor is as follows
�� =1
D(Vin)
1 +
E[(�Vin=Vt)4]
3! E[(�Vin=Vt)2]+ :::
!(2.70)
From (2.70) and (2.69), one can show that �� approaches one as � gets very small.
This is precisely the behavior of the curves in Fig. 2.14.
The second distortion measure is computed using the optimal gain factor. It
can be computed from (2.42), as in
V 2in;d2 =
Z 1�1
f(Vin)�Vin � ��
Vt�sinh(�Vin=Vt)
�2dVin (2:71)
where f(Vin) is the probability density function of the input voltage, Vin. In
Fig. 2.15 we plot the second distortion level as a function of � for three input
distributions.
The second distortion level is more di�cult to estimate using series approxima-
tions than the �rst distortion level. Starting from (2.44), and after much algebra,
we �nd
V 2in;d2 =
1
D(Vin)
V 2t
�2
0@ E[(�Vin=Vt)
6](2
5
6� 2)
5!� E2[(�Vin=Vt)4]
E[(�Vin=Vt)2]
1
3! 3!+
E[(Vin=Vt�)8](2
7
8� 2)
7!� E[(�Vin=Vt)4] E[(�Vin=Vt)6]
E[(�Vin=Vt)2]
2
3! 5!+ :::
1A (2.72)
45
10−2
10−1
100
0.5
0.6
0.7
0.8
0.9
1
Input Sigma [Vt/kappa]
Opt
imal
Gai
n, Z
eta
Figure 2.14: The optimal gain factor �� as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).
46
10−2
10−1
100
−160
−140
−120
−100
−80
−60
−40
Input Sigma [Vt/kappa]
Sec
ond
Dis
tort
ion
Leve
l [dB
]
Figure 2.15: The second distortion level as a function of � in units of Vt=�, where� = 0:7 and Vt = 25:7 mV. Curves are drawn for three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).
47
If we compare the lowest order terms in the series approximations for the �rst and
second distortion levels, we �nd the second distortion level to be 4 � 10 dB less
than the �rst.
2.6.4 Dynamic Range
The signal-to-noise-plus-distortion ratio using the �rst and second distortion mea-
sures is given in Fig. 2.16. From this �gure we see qualitatively that there is an
optimum input signal level at which to operate, depending on the distortion mea-
sure used and the input signal distribution. The optimum signal level occurs in
the vicinity of the peak value for the signal-to-noise-plus-distortion ratio. For in-
put levels below this peak, the signal-to-noise ratio is low, whereas for input level
above this peak, the signal-to-distortion ratio is low.
From �rst-order approximations using the series approximations derived earlier
and truncating after the �rst term, we expect the distortion-free dynamic range
using the �rst distortion measure to have a 6:67 dB/dec slope as a function of the
integrating capacitance C and a �6:67 dB/dec slope as a function of the body-e�ectcoe�cient �. For nominal parameter values of C = 5 pF and � = 0:7, DFDR1 is
in the range 41.7-44.2 dB. In Fig. 2.17(a) we plot the �rst distortion-free dynamic
range as a function of capacitance for three input distributions. Fig. 2.17(b) shows
the �rst distortion-free dynamic range as a function of �. Similar plots can be
generated using the second distortion measure, as in Fig. 2.18. The slope of the
second distortion-free dynamic range is identical to that of the �rst, whereas its
level is 1.3{3.4 dB higher.
Using �rst-order approximations, one can show that the distortion-limited dy-
namic range is a slightly stronger function of C and � than the distortion-free
dynamic range. Speci�cally, we anticipate a 10 dB/dec slope as a function of C
and a �10 dB/dec slope as a function of �. The distortion-limited dynamic range
48
10−2
10−1
100
0
20
40
(a) Input Sigma [Ut/kappa]
SN
DR
1 [d
B]
10−2
10−1
100
0
20
40
(b) Input Sigma [Ut/kappa]
SN
DR
2 [d
B]
Figure 2.16: The signal-to-noise-plus-distortion ratio using (a) the �rst and (b) thesecond distortion measure as a function of � in units of Vt=�, where Vt = 25:7 mV,� = 0:7 and C = 5:0 pF. Curves are drawn for each of three input distributions:uniform (solid), cosine value (dashed), and normal (dotted).
49
100
101
35
40
45
50
(a) Capacitance [pF]
DF
DR
[dB
]
10−1
100
40
45
50
55
(b) kappa
DF
DR
[dB
]
Figure 2.17: The distortion-free dynamic range using the �rst distortion measure(a) as a function of C, where � = 0:7, and (b) as a function of �, where C = 5:0 pF.Vt = 25:7 mV. Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).
50
100
101
35
40
45
50
(a) Capacitance [pF]
DF
DR
[dB
]
10−1
100
40
45
50
55
(b) kappa
DF
DR
[dB
]
Figure 2.18: The distortion-free dynamic range using the second distortion measure(a) as a function of C, where � = 0:7, and (b) as a function of �, where C = 5:0 pF.Vt = 25:7 mV. Curves are drawn for each of three input distributions: uniform(solid), cosine value (dashed), and normal (dotted).
51
using the �rst distortion measure is in the range of 45.5{49.35 dB for the nominal
parameter values of C = 5:0 pF and � = 0:7 and a maximum of 2% amplitude
distortion (c = :02). In Fig. 2.19(a) we plot the �rst distortion-limited dynamic
range as a function of capacitance for three input distributions. Fig. 2.19(b) shows
the �rst distortion-limited dynamic range as a function of �. Similar plots are
given using the second distortion measure in Fig. 2.20. The slope of the second
distortion-limited dynamic range is identical to that of the �rst, whereas its level
is 2.0{5.2 dB higher.
The self-biased transconductor has the rare property that the nominal transcon-
ductance Go is the minimum transconductance, i.e. the transconductance function
is convex. One might expect that, in a cascade of such transconductors, as in a
cochlear �lter bank implementation, the output current distortion would eventually
push the transistor into the above threshold region of operation. Simply stated,
these transconductors do not have a saturating output current, but a saturating
output voltage. As we shall see in Chapter 3, transconductors based on di�erential
pairs have a saturating current characteristic, and as such the nominal transcon-
ductance is also the maximum transconductance. The output current distortion
for di�erential transconductors tends to reduce the actual output signal.
In the next chapter we explore di�erential transconductors. Using analysis tech-
niques introduced in Chapter 2, three new linearized transconductors are designed
and optimized for operation in subthreshold CMOS.
52
100
101
35
40
45
50
(a) Capacitance [pF]
DLD
R [d
B]
10−1
100
40
50
60
(b) kappa
DLD
R [d
B]
Figure 2.19: The distortion-limited dynamic range using the �rst distortion mea-sure (a) as a function of C, where � = 0:7, and (b) as a function of �, whereC = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion. Curves are drawnfor each of three input distributions: uniform (solid), cosine value (dashed), andnormal (dotted).
53
100
101
40
50
60
(a) Capacitance [pF]
DLD
R [d
B]
10−1
100
45
50
55
60
(b) kappa
DLD
R [d
B]
Figure 2.20: The distortion-limited dynamic range using the second distortionmeasure (a) as a function of C, where � = 0:7, and (b) as a function of �, whereC = 5:0 pF. Vt = 25:7 mV and 2% amplitude distortion. Curves are drawnfor each of three input distributions: uniform (solid), cosine value (dashed), andnormal (dotted).
Chapter 3
Linearized Transconductors in
Subthreshold CMOS
Analog circuits implemented in subthreshold CMOS are attractive because of
their low power consumption and compatibility with standard digital CMOS pro-
cesses (Vittoz, 1994). Continuous-time linear �ltering of audio signals, for appli-
cations such as hearing aids, is one class of analog circuits to which subthreshold
CMOS poses a particular challenge. The reason is that subthreshold current in
a CMOS device depends exponentially on the gate voltage. As a case in point,
we show in section 3.2 that the linear range of the basic two-transistor di�erential
pair operating below threshold is less than �7:5 mV. However, by applying several
linearizing techniques we are able to increase the linear range by as much as eight
times. Moreover, these techniques require only modest increases in silicon area
and power consumption.
Section 3.2 describes and analyzes the basic two-transistor di�erential pair. In
this section we de�ne two performance measures for transconductors, linear range
and current e�ciency. Sections 3.3 and 3.4 analyze in detail the four linearizing
techniques that are included in this research. Hints on improved transconductor
designs are given in section 3.5. The chapter concludes with experimental results
and a table summary.
54
55
A model for the current in an nMOS device operating below threshold is given
in Appendix A and is repeated here for convenience:
IDS = I0S exp�VGB=Vt(exp�VSB=Vt� exp�VDB=Vt) (3:1)
For transistors operating in saturation, i.e., (VDB � VSB) � 5Vt, the drain depen-
dence can be safely ignored.
3.1 The Transconductance-C Integrator
A di�erential-input, single-ended output transconductance-C integrator with no
linearization is shown in Fig. 3.1. It consists of a transconductor, a current mirror,
and an integrating capacitance. In this case the transconductor is a simple di�er-
ential pair, as found in Liu (Liu, 1992). In Fig. 3.1 and for the remainder of this
work, a three-terminal MOS transistor is assumed to have the bulk terminal tied to
a common local substrate. The mirror is assumed noiseless in order to isolate the
behavior of the transconductor. Alternately, the mirror can be replaced by a com-
plementary active stage, yielding a di�erential output. The noise is doubled, but so
is the nominal transconductance, so that no net noise is introduced (Groenewold,
1991).
Our goal is to maximize the dynamic range of the transconductance-C integra-
tor. Recall that dynamic range is the ratio of the maximum output voltage signal
power divided by the output voltage noise power, expressed in dB. If we assume a
lowpass con�guration with cuto� frequency Go=C, the equivalent noise bandwidth
is Go=4C, as derived in Chapter 2. Then the output voltage noise power can be
computed most conveniently as the product of the input-referred noise density with
the equivalent noise bandwidth of the circuit.
56
Ib
CI1 I2
Iout
Noiseless Current Mirror
V1 V2
IoutIin
Figure 3.1: The transconductance-C integrator using the basic di�erential pair.
57
3.2 The Di�erential Pair and De�nitions
The basic di�erential pair is shown in Fig. 3.2(a). It consists of two matched
transistors M1 and M2 operating in saturation and a third transistor M3 operating
as a current source, Ib. Let V1 and V2 be de�ned by their common-mode and
di�erential-mode voltages with respect to the substrate potential, where
VCM � V1 + V22
(3.2)
VDM � V1 � V2
Solving for the input voltages, we have
V1 = VCM +VDM
2(3.3)
V2 = VCM � VDM
2
Let I1 and I2 be the current passing through the two transistors as shown in
Fig 3.2(a). We have the constraint
Ib = I1 + I2 (3:4)
Let VS be the voltage at the source of the di�erential pair. If we assume that VB
the bulk potential is at zero volts, we can write
I1 = I0S exp�(VCM+VDM=2)=Vt exp�VS=Vt
I2 = I0S exp�(VCM�VDM=2)=Vt exp�VS=Vt (3.5)
De�ne the di�erential output current as in
IDM � I1 � I2 (3:6)
Then
IDM = I0S exp�VCM=Vt exp�VN=Vt(exp�VDM=2Vt� exp��VDM=2Vt) (3:7)
58
Ib
V1 V2
I2I1
M1 M2
i1
VS
Vbias
(a)
(b)
M3
in3 gd3
vs
in1
gm2+gmb2+gd2
in2
gm1+gmb1
+gd1
i2
Figure 3.2: The basic di�erential pair (a) circuit, and (b) simpli�ed AC noisemodel.
59
From (3.4), Ib can be written as
Ib = I0S exp�VCM=Vt e�VN
Vt (exp�VDM=2Vt+exp��VDM=2Vt) (3:8)
Normalizing IDM by the bias current and canceling common terms, we get
IDM
Ib=
(exp�VDM=2Vt� exp��VDM=2Vt)
(exp�VDM=2Vt +exp��VDM=2Vt)(3:9)
Recognizing the right-hand-side as the hyperbolic tangent, we can write
IDM = Ib tanh��VDM
2Vt
�(3:10)
De�ne the transconductance of the di�erential pair as
G(VDM ) � @IDM
@VDM(3:11)
where the nominal value Go is equal to G(VDM = 0). For the case of the basic
di�erential pair,
G(VDM ) =Ib
cosh2��VDM2Vt
� �
2Vt(3:12)
The nominal value occurs for VDM equal to zero. Thus,
Go = Ib�
2Vt(3:13)
In Appendix A a small-signal noise model is given for the MOS transistor.
Assuming a �xed (noiseless) bias potential, Vbias, a simpli�ed small-signal noise
model for the di�erential pair can be derived as in Fig 3.2(b). Two equations
which can be used to solve for the di�erential noise current idn, where
idn � i1 � i2 (3:14)
are as follows:
idn = in1 � vs(gm1 + gmb1 + gd1)� in2 + vs(gm2 + gmb2 + gd2)
in1 + in2 � in3 = vs(2gm1 + 2gmb1 + 2gd1 + gd3) (3.15)
60
The second equation is found by summing all currents at node vs. Let us assume
that transistors M1 and M2 are matched, so that their small-signal parameters are
equal. After simpli�cation, we �nd that
idn = in1 � in2 (3:16)
It is interesting to note that noise in the bias current is e�ectively a common-mode
noise source and is canceled by the di�erential output current.
To �nd the power spectrum of the di�erential noise current, we add the e�ects
of each independent noise source. As such, we sum the power spectra of the
individual sources multiplied by the square magnitude of its transfer function. In
this case the transfer functions are simply 1 and -1 for in1 and in2, respectively.
Sidn(!) = (1)2Sin1(!) + (�1)2Sin2(!) = Sin1(!) + Sin2(!) (3:17)
From Appendix A, the power spectrum of the current noise in a single transistor
is 2qIDS. For the nominal input voltage VDM = 0, I1 = I2 = Ib=2. Therefore, we
have
Sidn(!) = 2qIb2+ 2q
Ib2= 2qIb (3:18)
From Chapter 2 the input-referred noise density is the di�erential output cur-
rent density divided by the square of the nominal transconductance. Thus,
V 2in;n =
Sidn(!)
G2o
(3:19)
Multiplying the input noise density by the equivalent noise bandwidth of a lowpass
�lter Go=4C, the output noise power is
V 2out;n =
Sidn(!)
4GoC(3:20)
For the case of the di�erential pair, we substitute Sidn and Go into (3.20) to obtain
V 2out;n =
qVt�C
(3:21)
61
The maximum normalized transconductance distortion measure, as de�ned in
Chapter 2, is applied to the transconductor. It is repeated here for convenience.
DG � max
�����G(Vin)�Go
Go
����� 8 Vin (3:22)
where Go is the nominal transconductance. 1 One can express DG as a percentage.
where a typical value is 1.0%. Referred to the input voltage, a distortion power
measure can be computed as
V 2in;dg = V 2
in;s(DG)2 (3:23)
For the remainder of this chapter, the term distortion will refer to the maximum
normalized transconductance distortion measure, unless otherwise stated. As out-
lined in section 2.4, other distortion measures are possible. This particular dis-
tortion measure is the most conservative and also the simplest to compute of the
three considered in that section.
De�ne the maximum input voltage Vmax of the transconductor as the contin-
uous set of di�erential input voltages VDM for which the maximum normalized
distortion is less than or equal to a constant. By symmetry, the linear range of the
transconductor will be �Vmax. Other constraints can be added to the de�nition of
linear range. For example, we may require a certain degree of smoothness in the
transconductance function at VDM = 0.
If the transconductance function is convex and takes its maximum value at
VDM = 0, then Vmax can be easily computed by �nding that value of VDM which
achieves the maximum distortion, as in
DG =
�����G(Vmax)�Go
Go
����� (3:24)
1For the case that the transconductance function has more than one in ection point, such as
an equiripple design, substitute the global maximum for the nominal transconductance.
62
Supposing that the transconductance function has an inverse,G�1(�), one can solvefor Vmax, as in
Vmax = G�1 (Go(1�DG)) (3:25)
If the inverse function does not exist, numerical techniques can be used to solve
the above equation.
Given DG, the maximum input voltage Vmax for the basic di�erential pair can
be determined analytically as
Vmax =2Vt�
cosh�1
1p
1�DG
!(3:26)
Note that Vmax is proportional to Vt=�. In fact, one can write Vmax in the form
Vmax = d(DG)Vt=� (3:27)
where d(�) is a function of the percent distortion only. If we specify DG as 1%,
that function becomes a constant and we have
Vmax = 0:201Vt�
(3:28)
For Vt = 25:7 mV and � = 0:7 we obtain a linear range of �7:37 mV. In Fig. 3.3
we plot normalized G as a function of VDM .
Let us de�ne the current e�ciency of a transconductor as the maximal linear
output current expressed as a fraction of the total bias current. If we write the
di�erential output current IDM as a function of the di�erential input voltage VDM ,
then the current e�ciency �I can be expressed as
�I =IDM(VDM = Vmax)
Ib� 100% (3:29)
where Vmax is the maximum di�erential input voltage. The current e�ciency gives
us a measure of how much of the available current is usable for performing linear
computations. Note that the bias current serves the dual purpose of tuning the
nominal transconductance.
63
Given the maximuminput voltage, the current e�ciency of the basic di�erential
pair can be computed as
�I = tanh��Vmax
2Vt
�� 100% (3:30)
Applying trigonometric identities, this equation simpli�es to
�I =qDG � 100% (3:31)
Thus, for DG equal to 0.01, we �nd that the current e�ciency of this design is
10.0%.
The input amplitude is permitted to vary in the range of �Vmax. If we assume
that the input is sinusoidal with amplitude Vmax, the maximum input signal power
is given by
V 2in;s =
V 2max
2(3:32)
For a transconductance-C �lter con�gured as a unity-gain lowpass �lter, the max-
imum output signal power is equal to the maximum input signal power, so that
V 2out;s =
V 2max
2(3:33)
The dynamic range can be expressed as the maximum signal-to-noise ratio at
a given distortion DG, as in
DR =V 2out;s
V 2out;n
�����DG
(3:34)
Substituting (3.32) into the above equation, we get
DR =1
2
V 2max
V 2out;n
(3:35)
For the case of the di�erential pair, substituting (3.21) and (3.28) the dynamic
range can be written as
DR =1
2
�0:201Vt
�
�2 �C
qVt
= :0201VtC
�q(3.36)
64
−15 −10 −5 0 5 10 15
0.98
0.985
0.99
0.995
1
Vdm [mV]
Nor
mal
ized
G [S
/S]
Figure 3.3: Normalized transconductance for the basic di�erential pair as a func-tion of VDM with Vt = 25:7 mV and � = 0:7.
65
For a nominal capacitance value of 5.0 pF, Vt = 25:7 mV and � = 0:7, the dynamic
range of the di�erential pair is 43.6 dB. From (3.36), we can identify certain trends
or relationships. Increasing the temperature of operation will increase Vt and as
such extends the dynamic range. How does this happen? Increases in temperature
increase the noise power, but only linearly with T . However, increases in temper-
ature increase the signal power by the factor T 2. Taking the ratio of the signal
power to the noise power yields a net increase. Similarly, increases in C and/or
decreases in � result in a higher dynamic range.
While it is not immediately apparent how � can be a varied, the next section
details one possible means of reducing an \e�ective" � using diode-connected tran-
sistors. Alternately, one might use the \back-gate", i.e., the substrate, of the MOS
transistor as the input terminal (Sarpeshkar et al., 1996), since the e�ective � of
the back-gate is equal to (1� �).
3.3 Source Degeneration
Source degeneration can be accomplished by placing a conductance at the source
of the di�erential pair. In a standard digital CMOS process, no high-value re-
sistors exist. Therefore, resistors will be generated using transistors only. Three
techniques for improving the linear range of the basic di�erential pair using source
degeneration are outlined below.
3.3.1 Diode-Connected Transistors
A di�erential pair with source degeneration via diode-connected transistors is
shown in Fig. 3.4 (Watts, 1992). It consists of two pairs of matched transistors
M1�M2 and M3�M4 and a �fth transistor M5 operating as a current source, Ib.
Let VS now be the voltage at the source of the two diode-connected transistors,
referenced to ground. The voltages at the sources of M1 and M2 are VS1 and VS2,
66
respectively. We assume that all transistors are operating in saturation and that
VB, the bulk potential, is at zero volts. Applying (3.1) to transistors M1 and M3
in Fig. 3.4(a), and eliminating the voltage, VS1, one can show that
I1 = I0Seffe�2V1
(1+�)Vt e�
VS(1+�)Vt (3:37)
where Seff is the e�ective width-to-length ratio given by
Seff = (S�1S3)
1=(1+�) (3:38)
We note from (3.37) that the width-to-length ratios of transistors M1 and M3 do
not control the current I1 independently. An equation analogous to (3.37) holds
for transistors M2 and M4 of Fig. 3.4(a).
Let V1 and V2 be de�ned as in (3.4). The constraint Ib = I1 + I2 still holds.
We �nd that
IDM = Ib tanh
�2VDM
(1 + �)2Vt
!(3:39)
The transconductance as a function of VDM can be written as
G =Ib
cosh2�
�2VDM(1+�)2Vt
� �2
(1 + �)2Vt(3:40)
The nominal value occurs for VDM equal to zero. Thus,
Go = Ib�
2Vt
��
1 + �
�(3:41)
To some extent, one can the view the presence of the diode-connected transistor as
having reduced the e�ective � by the factor �=(1+�). But it is not, unfortunately,
quite that simple for the computation of noise power.
A simpli�ed small-signal noise model for the di�erential pair with source de-
generation via diode-connected transistors is shown in Fig 3.4(b). Three equations
which can be used to solve for the di�erential noise current idn � i1 � i2 are:
idn = in1 � vs1(gm1 + gmb1 + gd1)
67
V1 V2
Vbias
VS1
I2I1
(a)
M1
M3
M5
M4
M2
Ib
VS
VS2
in5
in4in3gm3
+gd3
in1
-vsgmb4
in2
(b)
i2i1
gm1+gmb1
+gd1
gm2+gmb2+gd2
gm4+gd4
-vs gmb3
gd5
vs
vs1 vs2
Figure 3.4: The di�erential pair with source degeneration via diode-connectedtransistors (a) circuit, and (b) small-signal noise model.
68
�in2 + vs2(gm2 + gmb2 + gd2)
idn = in3 + (vs1 � vs)(gm3 + gd3)� vsgmb3
�in4 � (vs2 � vs)(gm4 + gd4) + vsgmb4 (3.42)
in3 + in4 � in5 = vs(gmb3 + gmb4 + gd5) + (vs � vs1)(gm3 + gd3)
+(vs � vs2)(gm4 + gd4)
The third equation is found by summing all currents at node vs. Let us assume that
transistor pairsM1{M2 andM3{M4 are matched, i.e., their small-signal parameters
are the same. Then after some simpli�cation, we �nd
idn =(in1 � in2)(gm3 + gd3)
gm1 + gmb1 + gd1 + gm3 + gd3+
(in3 � in4)(gm1 + gmb1 + gd1)
gm1 + gmb1 + gd1 + gm3 + gd3(3:43)
To estimate the power spectrum of the di�erential noise current, we make the
following assumptions. Let gd1 = gd5 � 0. From Appendix A gm = �IDS=Vt and
gmb = (1 � �)IDS=Vt, so that gm + gmb = IDS=Vt. Since the current IDS is the
same through the di�erential pair as through the diode-connected transistors, the
equation for idn reduces to
idn =(in1 � in2)�
1 + �+
(in3 � in4)
1 + �(3:44)
As before, we sum the power spectra of the independent noise sources multiplied
by the square magnitude of their respective transfer function. We obtain
Sidn(!) =�
�
1 + �
�2
[Sin1(!) + Sin2(!)] +�
1
1 + �
�2
[Sin3(!) + Sin4(!)] (3:45)
From Appendix A, the power spectrum of the current noise in a single transistor
is 2qIDS. For VDM = 0, I1 = I2 = Ib=2. Therefore, each of the four noise current
sources have power spectrum 2qIb=2. We obtain
Sidn(!) =1 + �2
(1 + �)22qIb (3:46)
69
For � = 0:7, the fraction (1 + �2)=(1 + �)2 is approximately 0.5. Therefore, for
a given bias current, the di�erential noise spectral density in this transconductor
is almost half that of a simple di�erential pair. Using the same type of models
described in this work, one can show that a current source with source degeneration
via a diode-connected transistor has a lower noise current density than a simple or
cascoded current source.
To compute the output referred noise power for the di�erential pair with diode-
connected transistors, we substitute equations for Go and Sidn into (3.20) to obtain
V 2out;n =
qVt�C
1 + �2
�+ �2(3:47)
Thus, the output-referred noise power is increased by the factor (1 + �2)=(�+ �2)
compared to the simple di�erential pair. For � = 0:7, this amounts to a 25%
increase.
As done for the simple di�erential pair, the maximum input voltage Vmax can
be determined analytically as in
Vmax =2Vt�
1 + �
�cosh�1
1p
1 �DG
!(3:48)
If we set the the distortion DG to 1.0%, as before, we can rewrite the expression
for Vmax leaving visible parameters Vt and �, where
Vmax = 0:201Vt�
1 + �
�(3:49)
For Vt = 25:7 mV and � = 0:7 this relation predicts a linear range of �17:9 mV.
In Fig. 3.5 we plot G as a function of VDM .
Given the maximum input voltage, one can show that the current e�ciency
of the di�erential pair with source degeneration via diode-connected transistors is
identical to that of the basic transconductor, i.e. 10.0%.
70
−30 −20 −10 0 10 20 30
0.98
0.985
0.99
0.995
1
Vdm [mV]
Nor
mal
ized
G [S
/S]
Figure 3.5: Normalized transconductance for the di�erential pair with source de-generation via diode-connected transistors as a function of VDM with Vt = 25:7 mVand � = 0:7.
71
To compute the dynamic range for the case of the di�erential pair with diode-
connected transistors, substitute (3.47) and (3.49) into (3.35) to obtain
DR = :0201VtC
�q
(1 + �)3
�(1 + �2)(3:50)
For a nominal capacitance value of 5.0 pF and other parameters as before, the
dynamic range of the di�erential pair with diode-connected transistors is 50.4 dB,
representing a 6.8 dB increase over that of the basic di�erential pair.
The addition of diode-connected transistors cannot be viewed solely as a re-
duction in � as hinted in (Watts, 1992), in which the author de�nes a new �0 =
�2=(1 + �), the chief reason being that the noise level is lower than anticipated
from this simple substitution.
The authors in (Watts, 1992) also propose source degeneration using more
than one diode-connected transistor in order to further enhance the linear range.
However, the cost of using one or more diode-connected transistors at the source
of a transistor is an increased supply voltage. A technique to adjust the threshold
voltage down to several hundred millivolts using oating gates would o�set this
increase. Nevertheless, we would like to achieve an improved linear range without
having to increase the supply voltage and, hence, the total power consumption.
3.3.2 Single Di�usor
The di�usor was proposed in (Boahen and Andreou, 1992) and discussed exten-
sively in (Andreou and Boahen, 1994). Its di�usivity, or conductivity, is controlled
by an applied gate potential, VGC . A di�erential pair with source degeneration via
a single di�usor M3 is shown in Fig. 3.6. The same circuit topology, as applied to
above threshold CMOS, can be found in (Tsividis et al., 1986).
Let VS1 and VS2 be the voltages at the sources of the di�erential pair, M1 and
72
M2, respectively. Let us de�ne
VS1 � VCMS +VDMS
2
VS2 � VCMS � VDMS
2(3.51)
where VCMS is the common-mode source voltage, and VDMS is the di�erential-mode
source voltage. They are given by the equations
VCMS � VS1 + VS22
VDMS � VS1 � VS2 (3.52)
We assume that all transistors except M3 operate in saturation and that VB,
the bulk potential, is at zero volts. Applying (3.1) to transistors M1 and M2 in
Fig. 3.6, we �nd that
I1 = I0S1e�(VCM+VDM=2)
Vt e�VCMS+VDMS=2
Vt
I2 = I0S1e�(VCM�VDM=2)
Vt e�VCMS�VDMS=2
Vt (3.53)
Writing IDM = I1 � I2 and simplifying, we get an equation for IDM as follows
IDM = 2I0S1e�VCM�VCMS
Vt sinh��VDM � VDMS
2Vt
�(3:54)
Let I12 be the current passing from node VS1 to VS2. It can be written as
I12 = I0S3e�VGCVt
�e�VCMS�VDMS=2
Vt � e�VCMS+VDMS=2
Vt
�(3:55)
Using the constraint Ib = I1 + I2, we �nd
IDM = Ib tanh��VDM
2Vt� VDMS
2Vt
�(3:56)
The voltage VDMS is best eliminated from (3.56), since it is a function of VDM . To
that end, apply Kircho�'s current law at nodes VS1 and VS2 to obtain
I12 =I1 � I2
2=
IDM
2(3:57)
73
I2I1
0.5 Ib
V1 V2
Vbias Vbias VGC
M3
M1
M4 M5
M2
(a)
0.5 Ib
VS2VS1
in4 in5
inf3
gmf3+gmbf3
Ib Ib
(b)
I2I1
gd4 gd5
inr3
gm1+gmb1
+gd1
gm2+gmb2+gd2
vs2vs1
Figure 3.6: The di�erential pair with source degeneration via a single di�usor (a)circuit, and (b) small-signal noise model.
74
Using (3.55) and (3.57) we can write another equation for IDM . After simplifying,
we get
IDM = 4I0S3e�VGC�VCMS
Vt sinh�VDMS
2Vt
�(3:58)
Equating (3.54) and (3.58) and applying trigonometric identities, we solve for
VDMS
VDMS = 2Vt tanh�1
264 sinh
��VDM2Vt
�2S3S1e�(VGC�VCM )
Vt + cosh��VDM2Vt
�375 (3.59)
Let
m � S3
S1(3:60)
be the relative width-to-length ratio and
meff =S3
S1e�(VGC�VCM )=Vt (3:61)
be the \e�ective" value of m. From (3.56) and (3.59) we note that the relative
width-to-length ratio and the term e�(VGC�VCM)=Vt have the same e�ect on the
voltage VDMS and hence the output current, IDM. We would like meff to be
constant, i.e., independent of the input signal. Therefore, we shall assume that
the voltage applied to the gate of the di�usor is exactly the common mode voltage
of the input signals, i.e., VGC = VCM . Alternately, one could set VGC higher to
achieve a higher e�ective value for m, or lower to simulate a lower one.
Substituting (3.59) into (3.56), we obtain a complete expression for IDM
IDM = Ib tanh
0@�VDM
2Vt� tanh�1
24 sinh
��VDM2Vt
�2m+ cosh
��VDM2Vt
�351A (3:62)
Di�erentiating IDM , the transconductance function G can be written as
G =Ib
cosh2
0@�VDM
2Vt� tanh�1
24 sinh
��VDM2Vt
�2m+cosh
��VDM2Vt
�351A
�4m2 + 2m cosh
��VDM2Vt
�4m2 + 4m cosh
��VDM2Vt
�+ 1
��
2Vt
�(3.63)
75
The relative width-to-length ratio m is the single parameter that we have to a�ect
the shape of the transconductance function.
Two possible criteria for optimizing the linear range are equiripple and maximal
atness. The optimality criterion that we follow is that of maximal atness, since
it is easier to derive analytically and it generally provides for a more robust design
strategy against device mismatch 2 With one degree of freedom, the �rst nonzero
derivative of G will be set to zero. By design G is an even function of VDM .
Therefore its �rst derivative is zero. Setting the second derivative equal to zero,
we �nd the only positive root occurs at m = 0:25. This root is independent of Ib,
�, and Vt.
The nominal transconductance value occurs at VDM equal to zero. Thus,
Go = Ib�
2Vt
�2m
1 + 2m
�= Ib
�
2Vt
�1
3
�(3:64)
A simpli�ed small-signal noise model for the di�erential pair with source de-
generation via a single di�usor is shown in Fig 3.6(b). Three equations which can
be used to solve for the di�erential noise current idn � i1 � i2 are:
idn = in1 � in2 � vs1(gm1 + gmb1 + gd1)
+vs2(gm2 + gmb2 + gd2)
in1 � in4 � inf3 + inr3 = vs1(gm1 + gmb1 + gd1 + gd4)
+(vs1 � vs2)(gmf3 + gmbf3) (3.65)
in2 � in5 + inf3 � inr3 = vs2(gm2 + gmb2 + gd2 + gd5)
+(vs2 � vs1)(gmf3 + gmbf3)
The last two equations are found by summing all currents at nodes vs1 and vs2,
respectively. We assume that the di�erential pair M1{M2 and the two current
2It remains to be proven through a detailed sensitivity analysis that maximal atness criterion
is indeed more robust than an equiripple design.
76
sources M4{M5 are matched, i.e., their small-signal parameters are equal. Solving
for idn, one can show that
idn =(in1 � in2)(gd4 + 2gmf3 + 2gmbf3)
gm1 + gmb1 + gd1 + 2gmf3 + 2gmbf3 + gd4(3.66)
+(in4 � in5 + 2inf3 � 2inr3)(gm1 + gmb1 + gd1)
gm1 + gmb1 + gd1 + 2gmf3 + 2gmbf3 + gd4
To estimate the power spectrum of the di�erential noise current, we make the
following assumptions. Set gd1 = gd4 � 0. Recall that gm + gmb = IDS=Vt. From
Appendix A one can deduce that gmf3 = mgm1, where m is the scaling ratio S3=S1.
In this case the equation for idn reduces to
idn =(in1 � in2)2m
1 + 2m+
(in4 � in5 + 2inf3 � 2inr3)1
1 + 2m(3:67)
As before, we sum the power spectra of the independent noise sources multiplied
by the square magnitude of their respective transfer functions, yielding,
Sidn(!) =�
2m
1 + 2m
�2
[Sin1(!) + Sin2(!)] (3.68)
+�
1
1 + 2m
�2
[Sin4(!) + Sin5(!)] +�
2
1 + 2m
�2
[Sin3f (!) + Sin3r(!)]
From Appendix A, the power spectrum of the current noise in a single transistor
is 2qIDS. For VDM = 0, I1 = I2 = Ib=2, whereas IF = IR = mIb=2 for transistor
M3. We obtain
Sidn(!) =�
2m
1 + 2m
�2
2qIb +�
1
1 + 2m
�2
2qIb +�
2
1 + 2m
�2
2qmIb
= 2qIb (3.69)
This remarkable result indicates that, at a given bias current, the addition of a
di�usor between the sources of the two input transistors does not add any net
power to the di�erential output current noise. Whereas for m = 0 (no connection)
and m!1 (the basic di�erential pair), we expected the noise density to be 2qIb,
it is surprising that the same result holds for any value of m.
77
To compute the output referred noise power for the di�erential pair with a
single di�usor, substitute equations for Go and Sidn into (3.20) to obtain
V 2out;n = 3
qVt�C
(3:70)
We see that the output-referred noise power is increased by a factor 3 which is due
to the reduction in Go.
The maximum input voltage Vmax can no longer be determined analytically.
If we set the the distortion DG to 1.0%, as before, using numerical techniques to
determine the constant term, we can write an expression for the linear range as in
Vmax = 1:59Vt�
(3:71)
For the same circuit conditions as before, the linear range is �58:4 mV, or roughly
eight times that of the basic di�erential pair. The current e�ciency of the di�eren-
tial pair with a single di�usor is 26.5%, or more than 2.5 times that of the simple
di�erential pair. In Fig. 3.7 we plot G as a function of VDM .
To compute the dynamic range for the case of the di�erential pair with a single
di�usor, substitute (3.70) and (3.71) into (3.35) to obtain
DR = 0:422VtC
�q(3:72)
For a nominal capacitance value of 5.0 pF and other parameters as before, the
dynamic range of this transconductor is 56.8 dB, representing a 13.2 dB increase
over that of the basic di�erential pair, and a 6.4 dB increase over that of the
di�erential pair with diode-connected transistors.
A distinct disadvantage to this di�erential pair con�guration is that it requires
additional common-mode circuitry to ensure that the input signals operate around
VGC . If for some reason the common-mode voltage drifts away from this value, the
linear range will be drastically reduced.
78
−75 −50 −25 0 25 50 75
0.98
0.985
0.99
0.995
1
Vdm [mV]
Nor
mal
ized
G [S
/S]
Figure 3.7: For the di�erential pair with source degeneration via a single di�usor,G normalized by the maximal transconductance as a function of VDM with Vt =25:7 mV and � = 0:7.
79
3.3.3 Double Di�usors
A di�erential pair with source degeneration via double di�usors M3 and M4 is
shown in Fig. 3.8. The basic topology for this circuit is derived from the work
described in (Krummenacher and Joehl, 1988). As for the single di�usor transcon-
ductor, we will show that the double di�usor transconductor has one free param-
eter, m = S3=S1, the relative aspect ratio of the two matched transistor pairs
M1{M2 and M3{M4. On the other hand, this di�erential pair con�guration does
not require extra common-mode circuitry.
Let VS1 and VS2 be as de�ned in (3.51). We assume that transistors M1 and
M2 operate in saturation and that VB, the bulk potential, is at zero volts. Apply-
ing (3.1) to transistors M1 and M2 in Fig. 3.6, we would �nd the same expressions
for I1, I2, and IDM = I1� I2 as for the single di�usor case. The di�erence between
these two con�gurations comes in the expression for I12, the current passing from
node VS1 to VS2. An equation for I12 can be written as
I12 = I0S3
�e�(VCM+VDM=2)
Vt + e�(VCM�VDM=2)
Vt
���
e�VCMS�VDMS=2
Vt � e�VCMS+VDMS=2
Vt
�(3.73)
Using (3.73) and (3.57) we can write another equation for IDM . After simpli-
fying, we get
IDM = 8I0S3e�VCM�VCMS
Vt cosh��VDM
2Vt
�sinh
�VDMS
2Vt
�(3:74)
Equating (3.54) and (3.74) and applying trigonometric identities, we solve for
VDMS
VDMS = 2Vt tanh�1�
1
4m+ 1tanh
��VDM
2Vt
��(3:75)
Substituting (3.75) into (3.56), we obtain a complete expression for IDM
IDM = Ib tanh��VDM
2Vt� tanh�1
�1
4m+ 1tanh
��VDM
2Vt
���(3:76)
80
I2I1
0.5 Ib
V1 V2
Vbias Vbias
M1
M5 M6
M2
(a)
0.5 Ib
VS2VS1
in5 in6
Ib Ib
(b)
I2I1
gd5 gd6
gm1+gmb1
+gd1
gm2+gmb2+gd2
vs2vs1
M3
M4
inf3 - inr3
gmf3+gmbf3
inf4 - inr4
gmf4+gmbf4
Figure 3.8: The di�erential pair with source degeneration via double di�usors (a)circuit, and (b) small-signal noise model.
81
Di�erentiating IDM , the transconductance function G can be written as
G =Ib
cosh2��VDM2Vt
� tanh�1h
14m+1
tanh��VDM2Vt
�i�
�(16m2 + 8m) cosh2
��VDM2Vt
�� 4m
(16m2 + 8m) cosh2��VDM2Vt
�+ 1
��
2Vt
�(3.77)
The relative width-to-length ratio m is the single parameter with which we can
a�ect the shape of the transconductance function. Setting the second derivative
of G equal to zero, we �nd the only positive root occurs at m = 0:5. This root is
independent of Ib, �, and Vt.
The nominal value of G occurs at VDM equal to zero. Thus,
Go = Ib�
2Vt
4m
4m+ 1= Ib
�
2Vt
2
3(3:78)
A simpli�ed small-signal noise model for the di�erential pair with source degen-
eration via double di�usors is shown in Fig 3.8(b). The analysis of this circuit is
quite similar to the case of a single di�usor, except that there are four independent
noise sources between nodes vs1 and vs2, and the conductance between these two
nodes is double, re ecting the fact that there are two di�usors in parallel. Assum-
ing precise matching of the three transistor pairs, M1{M2, M3{M4, and M5{M6,
one can write an expression for the di�erential output noise current as
idn =(in1 � in2)(gd5 + 4gmf3 + 4gmbf3)
gm1 + gmb1 + gd1 + 4gmf3 + 4gmbf3 + gd5(3.79)
+(in5 � in6 + 2inf3 � 2inr3 + 2inf4 � 2inr4)(gm1 + gmb1 + gd1)
gm1 + gmb1 + gd1 + 4gmf3 + 4gmbf3 + gd5
To estimate the power spectrum of the di�erential noise current, we make the
following assumptions. Set gd1 = gd5 � 0. Recall that gm + gmb = IDS=Vt. From
Appendix A we can deduce that gmf3 = mgm1, where m is the scaling ratio S3=S1.
The equation for idn simpli�es to
idn =(in1 � in2)4m
1 + 4m+
(in5 � in6 + 2inf3 � 2inr3 + 2inf4 � 2inr4)
1 + 4m(3:80)
82
As previously done, we sum the power spectra of the independent noise sources
multiplied by the square magnitude of their respective transfer functions.
Sidn(!) =�
4m
1 + 4m
�2
[Sin1(!) + Sin2(!)] (3.81)
+�
1
1 + 4m
�2
[Sin5(!) + Sin6(!)]
+�
2
1 + 4m
�2
[Sin3f(!) + Sin3r(!) + Sin3f(!) + Sin3r(!)]
From Appendix A, the power spectrum of the current noise in a single transistor
is 2qIDS. For VDM = 0, I1 = I2 = Ib=2, whereas IF = IR = mIb=2 for transistors
M3 and M4. We obtain
Sidn(!) =�
4m
1 + 4m
�2
2qIb +�
1
1 + 4m
�2
2qIb +�
2
1 + 4m
�2
(2qmIb + 2qmIb)
= 2qIb (3.82)
As for the case of the single di�usor, the addition of two di�usors between the
sources of the di�erential pair does not add to the di�erential output current noise
density.
To compute the output referred noise power for the di�erential pair with double
di�usors, substitute equations for Go and Sidn into (3.20) to obtain
V 2out;n =
3
2
qVt�C
(3:83)
We see that the output-referred noise power is increased by the factor 3=2 which
is due to a similar reduction in Go.
If we set the the distortion DG to 1.0%, using numerical techniques one can
show that the maximum input voltage is
Vmax = 0:795Vt�
(3:84)
For the same circuit conditions as before, we obtain a linear range of �29:2 mV,
or exactly one-half that of the transconductor with a single di�usor. The current
83
e�ciency of the di�erential pair with double di�usors is 26.5%, identical to that
of the single di�usor design. In Fig. 3.9 we plot normalized transconductance as a
function of VDM .
The dynamic range for the di�erential pair with double di�usors can be found
by substituting (3.83) and (3.84) into (3.35), to obtain
DR = 0:211VtC
�q(3:85)
For a nominal capacitance value of 5.0 pF and other parameters as before, the
dynamic range of this transconductor is 53.8 dB, representing a 10.2 dB increase
over that of the basic di�erential pair, and only a 3.0 dB decrease compared to
that of the single di�usor.
3.4 Multiple Di�erential Pairs
Another technique for linearizing the basic transconductor employs a multiplicity of
asymmetric di�erential pairs (Tanimoto et al., 1991). This technique was originally
applied to bipolar junction transistors (BJT's). Here it is extended to subthreshold
CMOS design, where CMOS device characteristics are similar to those of BJT's.
This technique is enhanced with the proposed use of the substrate terminal to
e�ectively modify the width-to-length ratio of the transistor.
3.4.1 Two Di�erential Pairs
A transconductor with two di�erential pairs is shown in Fig. 3.10. It consists of
two pairs of unequal size transistors and two current sources. The transistor M1a
is m times wider than M2a. Conversely, transistor M2b is m times wider than M1b.
The e�ect of sizing the transistors in this way is to create an intentional voltage
o�set. Note that the same e�ect could be obtained by the use of oating gate
transistors.
84
−30 −20 −10 0 10 20 30
0.98
0.985
0.99
0.995
1
Vdm [mV]
Nor
mal
ized
G [S
/S]
Figure 3.9: For the di�erential pair with source degeneration via double di�usors,G normalized by the maximal transconductance as a function of VDM with Vt =25:7 mV and � = 0:7.
85
Let VS1 be the voltage at the source of the transistor pair M1a{M2a, and VS2
be the voltage at the source of the transistor pair M1b{M2b. We assume that all
transistors are operating in saturation and that VB, the bulk potential, is at zero
volts. For the current transconductor con�guration, let the relative transistor ratio
be de�ned as
m � S1a
S2a� S2b
S1b(3:86)
It is helpful to write the relative transistor ratio as follows:
m = elnm (3:87)
Applying (3.1) to transistors M1a and M2a in Fig. 3.10, one �nds that
I1a = I0S2aelnme
�(VCM+VDM=2)
Vt e�VNaVt
I2a = I0S2ae�(VCM�VDM=2)
Vt e�VNaVt (3.88)
where S2a is the width-to-length ratio of transistor M2a. From (3.88) and the
constraint I1a + I2a = Ib=2, we can write an expression for the di�erence current,
IDMa, as in
IDMa =Ib2tanh
�VDM
2Vt+
lnm
2
!(3:89)
A similar, but complementary, equation can be derived for the transistor pair
M1b{M2b.
The total di�erential current is therefore
IDM =Ib2tanh
�VDM
2Vt+
lnm
2
!
+Ib2tanh
�VDM
2Vt� lnm
2
!(3.90)
The transconductance as a function of VDM is
G =Ib
2 cosh2��VDM2Vt
+ lnm2
� � �
2Vt
�
+Ib
2 cosh2��VDM2Vt
� lnm2
� � �
2Vt
�(3.91)
86
I1a
V1 V2
M2a
0.5 Ib
VSaM1a
I2a
m:1
I1b
M2b
0.5 Ib
VSbM1b
I2b
1:m
I1 I2
Figure 3.10: A transconductor with two asymmetric di�erential pairs.
87
The relative width-to-length ratio m of the transistor pairs will be used to obtain
a maximally at shape for the transconductance function. Setting the second
derivative of G equal to zero, we �nd that the only positive root that is greater
than one occurs at m = 2 +p3. Due to symmetry, a second root occurs at
m = 1=(2 +p3). These roots are independent of Ib, �, and Vt. In particular,
for a value of � = 1, the current-to-voltage characteristics of a CMOS transistor
operating in the subthreshold saturation region look identical to those of a bipolar
transistor, which is operating in its active region. As a consequence, this optimum
relative width-to-length ratio holds for both subthreshold CMOS and BJT design.
The nominal value for G occurs at VDM equal to zero. Thus,
Go = Ib�
2Vt
4
m+ 2 + 1=m
!= Ib
�
2Vt
�2
3
�(3:92)
It is not obvious, but, in fact, the transconductor with two asymmetric di�er-
ential pairs with m = 2 +p3 has the same voltage-to-current relationship as that
of the transconductor with double di�usors with m = 0:5. We omit the proof for
brevity. However, the key to showing this identity is found by making use of the
trigonometric identity:
tanh(a� b) =tanh(a)� tanh(b)
1� tanh(a) � tanh(b) (3:93)
The noise of a transconductor with asymmetric di�erential pairs is essentially
the bias current (Tanimoto et al., 1991). A laborious small-signal noise model
can be used to verify their result. As such, the equations for input-referred noise,
output noise power, linear range, current e�ciency, and dynamic range are the
same as those for the double di�usors.
In particular, allowing 1.0% distortion, using numerical techniques one can
show that the maximum input voltage is
Vmax = 0:795Vt�
(3:94)
88
The linear range is �29:2 mV, identical to that of the transconductor with double
di�usors. The current e�ciency is 26.5%, identical to that of the single and double
di�usor designs. In Fig. 3.11 we plot normalized transconductance as a function
of VDM . It is indistinguishable from that shown in Fig. 3.9.
As for the double di�usor design, the dynamic range of the transconductor with
two asymmetric di�erential pairs is
DR = 0:211VtC
�q(3:95)
For a nominal capacitance value of 5.0 pF and other parameters as before, the
dynamic range of this transconductor is 53.8 dB.
Similar to the double di�usor transconductor, no common-mode circuitry is
required for this transconductor. A minor advantage of the double di�usor design
over that of the two asymmetric di�erential pairs is that the optimum relative
width-to-length ratio is very simple in the former case m = 0:5, and quite compli-
cated in the latter casem = 2+p3 = 3:73. Indeed, a relative width-to-length ratio
of 4 is adopted in many practical designs which use two asymmetric di�erential
pairs.
3.4.2 Three Di�erential Pairs
It is possible to have more than two asymmetric di�erential pairs in order to
increase the linear range and current e�ciency (Tanimoto et al., 1991). As in
the previous design, we �nd that the optimal width-to-length ratios and current
source ratios are the same for bipolar circuits as for subthreshold CMOS, despite
the presence of �. In Fig. 3.12 we show the circuit for three asymmetric di�erential
pairs.
Using the results from the case of two asymmetric di�erential pairs, we can
89
−30 −20 −10 0 10 20 30
0.98
0.985
0.99
0.995
1
Vdm [mV]
Nor
mal
ized
G [S
/S]
Figure 3.11: For a transconductor with two asymmetric di�erential pairs, G nor-malized by the maximal transconductance as a function of VDM with Vt = 25:7 mVand � = 0:7.
90
write an equation for the total di�erential current as
IDM = �Ib tanh
�VDM
2Vt+
lnm
2
!
+ (1� 2�)Ib tanh��VDM
2Vt
�(3.96)
+ �Ib tanh
�VDM
2Vt� lnm
2
!
where the relative width-to-length ratio is now de�ned as
m � S1a
S2a� S2c
S1c(3:97)
The transconductance as a function of VDM is therefore
G =�Ib
cosh2��VDM2Vt
+ lnm2
� � �
2Vt
�
+(1� 2�)Ib
cosh2��VDM2Vt
� � �
2Vt
�(3.98)
+�Ib
cosh2��VDM2Vt
� lnm2
� � �
2Vt
�
The relative width-to-length ratio m of two of the transistor pairs and the relative
strength of the two sets of bias currents, �, are the two parameters that we have
to modify the shape of the transconductance function.
As before, we follow is the maximal atness criterion for optimizing the linear
range. Setting simultaneously the second and fourth derivatives of G equal to
zero, we �nd that the only positive root of m that is greater than one occurs at
m = 4 +p15. The only positive root of � occurs at � = 25=66. These roots are
independent of Ib, �, and Vt.
The nominal value of G occurs at VDM equal to zero. Thus,
Go = Ib�
2Vt
8�
m+ 2 + 1=m+ 1 � 2�
!
=�6
11
�Ib
�
2Vt(3.99)
91
Assuming that the di�erential output current noise is essentially due to the
bias current (Tanimoto et al., 1991), the output referred noise can be written as
V 2out;n =
�11
6
�qVt�C
(3:100)
where the factor 11=6 is due to a similar reduction in Go.
If we set the distortion DG to 1.0%, as before, using numerical techniques to
determine the constant, we can write the expression for the linear range as in
Vmax = 1:34Vt�
(3:101)
For the same circuit conditions as before, we obtain a linear range of �49:1 mV,
which is 67% higher than the linear range of the transconductor with two asym-
metric di�erential pairs. The current e�ciency of the transconductor using three
asymmetric di�erential pairs is 36.4%. In Fig. 3.13 we plot G as a function of VDM .
To compute the dynamic range for the case of three asymmetric di�erential
pairs, substitute (3.100) and (3.101) into (3.35) to obtain
DR = 0:486VtC
�q(3:102)
For a nominal capacitance value of 5.0 pF and other parameters as before, the
dynamic range of this transconductor is 57.5 dB, representing a 3.7 dB increase
over two asymmetric di�erential pairs, and only a 0.7 dB increase compared to
that of the single di�usor.
The transconductor built using three asymmetric di�erential pairs seems to
have every possible advantage. The dynamic range is very high and it does not
require common-mode circuitry. However, the matching requirements now become
increasingly severe. Not only must the di�erential pairs maintain a precise relative
aspect ratio m = 4 +p15 = 7:87, but also the current sources need a relative
sizing of �-to-(1 � 2�), or 1.56-to-1.
92
I1b
V2
M2b
(1-2α)Ib
VSbM1b
I2b
1:1
I1c
M2c
αIb
VScM1c
I2c
1:m
I2
I1a
V1
M2a
αIb
VSaM1a
I2a
m:1
I1
Figure 3.12: A transconductor with three asymmetric di�erential pairs.
−60 −40 −20 0 20 40 60
0.98
0.985
0.99
0.995
1
Vdm [mV]
Nor
mal
ized
G [S
/S]
Figure 3.13: For a transconductor with three asymmetric di�erential pairs, Gnormalized by the maximal transconductance as a function of VDM . with Vt =25:7 mV and � = 0:7.
93
3.4.3 Substrate Biasing Technique
The CMOS transistor is a symmetric four-terminal device, which has e�ectively
two voltage-control nodes, the gate and the substrate. The use of the substrate as a
control node is not common, although it has recently found increasing use (Cohen
and Andreou, 1992; Sarpeshkar et al., 1996). It seems that if a twin-tub process
became available through low-cost foundry services, the substrate would be more
fully exploited as a second input terminal voltage.
The equation for the current through an NMOS device in saturation can be
written in the form (see Appendix A),
IDS = I0Se(1��)VB=Vte(�VG�VS)=Vt (3:103)
Note that both the width-to-length ratio S and the bulk voltage VB can be used to
modulate the output current, independent of VG and VS . If we de�ne an e�ective
width-to-length ratio as
Seff = Se(1��)VB=Vt (3:104)
then it is possible to continuously adjust Seff via the bulk potential. Note that
the interface between the source and the bulk is a reversed-biased p-n junction.
Therefore we do not want to increase VB relative to VS . Rather, we must ensure
that VB is always less than or equal to the minimum expected source voltage. In
general, we are limited then to reducing the e�ective width-to-length ratio using
the substrate biasing technique.
As a sample design, consider Fig. 3.14. We would like to achieve the scaling
ratio of (2+p3)-to-1, which results in a maximally at transconductance function
for the case of two asymmetric di�erential pairs. At the mask level, we give all
four transistors the same aspect ratio S. We then reduce the bulk potential VB2 of
the two inside transistors (M2a and M1b) relative to the bulk potential VB1 of the
94
two outside transistors (M1a and M2b) such that
meff � Seff1aSeff2a
= 2 +p3 (3:105)
Substituting (3.104) for each transistor in the above equation, we have
meff =S exp(1��)VB1=Vt
S exp(1��)VB2=Vt= exp(1��)(VB1�VB2)=Vt = 2 +
p3 (3:106)
For nominal parameter values of Vt = 25:7 mV and � = 0:7, we �nd VB1 � VB2 =
113 mV, i.e., we need to lower VB2 113 mV with respect to VB1.
One notable drawback to this technique is that the optimal value of VB1� VB2
now depends on � and Vt. Therefore, these parameters either must be known a
priori and be held constant, or a circuit is needed that continually adapts the
bulk voltages to their desired values. Clearly, further research into this method is
warranted.
3.5 Hints on Improved Transconductor Design
3.5.1 Use of the Gate Capacitance
In the CMOS process available through foundry services, the gate-oxide thickness
is approximately half that of the capacitor-oxide thickness. As such, more dense
designs can be achieved through the use of the input gate as the integrating ca-
pacitance. In addition, by using large-area devices, icker noise, which we have
ignored for the most part, becomes less noticeable.
One possible drawback to the use of input gates as the integrating capacitance
is that the gate capacitance is weakly nonlinear. It may be, though, that by using
a constant gm input stage this nonlinearity will be canceled. More research is
necessary in this area.
95
I1a
V1 V2
M2a
0.5 Ib
M1a
I2a
1:1
I1b
M2b
0.5 Ib
M1b
I2b
1:1
I1 I2
VB2VB1
Figure 3.14: A transconductor with two asymmetric di�erential pairs demonstrat-ing the substrate biasing technique to achieve a maximally at transconductancefunction.
96
3.5.2 Voltage-Splitting
One technique to double the linear range of a given transconductor design is known
as voltage-splitting (Torrance et al., 1985). Its merits are that it is relatively
simple to implement and that it automatically computes the common-mode input
voltage. As such, this technique can be easily incorporated into a di�erential-
input di�erential-output design with only the addition of a common-mode feedback
circuit from the input of one stage to the output of the preceding stage. The
common-mode feedback circuit adds to the overall power dissipation, and so needs
to be carefully considered in terms of the overall improvement in dynamic range.
One good transconductor design that uses of this technique is found in (Silva-
Martinez et al., 1990). In their design the MOS transistors operate above threshold.
Similar in topology, but di�erent in operation, is the circuit shown in Fig. 3.15,
in which all the transistors operate in subthreshold. In this �gure, the voltage-
splitting technique is carefully applied to the optimum transconductor design using
double di�usors. In simulation and analysis, we achieve a factor of two improve-
ment in linear range.
We fabricated a ten-channel version of the Liu �lter bank model described in
Chapter 4 with the transconductor design of Fig. 3.15. Rather than applying
common-mode feedback, we chose a single-ended output using a current mirror.
Preliminary measurements indicate an improved linear range using this technique.
We have not performed a detailed noise analysis. However, from the �lter
structure, it appears that the noise will be greater than 2qIb since a fraction of
the bias current is consumed in computing the common-mode voltage. Another
potential di�culty with this design is that the matching requirements now extend
to the four current sources shown in Fig. 3.15, the six di�erential input transistors,
as well as the four di�usors.
97
IM2aIM1
0.5 Ib
V1 V2 M1 M2a
0.5 Ib
VS1
M3
IM6IM5a
0.5 Ib
M5a M6
0.5 Ib
M8M4
0.5 IbI1 I2
VCM
M5b M7M2b
Figure 3.15: Application of the voltage-splitting technique to the transconductorwith source degeneration via double di�usors.
98
3.5.3 Class-AB Operation
It is noted in (Groenewold, 1991) that class-AB transconductors have the lowest
noise factor since there is no extra bias element. In particular, the output-referred
noise current density is doubled, whereas the nominal transconductance is also
doubled by the e�ect of the push-pull con�guration. Referred to the input voltage,
then, the noise factor is unchanged. On the other hand, if the transconductor is
class-B, the noise factor is again doubled, but the nominal transconductance is
unchanged. As a result, the noise factor is doubled.
Therefore, a low-noise transconductor design would, at least in theory, be com-
plementary. In order for the complementary design to function well, it is also
required that the parameters for the native and complementary devices be well
matched. In a simulation with well-matched nMOS and pMOS transistors, we
�nd that a class-AB complementary version of the transconductor with double dif-
fusors operates with the desired result. This circuit is depicted in Fig. 3.16. In a
fully-integrated implementation of this circuit, there is the possibility of a system-
atic di�erence in � between native and complementary devices, which would limit
the achievable dynamic range. This di�erence would be largely due to di�erences
in the dopant concentration between the bulk of the native device (the substrate),
and that of the complementary device (the well).
3.6 Experimental Results
To date, two experiments have been made to verify the analysis and simulation of
the transconductors presented in this chapter.
3.6.1 Static Measurements
Static measurements of the basic di�erential pair, the di�erential pair with source
degeneration using single di�usors, and the di�erential pair with source degen-
99
V1 V2 M1
0.5 Ib
M2
0.5 Ib
M3
M4
0.5 Ib0.5 Ib
I1I2M5 M6
M7
M8
Figure 3.16: Fully complementary transconductor design based on source degen-eration via double di�usors.
100
eration using double di�usors were made using ADC488 and DAC488 analog-to-
digital and digital-to-analog converter modules under control of a PC-compatible
computer. All transistors had a width of 2000 �m and a length of 20 �m. Chips
were fabricated in a standard 2 �m n-well process. Relative transistor aspect ra-
tios of 2 and 0.5 were achieved by combining transistors in parallel and in series,
respectively.
The output voltages of the DAC488 were attenuated and lowpass �ltered at
a very low cuto� frequency before being presented to the di�erential inputs of
the transconductors. The output currents were converted to voltages using low-
value resistors and a bu�er/ampli�er. These voltages, in turn, were digitized by
the ADC488. Care was taken to ensure that the common-mode voltage remained
unchanged in these experiments. In Fig. 3.17, the di�erential output current data
is plotted as a function of di�erential input voltage.
In software, a �nite di�erence method was used to compute the normalized
transconductance function. The output current data was �rst smoothed, and then
di�erentiated with respect to the (known) di�erential input voltage. The DC o�set
and maximum transconductance values were estimated using a least-mean squares
�t of a subset of the data to a parabolic function. As such, in the plot of Fig. 3.18,
the transconductance function is normalized by the maximum value, and the DC
o�set removed from the data.
From Figs. 3.17 and 3.18 it is evident that an improvement in linear range is
obtained for both the single and double di�usor transconductors, as compared to
the basic di�erential pair. On the other hand, this improvement is not as high as
one would hope for. One possible explanation is poor device matching.
Also note from Fig. 3.18 that the computed normalized transconductance func-
tion for the basic di�erential pair is essentially \noise-free," unlike those for the
two transconductors with source degeneration. The latter designs exhibit some-
101
−100 −50 0 50 100−2
−1.5
−1
−0.5
0
0.5
1
1.5
2Differential Pairs (2000um x 20um)
Vdm [mV]
Idm
[uA
]
(a) (b) (c)
Figure 3.17: Experimental data of the di�erential output current as a functionof input voltage for (a) the basic di�erential pair, (b) the di�erential pair withsource degeneration via double di�usors, and (c) the di�erential pair with sourcedegeneration via a single di�usor. Each dot represents a sample point.
102
−60 −40 −20 0 20 40 60
0.97
0.975
0.98
0.985
0.99
0.995
1
Normalized Transconductance (2000um x 20um)
Vdm [V]
Gm
/ G
mM
ax [S
/S]
(a) (b) (c)
Figure 3.18: Normalized transconductance as a function of VDM computed fromexperimental data for (a) the basic di�erential pair, (b) the di�erential pair withsource degeneration via double di�usors, and (c) the di�erential pair with sourcedegeneration via a single di�usor. Solid lines show the predicted values.
103
what strange peaks and valleys in their transconductance function. One possible
explanation for these extraneous peaks and valleys is poor device matching.
3.6.2 Dynamic Measurements
The transconductance functions of the basic di�erential pair and the di�erential
pair with a single di�usor were measured using an SR850 lock-in ampli�er. All
transistors had a width of 1377.6 �m and length of 4.8 �m. Chips were fabricated
in a standard 1.2 �m n-well process. We varied VGC at the gate of the di�usor in
Fig. 3.6 in order to control the \e�ective" width-to-length ratio of the di�usor.
Two voltage signals originating from the SR850 were attenuated and summed
externally with a summing ampli�er. The complement to the input signal was also
computed using a unity-gain inverting ampli�er. The voltage signal seen across
the di�erential inputs consisted of a a 1 mV peak-to-peak sine wave at frequency
100 Hz superimposed upon a DC bias in the range of �200 mV. The output
currents were converted to a voltage through low-valued resistors, bu�ered, and
then AC coupled to the input of the lock-in ampli�er.
Fig. 3.19 shows the experimental data. The linear range of the basic di�erential
pair is measured as 18.6 mV, while that of the transconductor with a single di�u-
sor is 133.6 mV. The improvement in linear range is approximately seven times,
whereas the expected improvement from �rst-order analysis was eight. This dis-
crepancy may be attributed to such non-idealities as a nonzero drain-to-source
conductance and variations in � as a function of bulk-to-source potential.
3.6.3 Summary of Results
We summarize the analytical results of this chapter with several tables which allow
easy comparison of several transconductor designs.
For a �xed bias current Ib, the highest nominal transconductance Go is ob-
104
−75 −50 −25 0 25 50 75
0.98
0.985
0.99
0.995
1
Vdm [mV]
G N
orm
aliz
ed to
Gm
ax
(a) (b)
Figure 3.19: Experimental data of the normalized transconductance as a functionof VDM for (a) the basic di�erential pair and (b) the di�erential pair with sourcedegeneration via a single di�usor.
105
tained by the basic di�erential pair. The linearization techniques discussed in this
chapter e�ectively trade-o� transconductance for linear range, i.e., the transcon-
ductance value is decreased while the linear range is increased. Indeed, the author
doubts that it is possible to increase the linear range while not adversely a�ect-
ing the nominal transconductance value. Moreover, except for the di�erential pair
with source degeneration via diode-connected transistors, the output current noise
density for each of the linearization techniques is essentially constant at (2qIb).
For a constant corner frequency (Go=C), and constant integrating capacitance
C = 5 pF, we see in Table 3.1 the \cost" in terms of increased power consumption
relative to that of the basic di�erential pair, and also the expected \bene�t" in
terms of improved dynamic range for each of the transconductors described in this
chapter. The increased power consumption comes from the need to set Go equal
to a constant for each of the transconductor designs. Since Go / Ib, constant Go
is achieved by increasing the power consumption of the linearized transconductors
as compared to the basic di�erential pair.
Table 3.1: Summary of Linearization Techniques with Constant (Go=C), C = 5 pF(Vt = 25:7 mV, � = 0:7).
Technique Relative Ib Dynamic Range at DG = 1%
Di�erential Pair 1 43.6 dBDiode-Connected 2.43 50.4 dBSingle Di�usor 3 56.8 dBDouble Di�usors 1.5 53.8 dB
Two Asymmetric Pairs 1.5 53.8 dBThree Asymmetric Pairs 1.83 57.5 dB
Another way of viewing the relative merit of each transconductor topology
is to again �x the corner frequency (Go=C), but this time to also �x the power
consumption, Ib. Since �xing Ib e�ectively �xes Go, it is necessary to decrease
the integrating capacitance for each of the linearized transconductors in order to
106
maintain a constant corner frequency. Thus, below we see not only the \bene�t"
of each linearizing technique in terms of increased dynamic range, we also see the
\bene�t" of reduced integrating capacitance.
Table 3.2: Summary of Linearization Techniques with Constant (Go=C), Ib (Vt =25:7 mV, � = 0:7).
Technique Integrating C Dynamic Range at DG = 1%
Di�erential Pair 5 43.6 dBDiode-Connected 2.06 46.5 dBSingle Di�usor 1.67 52.1 dBDouble Di�usors 3.33 52.1 dB
Two Asymmetric Pairs 3.33 52.1 dBThree Asymmetric Pairs 2.73 54.8 dB
From the results presented above, it is clear that various tradeo�s exist between
design criteria. For a �xed �lter cuto� frequency, (Go=C), and �xed transconductor
design, as the capacitance increases, so does the power consumption, since Ib / Go.
And as the integrating capacitance increases, so does the dynamic range, in a linear
fashion. Thus, for a particular distortion-limited transconductor designs, increases
in dynamic range come only at a cost of increased power consumption and increased
area.
Choosing one linearization technique over another can have the hidden cost of
more complex circuit design, increased area, and increased sensitivity to device
mismatch. All of these secondary design criteria must be accounted for in the
selection of a particular transconductor.
One question that the tables above do not answer is the following: given a
particular cuto� frequency (Go=C) and either a constant integrating capacitance
or constant bias current, what is the highest achievable dynamic range of any
distortion-limited transconductance-C integrator topology?
In this chapter, we have introduced several new transconductor topologies
107
to subthreshold CMOS design. Our intention is to apply these new designs in
continuous-time �lter architectures, such as the one described in the following
chapter. Optimizing the dynamic range of the transconductor is the �rst step
toward optimization of the entire �lter structure.
Chapter 4
The Multi-Resolution Filter
Bank Model
The e�ciency and performance of any information processing system, both hard-
ware and software, can be improved by incorporating prior knowledge in the design
phase. Indeed, this is the keystone for success in all statistical speech recognition
systems (Roe and Wilpon, 1994).
The ultimate goal of our research is the e�cient extraction of information
from sensory signals { auditory and visual { by hardware systems. As a tentative
measure of goodness, we propose the following optimizing criterion: maximize
the number of bits/sec/Watt, or bits/Joule. A high information rate is the desired
outcome of the processing, while the power consumption represents the cost. Other
cost measures are possible, such as size and weight, but very often they can be
related to the overall power consumption.
Biological systems serve as working models of sensory information processing
as they seem to achieve a very high information rate at very low levels of power
consumption. Therefore, we stand to learn by abstracting from known functions
and organizations of information processing in biological systems when we attempt
to solve similar problems using VLSI. To date, several biologically-inspired VLSI
systems for vision and audition have emerged from such an undertaking, including
108
109
the silicon retina and the silicon cochlea (Mead, 1989; Andreou and Boahen, 1994;
Liu et al., 1992b).
In this chapter, we present an analysis and design strategy for hardware cochlear
�lter bank models, addressing issues both at the architectural and circuit levels.
Total power dissipation is a prime engineering constraint and, as such, this work
�nds applications in the areas of portable speech-recognition equipment, hearing
aids, and cochlear implants.
4.1 Filter Bank Architecture
Prior knowledge is sometimes referred to as the \model" and represents the struc-
ture of the information processing system. A parametric description with a min-
imal number of parameters is desirable. A static, linear �lter bank model of the
basilar membrane in the cochlea has been proposed by Liu (Liu, 1992; Liu et al.,
1993). A block diagram of the architecture is found in Fig. 4.1. The design is a
result of the e�ective bandwidth concept and reproduces faithfully the results from
hydrodynamic simulations of a one-dimensional transmission-line model of cochlear
uid mechanics (Allen, 1985). The �lter bank structure has only four tuning pa-
rameters yet is exible enough so that an appropriate set of parameters can be
found to �t the neurophysiological data. In particular, the �lter bank is tuned so
that the model output closely resembles the auditory �bers' instantaneous �ring
rates (IFR) in response to steady-state sinusoids and tone bursts (Liu, 1992). To
do so, it has been found that two second{order sections are necessary (Liu, 1992;
Liu et al., 1992a; Liu et al., 1993) instead of one (Liu et al., 1992b).
However, the performance of any �xed-model based system will inevitably de-
grade if the operating environment does not match the environment under which
the system was originally designed. For speech communication the variability in
the acoustic environment, variability between individual speakers, and variability
110
Input
Outputs
1
2
N-1
N
Figure 4.1: Block diagram of Liu's N -channel basilar membrane model consistingof a cascade of N lowpass sections with taps to two bandpass �lters per outputchannel (Liu, 1992).
111
in the hardware components (mismatch) necessitates the possibility of adaptation.
Two software systems which successfully adapt in order to counter variability
are found in the area of speech recognition. Neti (Neti, 1994) used a software
model of the basilar membrane proposed by Liu (Liu, 1992), followed by temporal
feature extraction proposed by Yang (Yang et al., 1992), as a front-end for a large-
vocabulary isolated-word speech recognition experiment. By adjusting the four
tuning parameters of the basilar membrane in response to changes in the level of
additive babble noise, Neti reported more than a 50% decrease in word error rate at
moderate signal-to-noise ratios as compared with the more conventional acoustic
processing scheme. More recently, Kamm et al (Kamm et al., 1995) describes
a system which attempts to adjust one �lter bank parameter from the estimated
vocal-tract length of the individual speaker. They reported an overall improvement
in the word error rate of 5% for continuous-word speaker-independent telephone
speech. Thus, although adaptation is not treated in this work, it is understood by
the author that this topic must be addressed in a �nal design.
The �lter bank can be viewed as a single-input, multiple-output �lter bank.
With N output nodes, the transfer function from the input node to output n is
given by:
H(n; s) =
0@ s
Q3(n)!c(n)
1 + s
Q3(n)!c(n)+ s2
!c(n)2
1A2
nYi=1
0@ 1
1 + s
!c(i)
1A (4:1)
where !c(i) is the cuto� frequency of the lowpass �lter and center frequency of
the bandpass �lters for the ith section, and Q3(i) is the 3-dB quality factor of
the bandpass �lters for the ith section. The four �lter tuning parameters are the
center frequency range, !c(1){!c(N), and the quality factor range Q3(1){Q3(N).
112
4.2 RLC Proto-Type Filters
To achieve robust overall characteristics, the �lter building block designs were
based on RC and RLC proto-types. One of their desirable properties is their
insensitivity of the peak response to component values, i.e. the peak gain is always
unity. The second is a consequence of the �rst. In a cascade of such �lters, noise
from previous �lters is never ampli�ed by successive �lters. Indeed, the noise
level increases only linearly with the number of stages. Finally, the particular
second-order section chosen is the optimum design for low-Q, wide-frequency range
�lters (Nevarez-Lozano and Sanchez-Sinencio, 1991).
4.2.1 RC Proto-Type Lowpass
The �rst-order RC proto-type lowpass �lter has a single pole on the negative real
axis in the s-plane. The RC proto-type is shown in Fig. 4.2(a). Its transfer function
is
HLP (!) =1
1 + j!
!c
(4:2)
where !c is the corner frequency. Its square magnitude and group delay are
jHLP (!)j2 =1
1 + !2
!2c
(4.3)
4tLP (!) =1!c
1 + !2
!2c
We choose to implement the resistance using a transconductor of value G, as
shown in Fig. 4.2(b).1 Then !c = G=C. Let us consider the noise in this �lter.
The spectral density of the thermal noise in a single transconductor is
SG(!) =4kT�
G(4:4)
where � is the noise factor, typically greater than unity.
1For a discussion on the choice of implementation, see Chapter 2.
113
+-
GVin Vout
C
+-
GSin,n(ω)Vout
C
~~
SG(ω)
C
VoutVin
R
(a)
(b)
(c)
Figure 4.2: RC �rst-order lowpass �lter (a) proto-type, (b) transconductance-Cimplementation, and (c) noise model.
114
A noise model for the lowpass �lter is shown in Fig. 4.2(c), where the input
signal is also corrupted by Gaussian noise with power spectrum Sin(!). We assume
the input noise to be independent of the noise in the transconductor. Then, at the
output node we will see Gaussian noise with power-spectral density
Sout(!) = [Sin(!) + SG(!)] jHLP (!)j2 (4:5)
If we integrate SLP (!) across all radian frequencies and divide by 2�, we �nd the
total noise power in units of square volts. As an example, suppose that SLP;in = 0,
that is, there is no noise in the input signal. Then the mean-square output noise
is
V 2out;n =
1
2�
Z 10
4kT�
G
1
1 + !2
!2c
d! (4.6)
=4kT�
G
!c4
=kT�
C
Note that the equivalent noise bandwidth of a lowpass �lter is !c=4.
In Chapter 2 we discussed the self-biased transconductance-C integrator as a
building block for implementing �lter functions. Fig. 4.3(a) shows this transcon-
ductor circuit and symbol. Fig. 4.3(b) is a single-order inverting lowpass �lter
consisting of two such transconductors.
4.2.2 RLC Proto-Type Bandpass Filter
The second-order RLC proto-type bandpass �lter has a pair of complex-conjugate
poles in the left-half s-plane and a zero at the origin. It is shown in Fig. 4.4(a).
The transfer function is given by
HBP (!) =
j!
Q3!c
1 + j!
Q3!c� !2
!2c
(4:7)
where !c is the center frequency and Q3 is the 3-dB quality factor, de�ned as the
ratio of the center frequency to the 3-dB-down bandwidth. Note that Q3 = 0:5
115
C
Vin Vout
Iout
-Vdd
+Vdd
-
C
G
(a)
(b)
-
C
G- G
Figure 4.3: Self-biased transconductance-C integrator: (a) circuit and symbol, (b)con�gured as �rst-order lowpass �lter.
corresonds to two real poles at ! = !c. The square magnitude and group delay
are
jHBP (!)j2 =
�!
Q3!c
�2�1 � !2
!2c
�2+�
!
Q3!c
�2 (4.8)
4tBP (!) =
1Q3!c
�1 + !2
!2c
��1 � !2
!2c
�2+�
!
Q3!c
�2We adopt a transconductance-C implementation of the RLC prototype �lter,
as shown in Fig. 4.4(b). We replace the R with a transconductor G1, and the L
with a gyrator (generalized impedance converter) and second capacitor of value C.
The gyrator is formed with two transconductors G2 and G3 in negative feedback.
For ease in the design, we set G2 = G3. The equivalent inductance Leq seen at the
input to G2 is C=(G2G3). This particular second-order section has been chosen
because it is the optimum design for low-Q, wide-frequency range �lters (Nevarez-
116
C
Vin
RL
Vout
+-
GVin Vout
C+-
G2Vc
C
+-
G3
+-
G1Sin,n(ω)Vout
C
~~
SG1(ω)
+-
G2Vc
C
+-
G3
~SG2(ω)
~SG3(ω)
(a)
(b)
(c)
Figure 4.4: RLC proto-type second-order bandpass �lter (a) proto-type, (b)transconductance-C implementation, and (c) noise model.
117
Lozano and Sanchez-Sinencio, 1991). The transfer characteristics of the �rst- and
second-order sections are summarized in Table 4.1.
Table 4.1: Characteristics of the �rst and second-order OTA-C �lters
Parameter First-order lowpass Second-order bandpass
H(s)!c
s+ !c
!cQ3s
s2 + !cQ3s+ !2
c
!c G=CpG2G3=C
Q3 {pG2G3=G1
In Fig. 4.4(c) we model the various noise sources in this particular VLSI imple-
mentation. By superposition we can consider each noise source independently and
then add their resulting e�ects at the output node. Let SG1, SG2, and SG3 be the
power spectral density of the �rst, second, and third transconductors, respectively.
As before, assume that the input signal has noise power spectrum Sin. Let HLP2
be the transfer function of a second-order unity-gain lowpass �lter with the same
pole locations as the bandpass �lter, i.e.,
HLP2(!) =1
1 + j!
Q3!c� !2
!2c
(4:9)
Its square magnitude is
jHLP2(!)j2 = 1�1� !2
!2c
�2+�
!
Q3!c
�2 (4:10)
One can show that the square magnitude transfer functions between the equivalent
noise sources and the output voltage are given as in Table 4.2.
Setting G2 = G3, the ratio G3=G1 is just Q3 Then the total output noise power
spectrum for the VLSI implementation is given by
Sout(!) =hSin(!) + SG1(!) + SG3(!)Q
23
ijHBP (!)j2 + SG2(!) jHLP2(!)j2 (4:11)
We �nd the total noise power by integrating Sout(!) across all frequencies. As an
example, suppose that Sin = 0, that is, there is no noise in the input signal. Using
118
Table 4.2: Square magnitude transfer functions from all noise sources to Vout forthe second-order OTA-C �lter.
Noise Source Transfer Function
Sin jHBP (!)j2SG1 jHBP (!)j2SG2 jHLP (!)j2SG3 jHBP (!)G3=G1j2
the identities
1
2�
Z 10jHBP (!)j2 d! =
!c4Q3
(4.12)
1
2�
Z 10jHLP2(!)j2 d! =
!Q3
4
the noise power in the output voltage is
V 2out =
"4kT�
G1+Q2
3
4kT�
G3
#!c4Q3
+4kT�
G2
!cQ3
4=kT�
C(1 + 2Q3) (4.13)
Note that it is assumed that the noise factor � is the same for each transconductor.
Fig. 4.5 shows the second-order bandpass circuit, consisting of six self-biased
transconductors. Together with the �rst-order lowpass circuit of Fig. 4.3(b), they
can be used as building blocks for the basilar membrane model of Fig. 4.1.
4.3 Complete Filter Bank Model
Having studied the elements of the �lter bank, the �rst-order lowpass and the
second-order bandpass, we proceed to derive equations describing the behavior of
the entire system.
4.3.1 Transfer Function
The basilar membranemodel can be viewed as a single-input, multiple-output �lter
bank. Suppose there are N output nodes. From Fig. 4.1, the transfer function from
119
-
C
G1- G1 -
C
G2
-
Cp
G3 -G3-G3
Vin
Vout
Figure 4.5: RLC proto-type second-order bandpass �lter composed of six self-biased transconductors.
the input to the output n is given by:
H(n; !) =
0@ j!
Q3(n)!c(n)
1 + j!
Q3(n)!c(n)� !2
!c(n)2
1A2
nYi=1
0@ 1
1 + j!
!c(i)
1A (4:14)
where !c(i) is the cuto� frequency of the lowpass �lter and the center frequency of
the bandpass �lters for the ith output section and Q3(i) is the 3-dB quality factor
of the ith output section.
A practical VLSI implementation in subthreshold CMOS imposes the following
constraints: the spacing between !c(i) and !c(i+1) is exponential and the spacing
between Q3(i) and Q3(i + 1) is exponential. Note that exponential spacing on a
linear scale is the same as linear spacing on a logarithmic scale. As a result, one
need only specify the range in these parameters as well as the number of output
sections N in order to completely determine the �lter bank transfer characteristics.
Mathematically, we can write
!c(i) = !c(1)N�iN�1!c(N)
i�1N�1 (4.15)
Q3(i) = Q3(1)N�iN�1Q3(N)
i�1N�1 (4.16)
120
A more fundamental constraint is that !c(1) > !c(N). This constraint speci�es
the direction in which the acoustic wave travels.
A number of preliminary lowpass sections are needed before the �rst output
channel in order to establish more uniform magnitude and phase characteristics
within the �lter bank. The exact number depends largely on the frequency spacing.
For the moment, let there be M such preliminary sections. In practice one can
achieve this goal by ignoring the �rst M output channels in the �lter bank, and
only observe outputs M + 1 through N , for a total of N �M channels. However,
this technique has two undesirable consequences. The �rst is wasted space and
power consumption, since the bandpass �lters in the �rst M sections are unused.
The second is that the �rst several sections could have very high quality factors
that may introduce noise into the other circuits. For this reason, we specify M
preliminary lowpass sections prior to the N output sections, as in:
H(n; !) =
0@ j!
Q3(n)!c(n)
1 + j!
Q3(n)!c(n)� !2
!c(n)2
1A2
nYi=1
0@ 1
1 + j!
!c(i)
1A MY
i=1
0@ 1
1 + j!
!p(i)
1A (4:17)
where !p(i) is the cuto� frequency of the ith preliminary lowpass section. Sub-
mitting to the constraint of exponential spacing described earlier, an equation for
computing !p(i) is
!p(i) = !c(1)N+M�iN�1 !c(N)
i�1�MN�1 (4:18)
In Figs. 4.6 we plot magnitude response and the group delay as a function of
frequency for a 16-channel basilar membrane with constant Q3 and two preliminary
lowpass sections.
4.3.2 Filter Bank Tuning
Exponential spacing of the corner frequencies is easily achieved in subthreshold
CMOS design due to the exponential voltage-to-current relationship, as described
in Appendix A. The corner frequencies are proportional to G, the transconduc-
121
102
103
104
−60
−40
−20
0
(a) Frequency [Hz]
Mag
nitu
de [d
B]
102
103
104
100
(b) Frequency [Hz]
Gro
up D
elay
[ms]
Figure 4.6: Response of 16-channel cochlear �lter bank using exact equations, (a)magnitude and (b) group delay. Filter parameters are as follows: fc(1) = 8000 Hz,fc(16) = 100 Hz, Q3(1) = 2:6, and Q3(16) = 2:6. Two preliminary lowpass sectionsare added for better uniformity in the peak response.
tance. The transconductance, in turn, is proportional to the bias current. Hence,
it is our goal to exponentially space the bias current at each transconductor in
order to exponentially space the corner frequencies.
The method for achieving exponential spacing is by linearly spacing one of three
voltages terminals of the CMOS transistor. A good current source will not have
any drain voltage dependence, leaving the other three nodes as possible candidates:
the source, the gate, and the substrate.
Linear spacing of the supply voltages is possible using a resistive ladder. An
example of such a biasing scheme is depicted in 4.7(a), where one section of a
lowpass cascade is shown. As such, the sources of both the nMOS and pMOS
transistors in a complementary design can be linearly graded. Note that if the
transconductor is not complementary, only one of the power supplies need be
graded. However, because the source of a transistor is a low-impedance node, the
122
impedance of the resistive ladder must of necessity be lower still. A low-impedance
resistive ladder will result in signi�cant power dissipation. Therefore, this type of
resistive ladder is generally undesirable.
(b)
-
C
Gn- Gn
-
C
Gn- Gn
+Vddn
+Vsubn
-Vddn
-Vsubn
(a)
Figure 4.7: Single-section of lowpass cascade showing tuning mechanism via (a)supply lines and (b) substrate lines.
A second method for exponentially spacing the bias current of a transconductor
is at the gate of a current source. However, the self-biased transconductance-
C integrator does not have an independent current bias, hence, its name. But
transconductor designs based on the di�erential pair have an independent current
bias element. The method of linearly spacing the gate voltage of the bias transistor
has been used by Liu and others (Lyon and Mead, 1988; Watts, 1992) in the design
of cochlear �lter banks.
A �nal method of modulating the bias current in a current source is the oft-
overlooked substrate, or back-gate. A linear change in the substrate voltage results
123
in an exponential change in the bias current, albeit less e�ectively than the gate
potential. For a class-A or -B design, only one substrate potential need be modu-
lated. For a complementary transconductor, both substrates need resistive ladders.
If it were possible to bias both the n-substrate and the p-substrate, as in a twin-tub
process, the last solution would be the best for the case of the self-biased transcon-
ductor. The substrate sinks and sources very little current, and hence little power
would be consumed in setting the bias currents. The substrate biasing scheme is
depicted in Fig. 4.7(b).
4.3.3 Filter Bank Noise
Each �lter in a cascade contributes to the overall noise spectrum at the output
channel. Let us relate the transconductance noise spectra to the �lter parameters,
assuming constant capacitance C. For the case of the ith �rst-order lowpass �lter,
we have
SG(i; !) =4kT�
C!c(i)(4:19)
For the ith bandpass �lter, with G2(i) = G3(i), we have
SG1(i; !) =4kTQ3(i)�
C!c(i)
SG2(i; !) =4kT�
C!c(i)(4.20)
SG3(i; !) =4kT�
C!c(i)
We can now write an iterative procedure for computing the noise power at the
output of the ith lowpass section:
SLP (i; !) = [SLP (i� 1; !) + SG(i; !)] jHLP (i)j2 (4:21)
where HLP (i) is the �lter response of the ith lowpass section. The input to the ith
double-bandpass section is the ith lowpass �lter. Computing the output noise in
124
two steps we have
SBP1(i; !) =hSLP (i; !) + SG1(i; !) +Q2
3SG3(i; !)ijHBP (i; !)j2
+SG2 jHLP2(i; !)j2 (4.22)
SBP2(i; !) =hSBP1(i; !) + SG1(i; !) +Q2
3SG3(i; !)ijHBP (i; !)j2
+SG2 jHLP2(i; !)j2
Assuming ideally-matched self-biased transconductors exhibiting only white
thermal noise, Fig. 4.8(a) shows the theoretical output noise power spectrum of
the entire 16-channel �lter cascade with parameters Q3 and frequency range as
before. The output channel noise is dominated by the quality factor of the second-
order sections. Integrating the power spectrum across all frequencies, the output
channels have almost constant RMS noise � 0:105 mV, as shown in Fig. 4.8(b).
102
103
104
10−1
100
101
(a) Frequency [Hz]
Noi
se [u
V/H
z^.5
]
102
103
104
102
104
106
(b) Frequency [Hz]
RM
S N
oise
[uV
]
Figure 4.8: (a) Power spectral density of noise in 16-channel cochlear �lter bankwith C = 5:0 pF and other parameters as earlier de�ned, and (b) RMS noise as afunction of center frequency.
125
4.4 Information Rate and Power Dissipation
The maximum number of bits per second, or channel capacity, can be computed
from the dynamic range of an analog continuous-time �lter using information the-
ory (Shannon, 1948). In Chapter 5, we derive the following result:
C = fp log2
�1 +
6
�eDR
�(4:23)
where fp is the bandwidth of the �lter. The above equation applies to linear
systems under additive Gaussian noise conditions. It assumes a peak amplitude
constraint, which is appropriate for circuits which must operate within a certain
voltage range to avoid distortion and clipping.
Computing the exact distortion for the entire �lter structure is beyond the
scope of this work. However, we can estimate the dynamic range of each channel
using the distortion measure of just a single integrator. Allowing a maximum of
2% distortion, the maximum RMS input signal is only 6:43 mV for a normally
distributed input. The peak gain of each output channel is approximately 0.42,
due to overlap between the lowpass and bandpass �lters. As such, the distortion-
limited dynamic range for each channel is approximately (6:43 � 0:42=0:105)2 or
28.2 dB.
We can estimate the maximum information rate from (4.23), noting that the
message bandwidth fp is approximately !c=(2�Q3) for each channel. Assuming
non-overlapping channels and independence between channels, the total maximum
information rate is
Ctot =NXi=1
!c(i)
2�Q3(i)log2
�1 +
6
�eDR(i)
�(4:24)
For the parameter values chosen earlier, the system capacity is calculated as
68 kbits/sec.
126
The current consumption in the �lter bank, not including that needed for tun-
ing, which, admittedly, can be quite large, is theoretically 237 nA. Using a 1.5-Volt
power supply, the total power dissipated is approximately 355 nW. Finally, we es-
timate the maximum information rate per Watt, or number of bits per Joule, as
0.19 bits/pJ.
4.5 Signal Power Distribution
We restrict outselves to the design of constant-Q �lter banks. In this state the
linear �lter bank approximates wavelet analysis in a scale domain that preserves
good temporal resolution (Liu et al., 1993).
Given a particular input signal frequency distribution, SV in(!), it seems rea-
sonable to divide the signal power among the N output channels uniformly. Let
us assume that
SV in(!) � V 2o =! (4:25)
within the frequency band !(N){!(1) of interest, where V 2o is a constant of pro-
portionality.
A set of �lter parameters which spreads the signal power evenly across all
channels can be expressed as
!c(i) = !c(1)N�iN�1!c(N)
i�1N�1
Q3(i) =
24 !c(1)
!c(N)
! 12(N�1)
� !c(N)
!c(1)
! 12(N�1)
35�1
(4.26)
where !c(i) and Q3(i) are the center frequency and quality factor for channel i.
2 This set of �lter parameters is constant-Q, since the equation for Q3(i) is not
2We wish to make a �ne but important distinction between the parameters of the �lter bank,i.e. center frequency and quality factor of the �rst- and second-order sections, and the �lter bankbehavior. In particular, the center frequency of the output channel will be lower than that ofthe individual �lters, because the lowpass cascade introduces a skew to the output �lter shape.In addition, the output channel quality factor will be greater than that of the individual second-
127
a function of i. It also spans the entire frequency range from !c(N){!c(1). The
e�ective 3 dB-down bandwidth of channel i can be expressed as follows:
BW3(i) =q!c(i� 1)!c(i)�
q!c(i)!c(i+ 1) (4:27)
For the moment, ignore the two extreme channels, i = 1 and i = N . We assume
that the bandlimits of channel i are the geometric mean of that channel with its two
neighboring channels, i.e.q!c(i� 1)!c(i) for the upper limit and
q!c(i)!c(i+ 1)
for the lower limit.
Then the signal power at channel i is found by integrating the power spectrum
of the input signal over the bandwidth of channel i, i.e.,
V 2in;s(i) =
1
2�
Z p!c(i+1)!c(i)
p!c(i+1)!c(i)
A(i)2V 2o
!d! (4:28)
A(i) is the attenuation of channel i, and is assumed at across the frequency band.
Integrating we �nd
V 2in;s(i) =
A(i)2V 2o
4�ln
!c(i� 1)
!c(i+ 1)
!(4:29)
Substituting !c(i) from the equation above, we obtain
V 2in;s(i) =
A(i)2V 2o
2�(N � 1)ln
!c(1)
!c(N)
!(4:30)
Provided that A(i) is the same across each channel, the output power will be the
same.
We see from the above equation that it is important that the shape and peak
frequency of each channel be approximately constant. It is for this reason that we
introduce preliminary lowpass �lters into the design.
An attempt was made to verify the assumption that speech follow a 1=f for
the frequency range of interest. Using all of the training data from the TIMIT
order sections, because there are two such �lters in cascade. We are unable to derive an analyticexpression relating the parametric description to the behavioral description. In this section, weare describing only the behavioral description.
128
database, we computed the average short-term spectrum. The result is shown in
Fig. 4.9. A simple parametric �t for the data (marked with x's) is a broad bandpass
�lter with center frequency of 550 Hz. We conclude that a 1=f model is not wholly
appropriate for speech.
101
102
103
104
−60
−50
−40
−30
−20
−10
freq, Hz
|H(f
)|
Figure 4.9: Short-term averaged power spectrum taken from the training set ofthe TIMIT database normalized by the standard deviation. Experimental dataare marked by x's. The dotted line is a �t using a bandpass �lter with centerfrequency 550 Hz and Q � 1.
4.6 Results and Discussion
Several versions of this �lter bank architecture have been fabricated using transcon-
ductor designs described in Chapter 3. At the time of this writing, they are func-
tional, but not fully tested.
The problem of optimal extraction of information from sensory signals by \real"
computing hardware in terms of maximum information rate per unit of power con-
sumed has not been completely resolved in this work. Rather, having experimented
with one transconductor circuit design and one architecture, we leave open the
possibility of future improvements by way of: (1) integrators with higher dynamic
129
range and/or lower power consumption, (2) enhanced �lter architectures, and (3)
non-linear adaptation.
In the following chapter we explore fundamental limitations in low-power in-
formation processing systems, such as the basilar membrane, by comparing analog
and digital implementations of a simple delay function. Our goal is to begin to
answer the question of why and how to process signals in an analog format.
Chapter 5
Comparison of Continuous and
Discrete Circuits
A fundamental question in information processing is: What is the most power
e�cient representation for processing information? For example, most, if not all,
speech recognition systems pass the output of the microphone to an anti-aliasing
�lter and then to an analog-to-digital converter. Henceforth, all information pro-
cessing is performed in the digital domain. In particular, the extraction of features
and the classi�cation of these features into phonetic segments is performed with
digital hardware.
It is our hypothesis that the e�cient extraction of features from speech can be
performed using low-power, low-precision hardware. Classi�cation, on the other
hand, is perhaps a task best suited to digital hardware, where memory and band-
width requirements are enormous.
In this chapter we begin to explore some of the fundamental di�erences between
analog and digital information processing systems. Following the paradigm set
forth by Hosticka in his paper (Hosticka, 1985), we will compare four di�erent
information processing systems performing a very basic task. We l relate the
maximum signal-to-noise ratio, or dynamic range, to power dissipation. And then
we will compute the maximum information rate, or capacity, as a function of power.
130
131
Comparing these systems, we hope to shed light on the question of when and where
it is advantageous to use an analog rather than a digital signal representation.
The types of processing tasks that we ultimately wish to consider are those
found in area of sensory communication. It is generally believed that the human
nervous system does not behave exactly like a digital computer. Indeed, at the
cellular level, many neurons could be classi�ed as discrete in value, that is, ei-
ther spiking or not spiking, but not discrete in time. To date, much research has
been undertaken to solve problems such as automatic speech recognition or image
recognition using only digital computers. It is at least possible, if not probable,
that power e�cient algorithms and architectures will emerge as we explore sig-
nal representations that are more common to biology, the best speech and image
recognition system in the world.
5.1 Capacity
The maximum information rate of error-free transmission that can be processed is
limited by the system capacity C which is given by the signal-to-noise ratio S=N
and the message bandwidth fp, as de�ned by the Hartley-Shannon law (Shannon,
1948)
C = fp log2(1 + S=N) (5:1)
where S and N are the average signal and noise power, respectively. The dimension
of C is bits per second. This law applies to systems having an average signal power
constraint S within Gaussian noise of power N . According to (Shannon, 1948), in
order to approach this limiting rate of transmission, the transmitted signals must
approximate, in statistical properties, white noise.
For low-voltage signal processing circuits, perhaps a more common constraint
is peak amplitude, or peak power. In particular, let us assume that the signal is a
132
voltage which operates in the region from 0 to V Volts. Let the reference voltage,
or signal ground, have the value V=2. Then the signal is constrained to fall in the
range �V=2, with peak power Speak = V 2=4. 1
In the case of a peak power constraint, the equation for the channel capacity
C for a frequency band fp perturbed by white thermal noise of power N is given
by (Shannon, 1948)
C = fp log2
�1 +
2SPeak�eN
�(5:2)
where the instantaneous signal power is limited to Speak at every sample point. 2
The maximum entropy occurs if the samples are independent with a distribution
function which is uniform in the range �qSpeak to +
qSpeak. In our case, the
signal is limited to the range of 0 to V . Therefore, peak signal power is V 2=4 since
maximum amplitude range is �V=2.Let the input signal be uniformly distributed in the range 0 to V sampled at
time-intervals of 1=(2fp) and bandlimited to fp. Then the average signal power S
is approximately V 2=12. 3 Therefore we establish the relation
Speak = 3S (5:3)
If the samples are independent of one another, the power spectrum of the input
will appear almost white for long time intervals. Its power spectral density is
V 2in
�f=
V 2
12fp(5:4)
Substituting Speak into (5.2) we get
C = fp log2
�1 +
6
�eS=N
�(5:5)
1Interestingly, a current signal does not generally su�er the same constraint and might bemore appropriately modeled using an average power constraint.
2Sample points are assumed to be taken at the Nyquist rate, i.e. every 1=(2fp) seconds.3We consider the mean-value of the input signal V=2 to carry no information.
133
This is the equation for the capacity of a system subject to a peak power constraint,
Speak.
The dynamic range is de�ned as the maximum possible signal-to-noise ratio.
Therefore, we can write (5.5) in the form
C = fp log2
�1 +
6
�eDR
�(5:6)
This equation neglects an important facet about the input signal. While it is uni-
formly distributed in the range 0 to V at the sample point, it will not be uniformly
distributed outside the sample points. More importantly, it is not constrained to
lie in the region 0 to V volts. 4
5.2 Four Signal Representations
There are four types of signal representations that we distinguish in this work,
ranging from what is commonly known as analog to what is commonly known
as digital. They are: continuous-value continuous-time (CVCT), continuous-value
discrete-time (CVDT), discrete-value continuous-time (DVCT), and discrete-value
discrete-time (DVDT).
Examples of these four circuits are given in Fig. 5.1. These circuits implement,
as near as possible, the same function { a simple delay. They are the subject
of study for the remainder of this chapter. In Fig. 5.1(a), a �rst-order lowpass
�lter is an example of a CVCT delay. We shall also refer to this circuit-type as
analog. Fig. 5.1(b) shows a CVDT delay function. This circuit type is often called
switched-capacitor, or, more simply, switched-cap. A delay function implemented
in a DVCT circuit is given in Fig. 5.1(c). It consists of an RC circuit with a
comparator. We will also refer to this type as time-domain.5 In this work we
4Histograms of randomly-generated input signals indicate that the peak amplitude is approx-imately two times larger for samples taken at rates much higher than the Nyquist rate.
5Wewish to distinguish between time-domain circuits and asynchronous logic. In time-domain
134
assume binary-valued signaling. The last type of circuit is DVDT, as shown in
Fig. 5.1(d), which is a parallel synchronous register.
Using such a simple computational element, the delay function, we hope to
illuminate some of the trends associated with these four circuit techniques. Un-
fortunately, there is no simple equivalent to a pure delay in CVCT circuits; the
lowpass is possibly the fairest approximation. Of the four circuit types, possi-
bly the least-well understood is the time-domain processing. Interestingly, time-
domain processing appears to be an important means of communicating in neural
pathways between the cochlea and the brain (Rice et al., 1995).
In the case of the analog and switch-cap circuits in Fig. 5.1, the brick-wall
noiseless �lter de�nes the message bandwidth fp. The bandwidth of the DVDT
system is set by the sampling rate fc. In the case of the two discrete-time systems,
the input signals must be band-limited by an anti-aliasing �lter below the Nyquist
rate. For the case of the DVCT system, events have an average rate of ft.
5.2.1 Continuous-Value Continuous-Time
The noise spectral density of a nominal resistor of value R (one-sided) is
V 2n
�f= 4kTR (5:7)
where k is Boltzman's constant and T is absolute temperature.
An RC lowpass �lter has transfer function
H(j2�f) � Vc(j2�f)
Vi(j2�f)=
1
1 + j2�f�(5:8)
where � = RC. The phase and square magnitude are
jH(j2�f)j2 =1
1 + (2�f� )2(5.9)
circuits the information is contained in the time between events, as opposed to asynchronous logic,where the state of the event is of primary importance.
135
Rlp
C
VoutVin
Noiseless
Brick-Wall
FilterC
Noiseless
Brick-Wall
Filter
Anti-
Aliasing
Filter
VoutVin
Rt
C
Vin1 Octave
Bandpass
Filter
1 Octave
Bandpass
Filter
Vout
VinAnti-
Aliasing
Filter
Vo1
Vom
Rsw
Vc
Vc
Vc
Cg
Cg
Analog /
Digital
Converter
Vc1
Vcm
(a)
(b)
(c)
(d)
.
.
....
Vd
Vd
Figure 5.1: (a) CVCT RC lowpass circuit, (b) CVDT sample{and{hold circuit, (c)DVCT RC delay circuit, and (d) DVDT clockedM -bit delay.
136
6 H(j2�f) = � arctan(2�f� )
� �2�f� + (2�f� )3
3� (2�f� )5
5+ :::
The noise equivalent bandwidth of an RC lowpass �lter is found by integrating the
square magnitude over the entire frequency range. We obtain
ENBW =Z 10
1
1 + (2�f� )2df =
1
4�(5:10)
The mean-square noise voltage on the capacitor C is then
V 2c;n = ENBW � 4kTR =
kT
C(5:11)
The main point to be made here is that the mean-square noise on a capacitor is a
function only of the capacitance. We make use of this relation when we consider
other circuit types.
For the RC circuit, however, the presence of the capacitor has no e�ect on
the theoretical channel capacity. The reason is that a �ltering operation amounts
to no more than a coordinate transformation (Shannon, 1948). And because a
lowpass �lter is not absolutely bandlimiting, a su�ciently complex receiver can
detect signals which have frequency components above the cuto� frequency. Also
note that the white noise produced by the lumped resistor is �ltered in exactly
the same manner as the input signal. Therefore, the signal-to-noise ratio at each
frequency of the output signal is just the same as the ratio of the input signal
power to the thermal noise power.
Therefore, we can simplify our analysis considerably. Temporarily removing the
capacitor from Fig. 5.1(a), the output noise power is the resistive thermal noise
times the message bandwidth fp
V 2out;n = 4kTRfp (5:12)
independent of the cuto� frequency. The output signal power will be equal to the
input signal power, if the capacitor is removed. Assuming the signal is uniformly
137
distributed in the region 0 to V , that signal power is V 2=12, as stated earlier.
Therefore, the signal-to-noise ratio is
S=N =V 2
48kTRfp(5:13)
The above equation for signal-to-noise ratio is valid if the capacitor were re-
introduced in Fig. 5.1(a) because its e�ect on the signal power and the noise power
would cancel.
In order to compute the mean power dissipation, we derive the mean voltage
across the resistor and then divide by the resistance R. The transfer function of
the input voltage Vin to the voltage across the resistor VR is
H(j2�f) � VR(j2�f)
Vin(j2�f)=
jf=fo1 + jf=fo
(5:14)
where fo = 1=(2�RC). The power spectrum of the voltage drop across the resistor
is found by multiplying the input power spectrum in (5.4) by the square magnitude
of the transfer function. We obtain equation
V 2R
�f=
V 2in
�f
(f=fo)2
1 + (f=fo)2(5:15)
The power dissipated in the resistor is found by integrating over the message
bandwidth and dividing by R, as in
Pm =Z fp
0
V 2(f=fo)2
12Rfp (1 + (f=fo)2)df (5.16)
=V 2
12R
"1 � fo
fparctan
fpfo
!#
Writing the mean power dissipation as a function of the maximum signal-to-
noise ratio, we have
Pm = 4kTS=Nfp
"1 � fo
fparctan
fpfo
!#(5:17)
Similarly, one can write S=N as a function of Pm, as in
S=N =Pm
4kTfp
1h1� fo
fparctan
�fp
fo
�i (5:18)
138
If we compare the equation derived for S=N as found in (Hosticka, 1985) with the
above equation for the case of fp = fo, we �nd that (5.18) yields a S=N which is
6.7 dB higher. The discrepancy is largely found in the manner in which the mean
power dissipation was derived. The other author assumes that all of the input
signal power is dissipated in the resistor.
In addition, we wish to write the capacity as a function of the dissipated power.
Substituting (5.18) into (5.5), the capacity is
C = fp log2
0@1 + 6
�e
PM4kTfp
1h1� fo
fparctan
�fp
fo
�i1A (5:19)
Let W be the power-delay product. The delay in an RC circuits is theoretically
a function of frequency. If we suppose that the message bandwidth fp is much less
than fo, then the equivalent delay is � . One arrives at that conclusion by taking
a Taylor Series expansion of the actual phase, as in (5.10), and then truncating
after the �rst term. Note that the phase response of a pure delay �t is �2�f�t.Assuming that the delay in a lowpass �lter is approximately � = 1=(2�fo). we get
W =2kTS=N
�
fpfo
"1 � fo
fparctan
fpfo
!#(5:20)
5.2.2 Continuous-Value Discrete-Time
The CVDT circuit to be analyzed is the sample-and-hold circuit of Fig. 5.1(b) The
input signal is band-limited by an anti-aliasing �lter to frequency fs=2, where fs is
the sampling rate. It is assumed that the switch is closed for a period much longer
than the correlation time RswC=2. But since this is the fundamental requirement
for a complete charge transfer anyway it can be used quite safely. On the other
hand, because RswC=2 is small, the equivalent noise bandwidth necessarily exceeds
the Nyquist rate and the noise is aliased.
The thermal noise of the MOSFET switch is aliased by the sampling process
into the baseband (0 � fs=2), where fs is the sampling frequency. In order to
139
show this, we must convolve the noise spectrum at node Vc with a pulse train
at frequencies nfs, where n = 0;�1;�2; :::. At node Vc, the shape of the noise
spectrum is given by (5.10), where � is replaced by RswC. To get the noise shape
at node Vd, we compute
V 2d
�f=
1
1 + (2�f� )2�
1Xk=�1
�(k2�fc) (5.21)
=1X
k=�1
1
1 + (2�[f � kfc]� )2
From the summation, we see that the noise which is outside the baseband is aliased
into the baseband, and that, in fact, none of the noise escapes this aliasing. Since
the total noise on a capacitor is given by (5.11), it follows that the noise in the
baseband is also given by this equation and that it is almost at. Thus, the noise
spectrum at node Vd is
V 2d
�f=kT
C
2
fs(5:22)
Since the output node Vout is a �ltered version of Vd at frequency fp, the noise
power at the output is
V 2out;n =
2kTfpCfs
(5:23)
As before, we shall assume that the input signal is bandlimited to fp and is
almost white, in such a way that the peak power Speak is never exceeded at every
sample point, 1=(2fp). Note that if we sample at a rate fc which is greater than
the Nyquist rate, we will �nd that some of the samples will exceed the peak power
limitation.
The average input signal power S is approximately V 2=12, as stated earlier.
Dividing S by N in (5.23) we get
S=N =CV 2fs24kTfp
(5:24)
In order to compute the mean power dissipated in the switch, we need to com-
pute the average voltage di�erence between samples. We perform this computation
140
assuming that the input frequency is not phase-locked to the clocked frequency.
However, this is a slight violation of the assumption that the peak input power
does not exceed Speak. For now, we will live with this discrepancy.
Let Vc(n) be the sample at time n=fs and Vc(n+ 1) be the next. We want the
compute the expected power of the voltage di�erence (�Vc)2, where
(�Vc)2 � (Vc(n)� Vc(n+ 1))2 (5.25)
= (Vc(n))2 + (Vc(n+ 1))2 � 2(Vc(n)Vc(n+ 1))2
The �rst two terms are equal to the signal power. The last term is the input signal
autocorrelation function sampled at 1=fs.
Now, the input signal is approximately white, with average power V 2=12.
Therefore, the autocorrelation function R(� ) is a sinc function, as in,
R(� ) =V 2
12
sin(2�fp� )
2�fp�(5:26)
Substituting � = 1=fs, we get the average voltage di�erence power as
(�Vc)2 =V 2
6
1� sin(2�fp=fs)
2�fp=fs
!(5:27)
When a capacitor is charged from 0 to �Vc Volts, the energy stored on the
capacitor is C(�Vc)2=2 Joules. That same amount of energy gets dissipated in
the switch, no matter haw small the switch resistance (see the derivation leading
to (5.42).) Similarly, when the capacitor is discharged from �Vc to 0, C(�Vc)2=2
Joules are dissipated in the switch. The average power dissipated in switching
events occurring at a rate fs will be
Pm =C(�Vc)2
2fs =
CV 2fs12
1� sin(2�fp=fs)
2�fp=fs
!(5:28)
We can write the mean power dissipation in the channel, i.e. in the switch, as
a function of S=N as follows:
Pm = 2kTS=Nfp
1� sin(2�fp=fs)
2�fp=fs
!(5:29)
141
The power-delay product is equal to Pm times 1=fs, yielding
W =2kTfpS=N
fs
1� sin(2�fp=fs)
2�fp=fs
!(5:30)
The signal-to-noise ratio can be written as
S=N =Pm
2kTfp
1�1 � sin(2�fp=fs)
2�fp=fs
� (5:31)
If we compare the two equation for S=N as a function of power in (5.31) to that
obtained in (Hosticka, 1985) for the case that fp = 0:5fs, the equation derived here
is 6 dB higher. Again the main reason for the discrepancy comes in the manner in
which the other author computes the mean power dissipation.
The capacity can be expressed as
C = fp log2
0@1 + 6
�e
Pm2kTfp
1�1� sin(2�fp=fs)
2�fp=fs
�1A (5:32)
5.2.3 Discrete-Value Discrete-Time Circuit
For the case of the parallel M -bit register of Fig. 5.1(d), the signal-to-noise ratio
is a function of the number of bits used. The input samples must be quantized to
�t the �nite register length M ; herein lies the major source of noise.
Assume that the quantization noise Qn is uniformly distributed between the
two nearest quantization steps. Let the distance between quantization steps be 1
bit, or one count. Then the average quantization noise power is 1=12 square bits.
The signal power S is also computed assuming a uniformly distributed input
signal distribution with i.i.d. samples. With 2M levels, ranging from 0 to 2M � 1,
we have
S =2MXk=1
1
2Mk2 �
0@ 2MXk=1
1
2Mk
1A2
(5.33)
=2M(2M + 1)(2 � 2M + 1)
6 � 2M � 2M (2M + 1)
2 � 2M!2
=22M � 1
12
142
Combining the last two results, the S=N ratio is
S=N = 22M � 1 (5:34)
The above computations assume �xed-point, rather than oating-point arithmetic.
If fs now represents the the signaling rate at which the entire M -bit message
is transmitted, the digital system capacity is
C =Mfs (5:35)
The above equation is correct if we only consider the error introduced by quan-
tization noise. Should there be any additive noise in the channel, the binary
transmission will exhibit a certain bit error rate Pe. If we tried to reconstruct an
analog waveform from the received digital signal, we would �nd the resulting S=N
ratio to be lower.
Suppose that the probability of a single bit error is Pe and that the probability
of more than one error occurring in a single M -bit transmission is negligible. The
expected square distance between the sent and received digital signal D2 is
D2 = Pe12 + Pe2
2 + :::+ Pe(2M�2)2 + Pe(2
M�1)2 (5.36)
=M�1Xk=0
Pe(2k)2 = Pe
22M � 1
3
If we now sum the noise contributions of the quantization step and the distor-
tion introduced by the M -bit register, the total \noise" is 1=12 + Pe(22M � 1)=3.
Assuming the signal power to be approximately unchanged, we have
S=N =22M � 1
1 + 4Pe(22M � 1)(5:37)
In order for the distortion introduced by the M -bit register to be negligible, we
must satisfy the condition 4Pe(22M � 1) << 1.
Until now we have followed directly the formulation of Hosticka (Hosticka, 1985)
for the DVDT circuit. Presently, we introduce a weighted probability of error �,
143
where
� = 4Pe(22M � 1) (5:38)
If � is held constant, the degradation in S=N ratio due to one processing step, such
as an M -bit register, is held constant, independent of M . In this case the Pe for a
single gate must decrease as the number of bits increases.
Now we wish to relate the probability of error Pe to the power consumption in
theM -bit register. Let us consider a stream of binary symbols with two permissible
states with a voltage separation V . At the receiver we are interested in knowing
whether a pulse of �xed amplitude V is present or not within a certain time interval.
Assuming Gaussian noise V 2n which a�ects both states equally and the detection
threshold is set to V=2, the bit error rate is
Pe =Z1
V=2
1q2�V 2
n
exp
�x22V 2
n
!dx (5.39)
= erfc
0@ V
2qV 2n
1A
where erfc(x) � 1=p2�R1
x exp(y2=2) dy. 6
Consider a logic gate consuming no quiescent current and calculate the energy
in an elementary switching event WG.
WG =Z1
0Isw(t)Vsw(t) dt (5:40)
where Isw and Vsw are the current through and voltage across the active switch,
respectively. Let Rsw be the small but �nite resistance across the active switch
and Cg be the parasitic capacitance of the gate of the next logic gate. Then the
equations for charging the capacitor voltage from 0 to V Volts are
Isw(t) = V=Rsw exp(�t=�sw) (5.41)
Vsw(t) = V exp(�t=�sw)6Note that the complementary error function is sometimes de�ned in a di�erent manner.
144
where �sw = RswCg. Plugging into the equation for Wg and integrating we �nd
Wg =CgV
2
2(5:42)
The noise power in a single gate is V 2c = kT=Cg. Therefore, we have
Pe = erfc
�Wg
2kT
�1=2!(5:43)
Supposing that the inverse to the complementary error function exists, we have
Wg = 2kT
"erfc�1
�
4(22M � 1)
!#2(5:44)
If there are M bits in the register, and the clock is operating at the Nyquist
rate, then on average half of themwill be switching and half will remain unchanged.
In this case we have
Pm =MWg
2fs (5.45)
= kTMfs
"erfc�1
�
4(22M � 1)
!#2
where fs is the sampling rate.
Now the number of bits M relates to the S=N ratio of an analog signal by the
equation
M =1
2log2(1 + S=N) (5:46)
Substituting into (5.46) we obtain
Pm =kTfs2
log2(1 + S=N)
"erfc�1
�
4S=N
!#2(5:47)
Solving for S=N as a function of Pm appears di�cult, if not impossible. Therefore,
we will use numerical techniques to solve the above equation for a particular Pm.
Similarly, we can write an equation for Pm as a function of the capacity, whereas
the inverse looks very di�cult. Substituting M = C=fs, we get
Pm = kTC"erfc�1
�
4(22C=fs � 1)
!#2(5:48)
145
The equation for the energy dissipated per M -bit digital transmission is just
Pmfs, which can be computed easily from the above equations. The lower bound
on the amount of energy dissipated in a digital gate was derived by Landauer. He
estimates that dissipation of the order kT per logic step is required owing to ther-
modynamic limits (Landauer, 1961). How close we operate to the thermodynamic
limit will directly in uence the probability of error.
5.2.4 Discrete-Value Continuous-Time
Since the DVCT circuit of Fig. 5.1(c) was not discussed by Hosticka (Hosticka,
1985), care must be taken in order to adequately formulate the problem. Suppose
there is a binary source. At certain instants in time, the source changes state,
from either 0 to 1, or 1 to 0. The two important pieces of information are the
time between transitions and the direction of the transition. This type of signal
representation is generally referred to as zero-crossings. Logan's theorem states
that if a signal is strictly bandlimited to within one octave, then a signal can be
completely reconstructed from its zero-crossings to within a constant. Therefore,
we have included a noiseless single-octave bandpass �lter at the input in Fig. 5.1(c).
In the circuit realization, suppose the source begins in state 0, outputting 0
Volts. When the �rst event occurs, it switches state, outputting V Volts. We add
two constraints to the source. The �rst is that the minimum time between discrete
events is tmin seconds. If we like, we can set tmin to 0. The second constraint is
that the average time between transitions is tav.
Two questions we must answer. What is the e�ect of the noise in the resistor
on the arrival time of the transition? In other words, what is the jitter? Also,
what is the distribution of the source which maximizes the entropy over such a
channel, that is, what is the capacity of this channel?
The major theoretical results of the DVCT channel are found in section 5.4.
146
They can be summarized as follows:
1. The jitter, or noise in the DVCT circuit can be approximated as Gaussian,
under the constraint V=� >> kT=C.
2. The maximuminformation rate measured in bits/second of any DVCT source
under an average power constraint is a source for which the time between
transitions follows an exponential distribution.
3. Using the entropy-power inequality, a lower bound on the capacity of the
DVCT circuit is derived which becomes tight as the noise power is reduced.
Not coincidentally, the capacity lower bound is reached for the DVCT source
which follows an exponential distribution.
The energy dissipation per transition in the DVCT circuit is
W =CV 2
2(5:49)
as derived for the digital circuit, provided that the transition is complete. The
power dissipation is W divided by the average time between transitions, or
Pm =CV 2
2tav(5:50)
where tav is the average length between transitions.
The signal power is the variance of the input distribution. In order to approach
the lower bound on the capacity of the DVCT circuit, the input must follow an
exponential distribution. Its signal power is
S =Z1
tminx2pX (x) dx�
�Z1
tminxpX(x) dx
�2= (tav � tmin)
2 (5:51)
The noise power is just two times t2d;n, as derived in (5.62) in section 5.4, or
N = �2 =8kT� 2
CV 2(5:52)
147
So the signal-to-noise ratio is
S=N =(tav � tmin)
2
�2=
(tav � tmin)2CV 2
8kT� 2(5:53)
Also from section 5.4, a lower bound on the capacity of the DVCT circuit given by
C00 = 1
2tavlog2
�1 +
e
2�S=N
�(5:54)
The signal-to-noise ratio can also be written as a function of the power dissipation.
In this case, we have
S=N =(tav � tmin)2tavPm
4kT� 2(5:55)
At this point it is more convenient to covert all units of seconds into frequencies.
Let fav = 1=tav, fo = 1=(2�� ), and fmax = 1=tmin. Then we have
S=N =(fmax � fav)2f2o�
2PmkTf2maxf
3av
(5:56)
Substituting this expression into that of the approximate capacity, we obtain
C00 = fav2
log2
1 +
e�
2
(fmax � fav)2f2nPmkTf2maxf
3av
!(5:57)
One restriction to the above analysis is that it does not take into account the
possibility of spurious transitions, i.e. the noise signal never has a magnitude equal
to V=2. If one wishes to operate at extremely low, or even negative signal-to-noise
ratios, the above formulation must be augmented to account for the possibility of
spurious transitions. We believe that they can be treated much the same way as a
digital error, except that the probability of error must be integrated over the time
interval.
5.3 Graphical Results
In order to compare the formulation of Hosticka with the one presented in this
Chapter, we have duplicated two of the most signi�cant graphs from the work
148
of Hosticka (Hosticka, 1985). We include the parameters chosen in his analysis,
and wherever possible have attempted to choose similar conditions in the present
reformulation. A notable exception is the choice of �, where � is de�ned by the
relation (Hosticka, 1985)
Pe = 2�2�M (5:58)
In displaying his results, Hosticka chooses � = 100, which, for a modest 4-bit
register, results in a probability of error for a single gate equal to 150E � 243.
That value does not seem realistic for state-of-the-art digital circuit design.
On the other hand, we have selected probability of errors on the order of 1E�14,which seems more appropriate for digital technology. The parameter � of Hosticka
can be directly related to the weighted probability of error � introduced in our
work via the equation
� = 2�2�M+2(22M � 1) (5:59)
10−10
10−8
10−6
10−4
0
10
20
30
40
50
60
70
80
1
2
3
4
5
6
7
8
9
10
11
Mean Power Pm (W)
Sig
nal−
to−
Noi
se R
atio
S/N
(dB
)
Figure 5.2: Signal-to-noise ratio as a function of mean power. Results from (Ho-sticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed (2fp = fs = 100 MHz),and DVDT (� = 100, 2fp = fs = 100 MHz) Number of bits.
149
10−12
10−10
10−8
10−6
10−4
108
109
1
4
7
10
1316
1922
25
Mean Power Pm (W)
Cap
acity
(B
its/S
)
Figure 5.3: System capacity as a function of mean power dissipation. Resultsfrom (Hosticka, 1985). CVCT Solid (fp = 100 MHz), CVDT Dashed (2fp = fs =100 MHz), and DVDT (� = 100, 2fp = fs = 100 MHz) Number of bits.
5.4 Detailed Analysis of the DVCT Circuit
We have reserved until now some of the detailed analysis of the DVCT circuit
under question. The three main results follow.
5.4.1 Jitter in a DVCT Channel
Assume all nodes in the channel are at 0V. Let the input source transition from 0
to V at time t = 0. Assuming a noiseless channel, the voltage at the node of the
capacitance is then Vc(t) = V (1� exp(�t=� )). However, the resistor R introduces
noise. A lumped resistor can be viewed as a white gaussian voltage source in series
with a noiseless resistor. Let Vc;n be the noise at node Vc. Its spectral shape is
that of a lowpass �lter, while its mean-square value is kT=C. Using the principle
of superposition, we sum the e�ect of the two sources together to obtain
Vc(t) = V�1 � exp
�� t
�
��+ Vc;n (5:60)
150
10−12
10−11
10−10
10−9
10−8
0
10
20
30
40
50
60
1
2
3
4
5
6
7
8
Mean Power Pm (W)
Sig
nal−
to−
Noi
se R
atio
S/N
(dB
)
Figure 5.4: Re-formulated signal-to-noise ratio as a function of mean power (fp =0:5fs = 100 MHz, � = 1E � 12). CVCT Sold, CVDT Dashed, DVDT Number ofbits.
where � = RC.
Let the comparator have a threshold V=2. To solve for td, the time at which
the comparitor makes a transition, equate (5.60) with the threshold. We �nd
td = ��ln 2 � ln
�1 +
2Vc;nV
��(5:61)
Using the approximation ln(1 + x) � x for small x, we �nd that the average value
of td is � ln 2, while the mean-square noise of td;n is approximately
t2d;n �4V 2
c;n�2
V 2=
4kT� 2
CV 2(5:62)
Note that for this analysis to hold, one must wait at least 4� for the voltage Vc to
settle to either its low or high state before the source makes the next transition.
Thus, tmin = 4� . If the symbols we wish to communicate are the time intervals
between transitions, the noise power will then be twice t2d;n.
151
10−12
10−10
10−8
10−6
108
109
1
2
3
45678
Mean Power Pm (W)
Cap
acity
(B
its/S
)
Figure 5.5: Re-formulated system capacity as a function of mean power (fp =0:5fs = 100 MHz, � = 1E � 12). CVCT Solid, CVDT Dashed, DVDT Number ofbits.
5.4.2 Di�erential Entropy for a DVCT Source
A standard de�nition of di�erential entropy of a random variableX can be rendered
in units of bits/symbol, as in
h2(x) = �Z1
�1
pX(x) log2 pX(x) dx (5:63)
where pX (x) is the probability distribution of X. On the other hand, we prefer the
units of di�erential entropy to be in bits/second. In a discrete-time channel the
conversion is simple: divide h2(x) by the time each symbol occupies the channel,
which is the reciprocal of the clock rate. For a discrete-value continuous-time
source, the time occupied by each symbol is the symbol.
Suppose there is a discrete-value continuous-time source which can take one of
two possible values, 0 or V . The symbols that the source produces are the time
between successive transitions. Each symbol occupies the channel for the length of
that symbol. Thus the average time each symbol occupies the channel is just the
152
average length of each symbol. Therefore, to convert h2(x) to units of bits/second,
we divide it by the average length of each symbol. Let ~h2(x) denote the di�erential
entropy of a DVCT source in units of bits/second, where
~h2(x) =� R10 pX(x) log2 pX(x) dxR
1
0 xpX(x) dx(5:64)
We impose two physical constraints on our DVCT source. These constraints
actually arise from considerations of the DVCT channel. The �rst is that transi-
tions cannot occur less than tmin seconds apart. If later we choose to relax this
constraint, we can set tmin = 0. The second constraint is that the average num-
ber of transitions per unit time is constant. Each transition consumes a certain
amount of energy to charge or discharge the gate capacitance. Thus, �xing the av-
erage number of transitions per unit time is equivalent to �xing the average power
dissipation in the channel. Because the symbols are the time between transitions,
the average number of transitions per unit time is just the average symbol length.
Let tav be the average symbol length. Then
tav =Z1
tminxpX(x) dx (5:65)
where pX(x) is a probability density in the range tmin to 1.
Combining the constraints with the de�nition of di�erential entropy in units of
bits/second, we now pose our problem as
maxpX(x)
~h2(x) = maxpX(x)
� R1tmin pX(x) log2 pX(x) dxR1
tminxpX(x) dx
(5:66)
Now we prove that the probability distribution which maximizes the di�erential
entropy under these constraints is exponential of the following form:
pX(x) =1
tav � tminexp
�� x� tmin
tav � tmin
�(5:67)
It satis�es the constraint (5.65).
153
Let X be a random variable with probability distribution pX (x), and Y be
a second random variable that follows any probability distribution, qY (x) in the
region (tmin;1) and which also satis�es the constraint (5.65). We will compute
the di�erential entropy of X and then the di�erential entropy of Y , We then show
that ~h2(y) � ~h2(x) � 0, with equality if and only if qY (x) = pX(x) at every value
x.
~h2(x) =� R1tmin pX(x) log2 pX (x) dxR
1
tminxpX(x) dx
=
R1
tminpX(x) log2(1=pX (x)) dx
tav
=
R1
tminpX(x) [ln 2 log2(tav � tmin) + (x� tmin)=(tav � tmin)]= ln 2 dx
tav
=log2(tav � tmin) + 1= ln 2
tav=
log2 e(tav � tmin)
tav(5.68)
It is easily veri�ed using the same approach as above that
~h2(x) =log2 e(tav � tmin)
tav=� R1tmin qY (x) log2 pX(x) dxR
1
tminxqY (x) dx
(5:69)
Computing the di�erence of the di�erential entropies, we have
~h2(y)� ~h2(x) =� R1tmin qY (x) log2 qY (x) dxR
1
tminxqY (x) dx
� � R1tmin qY (x) log2 pX(x) dxR1
tminxqY (x) dx
=
R1
tminqY (x) log2(1=qY (x))
tav+
R1
tminqY (x) log2 pX(x) dx
tav
=
R1
tminqY (x) log2(pX(x)=qY (x)) dx
tav
�R1
tminqY (x)(1� pX(x)=qY (x)) dx
ln 2tav= 0 (5.70)
with equality if and only if qY (x) = pX(x).
Therefore the di�erential entropy of a source, under the constraints that the
symbols are larger than tmin � 0 and that the average rate is tav > tmin, is given
by an exponential distribution.
154
5.4.3 Approximate Capacity of DVCT Channel
The capacity of the DVCT system with random input symbol X and random
output symbol Y is de�ned as follows:
C = maxpX
I(X ^ Y )X
(5:71)
where I(X ^ Y ) is the mutual information between X and Y and X is the mean
value of X.
Let Y = X +N , where N is Gaussian noise (not necessarily white) with mean
zero and variance �2. Then (Shannon, 1948)
I(X ^ Y ) = h2(y)� h2(yjx)
= h2(y)� h2(n) (5.72)
where
h2(n) =1
2log2
�2�e�2
�(5:73)
Let the source have a probability distribution pX(x). We constrain the source so
that its average symbol rate is �xed, as in,
tav =Z1
0xpX(x) dx (5:74)
If we include these properties of the channel into the computation for the channel
capacity, we have
C = maxpX
h2(y)� h2(n)
tav(5:75)
Thus, to maximize the capacity of the DVCT channel, we need to maximize the
di�erential entropy of the output signal Y with respect to the probability density
of X. This maximization is quite di�cult to carry out. Therefore, we look for a
lower bound that becomes tight as the power of the noise goes to zero.
The entropy-power inequality states that (Blahut, 1987)
h2(y) � 1
2log2
�22h2(x) + 22h2(n)
�(5:76)
155
Re-writing the right-hand-side in order to more easily view the entropy of X, we
have
h2(y) � h2(x) +1
2log2
�1 + 2�2(h2(x)�h2(n))
�(5:77)
Now we are in a position to de�ne an approximate capacity C0 � C for the
DVCT channel,
C0 = maxpX
h2(x)� h2(n) +12log2
�1 + 2�2(h2(x)�h2(n))
�tav
(5:78)
The third term in (5.78) will in general be small compared to h2(x) � h2(n); as
such, we simplify our computation still further. Let C00 � C0 � C be de�ned as
C00 = 1
2tavlog2
�1 + 2�2(h2(x)�h2(n))
�+max
pX
h2(x)� h2(n)
tav(5:79)
Now the maximization is just with respect to the di�erential entropy of X. Sub-
stituting h2(x) = log2 e(tav� tmin) from the previous subsection and (5.73), we can
write the approximate capacity for the DVCT channel as
C00 = 1
2tavlog2
1 +
e
2�
(tav � tmin)2
�2
!(5:80)
The units of C00 are bits/second.
Chapter 6
Summary and Future Research
In Chapter 2, we presented a framework for computing the dynamic range of
a CMOS transconductance-C integrator, paying close attention to the topics of
input signal statistics, sources of noise, measures of distortion, and methods for
computing dynamic range. Our presently unreached goal is to optimize the cir-
cuit realization of an analog �lter in terms of its achievable dynamic range, given
constraints on power consumption. However, it is an important step in the right
direction, as with this mathematical framework in hand, it is possible to compare
two �lter designs in terms of their achievable dynamic range.
In our analysis, we have assumed that speech signals can be adequately de-
scribed by a normal distribution. Without constraints on the input signal range,
the double-gamma distribution, which is a better description of speech, does not
converge for the integral calculations of section 2.4.1. In future work, we plan to
introduce a clipped version of the double-gamma distribution to alleviate conver-
gence problems.
One might ask which distortion measure is more appropriate for speech pro-
cessing. The main advantage of the �rst mean-square distortion measure is that
it requires no a priori information of the input signal. In addition, it provides a
worst-case scenario since gain error is ignored. The main disadvantage of the �rst
distortion measure is that it results in a low value for the linear range of nonlinear
156
157
functions with convex or concave �rst derivatives, such as sinh(x) or tanh(x).
The second distortion measure, on the other hand, requires a model of the input
signal. The model, which in our experiments consists of a single-parameter distri-
bution, is needed to compute the optimal gain factor. The bene�t of computing
the optimal gain factor is an improved linear range because gain error is removed.
Most, if not all, commonly used distortion measures either discount gain error, or
at least distinguish it from harmonic and intermodulation distortion error. Thus,
in order to obtain a higher linear range and to be able to make direct comparisons
with other common distortion measures, we favor the second distortion measure
over the �rst.
Chapter 3 introduces and analyzes several CMOS transconductor designs op-
erating in the subthreshold region. At least three of them have never been used
in subthreshold design. These three techniques, which are promising for use in
low-power continuous-time �ltering applications, are 1) source degeneration us-
ing a single di�usor, 2) source degeneration using double di�usors, and 3) multiple
asymmetric di�erential pairs. These linearizing schemes o�ered signi�cantly higher
current e�ciency, as compared with the basic transconductor or the transconduc-
tor with source degeneration via diode-connected transistors. The single di�usor
gave the highest linear range (116.8 mV); however, it requires extra common-mode
voltage circuitry. The double di�usors and two asymmetric di�erential pairs o�er
half the linear range, but no common-mode circuitry is required. Finally, three
asymmetric di�erential pairs gave the highest current e�ciency (36.4%) of all the
linearizing schemes discussed in this research, with only a modest increase in the
complexity of the circuit. Its linear range (98.1 mV) is comparable to that of the
single di�usor, whereas it requires no common-mode voltage circuitry.
The optimal scaling ratios of the di�usor circuits, which were m = 0:25 and
m = 0:5, are easy to implement in VLSI. On the other hand, the optimal scaling
158
ratios for the multiple asymmetric di�erential pairs were not rational (m = 2+p3
and m = 4 +p15). As such, their scaling ratio must be rounded to the nearest
convenient scale factor in a practical design.
One particularly useful property of the subthreshold transconductors analyzed
in this work is that linear range, current e�ciency and optimal transistor scaling
are independent of the bias current. In that way, a single layout can be used
repeatedly in a large scale system that consists of hundreds of transconductors
biased at current levels which vary over several orders of magnitude (Liu et al.,
1992b).
Further work needs to be done to characterize the tolerance of these designs to
structural variability, i.e. mismatch in transistor parameters.
Just as Tanimoto et al (Tanimoto et al., 1991) extended the technique of two
asymmetric di�erential pairs to three or more, so we believe that there is an ex-
tension to the technique of source degeneration via double di�usors. The analysis
and optimization of these circuits promises to be very challenging.
Chapter 4 applies the techniques and circuits of the two previous chapters to
the silicon implementation of a proposed cochlear model (Liu, 1992) using analog
very-large-scale integrated circuit technology. The model is static, i.e. linear,
whereas we recognize the need for adaptation even at early stages of processing.
Two properties of a real basilar membrane, which are interrelated, are not
included in our current architecture. The �rst is that the real cochlea has an
enormous dynamic range of 100 dB or more. In the linear circuits described in
this work, using typical component and parameter values, we estimate a dynamic
range of roughly 40-60 dB for the self-biased transconductance-C integrator. It is
unlikely that further optimizations will increase the dynamic range more than a
few dB unless we are willing to expend enormous amounts of power and area. On
the other hand, one can argue that speech communication is generally performed
159
over an acoustic level not exceeding 60 dB, in which case the �nest linear model
may su�ce in real applications.
The second property of a real basilar membrane is that it contains nonlinear
signal processing. In particular, as the amplitude of the acoustic signal increases,
the membrane becomes more sti�, changing its �ltering properties dramatically.
Future analysis will attempt to quantify these issues, looking at models and sim-
ulations to qualitatively capture the two properties just mentioned.
Chapter 5 discusses limitations in information processing using continuous
and discrete systems. Our analysis does not consider the power dissipated in the
encoder or decoder. As a case in point, the Brick-Wall noiseless �lter of Fig. 5.1(a)
can be viewed as the �rst step of an analog decoder. Similarly, the quantizer in
Fig. 5.1(d) can be viewed as a very basic digital encoder. Thus, we have ignored
power dissipation outside of the channel.
Additionally, our analysis also does entertain the possibility of applying error-
correcting codes to the digital message before transmitting. Then it would be
possible to run the register at a higher clock rate, with much greater noise, and still
have the same capacity. On the other hand, the process of encoding and decoding
introduces major sources of power dissipation. These issues will be addressed in
future research.
A future trend in digital technology is the movement toward multi-level or
multi-valued logic. In other words, digital signals will begin to look surprisingly
more analog, although still maintaining discrete-time. On the other hand, one
could envision the advent of multi-resolution analog processing, whereby an in-
coming signal is broken down and processed according to its scale. In the auditory
periphery, for example, there are nerve �bers which are sensitive at low sound
pressure levels and those which are sensitive at high sound pressure levels. Thus,
with low dynamic range (approximately 30 dB) components, one can construct a
160
system with a much broader dynamic range.
Several circuit techniques have not been explored in detail in this work {
MOSFET-C, current-mode processing, and log-domain processing, to name a few.
In a way, each technique deserves the same consideration as the transconductance-
C technique for implementing low-power continuous-time linear �lter banks.
A measure of goodness is proposed for circuit design, bits/sec/watt. It is not
attering to analog design in general, where dynamic range, not information ca-
pacity, is directly proportional power dissipation. Taking the logarithm of the
dynamic range means that, in general, low power designs will result in the most
e�cient processing schemes.
The problem of optimal extraction of information from sensory signals by \real"
computing hardware in terms of maximum information rate per unit of power con-
sumed has not been completely resolved in this work. Rather, having experimented
with one transconductor circuit design and one architecture, we leave open the
possibility of future improvements by way of: (1) integrators with higher dynamic
range and/or lower power consumption, (2) enhanced �lter architectures, and (3)
non-linear adaptation. In the future, we hope to address these three issues.
Appendix A
MOS Technology
Very-large scale integrated circuit fabrication facilities are expensive. In recent
years, fabrication of VLSI circuits has become readily accessible to universities and
organizations that have no fabrication facilities of their own. This availability is
accomplished through the MOSISTM Service established by the Defense Advanced
Research Projects Agency (DARPA) and the National Science Foundation (NSF).
MOSIS, located at the University of Southern California in Los Angeles, serves as
a silicon broker which collects a number of relatively small projects from di�erent
organizations and �nds a manufacturer that fabricates them on a silicon wafer.
Therefore the overhead fabrication cost, shared by many small projects, is greatly
reduced.
As of 1995, the most common (inexpensive) fabrication process o�ered by the
MOSIS foundry is the 2-�m CMOS technology,1 although 1.2-�m and 0.8-�m
processes are available at a mildly higher fee. Most of the circuits and chips
described in this dissertation are fabricated in a low-noise n-well double-poly BiC-
MOS process, in which the following devices can be made on the same substrate:
1) n-channel (in p-substrate) and p-channel (in n-well) MOSFETs; 2) vertical NPN
bipolar transistors; 3) capacitors using the two polysilicon layers; and 4) depletion-
mode FETs that can also be used to implement charge-coupled devices.
1The feature size, 2-�m, implies that the minimum channel length of MOSFETs, L, is 2�m.
161
162
Table A.1: List of MOS device parameters and quantitiesIDS drain-to-source currentVGB gate-to-bulk voltage
VSB source-to-bulk voltage
VDB drain-to-bulk voltage
VGS gate-to-source voltage
VBS substrate-to-source voltage
Vth threshold voltage, typically 0.7�1.0VVt kT=q, thermal voltage, about 26mV at 300�K
k Boltzmann's constant, 1:38 � 10�23 J/K
T temperature, in �K
q electron charge, 1:6 � 10�19 C
I0 current coe�cient for subthreshold operation
W e�ective transistor channel widthL e�ective transistor channel length
S e�ective transistor channel width-to-length ratio
� gate e�ectiveness measure, typically 0.7
V0 Early voltage, approximately proportional to L
�0 charge mobility
K 0 Cox�0=2, current coe�cient for above-threshold
Cox gate-oxide capacitance per unit area.
Cdep depletion region capacitance per unit area.
A.1 MOS Transistor Model
Fig. A.1(a) shows a cross-sectional view of a CMOS transistor. It has four ter-
minals, the source, drain, gate, and bulk. The structure of a MOSFET is usually
symmetric, i.e. the assignment of source and drain is determined by the applied
voltages in the circuit. Fig. A.1(b) gives the symbol for an nMOS transistor,
with all four terminals. The MOS device drain-to-source current can be written
as a function F , of the terminal voltages with a general functional form for the
current-voltage relationship valid for all regions of operation given by:
IDS / F(VGB; VSB)�F(VGB; VDB) (A:1)
163
This functional form was �rst introduced by Meyer (Meyer, 1971) and is also
discussed in (Tsividis, 1987). For an n-type device, F is a nonnegative, monoton-
ically increasing function of VGB and a monotonically decreasing function of VSB
(or VDB).
G
SD
B
Drain Source
Gate
Bulk
(a) (b)
Figure A.1: (a) View of an nMOS transistor on the substrate and (b) symbol.
Two modes of operation are possible for MOS transistors, subthreshold and
above-threshold. The threshold voltage Vth, typically 0.7{1.1 V, is the gate voltage
above which mobile charge is induced in the transistor channel, and below which
the channel current results from charges jumping over the energy barrier formed by
the gate. That is, the subthreshold conduction mechanism is di�usion, as opposed
to drift in the above-threshold region. The region between the subthreshold and
above-threshold operation is often referred to as the transition region, where both
drift and di�usion currents are nonnegligible. For a typical MOSFET of square
geometry (W=L = 1), the subthreshold region is de�ned for channel currents below
10{100 nA.
An equation for an MOS device operating above threshold is
IDS =K 0
2Shmax
�0; [VGS � VTH(VBS)]
2��max
�0; [VGD � VTH(VBD)]
2�i
(A:2)
where K 0 = Cox�0=2 is the above-threshold current coe�cient. This equation
shows explicitly the symmetry of the output current, as the di�erence of two
164
quadratics. In the region that one of the terms is zero, the device is said to
be in saturation. In the region where both terms are nonzero, the device is said to
be in the ohmic region. These regions are not discontinuous.
In the subthreshold region, a further factorization of F is possible (Boahen and
Andreou, 1992):
IDS / G(VGB) [H(VSB)�H(VDB)] (A:3)
where G and H are exponential functions. This equation shows that the source-
driven and drain-driven components are controlled independently by VSB and VDB.
And VGB controls both components in a symmetric and multiplicative manner. In
this mode of operation the MOS transistor has been called a di�usor (Boahen
and Andreou, 1992), analogous to the variable conductance electrical junctions in
biological systems.
An expression for the current in an nMOS transistor operating in subthreshold
can thus be written as:
IDS = I0Se�VGB=Vt
he�VSB=Vt � e�VDB=V t
i(A:4)
The terminal voltages VGB; VSB; VDB are referenced to the substrate. The constant
I0 depends on the mobility (�o) and other physical properties of silicon. S is a
geometry factor, the width W to length L ratio of the device. Current through a
pMOS device is given by:
ISD = I0Se��VGB=Vt
heVSB=Vt � eVDB=V t
i(A:5)
It is not guaranteed that I0 and � be the same for both pMOS and nMOS devices.
Variation in I0 has been extensively studied (Pavasovic et al., 1991), while variation
in � is not so well documented.
The parameter � is de�ned as
� =Cox
Cox + Cdep(A:6)
165
The physical signi�cance of � is apparent if the observation is made that the oxide
and depletion capacitances form a capacitive divider between the gate and bulk
terminals that determines the surface potential. The parameter � takes values
between 0.6 and 0.9.
If we de�ne two currents IF and IR, the forward and reverse currents, respec-
tively, such that
IF = I0S exp�VGB=Vt exp�VSB=Vt (A.7)
IR = I0S exp�VGB=Vt exp�VDB=Vt
then we have
IDS = IF � IR (A:8)
IF and IR can be written in the form:
IF = I0S exp(1��)VBS=Vt exp�VGS=Vt (A.9)
IR = I0S exp(1��)VBD=Vt exp�VGD=Vt
These equations show explicitly the dependence on VBS and VBD in which the role
of the bulk is as a back-gate. These equations are useful for devices which operate
as gate-controlled conductors, or di�usors (Boahen and Andreou, 1992).
A small-signal model derived from IF and IR can be constructed in such a way
as to preserve the symmetry between the source and drain, as in Fig. A.2(a), where
gmf � @IF@VGS
=�
VtIF
gmbf � @IF@VBS
=(1� �)
VtIF (A.10)
gmr � @IR@VGD
=�
VtIR
gmbr � @IR@VBD
=(1� �)
VtIR
166
For an nMOS device that is biased with VDS � 5Vt, i.e. in saturation, IF >> IR
and the drain current is given approximately by:
IDS = IF = I0S exp�(1��)VSB=Vt exp�VGS=Vt (A:11)
This equation is most often used for circuit designs where devices operate in satu-
ration as transconductance ampli�ers. However, channel-length modulation (Early
e�ect) { which we have completely ignored thus far { becomes signi�cant in satu-
ration. As such, the device equation can be augmented with:
IDS = In0Sn exp�(1��n)VSB=Vt exp�nVGS=Vt(1 + VDS=V0) (A:12)
where V0 is the Early voltage. A small-signal model of an nMOS device in satura-
tion includes only the small-signal forward current parameters, with the addition
of:
gd � @IDS@VDS
� 1
V0IDS (A:13)
the output conductance. It can be seen in Fig. A.2(b).
The noise in a subthreshold MOS transistor can be reasonably-well modeled
as a bias-dependent shot noise having a at spectrum. Recently, the intimate
relationship between thermal noise and shot noise has been illuminated by Lan-
dauer (Landauer, 1993). In particular, he shows that shot noise and thermal noise
are special limits of a more general noise formula. We are therefore not excluding
any fundamental noise e�ects by treating shot noise alone.
The shot noise in a MOS transistor has two independent components, forward
and reverse. They have one-sided power spectrum given by (Sarpeshkar et al.,
1993; Tsividis, 1987)
Sif;sh(!) = 2qIF (A.14)
Sir;sh(!) = 2qIR
167
gd
gmfvgsvs
gmbfvbs
gmrvbd
gmbrvbd
Sif=2qIF
Sir=2qIR
vd
vg
vb
gmvgsvs
gmbvbs
Si=2qIDS
vd
vg
vb (b)
(a)
Figure A.2: MOS small-signal subthreshold model including sources of shot noiseonly, (a) as a di�usor, and (b) in saturation.
168
This type of noise has been experimentally con�rmed for subthreshold currents up
to 100 pA (Sarpeshkar et al., 1993) in large-area square devices. The small-signal
model of Fig. A.2(a) includes these two noise sources. In saturation, only Sif is
signi�cant, as depicted in Fig. A.2(b).
For sub-threshold currents between 1 nA and 100 nA, it appears necessary to
include the e�ects of icker noise for mid-to-low frequencies. A model for icker
noise which is referenced to the gate voltage is given by (Vittoz, 1994).
Sv;f (!) = 4kT�
WL
2�
!(A:15)
where � is a process-dependent parameter. According to Vittoz the parameter �
is often larger for nMOS than for pMOS transistors, and may range between 0.02
and 2 F/m2. A model for icker noise referenced to the output current is given
by (Tsividis, 1987).
Si;f (!) =Mg2mC 0oxWL
2�
!(A:16)
where gm is the forward transconductance in saturation, C 0
ox the gate capacitance
per unit area and M a process-dependent constant with units of Joules. The two
models are related through the relation 4kT=� =M=C 0
ox. Something which is not
clear from these two models is the e�ect of icker noise in the region that IF � IR.
It is possible that these two noise currents may be correlated and at least partially
cancel one another.
Assuming shot and icker noise are independent, a complete noise model for a
transistor operating in subthreshold saturation is
Si(!) = Si;sh(!) + Si;f(!) (A:17)
Fig. A.3 shows the power spectral density for a PMOS transistor, where W =
1148 �m and L = 4 �m. The model is given by solid lines, the data are marked by
x's. The three curves correspond to nominal current values of (a) 1 nA, (b) 10 nA,
169
and (c) 100 nA for an equivalent square device. One free process-dependent pa-
rameter M is used to model the icker noise. Note that, at low enough current
levels, icker noise cannot be detected within the audio frequency range. This
property is seen for curve (a) of Fig. A.3 in which there is little evidence of icker
noise for frequencies above 50 Hz.
102
103
104
10−13
10−12
10−11
freq, Hz
PS
D, A
/Hz^
1/2
(a)
(b)
(c)
Figure A.3: Noise data taken from a PMOS transistor with W=L = 1148=4. Solidlines are noise model, x's are data. Curve (a) corresponds to 1 nA for an equivalentsquare device, (b) 10 nA, and (c) 100 nA. (� = 0:7, Cox = 1500 F/m2, andM = 4:0E�26 J.)
A.2 Other Monolithic Elements
Passive resistors and capacitors can be realized on the chip using various layers
already available in the fabrication process.
Resistors can be realized on chip using either the polysilicon or the di�usion
layer. For a typical process, the resistivity is about 20�25=2 for polysilicon
and 30�60=2 for di�usion. Even though these resistors can be relatively well-
matched, the designer has no control over the exact resistance value. Therefore,
170
a reasonable circuit design should not depend on the absolute resistance of any
on-chip resistor.
Capacitors are easily implemented using the thin oxide layer sandwiched be-
tween the two polysilicon layers in a double-poly process. Typical capacitance is
about 0.5fF/�m2 and is usually well-matched. It should be mentioned, however,
the bottom plate (the �rst polysilicon layer) has about 0.05fF/�m2 of capacitance
to the substrate. It is thus desirable to design circuits in which the bottom plates
are connected to a low-impedance node, or to use only grounded capacitors. In
situations where neither of the above is possible, a bootstrapping technique may
be necessary to cancel this parasitic capacitance.
No practical inductors can be made on the chip, but as we havel seen in an
earlier chapter, an active inductor can be implemented using a gyrator (generalized
impedance converter) and a capacitor.
Other devices not mentioned in this appendix are available. Most notably are
vertical NPN bipolar transistors, lateral PNP transistors, photo-diodes and photo-
transistors, junction �eld-e�ect transistors, and more. The author regrets only the
lack of an isolated diode (using n-doped and p-doped polysilicon, for example) and
the lack of a true twin-tub process, whereby both nMOS and pMOS transistors
reside in oating wells. If one is willing to go to the extra e�ort of post-processing
IC's, more elaborate electro-mechanical devices can also be integrated onto the
same chip.
Appendix B
Cochlear Experimental Setup
B.1 Experiments with the Hopkins Electronic
EAR
B.1.1 Abstract
We have developed hardware and software for continuous long-term recordings
from the Hopkins Electronic EAR (HEEAR), an analog VLSI model of the auditory
periphery designed in our laboratory (Liu et al., 1992b). Figure B.1 shows the
experimental setup. Previously recorded audio signals are used to stimulate the
cochlear model. These signals can be downloaded to the �rst PC's hard disk over
an ethernet link. The PC converts the data �le into analog values using a digital-
to-analog converter module. The only limit to the length of the input �le is the
size of the �rst PC's hard disk. The second PC can store up to 30 minutes of
multi-channel analog signals from the model, sampling either 32 signals at 12 kHz
or 16 signals at 24 kHz. The recorded data are sent back to a workstation for
further analysis over a second ethernet link. All processing is performed by the
HEEAR chip set in real-time, consuming less than 25 milliwatts of power, including
external potentiometers used to set the parameters of the hardware model. We
have designed the experimental setup with the goal of processing four half-hour
segments from a standard database using the silicon cochlea. The outputs of the
171
172
HEEAR chips will be used by another research group to train and test a large
vocabulary speech recognition system.
We present preliminary results from a series of experiments which are being
conducted using the Hopkins Electronic Ear. In our �rst study, the silicon cochlea
is stimulated using tone bursts with amplitudes which vary over one order of magni-
tude, in order to demonstrate the properties of adaptation and signal compression
characteristic of auditory processors. The output of the cochlea is also studied
in response to pure tones with and without additive white noise. In a third ex-
periment, speech segments of one male and one female speaker are taken from a
standard database. The clean speech is degraded with successively larger amounts
of band-limited white noise, to obtain signal-to-noise ratios of 30 dB, 20 dB, 12 dB,
6 dB, and 0 dB. Using images similar to the neurogram (Secker-Walker and Searle,
1990), our intention is to give a qualitative demonstration of the HEEAR chips'
ability to process speech robustly in the presence of noise.
173
486 PC
160 MByte Hard Disk
Ethernet Link
DT2821D/A
Converter
Pre-Amplifier
Basilar Membrane
Hair Cells and Synapses
Current-to-Voltage Conv 31
Anti-Aliasing Filter
2.2 GByte Hard Disk
Ethernet Link
DT283932-ChannelA/D Conv
AAAA
Microphone
486 PC
Speaker
Buffer Amplifiers
31
Silicon Cochlea
31
31
Custom Interface
Processing file ... cello.ddSampling rate ... 48000
Figure B.1: Experimental setup for simultaneously stimulating and recording fromthe HEEAR chip set. One PC outputs a previously recorded speech signal viathe D/A converter module. The analog speech signal is attenuated before beingpresented to the silicon basilar membrane. Thirty-one output channels are fedinto independent hair-cell synapse circuits. The outputs from the HEEAR chipset are digitized and stored on a second PC after passing through a custom analoginterface. Synchronization is achieved by recording the input signal along with the31 output channels. Depending on the application, a microphone can be connecteddirectly to the input of the pre-ampli�er.
174
B.1.2 Preliminary Results with the HEEAR Chip Set
Three series of experiments have been conducted in order to give a qualitative
demonstration of the HEEAR chip set as an auditory processor. In the near future
we plan to process speech from a standard database using the silicon cochlea. The
outputs of the HEEAR chips will be used by another research group to train and
test a large vocabulary speech recognition system.
Experiment I: Tone Bursts of Varying Amplitudes
The silicon cochlea is stimulated using tone bursts of increasing amplitude. From
Figs. B.2 and B.3, we note that as the stimulus input increases, the onset of the
response increases. However, the steady-state response (after 8 msec) is compressed
at the higher input amplitudes. The time scale over which the adaptation process
takes place is roughly 3-5 msec.
Experiment II: Sinusoids in Noise
The output of the silicon cochlea is recorded in response to steady-state sinusoids
with additive white noise ( at, 0-8kHz). As shown in Figs. B.4 and B.5, the shape
of the silicon cochlea response does not appear to be greatly disturbed, even at a
signal-to-noise ratio of 0dB.
Experiment III: Male and Female Speech in Noise
Acoustically similar speech segments of one male and one female speaker are taken
from a standard database. The syllable chosen, /jh er/ comes from the word
'adjourned'. The clean speech is degraded with successively larger amounts of
white noise, with a minimum signal-to-noise ratio of 0dB. Figs. B.6 and B.7 show
results for the male speaker, while Figs. B.8 and B.9 are for the female speaker.
175
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
Silicon Cochlea Response to 1.0 kHz Tone Burst at 1/4 Full-Scale
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.2: Silicon cochlea response to 1kHz tone burst at 1/4 fullscale. Thecharacteristic frequency of the output channels are shown above half of the traces.Only one channel (668 Hz) appears to be adapting to the stimulus.
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
Silicon Cochlea Response to 1.0 kHz Tone Burst at 1/2 Full-Scale
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.3: Silicon cochlea response to 1kHz tone burst at 1/2 fullscale. Thecharacteristic frequency of the output channels are shown above half of the traces.The response of one channel (668Hz) is high during the �rst three cycles of thetone burst, but reduces to roughly one half its initial value by the tenth cycle.
176
0 0.005 0.01 0.015 0.02 0.025-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
Silicon Cochlea Response to 200 Hz Tone 1/4 Full-Scale RMS 12 dB SNR
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.4: Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 12dB SNR. The characteristic frequency of the output channels are shown above halfof the traces.
0 0.005 0.01 0.015 0.02 0.025-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
Silicon Cochlea Response to 200 Hz Tone 1/4 Full-Scale RMS 0 dB SNR
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.5: Silicon cochlea response to 200Hz tone at 1/4 fullscale RMS and 0 dBSNR. The characteristic frequency of the output channels are shown above half ofthe traces.
177
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
HEEAR Response to Male /jh er/ at 1/5 Full-Scale RMS and 6 dB SNR
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.6: Silicon cochlea response to male token of /jh er/ at 1/5 fullscale RMSand 6 dB SNR. The characteristic frequency of the output channels are shownabove half of the traces. A high-frequency burst marks the release of /jh/.
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
HEEAR Response to Male /jh er/ at 1/5 Full-Scale RMS and 0 dB SNR
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.7: Silicon cochlea response to male token of /jh er/ at 1/5 fullscale RMSand 0 dB SNR. The characteristic frequency of the output channels are shownabove half of the traces. The consonant /jh/ appears to be buried in the noise.
178
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
HEEAR Response to Female /jh er/ at 1/5 Full-Scale RMS and 6 dB SNR
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.8: Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 6 dB SNR. The characteristic frequency of the output channels areshown above half of the traces. A high-frequency burst marks the release of /jh/.
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-200
-100
0
100
200
300
400
500
Time
Cou
nts
(Offs
ets
Add
ed)
HEEAR Response to Female /jh er/ at 1/5 Full-Scale RMS and 0 dB SNR
6.91 kHz
3.85 kHz
2.15 kHz
1.20 kHz
668 Hz
372 Hz
208 Hz
116 Hz
input
Figure B.9: Silicon cochlea response to female token of /jh er/ at 1/5 fullscaleRMS and 0 dB SNR. The characteristic frequency of the output channels areshown above half of the traces. The consonant /jh/ is barely discernible in thenoise.
179
B.2 Harmonic and Intermodulation Distortion
As stated in chapter 2, the second distortion measure computes harmonic and
intermodulation distortions for a sinusoidal input signal. An alternate method for
computing these distortions is to expand the output signal in a power series and
then use trigonometric identities to \fold" it back up.
We would like to point out a potentially useful property of intermodulation
distortion in speech processing. Suppose that we are given a speech signal for which
the fundamental frequency is absent. A good example is a male voice transmitted
over a telephone line. When human subjects listen to the received speech signal,
they have no trouble inferring the missing fundamental frequency. One manner
in which the detection of the missing frequency might take place is by deliberate
nonlinear processing in which the fundamental frequency arises as one of the main
distortion products. For example, if two tones were used as input to the self-
biased transconductor with harmonic integers n = 2 and m = 3, at a large enough
signal level, the output current would contain a discernible distortion product at
harmonic integer j2n � mj = 1, i.e., the fundamental. Evidence to support this
type of processing in the inner ear of the mammal is the appearance of small
peaks in the synchrony of the auditory nerve �bers at characteristic frequencies
of 2F1, 3F1, and (2F1 � F2), where F1 and F2 are the �rst and second formant
frequencies (Young and Sachs, 1979).
B.3 Current-to-Voltage Converter
It is our experience that current-to-voltage conversion is not as straightforward as
voltage-to-current conversion. After four revisions, we �nally adopted the design
which is shown in Fig. B.10. During the design phase, we were unable to �nd a
simulation program that could accurately predict the performance for these cir-
180
cuits. Most notably, one reputable simulation program predicted a bandwidth of
400 kHz for these devices; we measured it as 40 kHz.
The operation of the circuit is as follows. As to the power supplies, set Vdd =
2:5V and Vss = �2:5V. The voltage Vbias sets the bias current for the two ampli�ers.
Its nominal value is Vdd � 0:95V. The voltage Voffst establishes an o�set current
in the current-to-voltage converter so that bipolar currents can be measured, with
the only nuisance being a DC o�set at the output voltage. Its nominal value is
Vdd�0:85V. The voltage Vref establishes the clamping voltage for the input current.
In our setup we set this to GND = 0V. The input current is labeled as Iin. Its
nominal range is 0 � 50nA. The voltage Vgain provides an additional current gain
of exp[(Vgain � Vref )=Vt], so be careful. Vgain is the only voltage input, other than
the power supplies that must be low impedance. In our design, we did not need
the additional voltage gain, and therefore set Vgain = Vref = 0V. Finally, keep in
mind that the output voltage Vout must be bu�ered before it can venture out of the
chip. The reason for this precaution is that the parasitic capacitance associated
with a pad and wire would adversely e�ect circuit performance.
B.4 A BiCMOS Voltage Bu�er
In designing low-power, mostly CMOS analog VLSI systems, high-drive capability
is not an issue until one wants to store and analyze the outputs digitally. An exam-
ple of such a system is an analog VLSI cochlea with 30 frequency channels (Liu,
1992; Liu et al., 1992b). Internal signal processing is done with small voltages
(� 100mV ) and tiny currents (� 20nA); however, in order to view the outputs
externally, we need to drive capacitances up to 300pF and resistive loads down to
10K. Indeed some type of bu�er ampli�er is required.
Speci�c design goals for our bu�er ampli�er are listed in the �rst column of
Table B.1. A trade-o� exists among these design goals, and we found no single
181
Vss
52/6
Vout
Vref 52/6
52/652/6
45/12 45/12
30/12
52/6 52/6
4.0p
30/10
24/10 96/10
24/10 96/10
40k
612/3
Vbias
Voffst
Iin Vgain
Figure B.10: Schematic of current-to-voltage converter as used in the computerinterface to the Hopkins Electronic EAR.
architecture and layout to match every possible requirement. We are also con-
strained by the available technology. Through MOSIS we have available a 2:0�m
double-poly n-well BiCMOS process, which permits the fabrication of oating ca-
pacitors, as well as vertical NPN transistors exhibiting beta's in the range of 50 to
100.
We used a class-AB source follower output stage for our bu�er ampli�er to
ensure a low quiescent current. The main drawback of using source followers at
the output is a reduced linear range (Gregorian and Temes, 1986). If a strictly
CMOS implementation is used, one stands to lose 1:5V {2V at either supply rail.
Thus, using a � 2:5V power supply, the linear range might be only � 0:75V
when driving a large load. Next we considered stacking an NPN transistor with a
PMOS transistor in its own well. The linear range improved to about � 1:0V in
182
Table B.1: Speci�cations, simulation results, and measurements for the BiCMOSbu�er ampli�er.
Feature Design Goal Simulation MeasurementPower Supply � �2:5V �2:5V �2:5VSupply Current � 100�a 304�a 300�aTotal Area � :04mm2 NA 0:165mm2
O�set (mean) � 0:5mV �1:3mV �1:6mV(4 s.d.) � 5:0mV NA 5:3mV
Output Range � �1:5V �1:25V �1:25V@ RL = 1:0KOpen-Loop Phase Margin � 70 deg 86 deg NA@ CL = 300pFOpen-Loop Gain Margin � 10dB 14dB NA@ CL = 300pFMax. DC Gain Error � 1:0% 0:9% 2:2%@ Gain = 1, RL = 1:0KBandwidth � 20KHz 20KHz 20KHz@ Max DC Gain ErrorTotal Harm. Distortion � 1:0% - 0:4%@ 1kHz, RL = 1:0KSlew Rate � 1V=�s 1:1V=�s 0:97V=�s@ RL = 1:0K;CL = 300pF
183
Ic
Ie
Ib
Isd
Vgs
+
-
Vbe
+
-
Vce
+
-
Vsd+
-
Figure B.11: Compound PMOS/NPN Transistor.
this case. Finally we examined a compound PMOS/NPN transistor to replace the
PMOS transistor. In this case the linear range of the bu�er ampli�er increased to
� 1:25V .
B.4.1 Compound PMOS/NPN Transistors
In a compound PMOS/NPN transistor (see Figure B.11) the source and well of
the PMOS transistor and the collector of the NPN transistor are all in common,
while the drain of the PMOS transistor drives the base of the NPN transistor. The
control node is the gate of the PMOS transistor. The voltage/current character-
istics of the compound PMOS/NPN are analogous to those of a very wide PMOS
transistor.
Below the voltage/current characteristics of the compound PMOS/NPN are
derived for the case that the PMOS transistor operates below threshold. The sub-
threshold region for square CMOS devices is de�ned as being for currents less than
approximately 30nA (Vittoz, 1994; Andreou and Boahen, 1994).
For a PMOS device that has the source and well at the same potential and
is biased in the saturation region (VSD � 4Vt), the source-drain current is given
184
by (Andreou and Boahen, 1994):
ISD = I0Se�
�VGS
Vt + gdVSD (B:1)
where:
VGS = VG � VS is the gate to source voltage,
VSD = VS � VD is the source to drain voltage,
S � W=L is the width to length ratio,
I0 is the zero-bias current,
� is the body e�ect coe�cient,
gd is the drain conductance,
Vt � kT=q is the thermal voltage.
We model the NPN bipolar transistor as a current-input/current-output device
with a non-zero output conductance (Andreou and Boahen, 1994):
IC = �IB + g0VCE (B:2)
where:
VCE = VC � VE is the collector to emitter voltage,
� is the current gain,
g0 is the output conductance.
A large signal model of the compound PMOS/NPN device can be derived by
realizing that ISD = IB and IC + IB = IE. Then substituting equation (B.1) into
equation (B.2), we �nd the result:
IE = I0(1 + �)Se��VGS
Vt + (1 + �)gdVSD + g0VCE (B:3)
185
The �rst term in equation (B.3) will dominate the two latter. Hence, we see
that the compound PMOS/NPN behaves similar to a PMOS transistor with width-
to-length ratio scaled by (1 + �). In e�ect, we have increased the maximum sub-
threshold current of a PMOS device with width-to-length ratio S from 30nA x S
to 30nA x (1 + �)S.
We identify two reasons for operating the PMOS transistor below threshold in
the compound PMOS/NPN transistor. Firstly, a PMOS transistor requires a lower
gate-to-source voltage when operating below threshold than when operating above.
It follows that we will obtain a larger output swing using a sub-threshold device
than an above-threshold device. Secondly, we want our design to be as symmetric
as possible in order to achieve good linearity (Gregorian and Temes, 1986). If we
consider the NPN transistor as a voltage-input/current-output device, a simpli�ed
large signal model is as follows (Andreou and Boahen, 1994):
IC = IES�
1 + �eVBE
Vt + g0VCE (B:4)
where:
VBE = VB � VE is the base to emitter voltage,
IES is the saturation current,
and other parameters are as de�ned earlier. We see that the exponential voltage-
to-current relationship of equation (B.3) is similar (and complementary) to that of
an NPN transistor in equation (B.4). We exploit this similarity to achieve a more
symmetric design.
On a test chip we fabricated individual compound PMOS/NPN transistors and
NPN transistors of the same dimensions as those used in the design of the bu�er.
Figure B.12 shows the measured I-V characteristics for the case that VCE = 2:0V .
For both transistor types the output current is exponentially dependent on the
186
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAAAAAAAAAAAA
10-11
10-9
10-7
10-5
0.001
0 0.2 0.4 0.6 0.8 1 1.2
I(P_NPN1)
I(NPN1)
Ie, Ic
Vsg, Vbe
Amps
Volts
Figure B.12: Current as a function of voltage for a compound PMOS/NPN tran-sistor (solid line) and an NPN transistor (dashed line). Two transistors of eachtype were measured, but the resulting curves were so similar for each type thatthey would be indistinguishable on this graph. The saturation of the NPN tran-sistor output current near 10mA was caused by limitations in the measurementequipment. In all cases, VCE = 2:0V .
control voltage over approximately 7 orders of magnitude. In addition, we see that
the NPN I-V curve has a steeper slope than that of the compound PMOS/NPN.
The di�erence in slope is due to the extra � term in the exponent of equation (B.3)
which is not found in equation (B.4).
B.4.2 Circuit Description
Figure B.13 shows the schematic of the bu�er including device geometries. The
input stage consists of a large-area PMOS di�erential pair in a oating well. In
order to reduce systematic o�set, transistors M9 and M10 copy the voltage from
the drain of M3 to that of M4. In this way, the mirroring ratio between transistors
M3 and M4 is very close to unity. Transistor pairs M5-M6 and M11-M12 mirror
187
Vdd
Vbias
Vss
M2
M7
124/10
M3
50/10
M5
50/10M6200/10
M12500/10
M810/6
M910/10
M450/10
M1015/6
200/5
M11124/10
M1610/10
Q416 x 64
Q216 x 64
Q116 x 64
Q316 x 64
15pF
Vin
Vout
M13 M14
15pF
309/3
15/6
M15
M1
Figure B.13: BiCMOS Bu�er Ampli�er.
and amplify the di�erential current by a factor of four, establishing a push/pull
con�guration. The output stage consists of NPN transistor Q2 biased one diode
drop above the output voltage stacked with the compound PMOS/NPN transistor
M14/Q4 biased one diode drop below the output voltage. Frequency compensation
is achieved using pole/zero cancellation (Gregorian and Temes, 1986) with two
15pF capacitors and transistors M15-M16 operating in the triode region.
B.4.3 Results
Circuit simulations were performed with SPICE3e using BSIM and BIPOLAR
models provided by MOSIS. Results of simulation experiments can be found in
column 3 of Table B.1. Eighteen bu�ers were arranged in a 2mm x 2mm tiny chip
188
and placed in a 40-pin DIP package. Electrical measurements are averaged from
four such chips and can be found in the last column of Table B.1.
We note one minor discrepancy between simulated and measured results. The
maximum DC gain error, which we anticipated to be lower than 1%, is actually
greater than 2% for the maximumload. One factor contributing to this discrepancy
is the long metal line connecting the output of the bu�er to the pad. We omitted
this 5 parasitic resistance from our simulations. However, even in the condition
of no load the gain error is measured to be 0.7%. It follows that the open-loop gain
must be lower than anticipated, but we have no means of measuring it directly.
Bibliography
Allen, J. (1985). Cochlear modeling. IEEE ASSP Magazine, 2(1):3{29.
Andreou, A. and Boahen, K. (1994). Neural information processing II. In Ismail,
M. and Fiez, T., editors, Analog VLSI Signal and Information Processing.
McGraw-Hill.
Andreou, A. and Liu, W. (1993). BiCMOS circuits for silicon cochleas. In Dedieu,
H., editor, 1993 European Conference on Circuit Theory and Design, pages
503{508, Davos, Switzerland. Elsevier Science B.V.
Andreou, A. G. (1995). Low power analog VLSI systems for sensory information
processing. In Sheu, B. J., Ismail, M., Sanchez-Sinencio, E., and Wu, T. H.,
editors, Microsystems Technology for Multimedia Applications: An Introduc-
tion, ISCAS '95 Tutorial Sessions, pages 501{522. IEEE, Seattle, WA.
Bhadkamkar, N. (1993). A variable resolution, nonlinear silicon cochlea. Technical
Report CSL-TR-93-558, Stanford University, Stanford.
Blahut, R. (1987). Principles and Practice of Information Theory. Addison-Wesley,
Reading, Mass. Pages 280{282.
Boahen, K. and Andreou, A. (1992). A constrast sensistive silicon retina with
reciprocal synapses. In Moody, J., Hansen, S., and Lippmann, R., editors,
Advances in Neural Information Processing 4. Morgan-Kaufmann, San Mateo,
CA.
189
190
Cohen, M. and Andreou, A. (1992). Current-mode subthreshold MOS implementa-
tion of the Herault-Jutten autoadaptive network. IEEE J. Solid-State Circuits,
27(5):714{727.
Furth, P. and Andreou, A. (1995). Linearised di�erential transconductors in sub-
threshold CMOS. Electronics Letters, 31(7):545{547.
Furth, P., Goel, N., Andreou, A., and Goldstein, Jr., M. (1994). Experiments
with the Hopkins Electronic EAR. In 14th Speech Research Symposium, pages
183{189, Baltimore MD.
Ghitza, O. (1986). Auditory nerve representation as a front-end for speech recog-
nition in a noisy environment. Computer Speech and Language, 1:109{130.
Gray, R., Buzo, A., Gray, Jr., A., and Matsuyama, Y. (1980). Distortion mea-
sures for speech processing. IEEE Trans. Acoust., Speech, Signal Processing,
28(4):367{376.
Gray, Jr., A. and Markel, J. (1976). Distance measures for speech processing.
IEEE Trans. Acoust., Speech, Signal Processing, 24(5):380{391.
Gregorian, R. and Temes, G. (1986). Analog MOS Integrated Circuits for Signal
Processing. John Wiley & Sons.
Groenewold, G. (1991). Optimal dynamic range integrators. IEEE Trans. Circuits
Syst. I, 39(8):614{627.
Hosticka, B. (1985). Performance comparison of analog and digital circuits. Proc.
IEEE, 73(1):25{29.
Kamm, T., Andreou, A., and Cohen, J. (1995). Vocal tract normalization in speech
recognition: compensating for speaker variability. In 15th Speech Research
Symposium, pages 175{178, Baltimore MD.
191
Krummenacher, F. and Joehl, N. (1988). A 4-MHz CMOS continuous-time �lter
with on-chip automatic tuning. IEEE J. Solid-State Circuits, 23(3):750{758.
Landauer, R. (1961). Irreversibility and heat generation in the computing process.
IBM J. Res. Devel., 5:183{191.
Landauer, R. (1993). Solid-state shot noise. Physical Review B, 47(24):16427{
16432.
Lazzaro, J., Wawrzynek, J., and Kramer, A. (1994). Systems technologies for
silicon auditory models. IEEE Micro, pages 7{15.
Lazzaro, J., Wawrzynek, J., Mahowald, M., Sivilotti, M., and Gillespie, D. (1993).
Silicon auditory processors as computer peripherals. IEEE Trans. Neural Net-
works, 4(3):523{528.
Lin, J., Ki, W., Edwards, T., and Shamma, S. (1994). Analog VLSI implementation
of auditory wavelet transforms using switched-capacitor circuits. IEEE Trans.
Circuits Syst. I, 41(9):572{583.
Liu, W. (1992). An Analog Cochlear Model: Signal Representation and VLSI
Realization. PhD thesis, Johns Hopkins University, Baltimore.
Liu, W., Andreou, A., and Goldstein, Jr., M. (1992a). Multiresolution speech
analysis with an analog cochlear model. In IEEE-SP International Symposium
on Time-Frequency and Time-Scale Analysis, pages 433{436, Victoria, BC,
Canada.
Liu, W., Andreou, A., and Goldstein, Jr., M. (1992b). Voiced-speech representation
by an analog silicon model of the auditory periphery. IEEE Trans. Neural
Networks, 3(3):477{487.
192
Liu, W., Andreou, A., and Goldstein, Jr., M. (1993). Analog cochlear model
for multiresolution speech analysis. In Hanson, S., Cowan, J., and Giles, C.,
editors, Advances in Neural Information Processing Systems 5, pages 666{673.
Morgan Kaufmann, San Mateo, CA.
Lyon, R. and Mead, C. (1988). An analog electronic cochlea. IEEE Trans. Acoust.,
Speech, and Signal Proc., 36(7):1119{1134.
Max, J. (1960). Quantizing for minimum distortion. IRE Trans. Inform. Theory,
6:7{12.
Mead, C. (1989). Analog VLSI and Neural Systems. Addison-Wesley, Reading,
MA.
Meng, H. and Zue, V. (1990). A comparative study of acoustic representations of
speech for vowel classi�cation using multi-layer perceptrons. In Int'l Conf. on
Spoken Language Processing, pages 1053{1056.
Meyer, J. (1971). MOS models and circuit simulation. RCA Review, 32:42{63.
Nauta, B. (1992). A CMOS tranconductance-C �lter technique for very high fre-
quencies. IEEE J. Solid-State Circuits, 27(2):142{153.
Neti, C. (1994). Neuromorphic speech processing for noisy environments. In IEEE
Intl. Conf. on Neural Networks, pages 4425{4430, Orlando, FL.
Nevarez-Lozano, H. and Sanchez-Sinencio, E. (1991). Minimum parasitic e�ects
biquadratic OTA-C �lter architectures. Analog Integrated Circuits and Signal
Processing, 1(4):297{319.
Paez, M. and Glisson, T. (1972). Minimum mean-squared error quantization in
speech PCM and DPCM systems. IEEE Trans. Comm., 20:225{230.
193
Papoulis, A. (1965). Probability, Random Variables, and Stochastic Processes.
McGraw-Hill, New York. P. 219.
Park, J., Abel, C., and Ismail, M. (1993). Design of silicon cochlea using MOS
switched-current techniques. In Dedieu, H., editor, 1993 European Conference
on Circuit Theory and Design, pages 269{273, Davos, Switzerland. Elsevier
Science B.V.
Pavasovic, A., Andreou, A., and Westgate, C. (1991). Characterization of CMOS
process variations by measuring subthreshold current. In Green, R. and Ruud,
C., editors, Nondestructive Characterization of Materials IV. Plenum Press,
New York.
Rice, J. (1988). Mathematical Statistics and Data Analysis. Wadsworth &
BrooksCole, Paci�c Grove, CA.
Rice, J., Young, E., and Spirou, G. (1995). Auditory-nerve encoding of pinna-
based spectral cues: Rate representation of high-frequency stimuli. J. Acoust.
Soc. Am., 97:1764{1776.
Roe, D. and Wilpon, J., editors (1994). Voice Communication Between Humans
and Machines. National Academy Press, Washington, D.C.
Ross, S. (1988). A First Course in Probability. Macmillan, New York, third edition.
Sarpeshkar, R., Delbruck, T., and Mead, C. (1993). White noise in MOS transistors
and resistors. IEEE Circuits Devices Mag., 9(6):23{29.
Sarpeshkar, R., Lyon, R., and Mead, C. (1996). An analog VLSI cochlea with
new transconductance ampli�ers and nonlinear gain control. In ISCAS-96,
Atlanta, GA.
194
Secker-Walker, H. and Searle, C. (1990). Time-domain analysis of auditory-nerve-
�ber �ring rates. J. Acoust. Soc. Am., 88:1427{1436.
Shannon, C. (1948). A mathematical theory of communication. Bell Syst. Tech.
J., 27:379{423, 623{656.
Silva-Martinez, J., Steyaert, M., and Sansen, W. (1990). A high frequency large
signal very low distortion transconductor. In IEEE ESSCIRC-90, pages 169{
172.
Tanimoto, H., Koyama, M., and Yoshida, Y. (1991). Realization of a 1-V active
�lter using a linearization technique employing plurality of emitter-coupled
pairs. IEEE J. Solid-State Circuits, 26(7):937{945.
Torrance, R., Viswanathan, T., and Hanson, J. (1985). CMOS voltage to current
transducers. IEEE Trans. Circuits Syst., 32(11):1097{1104.
Tsividis, Y., Czarnul, A., and Fang, S. (1986). MOS transconductors and integra-
tors with high linearity. Electron. Lett., 22:245{246. Errata, vol. 22, p. 619,
May, 1986.
Tsividis, Y. P. (1987). Operation and Modeling of the MOS Transistor. McGraw-
Hill, New York. P. 343.
Vittoz, E. (1994). Micropower techniques. In Franca, J. and Tsividis, Y., editors,
Design of MOS VLSI Circuits for Telecommunications and Signal Processing.
Prentice-Hall, 2nd edition.
Watts, L. (1992). Cochlear Mechanics: Analysis and Analog VLSI. PhD thesis,
California Institute of Technology, Pasadena, CA.
Watts, L., Kern, D., Lyon, R., and Mead, C. (1992). Improved implementation of
the silicon cochlea. IEEE J. Solid-State Circuits, 27(5):692{700.
195
Yang, X., Yang, K., and Shamma, S. (1992). Auditory representations of acoustic
signals. IEEE Trans. Information Theory, 38(2):824{839.
Young, E. and Sachs, M. (1979). Representation of steady-state vowels in the
temporal aspects of the discharge patterns of populations of auditory-nerve
�bers. J. Acoust. Soc. Am., 66:1381{1403.
Vita
Paul Matthew Furth was born in Washington, DC on June 27, 1963. He received
the B.A. in French from Grinnell College in 1984 and the B.S. in engineering (elec-
trical) from the California Institute of Technology, as part of a combined 5-year
liberal arts/engineering program. From 1985-1989 he worked as an electronics
project engineer for TRW Technar, a company which manufactured airbag crash
sensors for automobiles. His responsibilities were hardware and software design for
computer-controlled shock test equipment, as well as documentation and training.
In 1989, he entered the Ph.D. program in the Electrical and Computer Engineering
Department of Johns Hopkins University, where he received the M.Sc.Eng. degree
in 1992. Since 1989, he has been a teaching assistant and instructor for the depart-
ment and a research assistant in the Sensory Communications Laboratory, under
the direction of Professor Andreas Andreou. His research interests are low power
analog circuit design, speech processing, and circuit and device limitations.
196