1
CHAPTER 1
INTRODUCTION
1.1 COCHLEAR IMPLANT
Cochlear implant(CI) is an electronic prosthetic device surgically
implanted into the inner ear that restores partial hearing to the profoundly
deaf. Unlike commercial hearing aids which benefit patients with conductive
hearing loss, CI, on the other hand, also benefits patients with sensory-neural
hearing impairment. It bypasses the normal hearing mechanism and directly
stimulates the inner ear sensory cells of the auditory nerve by delivering
electrical signals to an electrode array implanted inside the cochlea. These
electrical signals are derived from the external sound acquired from a
microphone. Sound signals are first manipulated by an external speech
processor and then transmitted via transcutaneous link in the form of
electromagnetic waves to the inner ear where they are finally converted into
electrical pulses (Mishra and Hubbard 2002). With high success rates and
increasing demand of implants worldwide, a substantial growth and progress
is seen in the CI research in the last two decades.
CI devices demand an ultra low power, high performance signal
processing unit to enhance the patient’s comfort, to optimize the speech
intelligibility and to improve sound quality. To meet this goal, the digital
speech processor was developed using low power design techniques and
dedicated processing architectures. A general trend for new electronic
ground-breaking products appears to be that they consist of highly integrated
5
2) Internal Function: The internal receiver receives signals from
the external transmitter, the stimulator sends impulses to the
inside of the cochlea, the electrodes stimulate the cochlear or
auditory nerve, and the signals are then passed to the brain.
Also, the magnet holds the external components in place.
1.3 HISTORY OF CI
An overview of the history provides an understanding of the slow
progression and breakthrough the CI underwent right from the starting to till
date. Table.1 shows the events that led to the Conceptualization (1800-1949),
Research and Development (1950-1979) and Commercialization (1980-2013)
of CI.
1.3.1 Conceptualization
Early in the year 1800, Alessandro Voltas conducted experiment on
his ears by sending stimulating current of order micro amperes through the
leads of wire connected across the 50V battery and his auditory nerve
experienced a jolt and was able to hear the noise similar to boiling of thick
soup. In honour of Voltas work, the important electrical unit volt was named
after him. For another 150 years research on stimulating the auditory nerve
was done only by Direct Current (DC) stimulation. Only in the late 1940,
Steven with his colleagues identified three mechanism underlying the
stimulation (Fan Gang-Zeng 2008) of the auditory nerve. The first mechanism
was the function of eardrum to convert electrical stimulation into acoustic
signal. The second mechanism was related to the “electromechanical effect,”
in which electric stimulation caused the hair cells in the cochlea to vibrate,
resulting in a perceived tonal pitch at the signal frequency as if it were
acoustically stimulated. Only the third mechanism was related to direct
6
electric activation of the auditory nerve, as the subjects reported noise-like
sensation in response to sinusoidal electrical stimulation, much steeper
loudness growth with electric currents, and occasionally activation of non
auditory facial nerves.
1.3.2 Research and Development
The research and development phase legitimized the utility and
safety of electric stimulation. In the middle of last century, physicians became
the driving force to translate these early research efforts into clinical practice.
In 1957, A French physician, Djourno and his colleagues reported successful
hearing using electric stimulation in two totally deaf patients. In 1964, Blair
Simmons at Stanford implanted a cluster of six stainless-steel electrodes into
the auditory nerve through the modiolus in a 60-year-old man with profound
hearing loss. In 1971, Robin Michelson in San Francisco implanted a form-
fitting (from human temporal bones) single channel electrode pair in four deaf
patients. In 1978, Graeme Clark in Australia developed a 20-electrode
(Platinum rings) CI system and implanted in two deaf patients.
1.3.3 Commercialisation
To settle the scientific issues, Michael Merzenich and colleagues
organized the First International Conference on Electrical Stimulation of the
Acoustic Nerve as a Treatment for Profound Sensorineural Deafness in man
in San Francisco in 1973(Fan-Gang Zeng 2008). An outcome of this
conference was, as shown in Table 1.1 an intensified research effort in CIs,
particularly using animal experiments, in the mid 1970s to 1980s. In 1975 to
settle the safety and efficacy issues, National Institute of Health (NIH),
commissioned Bilger and colleagues at the University of Pittsburgh to
evaluate objectively and independently the audiological performance in the
world’s first group of single-electrode CI recipients, including 11 implanted
7
by House and 2 by Michelson. Bilger’s report confirmed that the single-
electrode CIs provided useful hearing in terms of aiding lip read and
identifying common environmental sounds, but these devices could not
provide open-set speech recognition. These events led to the
commercialization phase which saw a wide spread use of electric stimulation
in treating sensori-neural hearing loss.
Table 1.1 Events in the development of CI (Fan-Gang Zeng et al 2008)
Conceptualization:1800-1949
Research & Development: 1950-1979
Commercialization: 1980-2008
1800 1957 1977 Nov 26, 1984 1983Volta used his battery to show that electric simulation can invoke sensations including hearing.
Djourno and Eyries introduced electric hearing in two deaf patients.
Bilger report confirmed the effectiveness of CIs.
3M/House became the first FDA approved CI.
First of the Biennial Conferences under the auspices of the GordonResearch Conference.
1937-1940 1961 1973 Oct 31, 1985 1988Stevens and colleagues identified three mechanisms underlying electric sensations.
House implanted two patients.
First International Conference of Electrical Stimulation-San Francisco.
Nucleus 22 became the first FDA approved multi-channel device.
First NIH Consensus Statement on CIs.
1964 Jun 27, 1990Simmons implanted a patient.
FDA approved Nucleus 22 device in children
1971 Jun 26,1997Michelson implanted four patients.
FDA approved Clariondevice
1978 Aug 20,2001Clarkimplanted a multi-channel device in two patients.
FDA approved Med-EI device.
8
The manufacturer of CI in US namely 3M commercialised the CIs
as product and was the first to win US Food and Drug Approval (FDA) in
1984 for 3M/House single-electrode implant and became the industrial leader
with several hundred users in the mid 1980s. On the other hand, supported by
a grant from the Australian Department of Productivity, the University of
Melbourne and Nucleus Limited (a medical device company focusing on
pacemakers) entered a public and private cooperative agreement in 1979 to
manufacture and market the 22-electrode CI. In middle 1980s, NIH also
helped speed up the acceptance of multi-electrode CIs by funding the
University of Melbourne device development and hosting the first consensus
conference. The conference consensus concluded that “multichannel implants
may have some superior features in adults when compared with the single-
channel type”. Time proved that the multi-electrode devices indeed not only
produced much superior performance over the single-electrode devices but
also eventually phased out the single electrode devices in the market.
In addition to the Nucleus device, several other multi-electrode
devices were also developed. The University of Utah developed a
six-electrode implant with a percutaneous plug interface and was called either
the Ineraid or Symbion device in literature, which was uniquely suited for
research purposes. The University of Antwerp in Belgium developed the
Laura device that could deliver either 8-channel bipolar or 15-channel
monopolar stimulation. These devices were later brought out and are no
longer available commercially. TheMXM laboratories in France also
developed a 15-channel monopolar device, the Digisonic MX20, which is
marketed by Neurelec (www.neurelec.com).
There are three major CI manufacturers including Advanced
Bionics Corporation, USA, Med-El Corporation, Austria and Cochlear
Corporation, Australia with Cochlear being the dominating player controlling
9
70–80% of the CI market worldwide. There are also several manufacturers
developing advanced and low-cost multi-electrode CIs, like Advanced
Cochlear Systems in Seattle, Washington, Nurobiosys Corporation in Seoul,
Korea and Nurotron Biotechnology Inc. based in both Irvine, CA and
Hangzhou, China.
Figure 1.3 Generations of CI systems (Patrick 2006)
1.4 SPEECH PROCESSOR OF CI
Figure 1.3 shows the development stages of the Nucleus® series
over the years. It can be observed that the speech processor progressively
reduced in size for the newer generations. As in Figure 1.3 during 1980, the
processor was large in size and it was worn on the waist. Owing to the rapid
10
progress in the semiconductor Technology roadmap the CI size was gradually
miniaturised as shown in Figure 1.4 and its external speech processor is fixed
just behind the ear. As the size of the speech processor in the implant
decreased, the life time of batteries used for power supply naturally extended.
The patient will be relieved of the pain of changing the batteries frequently.
Figure 1.4 Road map of CIs from generation I to the present generation
In the near future, these devices will be fully implanted inside the
body so that deaf people will be indistinguishable from everyone else in both
appearance and ability to hear. Hence when the area is to be miniaturised to
fit inside the narrow ear channel, Very Large Scale Integration Digital Signal
Processing (VLSI DSP) Concepts must be applied to design the individual
blocks of the speech processor depending on the speech processing strategy
adapted.
1.4.1 Speech Processing Strategy
A speech processing strategy or speech coding strategy is one
of the key features which affect the overall performance of the device
(Loizou 1998). The speech processor emulates the functioning of the inner ear
by dividing the speech signal into 12 to 22 number of frequency bands in
order to extract the signal strength to excite the implanted electrode
11
accordingly. Depending upon the speech processing strategy, the speech
processor extracts various parameters from the acoustic signals and converts
them into electrical signals. Various speech processing strategies have been
developed and reported in literature (Wilson 1991, Loizou 2003, Wilson et al
2007) over time for cochlear prosthesis which include Continuous Interleaved
Sampling (CIS), Spectral Peak (SPEAK)(Ahmad et al 2009) , Advanced
Combination Encoder (ACE), Spectral Maxima Sound Processor (SMSP),
Simultaneous Analog Strategy (SAS), Paired Pulsatile Sampler (PPS),
Quadruple Pulsatile Sampler (QPS) and various Formant based Strategies.
Numerous algorithms based on Wavelet Transform (Paglialonga et al 2006),
Wavelet Packets (Nogueira et al 2006), Bionic Wavelet Transform (Yao
2002) and Auditory Models (Grayden et al 2004) are also found in literature.
In contrast to traditional approaches, various algorithms especially for tonal
languages have been developed, which emphasize on the extraction of
maximum tonal and pitch information from speech (Lan 2004, Wilson et al
2007a). The two commonly used speech processing strategies are Continuous
Interleaved Sampling (CIS) and Advanced Combination Encoder(ACE). Both
the speech processing strategy can be realised using filterbank approach and
FFT approach.The major manufacturer of CI use the CIS processing strategy.
Hence, the work proposed in this study have chosen the filterbank approach
of CIS strategy.
1.4.1.1 Continuous interleaved sampling strategy
The CIS strategy filters the incoming speech into required number
of bands, obtains the speech envelope and compresses the signal for each
channel. Stimulation consists of interleaved digital pulses that sweep rapidly
through the channels at a rate of 833 pulses per second when using all
eight channels for a maximum pulse rate of 6664 pulses per second
(8 × 833 = 6664). With the CIS strategy, rapid changes in the speech signal
12
are tracked by rapid variations in pulse amplitude. The pulses are delivered to
consecutive channels in sequence to avoid channel interaction. Hence CIS
strategy is regarded as one of the most useful and famous speech processing
strategy in CI. Figure 1.5 shows the functional block diagram for speech
processing using CIS.
Figure 1.5 Functional block diagram of CIS strategy for speech
processing
1.5 MOTIVATION TO USE CIS ALGORITHM
In 1998, at the University of Pennsylvania, primary work on
auditory signal processing was done as doctoral research work of Ahmed Ali.
His work proposed to replicate cochlea-like filtering behaviours that
recognised more spontaneous speech than can be recognized by the existing
system which had very limited perplexity and artificial grammar constraints.
Cheng and Edelman (2002) attempted to implement 36 analog programmable
cascaded filters on a Xilinx Field Programmable Gate Array (FPGA) chip.
However, their design, which required roughly 80,000 gates, was too
13
complex for the single proposed chip, which contained only 5,000 gates.
Mishra et al (2002) designed a digital cochlear filter and implemented in
Xilinx XC4010 FPGA. The filter gave a good fit to biological data. It
is a tenth-order recursive filter implemented as a parallel combination of
low-order elements. Leong (2002) used module generator tool to develope an
FPGA-based implementation of an electronic cochlea filter with arbitrary
precision. FPGA-based implementations of cochlea filter offered shorter
design times, improved dynamic range, higher accuracy, and a simpler
computer interface. Lee (2001) attempted the first design of a 16-channel
filter system. Because of time limitations, his designs were only simulated
and not implemented on an FPGA chip. Chen et al (2002) continued the
project based on Lee (2001) work. They were able to download a 16-channel
first-order band pass filter system on a single FPGA. Their design showed
only minimal promise in the eleventh filter of their work. Time constraints
left them unable to optimize their system design. Hinck and Todd (1999)
unsuccessfully attempted to implement a design consisting of a 6 to 10
channel digital cochlea filter on an FPGA. Watts and Lyod (2000) built a
functioning real-time, high-resolution, 240-tap, 10-octave, 44 kHz-sampling
cochlear model on multiple FPGAs. Rekha et al (2008) designed and
implemented digital cochlea filter on FPGA which gave a good fit to real time
data with efficiency of hardware usage. All these previous works were aimed
to design low power, low area and high performance architecture for the filter
bank of speech processor.
CIS is one of the most useful and famous speech processing
strategies used in speech processors of CI. Algorithm realization of CIS in
hardware which is a laborious task due to high computation cost of the
algorithm is the motivation for this work. Real-time issues and low-power
design demands an optimized realization of algorithm in hardware. In CIS
algorithms, the programs for filtering the incoming signal are non terminating
14
and the same programs are executed repetitively on the same hardware for an
infinite time series. The non terminating nature of the programs is exploited
to design more efficient speech processor by exploiting the dependency tasks
both within iterations and among multiple iterations. The long critical paths in
the individual filters of the filter bank limit the performance of processor.
Using the suitable VLSI DSP transformations these algorithms of filters are
transformed for design of high-speed, low area and low power
implementation. Such transformed algorithms of CIS makes the speech
processor of CI compact, less power hungry and also have enhanced
performance.
1.6 OBJECTIVES
The primary objective of this study is to develop new DSP
architectures for the filter in speech processor of CI with the following
features
minimized area of the filter.
reduced power consumption of the speech processor.
enhanced performance of the filter.
This objective is achieved with trade-offs by utilising the
techniques of VLSI DSP like folding transformation, numerical strength
reduction algorithm, retiming and pipelining which leads to provide a low
cost and low power CIs. Based on the varying degrees of hearing defects from
one patient to another, the cochlear filter is to be designed with coefficients
suitable for the particular degree of hearing defect of a patient. Hence, the
design of digital filter depends on reconfigurability which plays a major role
for designing the filter with particular specifications to reduce the effort in a
new design as well as to deliver an economical product.
15
1.7 METHODOLOGY
In the filter bank design of speech processor, the long critical path
in each filter which limits the performance of the system must be reduced by
selecting the appropriate transformations of the VLSI DSP. Transformations
like pipelining, retiming and data-broadcasting significantly reduces critical
path which ultimately enhances the performance of the system.
In synthesizing filter architecture, it is necessary to minimize
silicon area which is achieved by reducing the number of functional units like
adders, multipliers, registers, multiplexers and interconnection wires. By
executing multiple algorithm operations on single functional unit, the number
of functional units in the implementation is reduced resulting in low silicon
area. This phenomenon is called folding transformation which is utilized in
the study to reduce the silicon area of the total speech processor, which has
the filter bank as major block of processing.
Numerical transformation techniques are used to reduce the
strength of filter computation. These transformations rely upon Canonical
Signed Digit (CSD) representation and sub expression elimination of the filter
coefficient to restructure the computation in such a manner that performance,
in terms of speed, power and area of the computation can be improved. Figure
1.6 gives the overview of the study.
CHAPTER I
16
Figure 1.6 Overview of the study
1.8 AIMS AND CONTRIBUTIONS OF THIS THESIS
The aim of this study is to determine new architectures for the filter
used in speech processor of CI.
The filter bank occupies the major portion of the speech
processor which forms the external part of the cochlear
implant and so it should occupy as little area as possible. So
it becomes imperative that an optimized digital VLSI
architecture be designed for this application that is tailor
made to meet these requirements as discussed by Mishra and
IIR Realization of GTF
Cochlear Implant
Speech Processor
Filter Bank
CHAPTER VI
16 channel DA based FIR filter
bank
Conclusion
IIR Realization of GTF
Folding Transformation
Low Area Low Power
Fractional Delay FIR Filter
Numerical Strength Reduction
Low Area Low Power High Speed
FIR Realization of auditory filter
Critical Path Reduction
Enhanced Performance Splitting of speech
spectrum into 16 bands
CHAPTER II CHAPTER III CHAPTER IV CHAPTER V
17
Hubbard (2002). Mahalakshmi and Reddy (2010) designed
narrow band pass FIR filters for 16 channels at algorithm
level using Kaiser Window with sharp transition band to
decompose the audio signals into multiple frequency bands.
In this work investigation is done on the design and hardware
implementation of narrow band pass FIR filter for speech processor of CI
using the Xilinx System Generator (XSG) tool on FPGA. Filters in each
channel of the filter bank are designed with Kaiser Window of length 877 and
stop band attenuation approximately equal to -60 dB. Only with this higher
order filter the required sharp narrow band filtered output is obtained which
makes the extraction of spectral energy to become possible. The spectral
energy of each band is necessary to generate stimulating pulses to excite the
microelectrodes placed at different locations along the cochlea. Input audio
signal is fed to the DA FIR filter block designed for critical band frequency of
3145Hz to 3655Hz. This designed filter provides filtered output for that
particular frequency band. In the similar way the other filters in the
filterbanks are designed for the sixteen critical bands of frequencies.
A random signal source is also used for verifying the function of the filter
bank. FIR filter in each channel splits the signal into critical bands of
frequencies as the basilar membrane in the biological cochlea. XSG based DA
FIR filter proves that design, analysis and testing of the filter with real time
signal is completed in minimum duration of time. FIR based filter banks used
100% of available resources on a Virtex II Pro board and 41 % of available
resources on a Virtex 7 board. Hence IIR realisation of the auditory filter
banks are used in the following works of this investigation
To obtain the best outcome on auditory signals, the more
suitable Differential All Pole Gammatone Filter (DAPGF) is
18
designed in the digital domain, where the reconfigurability can
be attained. Also, to acquire non linear phase response of the
target filter for mimicking the bionic ear, IIR filter realization is
used. The high performance of this filter is accomplished by
using highly efficient adders, multipliers and appropriate delays.
In order to minimize the power consumption due to huge adder
circuitry, ripple carry adder is used here to provide less power
consumption. When compared with other multipliers the
application of proposed multipliers with a regular partial
product array results in adequate improvement in the area, delay
and power consumption. The optimized results for the
architecture of filter are achieved using VLSI DSP concepts like
data broadcasting and retiming. The filter is modeled in Verilog
Hardware Description Language (HDL) and synthesized by
using Cadence Register Transfer Logic (RTL) compiler in
0.13µm technology. Simulation results show that the area and
power consumption of the designed filter are 0.105mm2 and
760µw respectively.
VLSI DSP transformation like folding can lead to reduction in
the cost of the CI. DSP algorithm is repetitively used in these
processors for filtering and encoding operation. The critical
paths in these algorithms limit the performance of the speech
processor. These algorithms need to be transformed for the
design of high speed, less area and low power of the processor.
This can be realized by designing the suitable auditory filter
banks for the processor based on digital VLSI signal processing
concepts. Folding algorithm is applied to the second order
digital gammatone filter which reduces the number of
multipliers from five to one and the number of adders from
19
three to one in the design without changing the characteristics
of the filter.
The delay sum beam forming method used in the dual Technical
Pro (TP) microphone for performing speech enhancement
requires a specified delay to be offered to signal coming from
one microphone as the two microphones are placed 0.01 m
apart. The specified delay is not an integer value it has to be
fractional delay. Previously FD generated for this application
was done at algorithm level only (Chen Yousheng 2011) where
as this work is implemented at architecture level. FD filter are
realised as FIR approximation and also categorised as FFD FIR
and Variable Fractional Delay (VFD) FIR filters. These filters
are implemented on Spartan 3E FPGA kit. In order to determine
the suitable method such as maximally flat FFD FIR for
generating the required FD to compensate for the delay
introduced in signal from one of the microphones, general
analysis is done on the different methods available under FFD
filter category. Under VFD category (Laakso et al 1996) which
updates its value online, only one method namely the Farrow
structure based Lagrange interpolation is analysed and found
suitable for generating the required FD for signal originating
from one of the microphones. The proposed work involves the
design of maximally flat FFD FIR filter and VFD FIR filter
using the CSD multiplier which uses the numerical strength
reduction technique to reduce the area occupied and the power
consumed and enhances the speed of operation. It is found that
maximally flat FFD FIR produced a fractional delay of 1.427
samples (x(n-1.427)) instead of x(n-1) when the distance
between two microphone is 0.01m and the system clock period
20
is 22.7 µs whereas VFD FIR produced FD of only 1.4 samples
for the same specifications. With maximally flat FFD FIR filter,
CSD based FFD FIR has 35% less area and 96.67% less power
consumption with 5% increase in speed with respect to
conventional direct form FFD FIR filter.
1.9 ORGANIZATION OF THE STUDY
This introductory chapter gives the background on the physiology
of the human ear, history of CI, motivation, objectives, methodology and
contribution of the study. The different types of speech processing strategies
for CI are discussed to determine that CIS algorithm is the most common and
famous speech processing strategy used in speech processor of CI
Chapter 2 proposes DA based FIR filter architecture for the filter
banks used in the speech processor of CI using XSG tool.
Chapter 3 proposes data-broadcast structure for IIR realization of
Gamma Tone filter (GTF) to minimise the critical path and enhance the
performance. The data broadcast structure is implemented with a proposed
multiplier occupying minimum area and having low power delay product.
Brief review on the existing adders like 4-2 compressor, Chinese abacus
adder, carry save adder and ripple carry adder was done to determine the
better adder to be used in adding the partial products of the multiplier and
also to determine the better adder to add speech signal components of filter.
Chapter 4 proposes folded architecture for IIR realisation of GTF
to reduce area and power and to enhance speed. Folding transformation is
applied to reduce hardware complexity by time multiplexing the hardware
components of the filter like adder, multiplier and the delay elements. Folded
21
architecture is implemented with same multiplier and adder used in the
previous chapter
Chapter 5 proposes a fractional delay FIR filter architecture for
speech processor of CI using numerical strength reduction technique such as
CSD and CSE.
Chapter 6 will draw conclusions and future extensions of the
work.