Download - CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/38608/6/06_chapter1.pdf · CHAPTER 1 INTRODUCTION 1.1 COCHLEAR IMPLANT Cochlear implant(CI) is an electronic

1

CHAPTER 1

INTRODUCTION

1.1 COCHLEAR IMPLANT

Cochlear implant(CI) is an electronic prosthetic device surgically

implanted into the inner ear that restores partial hearing to the profoundly

deaf. Unlike commercial hearing aids which benefit patients with conductive

hearing loss, CI, on the other hand, also benefits patients with sensory-neural

hearing impairment. It bypasses the normal hearing mechanism and directly

stimulates the inner ear sensory cells of the auditory nerve by delivering

electrical signals to an electrode array implanted inside the cochlea. These

electrical signals are derived from the external sound acquired from a

microphone. Sound signals are first manipulated by an external speech

processor and then transmitted via transcutaneous link in the form of

electromagnetic waves to the inner ear where they are finally converted into

electrical pulses (Mishra and Hubbard 2002). With high success rates and

increasing demand of implants worldwide, a substantial growth and progress

is seen in the CI research in the last two decades.

CI devices demand an ultra low power, high performance signal

processing unit to enhance the patient’s comfort, to optimize the speech

intelligibility and to improve sound quality. To meet this goal, the digital

speech processor was developed using low power design techniques and

dedicated processing architectures. A general trend for new electronic

ground-breaking products appears to be that they consist of highly integrated

5

2) Internal Function: The internal receiver receives signals from

the external transmitter, the stimulator sends impulses to the

inside of the cochlea, the electrodes stimulate the cochlear or

auditory nerve, and the signals are then passed to the brain.

Also, the magnet holds the external components in place.

1.3 HISTORY OF CI

An overview of the history provides an understanding of the slow

progression and breakthrough the CI underwent right from the starting to till

date. Table.1 shows the events that led to the Conceptualization (1800-1949),

Research and Development (1950-1979) and Commercialization (1980-2013)

of CI.

1.3.1 Conceptualization

Early in the year 1800, Alessandro Voltas conducted experiment on

his ears by sending stimulating current of order micro amperes through the

leads of wire connected across the 50V battery and his auditory nerve

experienced a jolt and was able to hear the noise similar to boiling of thick

soup. In honour of Voltas work, the important electrical unit volt was named

after him. For another 150 years research on stimulating the auditory nerve

was done only by Direct Current (DC) stimulation. Only in the late 1940,

Steven with his colleagues identified three mechanism underlying the

stimulation (Fan Gang-Zeng 2008) of the auditory nerve. The first mechanism

was the function of eardrum to convert electrical stimulation into acoustic

signal. The second mechanism was related to the “electromechanical effect,”

in which electric stimulation caused the hair cells in the cochlea to vibrate,

resulting in a perceived tonal pitch at the signal frequency as if it were

acoustically stimulated. Only the third mechanism was related to direct

6

electric activation of the auditory nerve, as the subjects reported noise-like

sensation in response to sinusoidal electrical stimulation, much steeper

loudness growth with electric currents, and occasionally activation of non

auditory facial nerves.

1.3.2 Research and Development

The research and development phase legitimized the utility and

safety of electric stimulation. In the middle of last century, physicians became

the driving force to translate these early research efforts into clinical practice.

In 1957, A French physician, Djourno and his colleagues reported successful

hearing using electric stimulation in two totally deaf patients. In 1964, Blair

Simmons at Stanford implanted a cluster of six stainless-steel electrodes into

the auditory nerve through the modiolus in a 60-year-old man with profound

hearing loss. In 1971, Robin Michelson in San Francisco implanted a form-

fitting (from human temporal bones) single channel electrode pair in four deaf

patients. In 1978, Graeme Clark in Australia developed a 20-electrode

(Platinum rings) CI system and implanted in two deaf patients.

1.3.3 Commercialisation

To settle the scientific issues, Michael Merzenich and colleagues

organized the First International Conference on Electrical Stimulation of the

Acoustic Nerve as a Treatment for Profound Sensorineural Deafness in man

in San Francisco in 1973(Fan-Gang Zeng 2008). An outcome of this

conference was, as shown in Table 1.1 an intensified research effort in CIs,

particularly using animal experiments, in the mid 1970s to 1980s. In 1975 to

settle the safety and efficacy issues, National Institute of Health (NIH),

commissioned Bilger and colleagues at the University of Pittsburgh to

evaluate objectively and independently the audiological performance in the

world’s first group of single-electrode CI recipients, including 11 implanted

7

by House and 2 by Michelson. Bilger’s report confirmed that the single-

electrode CIs provided useful hearing in terms of aiding lip read and

identifying common environmental sounds, but these devices could not

provide open-set speech recognition. These events led to the

commercialization phase which saw a wide spread use of electric stimulation

in treating sensori-neural hearing loss.

Table 1.1 Events in the development of CI (Fan-Gang Zeng et al 2008)

Conceptualization:1800-1949

Research & Development: 1950-1979

Commercialization: 1980-2008

1800 1957 1977 Nov 26, 1984 1983Volta used his battery to show that electric simulation can invoke sensations including hearing.

Djourno and Eyries introduced electric hearing in two deaf patients.

Bilger report confirmed the effectiveness of CIs.

3M/House became the first FDA approved CI.

First of the Biennial Conferences under the auspices of the GordonResearch Conference.

1937-1940 1961 1973 Oct 31, 1985 1988Stevens and colleagues identified three mechanisms underlying electric sensations.

House implanted two patients.

First International Conference of Electrical Stimulation-San Francisco.

Nucleus 22 became the first FDA approved multi-channel device.

First NIH Consensus Statement on CIs.

1964 Jun 27, 1990Simmons implanted a patient.

FDA approved Nucleus 22 device in children

1971 Jun 26,1997Michelson implanted four patients.

FDA approved Clariondevice

1978 Aug 20,2001Clarkimplanted a multi-channel device in two patients.

FDA approved Med-EI device.

8

The manufacturer of CI in US namely 3M commercialised the CIs

as product and was the first to win US Food and Drug Approval (FDA) in

1984 for 3M/House single-electrode implant and became the industrial leader

with several hundred users in the mid 1980s. On the other hand, supported by

a grant from the Australian Department of Productivity, the University of

Melbourne and Nucleus Limited (a medical device company focusing on

pacemakers) entered a public and private cooperative agreement in 1979 to

manufacture and market the 22-electrode CI. In middle 1980s, NIH also

helped speed up the acceptance of multi-electrode CIs by funding the

University of Melbourne device development and hosting the first consensus

conference. The conference consensus concluded that “multichannel implants

may have some superior features in adults when compared with the single-

channel type”. Time proved that the multi-electrode devices indeed not only

produced much superior performance over the single-electrode devices but

also eventually phased out the single electrode devices in the market.

In addition to the Nucleus device, several other multi-electrode

devices were also developed. The University of Utah developed a

six-electrode implant with a percutaneous plug interface and was called either

the Ineraid or Symbion device in literature, which was uniquely suited for

research purposes. The University of Antwerp in Belgium developed the

Laura device that could deliver either 8-channel bipolar or 15-channel

monopolar stimulation. These devices were later brought out and are no

longer available commercially. TheMXM laboratories in France also

developed a 15-channel monopolar device, the Digisonic MX20, which is

marketed by Neurelec (www.neurelec.com).

There are three major CI manufacturers including Advanced

Bionics Corporation, USA, Med-El Corporation, Austria and Cochlear

Corporation, Australia with Cochlear being the dominating player controlling

9

70–80% of the CI market worldwide. There are also several manufacturers

developing advanced and low-cost multi-electrode CIs, like Advanced

Cochlear Systems in Seattle, Washington, Nurobiosys Corporation in Seoul,

Korea and Nurotron Biotechnology Inc. based in both Irvine, CA and

Hangzhou, China.

Figure 1.3 Generations of CI systems (Patrick 2006)

1.4 SPEECH PROCESSOR OF CI

Figure 1.3 shows the development stages of the Nucleus® series

over the years. It can be observed that the speech processor progressively

reduced in size for the newer generations. As in Figure 1.3 during 1980, the

processor was large in size and it was worn on the waist. Owing to the rapid

10

progress in the semiconductor Technology roadmap the CI size was gradually

miniaturised as shown in Figure 1.4 and its external speech processor is fixed

just behind the ear. As the size of the speech processor in the implant

decreased, the life time of batteries used for power supply naturally extended.

The patient will be relieved of the pain of changing the batteries frequently.

Figure 1.4 Road map of CIs from generation I to the present generation

In the near future, these devices will be fully implanted inside the

body so that deaf people will be indistinguishable from everyone else in both

appearance and ability to hear. Hence when the area is to be miniaturised to

fit inside the narrow ear channel, Very Large Scale Integration Digital Signal

Processing (VLSI DSP) Concepts must be applied to design the individual

blocks of the speech processor depending on the speech processing strategy

adapted.

1.4.1 Speech Processing Strategy

A speech processing strategy or speech coding strategy is one

of the key features which affect the overall performance of the device

(Loizou 1998). The speech processor emulates the functioning of the inner ear

by dividing the speech signal into 12 to 22 number of frequency bands in

order to extract the signal strength to excite the implanted electrode

11

accordingly. Depending upon the speech processing strategy, the speech

processor extracts various parameters from the acoustic signals and converts

them into electrical signals. Various speech processing strategies have been

developed and reported in literature (Wilson 1991, Loizou 2003, Wilson et al

2007) over time for cochlear prosthesis which include Continuous Interleaved

Sampling (CIS), Spectral Peak (SPEAK)(Ahmad et al 2009) , Advanced

Combination Encoder (ACE), Spectral Maxima Sound Processor (SMSP),

Simultaneous Analog Strategy (SAS), Paired Pulsatile Sampler (PPS),

Quadruple Pulsatile Sampler (QPS) and various Formant based Strategies.

Numerous algorithms based on Wavelet Transform (Paglialonga et al 2006),

Wavelet Packets (Nogueira et al 2006), Bionic Wavelet Transform (Yao

2002) and Auditory Models (Grayden et al 2004) are also found in literature.

In contrast to traditional approaches, various algorithms especially for tonal

languages have been developed, which emphasize on the extraction of

maximum tonal and pitch information from speech (Lan 2004, Wilson et al

2007a). The two commonly used speech processing strategies are Continuous

Interleaved Sampling (CIS) and Advanced Combination Encoder(ACE). Both

the speech processing strategy can be realised using filterbank approach and

FFT approach.The major manufacturer of CI use the CIS processing strategy.

Hence, the work proposed in this study have chosen the filterbank approach

of CIS strategy.

1.4.1.1 Continuous interleaved sampling strategy

The CIS strategy filters the incoming speech into required number

of bands, obtains the speech envelope and compresses the signal for each

channel. Stimulation consists of interleaved digital pulses that sweep rapidly

through the channels at a rate of 833 pulses per second when using all

eight channels for a maximum pulse rate of 6664 pulses per second

(8 × 833 = 6664). With the CIS strategy, rapid changes in the speech signal

12

are tracked by rapid variations in pulse amplitude. The pulses are delivered to

consecutive channels in sequence to avoid channel interaction. Hence CIS

strategy is regarded as one of the most useful and famous speech processing

strategy in CI. Figure 1.5 shows the functional block diagram for speech

processing using CIS.

Figure 1.5 Functional block diagram of CIS strategy for speech

processing

1.5 MOTIVATION TO USE CIS ALGORITHM

In 1998, at the University of Pennsylvania, primary work on

auditory signal processing was done as doctoral research work of Ahmed Ali.

His work proposed to replicate cochlea-like filtering behaviours that

recognised more spontaneous speech than can be recognized by the existing

system which had very limited perplexity and artificial grammar constraints.

Cheng and Edelman (2002) attempted to implement 36 analog programmable

cascaded filters on a Xilinx Field Programmable Gate Array (FPGA) chip.

However, their design, which required roughly 80,000 gates, was too

13

complex for the single proposed chip, which contained only 5,000 gates.

Mishra et al (2002) designed a digital cochlear filter and implemented in

Xilinx XC4010 FPGA. The filter gave a good fit to biological data. It

is a tenth-order recursive filter implemented as a parallel combination of

low-order elements. Leong (2002) used module generator tool to develope an

FPGA-based implementation of an electronic cochlea filter with arbitrary

precision. FPGA-based implementations of cochlea filter offered shorter

design times, improved dynamic range, higher accuracy, and a simpler

computer interface. Lee (2001) attempted the first design of a 16-channel

filter system. Because of time limitations, his designs were only simulated

and not implemented on an FPGA chip. Chen et al (2002) continued the

project based on Lee (2001) work. They were able to download a 16-channel

first-order band pass filter system on a single FPGA. Their design showed

only minimal promise in the eleventh filter of their work. Time constraints

left them unable to optimize their system design. Hinck and Todd (1999)

unsuccessfully attempted to implement a design consisting of a 6 to 10

channel digital cochlea filter on an FPGA. Watts and Lyod (2000) built a

functioning real-time, high-resolution, 240-tap, 10-octave, 44 kHz-sampling

cochlear model on multiple FPGAs. Rekha et al (2008) designed and

implemented digital cochlea filter on FPGA which gave a good fit to real time

data with efficiency of hardware usage. All these previous works were aimed

to design low power, low area and high performance architecture for the filter

bank of speech processor.

CIS is one of the most useful and famous speech processing

strategies used in speech processors of CI. Algorithm realization of CIS in

hardware which is a laborious task due to high computation cost of the

algorithm is the motivation for this work. Real-time issues and low-power

design demands an optimized realization of algorithm in hardware. In CIS

algorithms, the programs for filtering the incoming signal are non terminating

14

and the same programs are executed repetitively on the same hardware for an

infinite time series. The non terminating nature of the programs is exploited

to design more efficient speech processor by exploiting the dependency tasks

both within iterations and among multiple iterations. The long critical paths in

the individual filters of the filter bank limit the performance of processor.

Using the suitable VLSI DSP transformations these algorithms of filters are

transformed for design of high-speed, low area and low power

implementation. Such transformed algorithms of CIS makes the speech

processor of CI compact, less power hungry and also have enhanced

performance.

1.6 OBJECTIVES

The primary objective of this study is to develop new DSP

architectures for the filter in speech processor of CI with the following

features

minimized area of the filter.

reduced power consumption of the speech processor.

enhanced performance of the filter.

This objective is achieved with trade-offs by utilising the

techniques of VLSI DSP like folding transformation, numerical strength

reduction algorithm, retiming and pipelining which leads to provide a low

cost and low power CIs. Based on the varying degrees of hearing defects from

one patient to another, the cochlear filter is to be designed with coefficients

suitable for the particular degree of hearing defect of a patient. Hence, the

design of digital filter depends on reconfigurability which plays a major role

for designing the filter with particular specifications to reduce the effort in a

new design as well as to deliver an economical product.

15

1.7 METHODOLOGY

In the filter bank design of speech processor, the long critical path

in each filter which limits the performance of the system must be reduced by

selecting the appropriate transformations of the VLSI DSP. Transformations

like pipelining, retiming and data-broadcasting significantly reduces critical

path which ultimately enhances the performance of the system.

In synthesizing filter architecture, it is necessary to minimize

silicon area which is achieved by reducing the number of functional units like

adders, multipliers, registers, multiplexers and interconnection wires. By

executing multiple algorithm operations on single functional unit, the number

of functional units in the implementation is reduced resulting in low silicon

area. This phenomenon is called folding transformation which is utilized in

the study to reduce the silicon area of the total speech processor, which has

the filter bank as major block of processing.

Numerical transformation techniques are used to reduce the

strength of filter computation. These transformations rely upon Canonical

Signed Digit (CSD) representation and sub expression elimination of the filter

coefficient to restructure the computation in such a manner that performance,

in terms of speed, power and area of the computation can be improved. Figure

1.6 gives the overview of the study.

CHAPTER I

16

Figure 1.6 Overview of the study

1.8 AIMS AND CONTRIBUTIONS OF THIS THESIS

The aim of this study is to determine new architectures for the filter

used in speech processor of CI.

The filter bank occupies the major portion of the speech

processor which forms the external part of the cochlear

implant and so it should occupy as little area as possible. So

it becomes imperative that an optimized digital VLSI

architecture be designed for this application that is tailor

made to meet these requirements as discussed by Mishra and

IIR Realization of GTF

Cochlear Implant

Speech Processor

Filter Bank

CHAPTER VI

16 channel DA based FIR filter

bank

Conclusion

IIR Realization of GTF

Folding Transformation

Low Area Low Power

Fractional Delay FIR Filter

Numerical Strength Reduction

Low Area Low Power High Speed

FIR Realization of auditory filter

Critical Path Reduction

Enhanced Performance Splitting of speech

spectrum into 16 bands

CHAPTER II CHAPTER III CHAPTER IV CHAPTER V

17

Hubbard (2002). Mahalakshmi and Reddy (2010) designed

narrow band pass FIR filters for 16 channels at algorithm

level using Kaiser Window with sharp transition band to

decompose the audio signals into multiple frequency bands.

In this work investigation is done on the design and hardware

implementation of narrow band pass FIR filter for speech processor of CI

using the Xilinx System Generator (XSG) tool on FPGA. Filters in each

channel of the filter bank are designed with Kaiser Window of length 877 and

stop band attenuation approximately equal to -60 dB. Only with this higher

order filter the required sharp narrow band filtered output is obtained which

makes the extraction of spectral energy to become possible. The spectral

energy of each band is necessary to generate stimulating pulses to excite the

microelectrodes placed at different locations along the cochlea. Input audio

signal is fed to the DA FIR filter block designed for critical band frequency of

3145Hz to 3655Hz. This designed filter provides filtered output for that

particular frequency band. In the similar way the other filters in the

filterbanks are designed for the sixteen critical bands of frequencies.

A random signal source is also used for verifying the function of the filter

bank. FIR filter in each channel splits the signal into critical bands of

frequencies as the basilar membrane in the biological cochlea. XSG based DA

FIR filter proves that design, analysis and testing of the filter with real time

signal is completed in minimum duration of time. FIR based filter banks used

100% of available resources on a Virtex II Pro board and 41 % of available

resources on a Virtex 7 board. Hence IIR realisation of the auditory filter

banks are used in the following works of this investigation

To obtain the best outcome on auditory signals, the more

suitable Differential All Pole Gammatone Filter (DAPGF) is

18

designed in the digital domain, where the reconfigurability can

be attained. Also, to acquire non linear phase response of the

target filter for mimicking the bionic ear, IIR filter realization is

used. The high performance of this filter is accomplished by

using highly efficient adders, multipliers and appropriate delays.

In order to minimize the power consumption due to huge adder

circuitry, ripple carry adder is used here to provide less power

consumption. When compared with other multipliers the

application of proposed multipliers with a regular partial

product array results in adequate improvement in the area, delay

and power consumption. The optimized results for the

architecture of filter are achieved using VLSI DSP concepts like

data broadcasting and retiming. The filter is modeled in Verilog

Hardware Description Language (HDL) and synthesized by

using Cadence Register Transfer Logic (RTL) compiler in

0.13µm technology. Simulation results show that the area and

power consumption of the designed filter are 0.105mm2 and

760µw respectively.

VLSI DSP transformation like folding can lead to reduction in

the cost of the CI. DSP algorithm is repetitively used in these

processors for filtering and encoding operation. The critical

paths in these algorithms limit the performance of the speech

processor. These algorithms need to be transformed for the

design of high speed, less area and low power of the processor.

This can be realized by designing the suitable auditory filter

banks for the processor based on digital VLSI signal processing

concepts. Folding algorithm is applied to the second order

digital gammatone filter which reduces the number of

multipliers from five to one and the number of adders from

19

three to one in the design without changing the characteristics

of the filter.

The delay sum beam forming method used in the dual Technical

Pro (TP) microphone for performing speech enhancement

requires a specified delay to be offered to signal coming from

one microphone as the two microphones are placed 0.01 m

apart. The specified delay is not an integer value it has to be

fractional delay. Previously FD generated for this application

was done at algorithm level only (Chen Yousheng 2011) where

as this work is implemented at architecture level. FD filter are

realised as FIR approximation and also categorised as FFD FIR

and Variable Fractional Delay (VFD) FIR filters. These filters

are implemented on Spartan 3E FPGA kit. In order to determine

the suitable method such as maximally flat FFD FIR for

generating the required FD to compensate for the delay

introduced in signal from one of the microphones, general

analysis is done on the different methods available under FFD

filter category. Under VFD category (Laakso et al 1996) which

updates its value online, only one method namely the Farrow

structure based Lagrange interpolation is analysed and found

suitable for generating the required FD for signal originating

from one of the microphones. The proposed work involves the

design of maximally flat FFD FIR filter and VFD FIR filter

using the CSD multiplier which uses the numerical strength

reduction technique to reduce the area occupied and the power

consumed and enhances the speed of operation. It is found that

maximally flat FFD FIR produced a fractional delay of 1.427

samples (x(n-1.427)) instead of x(n-1) when the distance

between two microphone is 0.01m and the system clock period

20

is 22.7 µs whereas VFD FIR produced FD of only 1.4 samples

for the same specifications. With maximally flat FFD FIR filter,

CSD based FFD FIR has 35% less area and 96.67% less power

consumption with 5% increase in speed with respect to

conventional direct form FFD FIR filter.

1.9 ORGANIZATION OF THE STUDY

This introductory chapter gives the background on the physiology

of the human ear, history of CI, motivation, objectives, methodology and

contribution of the study. The different types of speech processing strategies

for CI are discussed to determine that CIS algorithm is the most common and

famous speech processing strategy used in speech processor of CI

Chapter 2 proposes DA based FIR filter architecture for the filter

banks used in the speech processor of CI using XSG tool.

Chapter 3 proposes data-broadcast structure for IIR realization of

Gamma Tone filter (GTF) to minimise the critical path and enhance the

performance. The data broadcast structure is implemented with a proposed

multiplier occupying minimum area and having low power delay product.

Brief review on the existing adders like 4-2 compressor, Chinese abacus

adder, carry save adder and ripple carry adder was done to determine the

better adder to be used in adding the partial products of the multiplier and

also to determine the better adder to add speech signal components of filter.

Chapter 4 proposes folded architecture for IIR realisation of GTF

to reduce area and power and to enhance speed. Folding transformation is

applied to reduce hardware complexity by time multiplexing the hardware

components of the filter like adder, multiplier and the delay elements. Folded

21

architecture is implemented with same multiplier and adder used in the

previous chapter

Chapter 5 proposes a fractional delay FIR filter architecture for

speech processor of CI using numerical strength reduction technique such as

CSD and CSE.

Chapter 6 will draw conclusions and future extensions of the

work.