ANALYSIS OF AUDITORY EVOKED POTENTIAL
by
VIKASH DAGA, B.E.
A THESIS
IN
ELECTRICAL ENGINEERING
Submitted to the Graduate Faculty of Texas Tech University in
Partial Fulfillment of the Requirements for
the Degree of
IVIASTER OF SCIENCE
IN
ELECTRICAL ENGINEERING
Approved
December, 2002
^
z^
ACKNOWLEDGEMENTS
v-'L ^h laccn tharkfiil to my advisor Dr. Mary Baker for her help, support and guidance
throughout the project. I am grateful to Dr. Dwayne Paschall for providing the data set and
guiding me and agreeing to be in my graduate committee to evaluate my work. I am also
grateful to Dr. Thomas Trost and for being in my graduate committee and evaluating my
work.
I am thankful to all my friends who provide me support through out my work and
study and especially thankful to Sri Raja for guiding me in some bad times. I am thankful
to my parents and my family for their support without which I would not have been where I
am right now. It is because of their love, guidance and encouragement that I was able to
make it so far.
11
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ii
LIST OF TABLES v
LIST OF FIGURES vi
CHAPTER
1. INTRODUCTION 1
1.1 Aim of the thesis 1
1.2 Introduction to EEG and ABR 1
1.3 Earlier work and thesis outline 2
2. AUDITORY BRAINSTEM RESPONSE 3
2.1 Electroencephalography 3
2.2 Auditory evoked potential 3
2.3 Auditory brainstem response 5
2.4 Instrumentation 6
2.5 Interpreting ABR waveforms 9
2.6 Absolute latency 11
2.7 Interwave latency intervals 12
2.8 Latency curve 13
2.9 Clinical application 14
3. WAVELET TRANSFORMATION 16
3.1 Mathematical transformation 16
iii
3.2 Fourier transform 16
3.3 Short-time Fourier transform 17
3.4 Wavelet transform 17
3.5 Scale 18
3.6 Time and frequency resolution 20
3.7 Discrete wavelet transform 23
4. PROCEDURE 27
4.1 Introduction of data set 27
4.2 Data collection 27
4.3 Preprocessing 29
4.4 Slope method 30
4.5 Wavelet transform 32
4.6 Spectrogram approach 35
5. RESULT AND CONCLUSION 36
5.1 Slope method 36
5.2 Wavelet analysis 36
5.3 Conclusion 41
5.4 Future work 41
REFERENCES 43
IV
LIST OF TABLES
2.1 Summary of auditory evoked potentials 4
2.2 Normal values for adult females 10
5.1 Average and standard deviation of wave V latency 41
LIST OF FIGURES
2.1 Neural generators of the ABR in humans 6
2.2 Electrode placement for ABR recording 7
2.3 Block diagram of ABR recording 10
2.4 Normal ABR waveform of an adult male 12
2.5 Latency curve 13
2.6 Latency curve comparison 15
3.1 A cosine wave for s =1 19
3.2 A cosine wave for s > 1 19
3.3 A cosine wave for s < 1 20
3.4 Time and frequency diagram for wavelet transform 21
3.5 Time and frequency diagram of STFT 22
3.6 Time and frequency diagram of Fourier transform 22
3.7 Different levels of wavelet decomposition of signal 26
4.1 10-20 Intemational system for electrode placement 28
4.2 Test details 29
4.3 A typical waveform 30
4.4 Slope method 32
4.5 Daubechie wavelets 33
VI
4.6 Reconstmcted waveform 34
4.7 Spectrogram 35
5.1 Reconstmcted waveform at different wavelet coefficients 37
5.2 Latency curve comparison 38
5.3 Wave V latency for 80 dB 38
5.4 Wave V latency for 60 dB 39
5.5 Wave V latency for 55 dB 39
5.6 Wave V latency for 40 dB 40
5.7 Wave V latency for 20 dB 40
Vll
CHAPTER 1
E^JTRODUCTION
1.1 Aim of the thesis
Auditory brainstem response is the response of the auditory nervous system to an
acoustic stimulus. It is characterized by a series of peaks. The time of occmrence of these
peaks has clinical importance in assessing hearing disorders of patients and identifying
the region of disorder. The proper identification of the occmrence of these peaks is very
difficult due to noise in the signal as well as background EEG signals. The aim of the
thesis is to identify one of the several peaks (Wave V) with the help of various signal
processing techniques.
1.2 Introduction to EEG and ABR
Electroencephalography measures electrical signals from brain, recorded with the
help of electrodes placed on the scalp. EEG recordings are used to diagnosis various
neurological disorders. The EEG signal can be analyzed visually by neuroscientists to
yield some information, but computerized methods have been developed to analyze the
signals to get a better understanding of the human brain.
Auditory brainstem response occurs in the first 10-15 ms of the auditory evoked
potential. Auditory brainstem response is used in the following:
1. estimating the level in different patients.
2. screening newboms who are at risk for hearing loss.
3. evaluate patients with suspected retrocochlear pathology.
The Auditory brainstem response or ABR as it is commonly known, is identified
by the presence of various peaks which are produced when the stimulus is encoded by
neurons along the ascending auditory pathway. The various peaks in the ABR are named
as wave I, II, III and so on in order of their occurtence. The most important peak out of
these peaks is wave V peak which is produced by lateral lemniscus [1]. These peaks can
be identified in normal cases but in abnormal cases these peaks are hidden by the
presence of noise. The diagnosis is done by looking at the time of occiurence of the peak
which is known as the latency of the peak.
1.3 Earlier work and thesis outline
Automatic recognition of wave V in the auditory brainstem response has been the
subject of research for the past few years, and a lot of work has been done in the field.
Various approaches like partem recognition and neural network and signal processing
techniques have been used for automatic detection of wave V peaks. Habrakan [2] used
neural networks to extract features to identify peak V in brainstem auditory evoked
potenfial. Wilson [3] used discrete wavelet analysis for the peak identification.
In this thesis. Chapter 2 describes about the evoked potential and gives a brief
description of auditory brainstem response. Chapter 3 describes the basics of wavelet
transformation and the reason for it being used in the thesis. Chapter 4 gives details about
the procedure followed, from data collection to processing. Chapter 5 summarizes the
result and conclusion as well as the future scope of the work.
CHAPTER 2
AUDITORY BRAINSTEM RESPONSE
2.1 Electroencephalography
Neurons exchange information in the form of electric signals. The recording of
this electrical activity of the neurons is called electroencephalography, or EEG. An
Austrian psychiatrist Hans Bergerin was the first to measure these signals in the late
1920s.
EEG signals are used for clinical and research purposes. The most common
method of recording an EEG signal is by placing surface electrodes on the scalp. The
electrical response to specific stimuli is known as the evoked potential. The evoked
potential can be a visual evoked potential caused by a flash of light or an auditory evoked
potential caused by an audio click.
2.2 Auditory evoked potential
Electrical potentials recorded due to an acoustic stimulus is known as an auditory
evoked potential. The auditory stimulus can be a click, tone burst, white noise, and
others. Latency is the time interval between the presenting of the stimuli and the
occurtence of the cortesponding response. Auditory evoked potentials are usually divided
into three time epochs each with different latencies. These are [4]:
1. fast response - latency of 0-10 ms,
2. middle response - latency of 10-50 ms and
3. slow response - latency of 50-500 ms.
Table 2.1 gives a list of all the auditory evoked potentials that occur in the first
500 ms after the onset of the stimulus along with their latencies, neural generators and the
places where the potentials are recorded [1].
Table 2.1 Summary of auditory evoked potentials [1]
Response Electrocochleography
Auditory Brainstem Response
Frequency Following Response
SNio Response
Middle Latency Response (MLR)
40 Hz Response
Late Potentials (Ni-P2 P300, CNV)
Latency 0.2-4.0 ms
1.5-10.0 ms
6-25 ms
8-12 ms
10-80 ms
Every 25 ms
80-500 ms
Recording Site Middle ear at the pronmontory
Vertex to earlobe or mastoid
Vertex to earlobe or mastoid
Vertex to earlobe or mastoid Vertex to earlobe or mastoid
Vertex to earlobe or mastoid Vertex to earlobe or mastoid
Generator(s) Coclea (CM andSP) Vlllth nerve (AP) Vllth nerve (AP) Brainstem nuclei tracts Vllth nerve (AP) Brainstem nuclei tracts Not known
Thalamus, primary auditory cortex? Not known
Primary auditory and association cortex
Stimulus Pulses
Pulses Tonebursts
Tones
Tonebursts
Pulses Tonebursts Tones
Pulses Tonebursts Tones Pulses Tonebursts Tones
2.3 Auditory brainstem response
The ABR is a sequence of electrical potentials that are generated in the brainstem
and central auditory pathway in response to a stimulus in the ear. ABR occurs as the fast
response of the auditory evoked potenfial [4]. ABR reflects the transmission of the
stimuli through the brain stem auditory pathways. A normal waveform is usually
characterized by a series of 5 to 7 peaks that occur within 12 ms after the onset of the
stimulus. These peaks are labeled in roman numerals from wave I to wave VII. These
peaks are the response of the various neurons present in the brainstem auditory pathway
of the stimuli. The various neurons responsible for the peaks are these [1]:
a. Vlllth nerve for the wave I,
b. Lateral nemuscus for wave II,
c. Oliver bundle for wave III,
d. Lateral lemniscus for wave IV and
e. Inferior colliculus for wave V.
The most prominent peaks are the wave I, III, V. The latencies of these peaks for
normal persons is generally [1]:
a. 1.6 ms for wave I,
b. 3.7 ms for wave III and
c. 5.4 ms for wave V.
Figure 2.1 shows the ascending auditory pathway and the neural generators
responsible for the various peaks. The figure is that of cochlear nerve fibers and the
neurons responsible for the generafion of peaks are shown [4].
DCNi
A A A / T r i I
"""rt-
1 ' -"^ i ^.
MG
Figure 2.1 Neural generators of the ABR in humans (reproduced from "The Auditory Brainstem Response" edited by John T. Jacobson, P.27)
2.4 Instrumentation
An instrument that captures and stores auditory evoked potential has to perform
four major functions. These include generafing an acousfic stimulus, amplification of the
electric potential and averaging and storing the signal. These are explained in detail
below.
2.4.1 Generation of an acoustic stimulus and recording
The stimulus most often used in ABR recording is an acoustic stimulus, which is
a click with time duration of 100 microseconds. The intensity of the stimulus is measured
in decibels and ranges from 80 dB to 20 dB.
The ABR is recorded by artaching electrodes to the surface of the patient's scalp.
The most common method of clinical ABR recording uses four electrodes, one at the
vertex (referted as Gi), two electrodes at the inner surface of both the ear lobes or over
the mastoids (G2), and the ground electrode at the forehead. The electrode artangement is
shown in Figure 2.2. In the Intemational 10-20 system, which is most commonly used,
the Gi electrode position is called Cz, while the G2 positions are called Ai and A2,
respectively, for left and right ears.
Figure 2.2 Electrode placement for ABR recording (reproduced from "Auditory Evoked Potentials" by Lind J. Hood and Charles I. Berlin, P. 12)
Electrodes are placed after cleaning the skin and then placing conductive
electrode paste or gel on the skin and electrode. The electrode is securely attached with
tape. The impedance of the electrodes is measured to determine whether it is within the
acceptable limits.
2.4.2 Amplification
The amplifier used in measuring auditory evoked potential is a differential
amplifier. It amplifies only the difference voltage of the electrodes and cancels the
common voltage. This helps in reducing the noise from, for example, the building power
lines, which is common to both the electrodes. Evoked potentials have their amplitudes
in the order of 0.1-0.5 microvolt, so the gain of the amplifier is on the order of 100,000.
This increases the amplitude to a range where it can be better identified and analyzed [4].
2.4.3 Signal averaging
Before averaging the signal is passed through an analog-to-digital converter. A
signal averager is then used to increase the signal-to-noise ratio (SNR). The signal is
assumed to be a sum of ABR and noise. Let x(j) be the EEG signal, a(j) is the ABR signal
and n(j) is the noise where/ which varies from 1 to k (total number of sample points), is
the number of sampled points. The EEG signal can be written mathematically as
x(j) = a(j) + n(j) 2.1
and represents one recorded waveform. If w records are collected a matrix of ^ columns
and m rows can be formed. Each sampled point can be represented as:
8
MW = a(ij) + ri(ij) 2.2
where / goes from 1 to m. Since the ABR signal is identical to each of these records, die
equation can be rewritten as:
x(ij) = aO) + n(ijf 2.3
The noise is assumed to be random, so it will differ in each record and hence the index /
cannot be removed from the noise part. The SNR can be taken as the ratio of the RMS
energy of ABR signal and the RMS energy of noise. The RMS of ABR signal is aQ'), for
a particular value of time the ABR signal is constant. The noise is assumed to be of
Gaussian distribution with mean value zero and variance a^. The RMS value of n(ij)
equals its standard deviation a. So for a particular value of time the SNR of the record is
a(j)/ a. When the average of all the records is taken the variance of the noise, which is m
independent samples from the same distribution is a /m, and hence its RMS value is
a/m^''^. So the over all SNR of the ABR signal is m^''^ a(j)l a which is m^'^ times of the
SNR of a single record. Finally, the resultant waveform is stored in a digitized format in a
computer. The various fiinctions performed by the instmment are shown in Figure 2.3.
2.5 Interpreting ABR waveforms
Some well-defined peaks characterize an ABR waveform. Parameters associated
with these peaks which help in diagnosis of hearing disorders are (i) latency (in ms), (ii)
latency difference between primary peaks (interpeak latency), (iii) peak amplitude
(microvolt), (iv) I-V amplitude ratio, and (v) waveform morphology. The values of these
parameters depend on the age as well as the sex of the subjects. Table 2.2 shows values
of these parameters for aduh females [1].
Stimulus! Section ;
Earphones i i
Calibrated Attenuator
i I
Stimulus Generator
SUBJECT
1
i 1
» C\Bf,i tieCuuuc:]
' r
Amplifiers and filters
A/D Converter
1 r
Averager
!Recording and i Amplification 1 Section
1 Averaging • ^prt inn
CVA converter Computer
Plotter
Storing Section
Figure 2.3 Block diagram of ABR recording
Table 2.2 Normal values for adult females [1]
Parameter Presence of waveform components Expected latency at 75dB Wave I Wave III WaveV Interwave latency intervals Wave I-III Wave III-V Wave I-V Wave V latency-intensity function between 50 and 70 dB Wave V/I amplitude ratio
Normal Value Usually Waves I,III,V at slow rates
1.6 msec +/- 0.2msec 3.7 msec +/- 0.2 msec 5.6 msec +/- 0.2 msec
2.0 msec +/- 0.4msec 1.8 msec +/- 0.4 msec 3.8 msec +/- 0.4 msec 0.3 msec/10 dB
Greater than 1.0
10
2.6 Absolute latency
The absolute latency is the time difference between the onset of the acoustic
stimulus and the time of occurtence of the peak of the averaged response. Figure 2.4
shows the concept of measuring the absolute latency and the interpeak latency [4]. A
peak is considered as that part of the wave where the dovmward slope starts. Most of the
time the waveforms have well-defined peaks but sometimes two peaks merge with each
other.
The absolute latency of wave V has received the most widespread clinical
attention in differential diagnosis of otoneurological disorders as well as in estimating
hearing sensitivity. The importance of wave V peak is due to its robust character and
reliability under varying measurement condition and due to its predictability with
decreasing stimulus intensity. An increase in latency with decrease in the stimulus
intensity is common to all neural system; that is, neural firing becomes less frequent as
the magnitude of the stimulus decreases [1].
Figure 2.4 shows a normal ABR waveform. The peaks are marked by roman
numerals. Labeled in the figure are absolute Wave V latency, the wave V amplitude, inter
peak latency of I-II, I-V and III -V. The diagram has been labeled by following the
Jewert scheme for peak labeling [4].
11
Wave V Absolute Lofency
MSEC
Figure 2.4 Normal ABR waveform of an adult male. (Reproduced from "The Auditory Brainstem Response" edited by John T. Jacobson, P.66)
2.7 Interwave latency intervals
The time difference between the different peaks of the ABR waveform is known
as interwave latency interval. The typical value of the interwave latency interval for
waves I-III and III-V is approximately 2 ms and for the wave I-V interval is about 4.0 ms.
Interwave latency intervals provide information regarding the synchrony and integrity of
the brainstem pathways from a nerve to the other nuclei, since each peak is the response
of a group of neurons.
12
2.8 Latency curve
The latency of the various waveforms mcreases as the mtensity is reduced. The
change in the latency is slow for stimulus intensity from 90 dB to 60 dB but is rapid
between 60 dB and 25 dB. A typical function can be plotted using wave V latency for
different values of intensity, such a graph is called a latency intensity function. Figure 2.5
shows a typical latency-intensity plot. The solid line is the latency curve of wave V of
normal hearing subjects. The dotted line represents the limits of+/- 3 standard deviations
8.5
8.0
7.5
n E .£ 7.0 >, o c _i
> 5
6.5
6,0
5.5
5.0
dBH ipea
1
LN 20 < SPL) (56)
1 30 (66)
i 40
(76)
1
50 (86)
1 60 (96)
- - —
1 70
(106)
1 80
(116)
1 90
(126)
Intensity
Figure 2.5 Latency curve (reproduced from "Auditory Evoked Potentials" by Lind J. Hood and Charles I. Berlin, pg 19)
13
2.9 Clinical application
The latency-intensity plot is used to identify cochlear hearing loss. The slope of
the latency curve of a person having cochlear loss will be steeper then the latency curve
of a normal person. Delay in latency of all major peaks but the interpeak latency within
normal limits signifies conductive hearing loss. The shift in the latency is due to the
reduction of the signal intensity which artives at the cochlea. This can also be caused by
human ertor like placement of earphones over the ear canals or shifts in ear phones due to
patients' movement. Figure 2.6 shows a comparison of latency curve for normal hearing,
cochlear loss and conductive hearing loss [1].
Prolongation of peaks latencies, prolongation of interpeak latencies and waveform
deformation, e.g., (absence of peaks) signifies presence of tumors in the Vlllth nerve
pathway. Tumors can also be indicated if the amplitude ratio of wave V/I is less than 0.5
[1].
14
^^^^^^^Kt'
^ B
•
7.5
7.0
£
ie.s c «J -J > 0)
S 6.0
5.5
5.0
dBHLI '*! (dB SPL)
1 1 \
Cochlear >. \ i
\ > « ^ Tv Conductive or \ » \ retrocochlear
\ \ \ /
/ \ \ % Normal 111 A^ '
^^ \ W \
N^ \ >v X
^ ^
1 1 1 1 ! 1 1 1 20 30 40 50 60 70 80 90
(56) (66) (76) (86) (96) (106) (116) (126) Intensity
Figure 2.6 Latency curve comparison (reproduced from "Auditory Evoked Potentials" by Lind J. Hood and Charles I. Beriin, P.21 )
15
CHAPTER 3
WAVELET TRANSFORMATION
3.1 Mathematical transformation
A mathematical transform is often used to extract information from a signal which
is not observable in the raw signal. A "raw signal" is one which has not been processed.
The signal obtained by applying any of the available mathematical transformation is a
processed signal. Some commonly used mathematical transform is Fourier transform,
Hilbert transformation, and wavelet transformation.
3.2 Fourier transform
The Fourier transform is used to extract the frequency information of a time-based
signal. The mathematical representation of Fourier transform is:
00
Xio))= \xit)expi-jcot)dt 3.1
- 0 0
00
xit)= jXia))expijo)t)dco. 3.2 - 0 0
In the above equations, t represents time, 6) represents frequency in radians per second,
and X denotes the signal in time domain and X denotes the signal in frequency domain.
Equation (3.1) is called the Fourier transform ofx(t) whereas equation (3.2) is called the
inverse transform Fourier transform ofX(a)), which is x(t).
16
A Fourier transform gives information of the frequencies present in the signal. It
will not give any temporal resolution.
3.3 Short-time Fourier transform
The Short-Time Fourier transform is a modified version of the Fourier transform.
In the Fourier transform the entire signal is multiplied by the sine and cosine function. In
STFT, the entire signal is divided into small time windows and the Fourier transform is
applied to the windowed signal. The time precision of the signal depends on the length of
the window. If the length of the window is infinite, then the Fourier transform is obtained
and time precision is minimum. The mathematical representation of STFT is:
STFT^it',0)) = j[xit)W* it -t')]expi-jo)t)dt. 3.3
W(t) is a window function and * is its complex conjugate.
The STFT maps a time function signal into a two-dimensional function of time
and frequency. The fixed length of the windowed signal, which gives limited time
precision to the transformed signal, limits the use of STFT.
3.4 Wavelet transform
EEG signals usually consist of low frequency component for a long durations and
high frequency component for short durations. Since high frequency component occur for
short durations so they should be analyzed with wide window. To overcome the
17
disadvantages of Fourier transform and short-time Fourier transform, the wavelet
transform is used. The wavelet transform utilizes wavelet functions, giving temporal
information on the coefficients.
The mathematical representation of wavelet transformation is:
rt-T^ vp;(r,.) = -i=Jx(0^* dt 3.4 y s J
where ^ f (r, s) is the wavelet transformation of signal x(t) and i// (t) is the transforming
function and it is called the wavelet.
A wavelet is similar to the sine and cosine function of the Fourier transform. In
the Fourier transform, the sine and cosine waves have various frequencies. Similarly, in
the wavelet transform, the wavelets have dilated and shifted versions. The difference
between the sine or cosine function of the Fourier transform and wavelet of wavelet
transformation is that the sine and cosine function is of infinite duration whereas the
wavelet is of finite duration. This property of the wavelet gives it its time resolution
property.
3.5 Scale
A wavelet transform maps a time-based signal into functions of scale and time.
The scale is similar to the frequency of the signal and is inversely related to frequency:
scale = (1 /frequency). 3.5
A scale basically compresses or dilates the wavelet and hence controls the frequency
component. Large scales cortcspond to dilated or stretched out signals and small scale
18
cortesponds to compressed signals. In terms of mathematics, if f(t) is a signal, tiien f(st) is
a compressed signal if s>l and is a stretched out signal if s<l [5].
For example, let f(st) be cos(s*(7c/4)*t) and s assumes tiie value of 1, 2, K2 tiien tiie
resultant waveform is shoym in Figures 3.1-3.3.
f(st) where s = 1
ID 20 30 40 a ] 60 Time
100
Figure 3.1 A cosine wave for s =1
1
D.8
0.8
0.4
0.2 0>
• n
i 0
-0.4
-0.61-
-0.8
-1
f(st) where s > 1
TT
10 20 30 40 50 Time
90 100
Figure 3.2 Figure 3.1 A cosine wave for s >1
19
Figure 3.3 A cosine wave for s <1
3.6 Time and frequency resolution
The biggest advantage of the wavelet transform over other transforms is the
varying resolution of time and frequency. The different resolutions of time and frequency
can be best explain by Figure 3.4. In the figure, the ordinate is time and the abscissa is
frequency. Each box in the diagram represents a value of the wavelet transform for
different resolution of frequency and time. Although each box has different yvddth and
height, the area of each box is the same. This signifies that each value of the wavelet
transform covers an equal portion in time-frequency plane but yvith different resolution.
At low frequency, the y^dth is greater than the height which means that wavelet
transform gives good frequency resolution but poor time resolution at low frequency;
similarly at high frequency the transform gives better time resolution but poor frequency
20
resolution [5]. In the diagram, it is seen that none of the boxes has zero area which means
the value of a particular point in time and frequency cannot be known. ue
ncy
kreq
c q
o
E o o 0)
• Q
0 • D
X
Time
Figure 3.4 Time and frequency diagram for wavelet transform
The time and frequency diagram of STFT will consist of squares since it has fixed
resolution in time and frequency. This is shown in Figure 3.5.
21
13
cr
Time
Figure 3.5 Time and frequency diagram of STFT
The time and frequency diagram for the Fourier transform will consist of
rectangles with width covering the entire time scales because no information of time is
present in a Fourier transform. This is shown in Figure 3.6
o
cr i-t
Time
Figure 3.6 Time and frequency diagram of Fourier Transform
22
3.7 Discrete wavelet transform
The wavelet transform calculated by using the continuous wavelet transform
formula given in equation 3.4. This formula can be used only for continuous time signals.
For discrete time signals, the discrete wavelet transform is used [5].
In the discrete wavelet transform, the signal is passed through various low-pass
and high-pass filters. The frequency resolution is controlled by the cut-off frequencies of
these filters. After passing the signal through various filters, the signal is upsampled or
downsampled to control the time resolution of the signal. Downsampling is removing
some of the samples from the signal, which decreases the sampling rate of frequency. For
example, to doymsample a signal by 2, every other sample points of the signal have to be
ignored. Upsampling is increasing the sampling rate by adding new sample points with
linear interpolation, in the signal [5].
In discrete wavelet transform, the signal is passed first through a low-pass filter
and high-pass filter. Filtering a signal is mathematically convolving the impulse response
of the filter with the signal and can be expressed as:
00
y[n] = x[n]h[n] = X * l - M « " 1 3.6 * = - o o
where y[n] is the resultant signal, x[n] is the original signal and h[n] is the impulse
response of the filter.
After passing through the filters, the signal is sub-sampled to satisfy the N\ quist
criteria. Initially the signal spanned from 0 - TT radians per second. When the signal is
passed through a high-pass filter then the resultant signal has frequency component from
23
71/2 - n radians per second and hence the frequency resolution has doubled. Similarly, the
signal resultant from a low-pass filter will have frequency range from 0 - 7r/2 radians per
second. After passing the signal through the filters, k is downsampled by 2. This
downsampling reduces the time resolution of the signal by 2. This is the first level of
decomposition. The coefficient of wavelet obtained for the signal passed through the high
pass filter is called the detail coefficient and for the low-pass filter it is called the
approximate coefficient. The low-pass filter signal is again passed through a high-pass
and a low-pass filter, resulting in a second level decomposition and then sub-sampled. As
the decomposition level increases the frequency resolution increases and the time
resolution decreases due to decrease in the sample points. This process is also known as
subband coding.
For example, suppose a signal x[n] has 512 points and is passed through the first-
level decomposition having high-pass filter with impulse response h[n] and resultant
signal yHi[n] and low-pass filter with impulse response l[n] and resultant signal yLi[n]
where the subscript 1 denotes the first level of decomposition. The resultant signals are
subsampled by 2 leaving both yHi[n] and yLi[n] with 256 points each and thus halved the
resolution in time. The frequency resolution of yHi[n] is n/2 - n radians per second and
yLi[n] is 0 - 71 radians per second. The low-pass signal is again passed through the high-
pass and low-pass filter yielding yH2[n] and yL2[n] each having 128 sample points and the
frequency band spans from 7i/4 - 7i/2 radians per second and 0 - 7i/4 radians per second.
The signal in the second decomposition level has half the time resolution and t\s ice the
frequency resolution then the first level decomposition. As the level of decomposition
24
increases, the time resolution decreases and the frequency resolution increases by a factor
of 2. The process of decomposition of the low pass filtered signal continues till the
number of sample points is 2. For this example the level of decomposition will be 8 [5].
In the Figure 2.5, the decomposition of the signal has been shown. In the figure
h[n] is a high-pass filter, whereas l[n] is a low-pass filter. These two filters filter the
signal, and then it is doymsampled by 2. The equation f = 0 - 7r shows that the filtered
signals frequency span is from 0 to 7i radians per second.
25
Decomposition level 1
x[n]
i h[n]
f=nll -n
Downsamplin
l[n]
f=0 -nil
Decomposition level 2 i
h[n]
i=nt4-ni:
h[n]
f=;r/8- nl4
l[n]
f = 0-71/4
iDecomposition
level 3
l[n]
f=0-;r/8
Figure 3.7 Different levels of wavelet decomposition of signal
26
CFL\PTER 4
PROCEDURE
4.1 Introduction of data set
The data was obtained from the Communication Disorders department of Texas
Tech Health Science center. There were 10 patients, of which 1 was normal and 9 were
abnormal cases. Each patient was considered a dataset, yvdth different number of records.
Each record has 256 points with different time spans.
4.2 Data collection
The subjects were seated in a recliner chair in a relaxed position or in sleep.
Before applying the electrodes on the ears the ear canals were checked with an otoscope
to see that they were clean. Electrodes were applied to the Cz and both mastoids. In
accordance with 10-20 Intemational System for scalp mapping, these sites cortcspond to
abbreviations Fz, A2(for right ear lobe) and Ai (for left earlobe).
The EEG data was stored in a host computer. The files were stored with an
extension of *.ep. The files were accessed by a DOS batch file epcfg.bat. The signal
values were then extracted into ASCII fortnat with the file EP2ASC.BAT. The ASCII
values were saved into Excel files. The entire analysis on the dataset was done in Matlab,
which was used to extract the data from the excel files and do the processing.
Figure 4.1 shows the 10-20 Intemational System for electrode positioning.
27
NASION
' ^^ (^ V
if4
*n I _r_5^...@..„,.(^,_.0_„0 .
< fA?
LEFT RIGHT
INION
Figure 4.1 10-20 Intemational system for electrode placement [4].
Figure 4.2 shows the details of the test performed. The patient is called as
"Subject 3". Each record is shoyvn as either gray or white. The two items on the first row
of each record are the date of the test performed and the time of the test. The first column
shows the number of the record as well as the chaimel. The second column shows the
name of the test performed, AEP is auditory evoked potential. The third column is the
total time span of the record. The next column shows the ear in which the stimulus is
applied. The next column is the type of stimulus used, as in this case it is a toneburst. The
next column is the stimulus frequency followed by the intensity of the stimulus and the
stimulus rate. The last two columns show the number of readings taken for a record and
the number of artifacts (times when the signal went to the instrument limits).
28
PATIENT :
REC-CHAN
NUMBER
01/04/80
^ 1_ 1
"01/04/80 2 1
01/04/80 3 1
01/04/80 4 1
piiyiGJi/eo
C 5 1 01/04/80
Subject 3
TEST TEST STIN NANE EPOCH LOC 17:54:12 W^^mww^mw^ AEP 15.10 RIGHT EAR 17:56:41
AEP 15.10 RIGHT EAR 17:59:11
AEP 15.10 RIGHT EAR
AEP 15.10 RIGHT EAR
18:04135 'WKKk AEP 15.10 RIGHT EAR 18:07:29
6 1 AEP 15.10 RIGHT EAR
7 1 01/04/80 8 1
wmwm 9 1
01/04/80 10 1
AEP 15.10 RIGHT EAR 18:12:49 AEP 15.10 RIGHT EAR
18:15 r'fS^^^^^^^^^^^^ AEP l^O^mm EAR 18:17:34"^
AEP 15.10 RIGHT EAR
[Fl] Preuious page [FZ] Next page
STIH
TVPE
A^TBURSI
A TBURST
JjmiBST
A TBURST
Ifif IBURST
A TBURST
A TBURST
A TBURST &-y
A TBURST
A TBURST
STIM FREQ
1
1
1
1
1
1
1
1
1
1
[F3] Printer
KHz
KHz
KHz
KHz
KHz
KHz
KHz
KHz
KHz
KHz
STIM INTMS
STIM RATE
80dB
80dB
80dB
60dB
60dB
11.3
11.3
11.3
11.3
11.3
40dB^|^1.3
40dB 11.3
20dB
[ESC] exit
11.3
NUMBER
STIM ART
1200
1200
1200
1200
1200
1200
18
16
6
6
9
0
120(|^^ 0
1200
1200
1200
4
6
58
Figure 4.2 Test details
4.3 Preprocessing
The data were normalized in terms of power, so that each record has same power.
The formula used for normalization was:
x, x norm. =
yl(x[+x[+Z+xl)
where X/ is the original value of each sample,
n is 256, the total number of sample points,
29
xnorm^ is the normalized value.
The normalized data was used to extract various features from the signal, which
were then used to identify the wave V peak. Two approaches were used to identify the
wave V peak, the slope method, the wavelet transform.
4.4 Slope method
The aim was to pick a peak, and slope as a characteristic associated with a peak
can be used to identify the peak. Most of the work done [2, 3] in the past for automatic
detection of wave V were usually done on normal waveforms but the data used here were
highly affected either by noise or most of them were patients so abnormal. A typical
waveform data looked like as shoyvn in Figure 4.3.
ja record number-2 original signal
Figure 4.3 A typical waveform
30
The first step was filtering the data to eliminate the noise from the ABR signal.
The power spectrum of the signal was drawn for all the signals. The cutoff frequenc\ of
the signal was determined to be around 1750 Hz, while recording values from patients,
filter of range 0 - 3000 Hz was used, and the signals had the power line noise of 60 Hz.
Hence a band pass filter was used of range 70 - 1750 Hz. The response of various filters
was observed and Chebyshev II had the least distorted filtered waveform. The frequency
range of filter used was consistent with the frequency range given by [3] for the various
peaks.
After filtering the signals, the local maxima and minima were identified and
slopes associated to the maxima were calculated. To calculate the slope, two values were
required on amplitude scale one was the local maxima and the other was the local
minima. The minima in this case were considered as the point was the waveform shape
changes and not the immediate dip after the peak. For example, in Figure 4.4 are shown
the original waveform on the top and the filtered waveform at bottom. The maxima and
the minima are marked. The slope was calculated for the peak A of the waveform.
31
Figure 4.4 Slope method
The formulae for the slope is
Slope= (PA - Dc)/(Tc- TA)
where PA is the amplitude of peak A, Dc is the amplitude of dip C, Tc is the time of
occurtence of C and Tc is the time of occvurence of TA-
4.5 Wavelet transform
The normalized signal was decomposed into 5 levels using the MATLAB
fiinction "wavedec" . Only the low pass filter signal was continuously decomposed using
32
the filter. The mother wavelet used for the decomposition was Daubechie 5. The dB5 is
the most widely used mother wavelet for EEG signals. The shape of various Daubechie
wavelets is shown below in Figure 4.5. After each level of decomposition the number of
points of the signal was decreased by a factor of 2. So the first level of signal had 128
points, the second level of signal had 64 points and so on. The entire signal at a particular
level was then reconstructed by using the command "waverec". These reconstructed
waveforms were used to extract the various features:
i. slope of the peaks (the difference in the ampliUide of the wave peak and the next
possible minima and the downward slope is taken),
ii. inter peak latencies (the time difference between all peaks),
iii. amplitude ratio (the ratio of the amplitudes of the various peaks), and
iv. peaks latencies (the time of occurtence of the peaks).
* I } > • X 4 • 1 4 * » i * * * • t
db2 db3 db4 db5 db6
« 1 I* i<
db7 db8 db9 db10
Figure 4.5 Daubechie wavelets
33
These features were extracted and tabulated m an excel file. Peak latency was taken
as the feature to be used for classification. The algorithm followed to pick the wave V
peak was as follows:
1. The waveform was normalized so that the lowest pomt of the waveform is
zero amplitude.
2. The mean value of all the available peaks was taken.
3. The peaks whose amplitude was greater than the mean value were selected as
the possible peaks.
4. The peaks from the short listed ones were seen in time frame in the following
order: 6-7 msec, 7-9 msec, 5-6 msec and 9-10 msec.
5. The first peak to fall in this range was selected as peak V, and if there was no
peak in these time intervals then it was assumed that the peak W2is absent.
05
> o u 'E .E 0
ampl
itude
cn
*-• > O.B o 1 0.6
•ffl 0 .4 T3
to.2 t " 0
ja record number-1 original signal
^ ^ ^ peakV/ \ J
\ / \ y \ /^
1 1 1 1 \ _ / i 1
2 4 B 8 10 12 time in seconds, , _ .._
ja record number -1 reconstructed level a3 using db5
^^ ' 'N peakV / \ y
\ / ^ / \ X"^
, 1 1 1 \ y i — 1
2 4 6 0 10 12
^^^H^amv:Mxmi!aiss:is!S£mm^^^ l ' ' "^ '" SBCOndS
1
14
x i o '
1
14
xlD'^
Figure 4.6 Reconstructed waveform.
34
4.6 Spectrogram approach
Before identifying the wave V from the signal, an approach was made to classify
the data set into normal and abnormal cases. The data set was known to have 1 normal
case and 9 abnormal cases. The STFT method was used to determine to what degree it
could cortoborate the clinical results. The total length of the signal was divided into 41
windows using a Kaiser Windowing function. Among the dataset only "ja" in Figure 4.7
was normal. There was a visible difference between the normal cases and abnormal cases
in the frequency range of 2000 to 4000Hz.
8000
7000
6000
w
u c
5000 -,
o- 4000 h
3000
2000
1000
0
'ii*
spectrogram ofjal using kaiser(16,5) — r r x — - . 1 f—
I *«> m
w
x10 i Figure 4.7 Spectrogram
35
CHAPTER 5
RESULT AND CONCLUSION
The previous chapters discussed about the theory of ABR and Wavelet transfonn
and the various processing steps taken to identify the wave V peak. In tiiis chapter we
will discuss the results of the various processing steps.
5.1 Slope method
Only about 30% of the peaks identified by the expert were selected by the slope
method. The slope alone can't be used as a feature to select peak V.
5.2 Wavelet analysis
The wavelet transform used to filter the signal and then locate the peak by using
the algorithm was much more successful then the slope method. Out of 10 patients, wave
V identification were 80% for 7 patients within a limit of 0.3 ms and 60% for 3 patients
within a limit of 0.3 ms. in terms of over all results out of 120 waveforms used 95 were
identified cortectly giving an overall 78% classification. Figure 5.1 shows the
reconstructed wave for "ag" record 10 at 8th level of decomposition. In the figure, "s'' is
the original signal, "ag" is the approximate coefficient at level 8, "dg", "d;", "de", ''ds",
"d4", "ds", "d2" and "di" are the detail coefficient at levels 8 to 1 respectively.
36
Figure 5.3 shows the latency curve for tiie clinically identified peak V and the
wavelet identified peak V.
Figures 5.2 through 5.6 compare tiie classification of wave V peak latency
determined by our clinical specialist and tiie peak V identified by tiie wavelet transfonn
37
for different intensity of stimulus. The X axis has tiie wave V latency and tiie Y axis has
the number of records.
8
^ 7.8 (0
E 7.6 -i" 7.4, c © 7 2 (0
> 7 -% 6.8 -
^ 6.6 -
6.4 -
2
Latency Curve
r,,...—
0 40 55 60
Intensity (dB)
8
4 Clinical \^lue
> • Was^let value
0
Figure 5.2 Latency curve comparison
38
Intensity 80 dB
>4 c3
-•1 0 X',J^!&^i^/ii/^i^>/y/;M^^l^ifS/^^^^
-•—wavelet identified
clinically identifieq
1 3 5 7 9 11 13 15 17 19 21 23 Record
Figure 5.3 Wave V latency for 80 dB
o (0
E
o c (0
10 9 8 7 6 5 4 3 2 1 0
Intensity 60 dB
" f — I — ' I '" ''1 1 — " ' r - t——I—-r-
-•—wavelet identified
-•—clinically identified
1 3 5 7 9 11 13 15 17 Record
Figure 5.4 Wave V latency for 60 dB
39
9i
87 ie c5 >.4
Q>2
-J1
u
Intensity 55 dB
F-^:.^Jr:^--^<^::»^^^-^ T ^ 5 > i M r v ^ ^ ^^---^^K^V
'
IE r I 1 ' • " ' • * — • " • "
—•—wavelet identified 1 1 1
—•—clinically identified
1 3 5 7 9 11 13 15 17 Record
Figure 5.5 Wave V latency for 55 dB
12
0
Intensity 40 dB
-•—wavelet identified
••—clinically identified
" T ^' I r
1 2 3 4 5 6 7 8 9 1011 1213
Record
Figure 5.6 Wave V latency for 40dB
40
Intensity 20 dB
-•—wavelet identified
•—clinically identified
1 2 3 4 5 6 7 8 9 10
Record
Figure 5.7 Wave V latency for 20dB
Table 5.1 compares the average values and standard deviation of wave V latency
between clinically identified peaks and wavelet identified peaks.
Table 5.1 Average and standard deviation of wave V latency
Intensity 80 60 55 40 20
Clinically Identified Average Lat.
7.29 7.46 7.02 7.47 7.10
Standard dev. 0.97 0.70 0.55 0.81 0.70
Wavelet Resulted Average Lat.
7.36 7.92 6.92 7.32 6.91
Standard dev. 0.76 0.87 0.50 0.91 0.79
The other features, like the slope of the reconstructed waveform, the interpeak
latencies and the amplitude ratio yield no results. The spectrogram approach had shoyvn
some results but the values obtained fi-om the spectrogram could not be quantified and
hence it was a qualitative result and not quantitative.
41
5.3 Conclusion
The reconstructed wave of wavelet decomposition level 3 using Daubechie 5
mother wavelet can be used as feature to extract the wave V peak from the waveform.
Wilson [3] has shoym that the wavelet used does not significantiy effect peak
identification.
5.4 Future work
Along with one more features the automated identification accuracy of wave V
can be improved. The other wavelets can potentially be used to extract features for
identification. The spectrogram approach could be used to classify the cases into normal
and abnormal.
42
REFERENCES
1. Linda J. Hood, Charles I. Berlin, Auditory Evoked Potentials, Austm, Pro-ed, 1986.
2. Habraken JBA, van Gils MJ, Cluitmans PJM. Identification of Peak V in Brainstem Auditory Evoked Potentials with Neural Networks. Computers in Biology and Medicine, 23 (5), pp. 369-380, 1993.
3. Wayne J. Wilson, Mark Winter, Gill Kert and Farzin Aghdasi, "Signal Processing of the Auditory Brainstem Response: Investigating into the use of Discrete Wavelet Analysis", Procs. South African IEEE Symposium on Communications and Signal Processing - COMSIG'98, University of Cape Toym, Rondebosch, September 7-8 1998, pp 17-22.
4. John T. Jacobson, The Auditory Brainstem Response, College-Hill Press, Inc, San Diego, CA, 1985.
5. Wavelet tutorial http://engineering.rowan.edu/~polikar/WAVELETS/WTtutorial.html
6. Ernest J. Moore, Bases of Auditory Brain-Stem Evoked Responses, Grune & Stratton, New York, 1983.
7- Barbara Burke Hubbard, The world According to Wavelets, A K Peters Wellesley, Wellesley, Ma, 1998.
8. Agostino Abbate, Casimer M. DeCusatis, Pankaj K. Das, Wavelets and Subbands, Birkhauser, Boston, MA, 2002.
9. J. Robert Boston, "Automated Interpretation of Brainstem Auditory Evoked Potentials: A Prototype System", IEEE Transactions on Biomedical Engineering, vol. 36, pp. 528-532, May 1989
10. Wayne J. Wilson, Mark Winter, Cartnel Nohr and Farzin Aghdasi, "Signal Processing of the Auditory Brainstem Response: Clinical effects of variations in Fast Fourier Transform Analysis", Procs. South African IEEE Symposium on Communications and Signal Processing - COMSIG'98, University of Cape Town, Rondebosch, September 7-8 1998, pp 23-28
43
11. M.J. van Gils. Peak Identification in Auditory Evoked Potentials using Artificial Neural Networks, Ph.D. Thesis, Eindhoven University of Technology, The Netherlands, 1995.
12. Wilson, W.J., and Aghdasi, F., "Discrete Wavelet Analysis of tiie Auditory Brainstem Response: Effects of Stimulus Intensity and Subject Age Gender and Test Ear", Proceedings of the IEEE AFRICON'99 conference, Vol. 1, pp. 291-296, September 1999.
13. Wilson, W.J., and Aghdasi, F., "Discrete Wavelet Analysis of the Auditory Brainstem Response: Effects of Stimulus Intensity and Subject Age Gender and Test Ear", Proceedings of the IEEE AFRICON'99 conference. Vol. 1, pp. 291-296, September 1999.
14. Hanrahan, H. E. (1990). Extraction of features in auditory brainstem response (ABR) signals. COMSIG 90, Proceedings of the third South African Conference on Communications and Signal Processing, IEEE catalog number 90TH0314-5/90, 61-66.
15. http://yvww.vanaga.es/fcps/filtered.htm
16. http://yvww.audiospeech.ubc.ca/haplab/aep.htm
44
PERMISSION TO COPY
In presentmg this thesis in partial hilftllmcnt of the requuemcnts for a master's
degree at Texas Tech University or Texas Tech University Health Sciences Center. 1
agree tiiat die Library and my major department shall make it freely available for
research purposes. Permission to copy this thesis for scholarly purposes may be
granted by die Duector of tiie Library or my major professor. It is understood tiiat
any copying or publication of this thesis for financial gam shall not be aUowed
witiiout my ftirther wntten permission and that any user may be liable for copyright
infringement.
Agree (Permission is granted.)
Student Signature Date
Disagree (Permission is not granted.)
Student Signature D^te