Accuracy and Time Efficiency of Two ASSR AnalysisMethods Using Clinical Test ProtocolsDOI: 10.3766/jaaa.20.7.5
Kathy R. Vander Werff*
Abstract
Background: The number of commercially available evoked potential systems implementing multiple-frequency auditory steady-state response (ASSR) techniques has increased over the last several
years. The majority of data in the multiple-frequency ASSR literature have been obtained using time-domain averaging and Fast Fourier Transform (FFT) techniques with F-test statistical analysis. Another
commercially available analysis method has been introduced using an adaptive filtering algorithmcalled the Fourier Linear Combiner (FLC). No previous investigation has evaluated the performance of
the FLC method, nor compared the two techniques. In addition, there is a need for evaluation of clinicalprotocols for ASSR testing using these available commercial systems that balance time efficiency and
accuracy in estimating threshold.
Purpose: (1) To determine whether ASSR thresholds, the relationship between ASSR and behavioral
thresholds, and clinical test time are affected by the ASSR analysis method when comparing twocommercially available systems for multiple-frequency ASSR. (2) To investigate the use of clinical
ASSR test protocols of varying recording length, and the effect on accuracy and time efficiency, usingthese two commercially available analysis methods.
Research Design and Study Sample: ASSR threshold searches were completed on a group of 20normal-hearing and 20 hearing-impaired adult participants using two different analysis methods, FFT
and FLC, under separate, independent, tests as well under simultaneous recording conditions.
Data Collection and Analysis: Three experiments were completed: (1) independent assessment ofASSR thresholds using the FFT and FLC methods separately, (2) simultaneous recording of ASSR for
both the FFT and FLC method, and (3) an automated threshold search protocol using the FLC method.Variables analyzed for Experiments 1 and 3 included ASSR thresholds, the difference between ASSR
and behavioral threshold, and total test time. For Experiment 2, the number of detected ASSRs per
method, the agreement between methods, and the time per detected ASSR were evaluated.
Results and Conclusions: ASSR thresholds and the relationship between ASSR and behavioral thresh-olds were found to be in line with those reported in the literature for multiple-frequency ASSR for both the
FLC and FFT methods. ASSR thresholds were found to be significantly higher for the FLC method for thelow frequencies, but not for the high frequencies, when tested independently. Correlations between ASSR
and behavioral thresholds, however, were found to be the same across methods. Overall, it did not appearthat either analysis method held an advantage in terms of accuracy or overall test time in independent
comparisons using the protocol implemented in the current study. The time benefits of an automatedprotocol were significant, although with compromised test accuracy. The results of this study suggest
critical clinical decision making is a necessary part of the ASSR protocol in order to decrease false positiveand false negative responses and to increase overall efficiency.
Key Words: Analysis method, auditory evoked response, auditory steady-state response, hearingthreshold, objective audiometry, recording time
Abbreviations: ASSR 5 auditory steady-state response; EEG 5 electroencephalographic; FFT 5 Fast
Fourier Transform; FLC 5 Fourier Linear Combiner; HI 5 hearing impaired/-ment; MASTER 5 MultipleAuditory Steady-State Response; NH 5 normal hearing
Kathy R. Vander Werff, Department of Communication Sciences and Disorders, 805 S. Crouse Ave, Room 200, Syracuse, NY 13244;Phone: 315-443-7403; Fax: 315-443-1113; E-mail: [email protected]
*Department of Communication Sciences and Disorders, Syracuse University, Syracuse, NY
This work was supported by the American Academy of Audiology Foundation (New Investigator Research Award) and the Marvin and CarolSchneller Fund, Syracuse University College of Arts and Sciences.
Portions of this research were presented at the American Auditory Society Annual Meeting, March 5–7, 2006, Scottsdale, AZ, and atAudiologyNOW! 2007, Denver, CO.
J Am Acad Audiol 20:433–452 (2009)
433
The auditory steady-state response (ASSR) has
transitioned from a research method into an
accepted diagnostic technique for evaluating
hearing in populations that cannot be tested using
behavioral methods. Many clinics have implemented
ASSR as part of the audiological test battery due to the
technique’s advantages as an objective measure,
including frequency specificity and automatic statisti-
cal detection of evoked responses.
The literature relating ASSR thresholds to behav-
ioral thresholds in adults and children has expanded
considerably in the past few years. However, there are
still reasons to hesitate in implementing ASSR as a
stand-alone clinical tool. There remain gaps in the
evidence base for ASSR threshold levels in infants with
hearing loss confirmed by behavioral data. There is a
lack of data regarding the use of bone-conduction
ASSR and identifying type of hearing loss with the
technique in hearing-impaired infants. In addition,
despite the possibility of testing multiple frequencies
simultaneously and objective response detection, ASSR
testing can still be time-consuming. ASSR is a small-
amplitude evoked potential, and the accuracy of
threshold estimation is known to be significantly
improved by longer recording times, due to lower
noise levels and higher SNR (John et al, 1998; John
and Picton, 2000; Luts and Wouters, 2005). Long
recording times may not be clinically feasible, and it
is important to evaluate whether an acceptable
balance between test time and accuracy can be
achieved. Despite these unresolved issues, there are
several commercial systems now available and in
clinical use. There are no accepted standards for
ASSR equipment, and each manufacturer is free to
implement its own test paradigm. The performance of
some of these methods and systems has not been
independently evaluated.
ASSRs are responses from the brain evoked by
continuous acoustic signals, most frequently sinusoi-
dally modulated carrier tones. These responses remain
stable in amplitude and phase over a long time period,
as neurons phase lock to the modulation rate of the
stimulus. ASSR can be recorded either to carrier tones
presented singly or to multiple tones presented
simultaneously. Both single- and multiple-frequency
ASSR techniques have advantages, and there are
commercial systems available utilizing both approach-
es. Previous research suggests that the accuracy of
single- and multiple-frequency techniques is similar
(Luts and Wouters, 2005), but that the relationship
between ASSR threshold and behavioral threshold
may be unique to the stimulation method (Luts and
Wouters, 2005; Vander Werff et al, 2008). That is,
clinicians would be advised to consider norms specific
to the type of stimulation method (single vs. multiple)
used by their device.
Within the category of multiple-frequency ASSR
techniques, manufacturers have implemented differ-
ent methods of recording and analyzing the response.
In general, multiple-frequency techniques estimate the
amplitude and/or the phase of each possible ASSR
component, corresponding to each of the modulation
frequencies used in the stimulus, in the ongoing
electroencephalographic (EEG) activity. If the estimate
of the component at a specific modulation frequency is
determined to be statistically different from random
background activity at a specified confidence level, an
ASSR is considered to be present for that stimulus
frequency. The way the estimates of the amplitude and
phase at each modulation frequency are determined,
the type of statistical test, and the time over which it is
applied vary across commercial systems.
The majority of multiple-frequency ASSR studies in
the literature have been conducted using the MASTER
(Multiple Auditory Steady-State Response) system
developed by John and Picton at the Rotman Research
Institute at the University of Toronto (John et al, 1998;
John and Picton, 2000). A version of the MASTER
system has been implemented commercially in the Bio-
logic Navigator Pro evoked potential system by Natus.
Similar analysis methods are incorporated in other
commercial devices such as the SmartEP system by
Intelligent Hearing Systems and the Audix by Neoro-
nic, SA. In these systems, incoming EEG data is
divided into sections or epochs of about 1 sec. In the
case of the MASTER system, 16 epochs are linked
together (after rejecting any epoch that exceeds
artifact rejection criteria) to form a sweep of data for
analysis. Fast Fourier Transform (FFT) calculations
are performed on each sweep of data to convert the
data to the frequency domain, and an F-ratio is used to
statistically evaluate the energy at each modulation
frequency compared to the energy in surrounding
frequency bins. That is, an estimate of the possible
ASSR signal is compared to an estimate of the
background noise at neighboring frequencies. Each
additional sweep of data is added to the prior sweeps in
the time domain, and the results are submitted to FFT
analysis and the F-test after each sweep. This method
of time-domain averaging serves to improve the signal-
to-noise ratio (SNR). When the F-ratio at a particular
modulation frequency becomes significant at the
specified level (typically p , 0.05), an ASSR is
considered to be present. It is up to the user to
determine the minimum and/or maximum number of
sweeps to include in the average before the ASSR
response is considered to be a stable, or real, response.
Other multiple-frequency ASSR systems have em-
ployed adaptive filtering algorithms, rather than time-
domain averaging and FFT analysis, to analyze the
EEG signal. One such example is a detection strategy
called RapidASSR, implemented commercially by GN
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
434
Otometrics in the Chartr EP Evoked Potential system.
This specific strategy implements an algorithm called
Fourier Linear Combiner (FLC) to adjust the ampli-
tude and phase of sine waves in its model to estimate
actual ASSR amplitude phase. The model is adaptively
evaluated against the EEG signal using a circular T2
statistic once per second. As soon as the statistic
reaches a specified confidence level (typically 95%), the
test is automatically stopped and a ‘‘positive ASSR’’ is
reported. If the specified confidence is not reached
within the designated maximum search time, the
results are reported as a ‘‘negative ASSR.’’
One difference between the two techniques just
described is the influence of the clinical strategy
implemented by the audiologist for determining
whether a response is present or absent. For both
methods, the clinician can determine a maximum
recording time after which, if the statistical criteria
is not met, ASSR will be considered absent. With the
FFT method, it is possible to implement strategies
based on always collecting data for a preset period of
time or number of sweeps, and making the decision
based on the F-ratio at the end of this time period.
Alternatively, ASSR testing could proceed until a
statistically significant ASSR is detected by evaluating
the F-ratio after each sweep. While this method is
more practical in clinical settings with limited test
time, the application of multiple sequential statistical
comparisons increases the likelihood that the result
could reach significance by chance. These types of
errors could be compensated for by changing the
significance level (e.g., using a Bonferroni correction);
however, it is has been shown that this type of
correction may lead to errors of not detecting real
ASSRs (Luts et al, 2007). Some combination of these
strategies may be the best solution in terms of test time
and accuracy. For example, recording could be contin-
ued for a predetermined length of time or be stopped
when the response remains significant for several
sweeps in a row or when the noise floor falls under a
certain criterion level.
The FLC method, by contrast, is continuously
updated in 1 sec increments and automatically stops
recording when statistical criteria are reached. The
circular T2 statistic, which is geared toward repeat-
ability, is not subject to the same types of errors of
multiple comparisons. However, to this author’s
knowledge, there are no reports in the literature of
the accuracy of this technique in estimating thresh-
olds. This study, therefore, provides a first report in
the literature of the use of the FLC analysis technique
for recording ASSR. It directly compares the FLC and
FFT analysis methods as implemented commercially
under independent conditions to optimize each proto-
col as well as under simultaneous recording condi-
tions, controlling for subject state as an extraneous
variable. Finally, this study attempts to utilize
clinically relevant strategies of variable recording
times to evaluate the balance between the test
accuracy and completion of recording in a feasible
amount of time.
METHODS
Subjects
Two groups of adult subjects were recruited for this
study, one group with normal hearing (NH), and one
group with hearing impairment (HI) for at least one of
the frequencies tested. A total of 43 adults were
enrolled, and 20 NH subjects (16 female) and 20 HI
subjects (7 female) completed the study. One NH
subject and two HI subjects did not complete the study
due to excessive noise levels even after allowing a
considerable period to relax/sleep and/or were unwill-
ing to return for further testing. Both ears were tested
and included in the analysis for each subject, with the
exception of one HI subject with a unilateral conduc-
tive component that exceeded the maximum stimulus
levels for the first experiment in this study at all test
frequencies, in which case the ear with mixed loss was
excluded. For the purposes of this study, NH was
defined as thresholds of 25 dB HL (American National
Standards Institute, 2004) or better for all frequencies
from 250 to 4000 Hz, and HI was defined as thresholds
.25 dB HL at one or more frequencies.
Recruitment focused on individuals with a somewhat
limited range of hearing loss due to the fact that the
upper limit of stimulation under simultaneous record-
ing conditions was 60 dB HL. The degree of hearing
loss for the HI group varied from mild to severe,
although most subjects had primarily high-frequency
hearing loss. Mean thresholds for the HI group were
21, 29, 36, and 52 dB HL for 500, 1000, 2000, and
4000 Hz respectively. The age range for the NH
subjects was 19–74 years (mean 29.3 6 13.4), and the
age range for the hearing-impaired subjects was 26–82
years (mean 70.7 6 12.0). These groups therefore
represent different ages as well as differences in
hearing status. All 40 participants completed Experi-
ments 1 and 2, and 36 of the participants also
completed Experiment 3 (17 NH and 19 HI). The
protocol used in this study was approved by the
Institutional Review Board of Syracuse University,
and informed consent was obtained from all partici-
pants.
Stimulus Parameters
ASSR stimuli were amplitude- and frequency-mod-
ulated (100% AM, 20% FM) tones at four carrier
frequencies (500, 1000, 2000, and 4000 Hz) per ear,
ASSR Analysis Methods/Vander Werff
435
presented to both ears simultaneously. Modulation
frequencies were approximately 80, 85, 90, and 95 Hz
for the left ear and 78, 83, 87, and 92 Hz for the right
ear, adjusted slightly for an integer number of cycles
within each epoch. ASSR stimuli were calibrated
separately for each carrier frequency in dB HL using
a Bruel & Kjaer Type 2209 sound level meter in linear
mode with a Bruel & Kjaer DB0138 2 cc coupler. ANSI
S3.6-1996 (American National Standards Institute,
1996) corrections for Etymotic ER-3A earphones were
used to convert from dB SPL to dB HL.
For Experiment 1, stimuli were generated by each
individual evoked potential system, the MASTER for
FFT method and the RapidASSR system for the FLC
method. For Experiment 2, under simultaneous re-
cordings of FFT and FLC, stimuli were generated and
presented by the MASTER system. Experiment 3
involved only the FLC method in an automated
threshold search protocol, and stimuli generated by
the RapidASSR system. Output levels were approxi-
mately 5 dB lower for the RapidASSR system com-
pared to the MASTER system; therefore, corrections
were made to adjust for differences between the output
levels of the two systems for the independent FLC
recordings. Frequency spectra for the stimuli generat-
ed by each system were comparable.
Instrumentation and Recording Parameters
Single-channel ASSRs were recorded using an active
electrode placed on the vertex (Cz), linked reference
electrodes on both mastoids (M1 and M2), and a
ground electrode on the low forehead (Fpz). All inter-
electrode impedance values were less than 5 kV and
were within 1.5 kV of each other. During simultaneous
recordings in Experiment 2, electrodes were linked via
jumper cables to the pre-amplifiers of both evoked
potential systems.
MASTER System (FFT method)
The Bio-logic MASTER system (research software v
2.02-R) implemented multiple ASSR using FFT anal-
ysis. EEG signals were amplified with a gain of 10,000
and filtered using a band pass of 30 to 150 Hz. The
basic unit of each stimulus was an epoch of 853 msec,
and a ‘‘sweep’’ of data in the average consisted of 16
epochs. Artifact rejection with a rejection level of 20 mV
was utilized to reject epochs with excessive myogenic
interference. The maximum number of sweeps record-
ed per intensity level was 45 for stimulus levels up to
70 dB HL, but were limited to a maximum of 36 sweeps
at 80 dB HL. The maximum recording time for each
level was, therefore, up to 10.5 min.
The F-ratio of the energy at each modulation
frequency versus the energy in neighboring bands
was calculated after each sweep. Response detection
level was set at p , 0.05. The ongoing display indicated
the amplitude, phase, F-statistic, and noise floor level
for each carrier/modulation frequency combination
after each sweep. Averaging could be stopped manu-
ally at any time after 1 sweep or continued until the
full 45 sweeps were completed. The intensity level was
automatically decreased by 10 dB after the maximum
number of sweeps was completed or when recording
was halted manually.
A clinical protocol of variable recording lengths was
employed based on two main criteria, although a
minimum of 15 sweeps were always collected. Once
15 sweeps had been collected, the recording could be
stopped before the maximum number of sweeps if one of
the two criteria were met: (1) an ASSR remained
statistically significant for five sweeps in a row or (2)
no stable ASSR was present, and the average noise floor
was below 10 nV. One of these two criteria had to be
fulfilled for all eight frequencies (four in each ear);
otherwise themaximum number of sweepswascollected.
RapidASSR System (FLC method)
The RapidASSR system (EP v 4.0, ASSR DLL v
1.1.05) implemented multiple ASSR using an adaptive
filter analysis method called FLC. EEG signals were
amplified with a gain of 100,000 and filtered from 10 to
105 Hz. The artifact reject level was not adjustable
with this system. The probability of a significant ASSR
was calculated by comparing the amplitude and phase
of model sine waves to the incoming EEG data
approximately once per second using a circular T2
statistic with the criterion level set at 95%. Once the
probability reached 95%, no further analysis was
conducted at that carrier/modulation frequency (re-
cording was stopped automatically and the ASSR
considered present), although analysis continued for
any other frequencies not yet significant. If the
criterion 95% probability level was not reached,
recording continued until the maximum time was
reached. The maximum recording time was adjustable
in minute increments and was set to 11 min to match
the MASTER system as closely as possible.
The information available on the screen during
recording included the response probability at each
carrier frequency updated in 1 sec intervals. Once 95%
probability was reached, the display indicated that a
present ASSR had been identified. If ASSRs were
identified at all frequencies, recording stopped for that
intensity level. The intensity level then automatically
decreased by 10 dB. Although the FLC method
eliminated most user control for determining recording
length (other than setting the maximum time), for the
purposes of this study an additional criterion was
developed during pilot testing. If more than 2 min
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
436
elapsed during which the response probability remained
lower than 50%, recording was manually halted and the
result was considered an absent ASSR. Again, criteria
had to be met for all frequencies where ASSRs had not
already been detected. The time (in seconds) taken to
identify each response was saved automatically in a log
file available for off-line analysis.
Procedures
Experiment I: Independent Threshold Assessment
by Analysis Method
ASSR threshold searches were conducted for each
subject with each of the two systems independently in
separate recordings to evaluate performance under the
optimal conditions for each method. In all cases, a
descending threshold search was conducted using
10 dB steps. The lowest level tested was 10 dB HL.
For NH subjects and those with mild hearing loss, the
search protocol began at 60 dB HL. For most HI
subjects the search began at 80 dB HL. The two
systems differed on how stimuli over 60 dB HL were
presented. The MASTER system allowed presentation
of multiple stimuli up to 80 dB HL, while the
RapidASSR system limited presentation of multiple
stimuli to 60 dB HL. Therefore stimuli were presented
singly at 70 and 80 dB HL for the RapidASSR system.
If configured to present single stimuli, the MASTER
system defaulted to modulation frequencies below
70 Hz; therefore, the decision was made to collect data
using multiple stimuli for the MASTER and single
stimuli for the RapidASSR at 70 and 80 dB HL, rather
than change the modulation frequencies. The maxi-
mum intensity level tested for independent recordings
of both systems in Experiment 1 was 80 dB HL.
The criteria described above for each method were used
to determine the length of recording at each intensity
level, and the investigatordescended to the next intensity
level as soon as any of these criteria were reached for all
frequencies. The RapidASSR system automatically
stopped presenting the stimulus for an individual carrier
frequency as soon as a response reached the 95%
confidence level, meaning that the number of stimuli
presented at one time changed over the course of the
recording. Each system was also set up to automatically
descend to the next intensity step when the maximum
time was reached. Testing continued in descending steps
until no ASSRs were obtained for two consecutive
intensity levels or to the lowest level of 10 dB HL.
Experiment 2: Simultaneous Comparison of ASSR
Detection by Analysis Method
Simultaneous recordings were conducted in order to
compare ASSR detection rates of the two systems
under exactly the same subject and noise conditions.
As in the independent recordings, a descending
intensity series was conducted from 60 to 10 dB HL
in 10 dB steps. Higher intensity levels were not tested
due to the differences in protocols for the two systems
at these levels. As a result, ASSR thresholds were not
obtained in the high frequencies for a number of HI
subjects.
Stimuli were presented via the MASTER system
signal generation and insert earphones. The Rapi-
dASSR recording system was calibrated by engineers
from GN Otometrics to incorporate exact stimulus
frequency and phase parameters of the MASTER
stimuli into the response detection model algorithm.
The two evoked potential systems were not directly
linked in any way other than by jumpers between the
pre-amplifiers. Stimulus presentation and recording
was started through the MASTER system, and immedi-
ately following, recording was begun with the Rapi-
dASSR system. Recording by the two systems was begun
as close to simultaneously as possible, however precise
triggering between the two systems was not required due
to the nature of the FLC detection method.
Criteria used to determine recording length were the
same for simultaneous recording as described in
Experiment 1, although stimulus presentation contin-
ued until criteria were met for both systems. For the
MASTER system, this meant that recording continued
while ‘‘waiting’’ for the RapidASSR system to meet
criteria. These additional sweeps collected after crite-
ria were met for the FFT were not included in the
calculations. Due to the way the RapidASSR system
operates in automatically stopping recording when
responses are detected, if FLC criteria were met before
FFT criteria the RapidASSR system was paused to
prevent data collection at the next intensity level.
Experiment 3: Automated Threshold Search
Protocol Using the FLC Method
The RapidASSR system offers an automated thresh-
old search protocol called ‘‘Quick Search.’’ The Quick
Search protocol adjusts the stimulus level indepen-
dently for each carrier frequency once a present ASSR
is detected. The user defines the upper and lower limits
of the search, and the software algorithm is designed to
bracket threshold within this range, beginning with a
level in the middle of the selected range. If a present
ASSR is detected, the level will automatically decrease
for that frequency as much as 20 dB (unless this is
below the lower limit). If no ASSR is detected following
the maximum time, the presentation level is increased
by 5–10 dB. Because the level is adjusted for each
frequency separately, the intensity levels of each and
the number of stimulus frequencies being tested
vary over time. The automated search protocol was
ASSR Analysis Methods/Vander Werff
437
implemented with an upper limit of 80 dB HL and a
lower limit of 10 dB HL, using a minimum 5 dB step
size and independent testing of each ear.
General Procedures
Experiments 1 and 2 were completed first. In order
to complete these two experiments, subjects were
tested under three conditions in random order: FFT
independently, FLC independently, and simultaneous
FFT and FLC. Experiment 3 was then completed if
time permitted, always after the first two experiments
were completed. Most subjects required two test
sessions to complete testing for all three experiments;
however, each individual test condition was completed
within a single session. Subjects were seated in a
comfortable, reclining chair in a double-walled sound-
treated booth with lights dimmed. They were asked to
relax with their eyes closed and try to sleep if possible.
No formal assessment was made of subject state,
although most subjects were able to sleep or doze
during the recording sessions. For all experiments,
testing continued until ASSR threshold was deter-
mined for all frequencies. ASSR thresholds were
defined as the lowest intensity level where a statisti-
cally significant ASSR was detected, provided that
suprathreshold ASSRs were also present within 20 dB
(unless this exceeded the highest level tested). If
ASSRs were absent for two consecutive intensity steps
above the lowest present ASSR, it was considered to be
a false positive response.
Analysis
ASSR thresholds were analyzed by frequency,
participant group, and test method (FLC or FFT) for
all three experiments. Difference scores between
behavioral and evoked potential thresholds were
calculated for each individual by subtracting behav-
ioral threshold from ASSR threshold. Means and
standard deviations of these values were determined
by frequency and subject group. Cases where threshold
could not be determined due to absent ASSR at the
upper limits of testing were not included in the
average. However, in order to compare test performance
using repeated measures analysis by test methods,
values 10 dB above the highest stimulus levels were
assigned to no-response conditions (i.e., 90 dB assigned
for no-response conditions in Experiment 1 and 70 dB
HL for Experiment 2). The time required to complete
the entire threshold search using each analysis method
was analyzed for independent recordings in Experiment
1 and Experiment 3. For Experiment 2, the number of
detected ASSRs per method, the agreement between
methods, and the time per detected ASSR were
evaluated. For repeated measures comparisons of time
per detected ASSR, no-response conditions were as-
signed a value of 46 sweeps (one greater than the
maximum number of sweeps collected). Statistical
analyses are described in the results for each experi-
ment. Due to the fact that much of the data was not
normally distributed, nonparametric tests were utilized
where required.
RESULTS
Experiment 1: Independent Threshold
Assessment by Analysis Method
ASSR versus Behavioral Thresholds
Mean ASSR thresholds for the FFT and FLC
methods as recorded independently are shown in
Table 1 for the NH and HI groups for each stimulus
frequency. The shaded rows in Table 1 provide the
mean difference scores, that is, the mean difference
amount that ASSR thresholds are raised over behav-
ioral thresholds for each group. Table 1 reveals a trend
for higher ASSR thresholds, and correspondingly
larger difference scores, for the FLC method over the
FFT method. Friedman Repeated Measures ANOVA
(analysis of variance) on Ranks showed that ASSR
thresholds were significantly higher for the FLC
method compared to the FFT method at 500 Hz (p 5
0.009) and 1000 Hz (p , 0.001), but there was no
significant difference by method for 2000 Hz (p 5
0.133) or 4000 Hz (0.961). There was not a significant
difference in ASSR-behavioral threshold difference
scores for the NH compared to the HI subjects for
either the FLC or the FFT method at any frequency
when compared by Wilcoxon signed-rank test (p . 0.05
for all comparisons).
The top two panels of Figure 1 show the relation-
ships between ASSR and behavioral thresholds for the
FFT and FLC methods under independent recording
conditions in Experiment 1. Data for all four stimulus
frequencies are combined, and symbols may represent
multiple overlapping data points. The solid line in each
panel of Figure 1 represents equal ASSR and behavioral
thresholds. As expected, ASSR thresholds were gener-
ally higher than behavioral thresholds. Thresholds
falling above the solid line (ASSR threshold lower than
behavioral threshold) were considered to be false
positives for the purpose of this study. By these criteria,
there were eight false positives for the FFT method
(seven of these in the HI group) and only four false
positives for the FLC method (three in the HI group).
Although there were fewer false positives for the FLC
method, the largest false positive (20 dB) was obtained
with this method for an HI subject at 2000 Hz.
The dotted line in each panel of Figure 1 shows the
regression between ASSR and behavioral thresholds.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
438
There was a strong overall correlation between ASSR
and behavioral thresholds for both methods underindependent recording conditions, with Pearson corre-
lation coefficients for all frequencies combined of 0.83
and 0.81 for the FFT and FLC methods respectively.
Table 1 lists Pearson correlation coefficients between
ASSR and behavioral thresholds by individual carrier
frequency. Statistical comparison using 95% confidence
intervals around the r-values indicated that the corre-
lation at 500 Hz was significantly poorer than for allother frequencies for the FFT method (p , 0.001). The
correlation at 500 Hz for the FLC method was signifi-
cantly poorer than for 2000 and 4000 Hz (p , 0.001),
although not significantly different from the correlation
at 1000 Hz for the FLC (p 5 0.075). Correlations were
not significantly different at any of the four frequencies
for FFT compared to FLC (p . 0.05), however.
An overall mean difference score was calculated for
each ear by averaging the difference scores at each of
the four stimulus frequencies in each ear. Figure 2
compares mean difference scores for the FFT and FLCmethods for the independent recording conditions in
Experiment 1. The solid line in this figure indicates
equal difference scores for the two methods, while the
dotted line indicates the regression fit to the data.
Average difference scores were significantly correlated
between analysis methods, with Pearson correlation
coefficient of r 5 0.74. Mean difference scores were not
significantly different between the two methods whenevaluated by paired t-test (p 5 0.102).
Total Test Time
Mean total test time for all subjects (NH and HI
subjects combined) required to complete thresholdtesting for all frequencies in both ears was 46.1 6
15.2 min for the FFT method and 43.6 6 12.2 min for
the FLC method. Overall, Friedman Repeated Mea-
sures ANOVA on Ranks showed that the total test
times were not significantly different between the FFT
and FLC methods in independent test sessions (p 5
0.739). As can bee seen in Figure 3, when broken down
by subject group, total test times were shorter for both
methods for the NH subjects compared to HI subjects.The test time for NH subjects was slightly shorter on
average for the FFT method (38.0 6 10.5 min) than for
the FLC method (39.2 6 10.3 min). Test times were
longer on average for the HI subjects by the FFT
method (54.9 6 13.5 min) compared to the FLC method
(47.4 6 12.9 min) for the FFT and FLC methods
respectively. Kruskal-Wallis one-way ANOVA on
ranks with Dunn’s method of post-hoc comparisonshowed that it took longer to complete threshold
testing for the HI subjects using the FFT method than
for NH subjects using either the FFT or the FLC
method (p , 0.05). No other comparisons for total time
by group or method were statistically significant.
The results for Experiment 1, therefore, showed that
under independent test conditions ASSR thresholds
were significantly higher in the low frequencies for the
FLC method compared to the FFT method. However,
correlations between ASSR and behavioral thresholds
were similar for the two analysis methods, and theaverage difference score across frequencies was well
correlated between the two methods. The total time
needed to complete testing for the entire audiogram
was longer for HI compared to NH subjects but was not
significantly different between methods.
Experiment 2: Simultaneous Comparison of
ASSR Detection by Analysis Method
ASSR versus Behavioral Thresholds
ASSR thresholds were also determined under simul-
taneous recording conditions. Recall that the highest
Table 1. Mean ASSR Thresholds and Difference Scores, Experiment 1
ASSR Analysis Methods/Vander Werff
439
stimulus level for the simultaneous condition was 60 dB
HL. Table 2 shows the mean ASSR thresholds and
difference scores for the simultaneous condition. When
compared under the same recording conditions, ASSR
thresholds were significantly higher for the FLC condi-
tion at 500 Hz (p , 0.001), 1000 Hz (p 5 0.013), and
2000 Hz (p , 0.001), but not at 4000 Hz (p 5 0.670) by
Friedman Repeated Measures ANOVA on Ranks. The
significant difference at 2000 Hz, however, is likely due
to the smaller number of HI subjects with measurable
thresholds at this frequency. Because of the 60 dB HL
upper limit for the simultaneous condition, thresholds
were obtained in a limited number of HI subjects. Mean
ASSR thresholds, therefore, appear to be better for the
simultaneous than the independent condition (Table 1)
for each method, but only due to the number of subjects
not included in the simultaneous average. This is
particularly true at 4000 Hz, where only 10 and 11 ears
contributed to the average for the FFT and FLC methods
respectively, while for the independent recordings,
ASSR thresholds could be determined for 28 and 25 ears
respectively.
The relationship between ASSR thresholds obtained
during simultaneous recording and the behavioral
Figure 1. Scatterplots representing the relationship between individual ASSR thresholds in dB HL obtained for the FFT (left column)and FLC methods (right column) and behavioral thresholds in dB HL. Thresholds obtained for all four stimulus frequencies and bothears for all subjects are included, and symbols may represent multiple overlapping data points. The top two panels show results fromindependent recordings in Experiment 1, while the bottom two panels show data obtained under simultaneous recording conditions inExperiment 2. Data were not included for any individuals/frequencies where threshold exceeded the upper limits of stimulation (80 dBHL for Experiment 1 and 60 dB HL for Experiment 2). Solid diagonal lines in each panel represent equal ASSR and behavioralthresholds. Dotted lines show the regression of all data in each panel. Pearson correlation coefficients and the regression equations foreach are indicated in the upper left of each panel.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
440
thresholds for each individual are shown in the bottom
two panels of Figure 1. The correlation between ASSR
and behavioral thresholds was poorer for the simulta-
neous recording conditions (r 5 0.74 and 0.72 for FFT
and FLC respectively) than for the independent
recordings (top panels), likely due to missing data
and restrictions placed on this recording method.
Under simultaneous recording conditions, correlations
were weakest for 500 Hz and strongest at 2000 Hz for
both the FFT and the FLC methods (Table 2). As with
the independent recordings, there were no significant
differences in the strength of the correlation for FFT
compared to FLC for any of the frequencies for the
simultaneous comparison (p . 0.05 for all compari-
sons).
Detected ASSRs by Analysis Method
While comparisons of ASSR thresholds between
methods are limited under the simultaneous recording
condition, this method allows for a direct comparison of
the ASSR detection rate by method under the same
subject and recording conditions. Each subject had the
possibility of 48 present ASSRs during the entire test
session if all intensity levels were tested and present
ASSRs were detected at all levels and frequencies (six
intensity levels 3 four frequencies 3 two ears).
Because not all subjects were tested at all levels, some
subjects had fewer possible ASSRs. Figure 4 shows the
number of detected (statistically significant) ASSRs for
each individual subject for both the FFT (filled
symbols) and FLC (open symbols) methods, in individ-
ual order by subjects from the lowest to highest number
of detected ASSRs within each subject group. The
number of ASSRs detected during the total test session
per subject ranged from 2 to 45. For almost all subjects,
the number of ASSRs detected by the FFT method was
higher than the number detected by the FLC method.
The inset of Figure 4 shows the average number of
detected ASSRs by group and analysis method.
Figure 2. Scatterplot of the relationship between the meandifference scores for each individual ear for the FFT and FLCmethods. Difference scores were calculated by subtractingbehavioral threshold in dB HL from ASSR threshold in dB HL.The mean difference score was obtained for each ear tested byaveraging the difference score for 500, 1000, 2000, and 4000 Hz.The solid diagonal line represents equal mean difference scoresfor the two methods. The dotted line indicates the regression forall data in the figure, and the Pearson correlation coefficient isindicated in the upper left of the panel.
Figure 3. Box plots of the total time required to complete threshold testing for all frequencies in both ears by test method and subjectgroup for Experiment 1. Outer limits of each box represent the 25th and 75th percentiles, with the median shown as the line within thebox. Whiskers (error bars) indicate the 10th and 90th percentiles, with filled circles showing the 5th and 95th percentiles. Results for theNH subjects are shown in the left panel, and the HI subjects in the right panel.
ASSR Analysis Methods/Vander Werff
441
Figure 5 shows the detection rate in terms of
percentage of present ASSRs out of the total possible
as a function of intensity level for each carrier
frequency. For the NH subjects (filled symbols), the
percentage detected ASSRs increases steeply with
intensity level for both the FFT (circles) and FLC
(triangles) methods. The FLC method detected an
equal or slightly lower percentage of ASSRs than the
FFT method for most comparisons, although the
difference between methods was generally greatest at
the highest intensity levels (60 dB HL). For the HI
subjects, a smaller percentage of ASSRs was detected
as compared to the NH subjects, particularly at 2000
and 4000 Hz due to the fact that behavioral thresholds
were elevated in this group. The percentage detected
ASSRs for the HI subjects detected tended to be higher
for the FFT method over the FLC method, although
this pattern was less consistent across intensities for
some frequencies (e.g., 1000 Hz).
Despite these trends for a higher ASSR detection
rate for the FFT method over the FLC method, chi-
square tests revealed no significant difference in the
Figure 4. The total number of present ASSRs (significant responses) detected per individual subject for the FFT (filled symbols) andthe FLC (open symbols) methods for Experiment 2. Subjects with ASSRs at all frequencies in both ears at all stimulus levels would havea possible 48 present ASSRs. Subjects are sorted in order from fewest to most present ASSRs in each subject group. The inset of thefigure shows the mean and standard deviation of the number of present ASSRs for the FFT (black bars) and FLC methods (gray bars) forthe NH and HI subjects.
Table 2. Mean ASSR Thresholds and Difference Scores, Experiment 2
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
442
number of present ASSRs between the FFT and FLC
methods by subject group (x2 5 0.586, df 5 1, p 5
0.444), test frequency (x2 5 0.0572, df 5 3, p 5 0.996),
or intensity level (x2 5 0.911, df 5 5, p 5 0.969) for
these simultaneous comparisons.
Because behavioral thresholds varied among sub-
jects, the detection rate for the two methods was also
compared by examining results at comparable intensi-
ties in terms of equal sensation levels (dB SL). Due to
the fact that ASSR was collected in 10 dB steps, while
behavioral thresholds were in 5 dB steps, ‘‘equal’’
sensation levels were considered to be within a range
of 5 dB. For the frequencies of 1000–4000 Hz, the
percentage of detected ASSRs was evaluated at either
15 or 20 dB SL for all subjects. Because ASSRs at
500 Hz tend to be further elevated above threshold
relative to the higher frequencies, detection rates were
evaluated for levels of 25 or 30 dB SL. When analyzed
at this low and equal SL, a slightly higher percentage
of ASSRs were detected by the FFT method (60, 65, 70,
and 60% for 500, 1000, 2000, and 4000 Hz respectively)
than the FLC method (60% at 500, 1000, and 2000 Hz;
50% at 4000 Hz) for the NH subjects. For HI subjects,
ASSRs were detected at a low SL in 70, 65, 72, and 36%
of cases for the FFT method compared to 68, 57, 59,
and 38% for the FLC method. Chi-square tests
revealed, however, that there were no significant
differences in detection rate by method at a low SL
Figure 5. The percentage of detected ASSRs by stimulus frequency and intensity level across subjects for each analysis method. Filledsymbols represent the NH subjects, while open symbols represent HI subjects for the FLC method (circles) and the FFT method(triangles). Percentages were calculated as the number of detected ASSRs out of the total number of tested across subjects.
ASSR Analysis Methods/Vander Werff
443
for either the NH (x2 5 0.236, df 5 3, p 5 0.971) or the
HI subjects (x2 5 0.197, df 5 3, p 5 0.978), nor was
there a significant difference by subject group overall
(x2 5 0.005, df 5 1, p 5 0.941).
Along with the overall number of present ASSRs, it
is important to know whether the two methods
obtained the same result (present or absent ASSR)
for a particular test. For example, for an individual
subject, when testing the 500 Hz in the left ear at
30 dB HL, is the same result obtained by the FFT and
FLC methods? The agreement between the FFT and
FLC methods by intensity level for each carrier/
modulation frequency was analyzed by calculating
whether (1) both methods detected a present, statisti-
cally significant ASSR, (2) both methods failed to
detect an ASSR, or (3) a present ASSR was found by
one method while no ASSR was detected by the other
method. Overall, the test outcome of present or absent
ASSR was in agreement between the two methods for
89% of tests. The rate of agreements and disagreement
can be seen in Figure 6 for the NH (left panels) and HI
(right panels) groups. For the NH subjects, the two
methods agreed in 91% of cases overall. Disagreements
between the methods were found in 7.5, 11.3, 7.9, and
6.7% of cases for 500, 1000, 2000, and 4000 Hz
respectively.
The number of ASSRs detected was lower in the HI
subjects overall due to fewer intensity levels tested as
well as absent ASSRs due to elevated hearing thresh-
olds. Results from the FFT and FLC methods still
agreed in the majority of cases (87%). The percentage of
disagreements was higher, however, for the HI subjects
compared to the NH subjects, with 10.1, 13.8, 12.4, and
11.9% of cases for 500, 1000, 2000, and 4000 Hz. Three-
way ANOVA (group 3 frequency 3 intensity) with
Holm-Slidak pairwise comparisons showed that the
number of disagreements did not vary significantly
based on carrier frequency (F 5 1.678, df 5 3,15, p 5
0.214) or intensity level (F 5 0.654, df 5 1,15, p 5
0.663), but the number of disagreements between
methods was significantly greater for the HI group over
the NH group (F 5 4.553, df 5 5, 15, p 5 0.050).
It is possible that disagreements between methods
could relate to differences in false positive or false
negative response rates between methods. For the
purposes of this study, a false positive response was
defined as a statistically significant ASSR detected at
any level below the behavioral threshold, or one that
occurred more than 20 dB (2 test levels) below the
next-highest phase-locked ASSR. Out of a total of 960
possible tests, there were seven false positive respons-
es detected in the NH subject group (0.73%), including
four for the FFT method and three for FLC. There were
37 false positives out of 872 tests in the HI group
(4.24%), 22 by FFT, and 15 by FLC. Three of these false
positives occurred for both FFT and FLC, which
resulted in agreements between methods. Therefore,
it does not appear that there was increased between-
method variability due to false positives.
There were also a few instances of absent ASSRs at
intensity levels well above behavioral threshold and/or
despite present ASSRs at lower intensity levels (e.g.,
absent ASSR at 60 dB HL and present ASSRs at 50, 40,
and 30 dB HL). These cases could be considered false
negatives. If it is assumed that, for example, all NH
subjects should have a present ASSR at 60 dB HL, false
negatives occurred in 18 cases or 11.25% for the FLC
method, and only 5 cases or 3.13% for the FFT method.
For example, no significant ASSRs were detected for
NH subject A4 to the 1000 Hz stimulus by the FLC
method at any of the tested intensity levels in the right
ear despite a behavioral threshold of 0 dB HL at this
frequency. ASSRs were obtained at 40, 50, and 60 dB
HL at this same frequency by the FFT method.
However, no ASSRs were detected for this subject at
60 dB HL for 4000 Hz in the right ear by either FLC or
FFT but were detected at lower levels. This subject’s
noise floor started out fairly high and did not fall under
10 nV at any point during the recording session.
It is possible, therefore, that false negatives may
have been due to higher subject noise levels at the
beginning of the test session when the highest
intensity levels were tested. However, subjects with
false negatives at 60 dB had a mean noise floor of 11.1
6 4.0 mV, while those without any false negatives at
60 dB had a mean noise floor of 13.6 6 3.7 mV. There
was also not a significant correlation between individ-
ual subject noise floor and the number of agreements
between the two methods for either NH (p 5 0.080, r 5
0.16) or HI subjects (p 5 0.558, r 5 0.06). In general,
therefore, any difference in results between the two
methods did not appear to relate to the noise floor of
the individual subject.
Time per Detected ASSR by Analysis Method
Comparing the time per detected ASSR during
simultaneous recording is complicated both by the
differences between methods including sweep length
and by the decision criteria used to determine present
ASSR. The FLC method updated statistical results in
1 sec increments, while FFT data are in sweeps of
13.65 sec. Unlike the FLC method implemented by the
RapidASSR system, the MASTER system and FFT
method were not stopped automatically when response
criteria first reaches significance. The current data
were collected using a minimum of 15 sweeps for the
FFT method, which results in considerably longer
averaging times per ASSR than the FLC method.
Instead of requiring 15 sweeps, FFT data could be
analyzed in terms of the first significant sweep, or after
a required number of consecutive significant sweeps.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
444
Figure 7 shows a comparison between the recording
times for each individual ASSR detected by the FFT
and FLC methods in terms of sweeps using two types of
FFT criteria. In order to compare the two methods
more directly, the FLC data was rounded up to the
nearest ‘‘sweep’’ of 13.65 sec. That is, if the FLC
method obtained a present ASSR at 15 sec, it was
considered to be two sweeps. The converted FLC data
was compared to the FFT data after the first
significant sweep and the fifth consecutive significant
sweep to compare the time per detected ASSR. For
data to be included in this comparison, both the FFT
and FLC methods resulted in a present ASSR for the
carrier frequency and intensity level tested (e.g., one
Figure 6. Agreement in response presence or absence between the FFT and FLC methods by intensity level for the NH (left panels)and the HI (right panels) at each frequency in Experiment 2. Black portions of the bars represent the number of cases where the presentASSRs were detected by both methods. Light gray areas indicate the number of cases where ASSRs were found to be absent (noresponse) by both analysis methods. The dark gray areas at the top of the bars represent cases of disagreement, where a present ASSRwas detected by one method and no response was detected by the other method at a particular frequency/intensity.
ASSR Analysis Methods/Vander Werff
445
data point represents subject A1 at 500 Hz, 60 dB HL,
left ear). The gray shaded area in Figure 7 represents
response detection times within five sweeps (68.24 sec)
between the two methods.
The filled symbols in Figure 7 show the time when
the ASSR first became significant by the FFT method,
called ‘‘FFT (1st),’’ compared to the time rounded to the
next highest sweep when significance criteria was met
for the FLC method. Almost all data points lie below
the solid line, indicating that the first significant
sweep for the FFT method was earlier than when the
response reached statistical criteria for the FLC
method. The time for FFT (1st) was significantly faster
than the time for the FLC method for all test
frequencies (p , 0.05 for each) by Friedman Repeated
Measures ANOVA on Ranks. However, the ASSR often
did not remain significant for subsequent sweeps for
the FFT, and this first significant response would be
considered an error if it did not remain for five
consecutive sweeps. The percentage of cases where
the first significant FFT response resulted in an error
ranged from 19% at 500 Hz to 39% at 4000 Hz. The use
of the first significant sweep for the FFT method as
criterion would not be recommended; however, this
comparison highlights that the time when the FLC
analysis became significant did not correspond to the
first significant sweep for the FFT.
The second criterion evaluated was the requirement
for five consecutive significant sweeps for FFT, called
‘‘FFT (5sig),’’ for which the minimum time for the FFT
Figure 7. The relationship between the time per detected ASSR for the FLC method and the FFT method under simultaneouscomparisons in Experiment 2. FLC times were converted to ‘‘sweeps’’ by rounding to the nearest FFT sweep length of 13.54 sec. Filledsymbols indicate the time in sweeps for the first significant response by the FFT method. Open symbols show the time in sweeps for theFFT method to meet the requirement of five consecutive significant sweeps. The solid diagonal line in each panel represents equal timesin sweeps for the two methods, while the gray shaded area includes results for the FFT and FLC methods that are within five sweeps ofeach other.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
446
method would be five sweeps (68.24 seconds). The time
per detected ASSR for the FLC method correlated more
strongly with the FFT (5sig), as shown by the open
symbols in Figure 7, than for FFT (1st). If data points
falling within the gray area are considered to represent
comparable times between methods (within five
sweeps or just over 1 min), the times were similar for
between 74% (500 Hz) and 82% (4000 Hz) of cases.
Overall, the FLC method detected the ASSR faster
than FFT (5sig) 61% of the time; however, in only 5% of
the total cases was the FLC method faster by over five
sweeps (shown by data points above the diagonal
shaded area). The FFT (5sig) method was faster in 32%
of cases overall, and in 16% of the total comparisons
the FFT (5sig) method was faster by over five sweeps.
The time in sweeps per detected ASSR was signifi-
cantly shorter for the FLC method compared to the
FFT (5sig) criteria for all four carrier frequencies
across intensity levels by Friedman Repeated Mea-
sures ANOVA on Ranks (p , 0.05 for each). However,
the FLC method and the FFT (5sig) method were
within five sweeps of each other in 79% of cases overall,
meaning present ASSRs were detected by the two
methods within just over 1 min of each other.
Note that considering the criterion for a minimum of
15 sweeps for the FFT method would result in all data
points below 15 sweeps in Figure 7 being moved up to
this value. The time difference for the FLC compared
to the FFT 15 sweep requirement was significant for
all test frequencies (p , 0.05), with the FLC method
faster 76% of time overall and over five sweeps faster
in 61% of these cases. The choice of stopping criteria for
the FFT method had an effect not only on the time
taken to detect a response but also on whether an
ASSR would be considered present at a given intensity
level. There were cases in which the criteria of five
consecutive significant sweeps resulted in a different
decision regarding ASSR presence/absence than if a
minimum of 15 sweeps were also required. In total,
this happened in only 13 instances out of a total of
1,920 individual tests (including all subjects/frequen-
cies/levels/ears) during the simultaneous recordings.
There were 37 instances in which there were five
consecutive significant sweeps at some point during
the test, followed by nonsignificant sweeps. In 20 of
those cases, the ASSR threshold would have changed
depending on whether recording would have stopped
after the first five consecutive sweeps.
In summary, the results for the simultaneous
recordings in Experiment 2 in which test conditions
were exactly the same for the two methods revealed
few significant differences between the FFT and FLC
methods. As with the independent recordings in the
first experiment, ASSR thresholds were found to be
significantly higher for the FLC method in the low
frequencies. Mean ASSR thresholds for the FLC
method were also found to be significantly higher at
2000 Hz for this experiment, although this result is
likely influenced by the reduced upper intensity limit
and correspondingly fewer ASSR thresholds obtained
for HI subjects at this frequency. The FFT method
tended to detect a greater number of ASSRs overall,
although this trend did not reach significance by
intensity, frequency, or subject group. In terms of time
per detected ASSR, the first significant sweep for the
FFT method was significantly faster than the time for
a significant ASSR by the FLC method, although this
criterion resulted in a large number of errors. If
criteria for five consecutive significant sweeps for
FFT were used, with or without a 15-sweep minimum,
the FLC method was significantly faster in detecting
an ASSR. However, in the majority of cases the two
methods detected ASSRs within a few sweeps, or just
over 1 min of each other.
Experiment 3: Automated Threshold Search
Protocol Using the FLC Method
For Experiment 3, an automated threshold search
protocol implemented by the RapidASSR system using
the FLC analysis method was evaluated. Because this
search protocol allowed for independent changes of
stimulus level by frequency and in each ear, the
automated algorithm has the potential to complete
threshold testing across frequencies in a more time-
efficient manner. The automated threshold search was
implemented as described in the methods section for 36
of the participants who had completed Experiments 1
and 2 (17 NH and 19 HI).
Figure 8 shows ASSR thresholds obtained using the
FLC automatic search protocol compared to behavioral
thresholds. The overall correlation of r 5 0.74 is poorer
than those obtained with either the FLC or FFT
independent recordings in Experiment 1 (Figure 1).
Correlation coefficients of r 5 0.63, 0.71, 0.88, and 0.77
were found for 500, 1000, 2000, and 4000 Hz respec-
tively. With the exception of 500 Hz, these were lower
than those found in Experiment 1. In addition,
Figure 8 shows that in a few instances ASSR thresh-
olds underestimated behavioral thresholds by a large
amount (up to 45 dB at 4000 Hz in one case).
The correlation between ASSR-behavioral threshold
difference scores for the automated search compared to
the independent threshold search in Experiment 1 was
r 5 0.60 overall. Mean difference scores were found to
be 27.3 6 13.3, 21.97 6 13.6, 19.8 6 10.8, and 15.4 6
15.6 across all subjects (NH and HI combined) for 500,
1000, 2000, and 4000 Hz respectively. Although mean
difference scores were generally similar for the
automated search as compared to the FFT or FLC
independent recordings in Experiment 1, standard
deviations were slightly larger, indicating higher
ASSR Analysis Methods/Vander Werff
447
variability for this experiment. Friedman repeated
measures ANOVA on ranks with post-hoc Tukey tests
showed that ASSR thresholds for the automated
search did not differ significantly from those obtained
using the FLC method in Experiment 1 at any of the
four frequencies (p . 0.05 for all). However, ASSR
thresholds for the FLC automated search were signif-
icantly higher than the FFT thresholds in Experiment
1 at 500, 1000, and 2000 Hz (p , 0.05).
While the correlation between ASSR and behavioral
thresholds was slightly poorer for the automated
search, the total test time was significantly decreased
compared to Experiment 1. Figure 9 shows box plots of
the total test time for the NH and HI subjects. The
mean test time for the NH group was 24.9 6 11.0 min,
while the total time for the HI subjects was 27.5 6
6.6 min. When compared to the test times shown in
Figure 3, there was a significant effect of the type of
search protocol on total test time by Friedman
Repeated Measures ANOVA on Ranks (p , 0.001).
The total test time for the automated protocol was
significantly faster than either the FLC or the FFT
independent tests (p , 0.05 each).
The results of Experiment 3 show that the FLC
automated search protocol with independent adjust-
ment of intensity for each carrier frequency was
significantly faster than a descending threshold search
in which criteria had to be met for all frequencies at a
certain intensity level before testing at the next level in
the series. However, results were slightly more
variable with poorer correlations between ASSR and
behavioral thresholds as compared to Experiment 1.
DISCUSSION
In this study, two multiple-frequency ASSR analysis
methods were compared under independent andsimultaneous test conditions in normal-hearing and
hearing-impaired adults. ASSR thresholds obtained by
both methods, as well as the relationship between
ASSR and behavioral thresholds, were found to be in
line with those reported in the literature for multiple-
frequency ASSR. The mean difference scores in
Tables 1 and 2 are similar to those reported in many
studies using multiple-frequency ASSR (Picton et al,1998; Picton et al, 2001; Vander Werff and Brown,
2005), although higher than those reported in others
(Lins et al, 1996; Herdman and Stapells, 2001; Perez-
Abalo et al, 2001; Dimitrijevic et al, 2002; Luts et al,
2004). Across studies in the literature, errors in
estimating audiometric threshold using the ASSR tend
to be greater in normal-hearing individuals compared
to those with hearing loss. The reader is referred toseveral published reviews on this topic (Herdman and
Stapells, 2003; Picton et al, 2003; Tlumak et al, 2007;
Vander Werff et al, 2008). In the current study,
difference scores and standard deviations were similar
between the NH and HI groups for both the FFT and
FLC methods. This result is likely due to the limited
range of hearing loss of the subjects in the HI group.
Most subjects had normal or nearly normal thresholdsin the low frequencies, and the degree of hearing loss
was limited to accommodate upper test limit of 60 dB
HL for the simultaneous comparison.
Figure 8. Scatterplot representing the relationship betweenbehavioral thresholds in dB HL and individual ASSR thresholdsin dB HL obtained using the automated search protocol for theFLC method in Experiment 3. Format is the same as for Figure 1.
Figure 9. Box plots of the total time required to completethreshold testing for all frequencies in both ears using theautomated search protocol in Experiment 3. Box format is thesame as for Figure 3.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
448
Most of the multiple-frequency ASSR studies in the
literature have utilized a method of time-domain
averaging and FFT analysis. The current study is the
first report of ASSR thresholds obtained using a
commercial implementation of an adaptive filtering
algorithm (FLC) to analyze the EEG signal, rather than
time-domain averaging and FFT analysis. When tested
independently, ASSR thresholds tended to be higher for
the FLC method as implemented by the RapidASSR
system compared to those obtained using the FFT
method implemented by the MASTER system. The
variability in mean thresholds was similar for the two
methods. The difference in thresholds between meth-
ods was found to be significant for the low frequencies,
where ASSR thresholds obtained using the FLC
method were higher but there was no significant
difference for the high frequencies. Average difference
scores across frequencies were the same for the two
methods, however. When tested under simultaneous
recording conditions, the ASSR thresholds remained
significantly higher for the FLC method at 500 and
1000 Hz. Although there was also a significant
difference at 2000 Hz under the simultaneous condi-
tion, this result was likely influenced by the small
number of thresholds obtained in the HI subjects at
this frequency.
Correlations between behavioral thresholds and
ASSR thresholds, however, were not significantly
different between the FFT and FLC methods. For both
the FFT and FLC methods, the poorest correlations
were obtained for 500 Hz. 500 Hz correlations of 0.53
and 0.59 for the FFT and FLC methods respectively in
the current study are somewhat lower than those
reported in previous studies, although consistent with
trends across ASSR studies in the literature for the
poorest correlations and highest variability at 500 Hz
(Dimitrijevic et al, 2002; Herdman and Stapells, 2003;
Hsu et al, 2003; Van Maanen and Stapells, 2005;
Vander Werff and Brown, 2005).
These results overall suggest that ASSR thresholds
obtained using the FLC method estimate the behav-
ioral audiogram as accurately as those obtained using
the FFT method. While ASSR thresholds were signif-
icantly higher in the low frequencies for the FLC
method compared to the FFT method, it is important to
note that the relative difference in ASSR thresholds
and difference scores for FLC compared to FFT was
generally small. Under simultaneous recording condi-
tions, the difference scores for the FFT and FLC
methods were the same for 65% of cases, and within
10 dB (the minimum step size) of each other for 94%
of ears tested. When independent recordings were
compared, the variability increased slightly. Differ-
ence scores for the FFT and FLC methods were
within 10 dB of each for 68% of cases, and within
20 dB for 92%. This higher variability would be
expected due to differences in subject state and other
noise conditions across test sessions for the indepen-
dent recordings.
There are limitations in comparing thresholds
obtained by the two methods for either independent
or simultaneous recording conditions. As mentioned
above, subject and test variability could influence the
independent comparisons. Under simultaneous record-
ing conditions, the upper limit for testing affected the
number of individuals with measurable ASSRs for the
higher frequencies. The fact that ASSR thresholds
obtained using the FLC method were more elevated
above behavioral threshold at 500 and 1000 Hz than
for the FFT method under both simultaneous and
independent conditions suggests that clinicians should
consider correction factors specific to the analysis
method and protocol when estimating behavioral
threshold based on ASSR.
The simultaneous recording condition in Experiment
2 allowed for a direct comparison of the detection rate
and recording time for an individual ASSR, given that
subject and environmental factors were exactly the
same. Overall, more ASSRs were detected by the FFT
method than the FLC method across subjects. This
trend was apparent across most frequencies and
intensity levels under simultaneous test conditions.
However, detection rates for the two methods were not
found to be significantly different whether compared
by equal levels in dB HL or dB SL. In most cases, the
results of the two methods were in agreement in terms
of ASSR presence or absence for each frequency/
intensity combination (91% for NH, 87% for HI). A
significantly higher number of disagreements between
methods occurred for the HI group, although these
disagreements did not occur more frequently at
threshold or below threshold levels overall.
There were instances of false positives (present ASSR
below threshold) and false negatives (absent ASSR at
clearly suprathreshold level) for both the FFT and FLC
methods. The criteria used to classify a response as a
false positive was somewhat strict, as it would be
expected with normal variation that some ASSR
thresholds may fall below behavioral threshold. Al-
though there were slightly fewer cases of false positives
for the FLC method, the largest false positives of 20 dB
below behavioral threshold were obtained with this
method. In addition, false positives of 25 and 45 dB
below threshold were obtained during the automated
search using the FLC method. In each of these cases, the
FLC method detected a present ASSR within the first
22–70 sec of testing. For subject HA39, for example, a
present ASSR was found in 22 sec at 20 dB HL, after no
ASSR was detected during the maximum recording
time at 30 dB HL. A clinical strategy to reduce false
positives therefore may be to repeat an ASSR test if an
ASSR is detected in such a short period of time at a
ASSR Analysis Methods/Vander Werff
449
relatively low intensity level, particularly if results do
not correspond with those at higher levels.
False negatives also occurred for both the FFT and
FLC methods. Although the true number of false
negatives is not known, there were several cases of
absent ASSRs at 60 dB HL in the NH group that could be
considered false negatives. False negatives also occurred
more frequently for the FLC method than the FFT
method. Because the highest intensity levels were tested
at the beginning of the session, subject state and noise
floor may have influenced these results, although there
was not a significant relationship between noise floor and
the agreement in results between the two methods.
Total test time under independent recording condi-
tions in Experiment 1 was not significantly different
for the two methods overall, although test time was
longer for the FFT method for the HI subjects. This
result is most likely due to the difference in recording
protocols between the two systems for the highest
stimulus levels. The MASTER system allowed for
simultaneous presentation of all four stimuli up to
80 dB HL, while the RapidASSR system limited the
simultaneous presentation to 60 dB HL. For HI
individuals, particularly those with larger differences
in threshold across frequencies, testing each frequency
independently may be a more efficient method. The
commercial MASTER system allows for testing single
frequencies, although the modulation frequency chang-
es with this configuration. Overall, it did not appear
that either analysis method held an advantage in
terms of overall test time in independent comparisons
using the protocol implemented in the current study.
Under simultaneous recording conditions, it is
difficult to evaluate the relative efficiency in terms of
time between the two methods for a specific test due to
differences in the two detection algorithms and the
clinical test protocol chosen. The FLC method updated
statistical test results in 1 sec increments, while the
FFT method updated results after each sweep of
13.65 sec. The FFT method also must be halted
manually by the user or complete a maximum number
of sweeps, while the FLC method stops as soon as
statistical significance is reached.
Using the minimum time segment of 13.65 sec, or 1
sweep, as the smallest unit of time allows for a
reasonable, though not ideal, comparison of individual
recording times per detected ASSR. The time in sweeps
when the FFT method first became significant was
almost universally faster than the time to the nearest
sweep when the FLC method became significant. How-
ever, as reviewed above, this method would likely result
in considerable error in threshold estimation. While the
time per detected ASSR was more similar when compar-
ing the FLC with the fifth consecutive significant sweep
for the FFT method, the FLC method was faster in 61% of
these comparisons. This time difference in sweeps was
statistically significant, although the two methods were
within a few sweeps of each other for the majority of
comparisons. Therefore, depending on the decision
criteria used, when ASSRs were present by both methods,
the FLC was slightly more time efficient in detecting the
response. However, this result may be balanced out
overall by a larger number of false negatives and higher
thresholds in general for the FLC method. Test time for
an absent ASSR, such as false negatives and subthresh-
old tests, may be longer than for present ASSR, unless
subject noise floor is quite low.
While ASSR detection is objective, decisions on when
to accept an ASSR as present or absent affect the
overall accuracy of threshold estimation. In most
research studies using the FFT method, response
presence or absence has been judged after a fixed
number of sweeps have been recorded. Studies have
shown that the longer the test duration, the smaller
the difference scores and their associated standard
deviations (John et al, 1998; John and Picton, 2000;
Luts and Wouters, 2005), due to lower noise levels and
higher SNR. However, clinical test time is limited,
especially when testing young, sleeping infants. There
is a need for compromise and more time-efficient
strategies such as stopping the recording as soon as a
significant response is detected rather than waiting for
a predetermined length of time. In the current study,
two different analysis methods were applied using
protocols designed to be clinically feasible. The FLC
method did not require any clinical judgment for a
present ASSR, as recording was halted as soon as the
statistical results reached significance. A decision
criterion was implemented to halt testing after a
specified time if the probability of response remained
low. The results of this study show that the accuracy of
the FLC method was in line with previously published
multiple-frequency ASSR studies.
For the FFT method, a protocol designed to increase
time efficiency was evaluated in which averaging was
continued until the statistical results were significant for
five consecutive sweeps, with a 15 sweep minimum and a
noise floor criterion for no response decisions. This
method requires sequential application of multiple
statistical testing, which has been shown to increase
error rates (Sturzebecher et al, 2005). Compared to
previous ASSR studies in the literature, ASSR thresholds
and difference scores were with the range of previous
reports, although on the higher end. Correlations
between ASSR and behavioral thresholds were slightly
poorer than some studies, but also within the range
across the literature (Vander Werff et al, 2008). These
results suggest that the clinical protocol added some
variability as compared to research protocols, but without
major adverse affects for threshold estimation in this
particular subject group. There were, however, examples
of significant errors in some individual subjects.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
450
Luts et al (2008) found that error rate, detection
rate, and recording time can vary significantly with
even small changes in the protocol. These authors
evaluated the use of fixed recording lengths, variable
lengths requiring a number of consecutive significant
responses, and the variable lengths plus the use of a
minimum number of sweeps. For variable length
recordings, error rates increased as the number of
consecutive significant sweeps required decreased.
When a fixed number of sweeps was required, error
rates were 5% or less, in agreement with the
significance level of the statistical test. For the variable
recording lengths, error rates were 29.8, 17.9, 9.7, 4.4,
and 1.3% for requirements of 1, 2, 4, 8, and 16
consecutive significant sweeps respectively. Requiring
a minimum of 8 sweeps decreased the error rates
slightly for 1, 2, and 4 consecutive sweeps to 19.2, 14.6,
and 8.9%. In the current study, the addition of a 15
sweep minimum requirement changed the result in a
small number of cases, which would be consistent with
a slight reduction in the error rate.
The results of this study, as well as those of Luts et
al (2008) suggest that a compromise may be possible
where recording time is limited without a large
sacrifice in accuracy. In a recent meta-analysis of
ASSR literature, Tlumak et al (2007) reported that the
maximum number of sweeps was not significantly
related to variability across studies in the difference
between ASSR and behavioral thresholds at any
carrier frequency. However, our results highlight the
need for critical clinical decision making as part of the
ASSR protocol. While ASSR testing is ‘‘objective,’’ it is
clear that the best results (fewest errors) are obtained
over long recording times in quiet subjects. In order to
strike a balance between shorter recording times and
high ASSR detection rates, clinicians need to strive to
improve clinical decision making. For example, in the
presence of higher subject noise, longer minimum
recording times would be required. In addition,
clinicians may change to stricter criteria if significance
levels are changing frequently with additional sweeps.
Repetition of particular tests that seem ‘‘suspicious’’
would also add confidence to ASSR results. For
example, if an ASSR is detected quickly at a low
intensity level, the test should be repeated and/or
evaluated at the next highest stimulus level. In
addition, clinicians should pay particular attention to
obtaining high-quality recordings by making sure
electrode impedances are low and balanced, subject
noise is reduced as much as possible, and electrical
interference is eliminated or reduced.
Even with the use of clinical protocols with variable
recording times, however, total test times were consid-
erable for the independent tests: around 40 min for NH
subjects and 50 min for HI subjects for both methods.
The total time may have been negatively affected by the
chosen test protocol for this research study (e.g.,
straight descent intensity series, requiring a minimum
of 15 sweeps for FFT), but does not seem to suggest a
clinical time savings overall. One limitation of multiple-
frequency ASSR that has limited potential time savings
is that for a certain intensity level, the stimulus may be
above threshold at some frequencies but below thresh-
old at other frequencies. Test time for that intensity
level therefore depends on the smallest amplitude ASSR
out of the eight possible in the two ears. The final
experiment in the current study evaluated the use of an
automated threshold search protocol using the FLC
method, which allowed for independent testing of each
stimulus frequency. Given the objective statistical
nature of determining present or absent ASSR, an
automated independent protocol holds promise for
decreasing total test time, particularly in cases where
thresholds vary considerably across frequency such as
in sloping SNHL. In the case of the FLC method, the
automated search protocol that independently con-
trolled stimulus level across frequencies was found to
be significantly faster than the descending search
protocols used in the independent tests in Experiment
1 (for either the FLC or the FFT method). Mean test
time was under 30 min for both the NH and the HI
groups. This decreased test time would likely represent
a significant clinical advantage.
However, the time savings was somewhat counter-
acted by decreased test accuracy as shown by poorer
correlations to behavioral threshold and larger stan-
dard deviations for the automated search method
compared to the manual methods from Experiment 1.
There were examples of large underestimations of
threshold during the automated search, where ASSRs
were detected in short amounts of time at subthreshold
intensity levels. In a clinical setting, the audiologist
could decide that this response was unlikely to
represent a true ASSR and either disregard it, repeat
that specific test, or complete additional intensity
levels to improve the accuracy of the automated
protocol. Limitations of the independent control of
stimulus level across frequencies must also be consid-
ered, such as interactions between stimuli of varying
intensities and masking effects. The time benefits of
the automated protocol appear to be significant;
therefore, techniques to improve performance should
be further explored.
CONCLUSIONS
This study has shown that both the FFT and FLC
analysis methods for ASSR detection, as imple-
mented commercially, can be used for behavioral
threshold estimation with approximately equal accu-
racy. The use of correction factors specific to the test
may be necessary, however, when converting ASSR
ASSR Analysis Methods/Vander Werff
451
thresholds to behavioral threshold due to the largerdifference scores for the FLC method in the low
frequencies. Under simultaneous recording conditions,
controlling for subject state as an extraneous variable,
the ASSR detection rate was not significantly different
between methods. Clinical protocols with variable
recording lengths were used in an attempt to balance
time efficiency and accuracy, resulting in performance
similar to previous studies in the ASSR literature.Although an automated threshold search protocol
using the FLC method based on independent adjust-
ment of stimulus level by frequency resulted in
significantly decreased test time, there was an associ-
ated decrease in the correlation between ASSR and
behavioral threshold. Further evaluation of this type of
protocol is warranted given the time efficiency if false
positive responses and variability could be improved.Variability in individual ASSR thresholds and reduced
correlations to behavioral threshold, as well as signif-
icant errors in some individuals, in this study highlight
the need for clinical decision making to improve test
accuracy for both the FFT and FLC methods when
these clinical protocols are used.
Acknowledgments. The author would like to gratefully
acknowledge all the participants in this study and sincerely
thank Margaret Overman, Heather Schwartz, Kerrie Nes-
bitt, Kyle Wilson, and Kristen Burns for their assistance in
collecting and analyzing data for this project.
REFERENCES
American National Standards Institute. (1996) Specification forAudiometers (ANSI S3.6-1996). New York: American NationalStandards Institute.
American National Standards Institute. (2004) Specification forAudiometers (ANSI S3.6-2004). New York: American NationalStandards Institute.
Dimitrijevic A, John MS, Van Roon P, Purcell DW, Adamonis J,Ostroff J, Nedzelski JM, Picton TW. (2002) Estimating theaudiogram using multiple auditory steady-state responses. J AmAcad Audiol 13(4):205–224.
Herdman AT, Stapells DR. (2001) Thresholds determined usingthe monotic and dichotic multiple auditory steady-state responsetechnique in normal-hearing subjects. Scand Audiol 30(1):41–49.
Herdman AT, Stapells DK. (2003) Auditory steady-state responsethresholds of adults with sensorineural hearing impairments.Int J Audiol 42(5):237–248.
Hsu WC, Wu HP, Liu TC. (2003) Objective assessment of auditorythresholds in noise-induced hearing loss using steady-stateevoked potentials. Clin Otolaryngol 28(3):195–198.
John MS, Lins OG, Boucher BL, Picton TW. (1998) Multipleauditory steady-state responses (MASTER): stimulus and record-ing parameters. Audiology 37(2):59–82.
John MS, Picton TW. (2000) MASTER: a windows program forrecording multiple auditory steady-state responses. ComputMethods Programs Biomed 61(2):125–150.
Lins OG, Picton TW, Boucher BL, Durieux-Smith A, ChampagneSC, Moran LM, Perez-Abalo MC, Martin V, Savio G. (1996)Frequency-specific audiometry using steady-state responses. EarHear 17(2):81–96.
Luts H, Desloovere C, Kumar A, Vandermeersch E, Wouters J.(2004) Objective assessment of frequency-specific hearing thresh-olds in babies. Int J Pediatr Otorhinolaryngol 68(7):915–926.
Luts H, Van Dun B, Alaerts J, Wouters J. (2007) Objectivedetection of ASSR: do’s and don’ts. Paper presented at theInternational Evoked Response Audiometry Study Group (IER-ASG) XXth Biennial Symposium, Bled, Slovenia.
Luts H, Van Dun B, Alaerts J, Wouters J. (2008) The influence ofthe detection paradigm in recording auditory steady-stateresponses. Ear Hear 29(4):638–650.
Luts H, Wouters J. (2005) Comparison of MASTER and AUDERAfor measurement of auditory steady-state responses. Int J Audiol44(4):244–253.
Perez-Abalo MC, Savio G, Torres A, Martin V, Rodriguez E,Galan L. (2001) Steady state responses to multiple amplitude-modulated tones: an optimized method to test frequency-specificthresholds in hearing-impaired children and normal-hearingsubjects. Ear Hear 22(3):200–211.
Picton TW, Dimitrijevic A, John MS, Van Roon P. (2001) The useof phase in the detection of auditory steady-state responses. ClinNeurophysiol 112(9):1698–1711.
Picton TW, Durieux-Smith A, Champagne SC, Whittingham J,Moran LM, Giguere C, Beauregard Y. (1998) Objective evaluationof aided thresholds using auditory steady-state responses. J AmAcad Audiol 9(5):315–331.
Picton TW, John MS, Dimitrijevic A, Purcell D. (2003) Humanauditory steady-state responses. Int J Audiol 42(4):177–219.
Sturzebecher E, Cebulla M, Elberling C. (2005) Automatedauditory response detection: statistical problems with repeatedtesting. Int J Audiol 44(2):110–117.
Tlumak AI, Rubinstein E, Durrant JD. (2007) Meta-analysis ofvariables that affect accuracy of threshold estimation viameasurement of the auditory steady-state response (ASSR).Int J Audiol 46(11):692–710.
Van Maanen A, Stapells DR. (2005) Comparison of multipleauditory steady-state responses (80 versus 40 Hz) and slowcortical potentials for threshold estimation in hearing-impairedadults. Int J Audiol 44(11):613–624.
Vander Werff KR, Brown CJ. (2005) Effect of audiometricconfiguration on threshold and suprathreshold auditory steady-state responses. Ear Hear 26(3):310–326.
Vander Werff KR, Johnson TJ, Brown CJ. (2008) Behaviouralthreshold estimation for auditory steady-state response. In:Rance G, ed. Auditory Steady-State Response: Generation,Recording, and Clinical Applications. San Diego: Plural Publish-ing, 125–147.
Journal of the American Academy of Audiology/Volume 20, Number 7, 2009
452