Accuracy and Time Efficiency of Two ASSR Analysis Methods ...thresholds, and clinical test time are...

Accuracy and Time Efficiency of Two ASSR AnalysisMethods Using Clinical Test ProtocolsDOI: 10.3766/jaaa.20.7.5

Kathy R. Vander Werff*

Abstract

Background: The number of commercially available evoked potential systems implementing multiple-frequency auditory steady-state response (ASSR) techniques has increased over the last several

years. The majority of data in the multiple-frequency ASSR literature have been obtained using time-domain averaging and Fast Fourier Transform (FFT) techniques with F-test statistical analysis. Another

commercially available analysis method has been introduced using an adaptive filtering algorithmcalled the Fourier Linear Combiner (FLC). No previous investigation has evaluated the performance of

the FLC method, nor compared the two techniques. In addition, there is a need for evaluation of clinicalprotocols for ASSR testing using these available commercial systems that balance time efficiency and

accuracy in estimating threshold.

Purpose: (1) To determine whether ASSR thresholds, the relationship between ASSR and behavioral

thresholds, and clinical test time are affected by the ASSR analysis method when comparing twocommercially available systems for multiple-frequency ASSR. (2) To investigate the use of clinical

ASSR test protocols of varying recording length, and the effect on accuracy and time efficiency, usingthese two commercially available analysis methods.

Research Design and Study Sample: ASSR threshold searches were completed on a group of 20normal-hearing and 20 hearing-impaired adult participants using two different analysis methods, FFT

and FLC, under separate, independent, tests as well under simultaneous recording conditions.

Data Collection and Analysis: Three experiments were completed: (1) independent assessment ofASSR thresholds using the FFT and FLC methods separately, (2) simultaneous recording of ASSR for

both the FFT and FLC method, and (3) an automated threshold search protocol using the FLC method.Variables analyzed for Experiments 1 and 3 included ASSR thresholds, the difference between ASSR

and behavioral threshold, and total test time. For Experiment 2, the number of detected ASSRs per

method, the agreement between methods, and the time per detected ASSR were evaluated.

Results and Conclusions: ASSR thresholds and the relationship between ASSR and behavioral thresh-olds were found to be in line with those reported in the literature for multiple-frequency ASSR for both the

FLC and FFT methods. ASSR thresholds were found to be significantly higher for the FLC method for thelow frequencies, but not for the high frequencies, when tested independently. Correlations between ASSR

and behavioral thresholds, however, were found to be the same across methods. Overall, it did not appearthat either analysis method held an advantage in terms of accuracy or overall test time in independent

comparisons using the protocol implemented in the current study. The time benefits of an automatedprotocol were significant, although with compromised test accuracy. The results of this study suggest

critical clinical decision making is a necessary part of the ASSR protocol in order to decrease false positiveand false negative responses and to increase overall efficiency.

Key Words: Analysis method, auditory evoked response, auditory steady-state response, hearingthreshold, objective audiometry, recording time

Abbreviations: ASSR 5 auditory steady-state response; EEG 5 electroencephalographic; FFT 5 Fast

Fourier Transform; FLC 5 Fourier Linear Combiner; HI 5 hearing impaired/-ment; MASTER 5 MultipleAuditory Steady-State Response; NH 5 normal hearing

Kathy R. Vander Werff, Department of Communication Sciences and Disorders, 805 S. Crouse Ave, Room 200, Syracuse, NY 13244;Phone: 315-443-7403; Fax: 315-443-1113; E-mail: [email protected]

*Department of Communication Sciences and Disorders, Syracuse University, Syracuse, NY

This work was supported by the American Academy of Audiology Foundation (New Investigator Research Award) and the Marvin and CarolSchneller Fund, Syracuse University College of Arts and Sciences.

Portions of this research were presented at the American Auditory Society Annual Meeting, March 5–7, 2006, Scottsdale, AZ, and atAudiologyNOW! 2007, Denver, CO.

J Am Acad Audiol 20:433–452 (2009)

433

The auditory steady-state response (ASSR) has

transitioned from a research method into an

accepted diagnostic technique for evaluating

hearing in populations that cannot be tested using

behavioral methods. Many clinics have implemented

ASSR as part of the audiological test battery due to the

technique’s advantages as an objective measure,

including frequency specificity and automatic statisti-

cal detection of evoked responses.

The literature relating ASSR thresholds to behav-

ioral thresholds in adults and children has expanded

considerably in the past few years. However, there are

still reasons to hesitate in implementing ASSR as a

stand-alone clinical tool. There remain gaps in the

evidence base for ASSR threshold levels in infants with

hearing loss confirmed by behavioral data. There is a

lack of data regarding the use of bone-conduction

ASSR and identifying type of hearing loss with the

technique in hearing-impaired infants. In addition,

despite the possibility of testing multiple frequencies

simultaneously and objective response detection, ASSR

testing can still be time-consuming. ASSR is a small-

amplitude evoked potential, and the accuracy of

threshold estimation is known to be significantly

improved by longer recording times, due to lower

noise levels and higher SNR (John et al, 1998; John

and Picton, 2000; Luts and Wouters, 2005). Long

recording times may not be clinically feasible, and it

is important to evaluate whether an acceptable

balance between test time and accuracy can be

achieved. Despite these unresolved issues, there are

several commercial systems now available and in

clinical use. There are no accepted standards for

ASSR equipment, and each manufacturer is free to

implement its own test paradigm. The performance of

some of these methods and systems has not been

independently evaluated.

ASSRs are responses from the brain evoked by

continuous acoustic signals, most frequently sinusoi-

dally modulated carrier tones. These responses remain

stable in amplitude and phase over a long time period,

as neurons phase lock to the modulation rate of the

stimulus. ASSR can be recorded either to carrier tones

presented singly or to multiple tones presented

simultaneously. Both single- and multiple-frequency

ASSR techniques have advantages, and there are

commercial systems available utilizing both approach-

es. Previous research suggests that the accuracy of

single- and multiple-frequency techniques is similar

(Luts and Wouters, 2005), but that the relationship

between ASSR threshold and behavioral threshold

may be unique to the stimulation method (Luts and

Wouters, 2005; Vander Werff et al, 2008). That is,

clinicians would be advised to consider norms specific

to the type of stimulation method (single vs. multiple)

used by their device.

Within the category of multiple-frequency ASSR

techniques, manufacturers have implemented differ-

ent methods of recording and analyzing the response.

In general, multiple-frequency techniques estimate the

amplitude and/or the phase of each possible ASSR

component, corresponding to each of the modulation

frequencies used in the stimulus, in the ongoing

electroencephalographic (EEG) activity. If the estimate

of the component at a specific modulation frequency is

determined to be statistically different from random

background activity at a specified confidence level, an

ASSR is considered to be present for that stimulus

frequency. The way the estimates of the amplitude and

phase at each modulation frequency are determined,

the type of statistical test, and the time over which it is

applied vary across commercial systems.

The majority of multiple-frequency ASSR studies in

the literature have been conducted using the MASTER

(Multiple Auditory Steady-State Response) system

developed by John and Picton at the Rotman Research

Institute at the University of Toronto (John et al, 1998;

John and Picton, 2000). A version of the MASTER

system has been implemented commercially in the Bio-

logic Navigator Pro evoked potential system by Natus.

Similar analysis methods are incorporated in other

commercial devices such as the SmartEP system by

Intelligent Hearing Systems and the Audix by Neoro-

nic, SA. In these systems, incoming EEG data is

divided into sections or epochs of about 1 sec. In the

case of the MASTER system, 16 epochs are linked

together (after rejecting any epoch that exceeds

artifact rejection criteria) to form a sweep of data for

analysis. Fast Fourier Transform (FFT) calculations

are performed on each sweep of data to convert the

data to the frequency domain, and an F-ratio is used to

statistically evaluate the energy at each modulation

frequency compared to the energy in surrounding

frequency bins. That is, an estimate of the possible

ASSR signal is compared to an estimate of the

background noise at neighboring frequencies. Each

additional sweep of data is added to the prior sweeps in

the time domain, and the results are submitted to FFT

analysis and the F-test after each sweep. This method

of time-domain averaging serves to improve the signal-

to-noise ratio (SNR). When the F-ratio at a particular

modulation frequency becomes significant at the

specified level (typically p , 0.05), an ASSR is

considered to be present. It is up to the user to

determine the minimum and/or maximum number of

sweeps to include in the average before the ASSR

response is considered to be a stable, or real, response.

Other multiple-frequency ASSR systems have em-

ployed adaptive filtering algorithms, rather than time-

domain averaging and FFT analysis, to analyze the

EEG signal. One such example is a detection strategy

called RapidASSR, implemented commercially by GN

Journal of the American Academy of Audiology/Volume 20, Number 7, 2009

434

Otometrics in the Chartr EP Evoked Potential system.

This specific strategy implements an algorithm called

Fourier Linear Combiner (FLC) to adjust the ampli-

tude and phase of sine waves in its model to estimate

actual ASSR amplitude phase. The model is adaptively

evaluated against the EEG signal using a circular T2

statistic once per second. As soon as the statistic

reaches a specified confidence level (typically 95%), the

test is automatically stopped and a ‘‘positive ASSR’’ is

reported. If the specified confidence is not reached

within the designated maximum search time, the

results are reported as a ‘‘negative ASSR.’’

One difference between the two techniques just

described is the influence of the clinical strategy

implemented by the audiologist for determining

whether a response is present or absent. For both

methods, the clinician can determine a maximum

recording time after which, if the statistical criteria

is not met, ASSR will be considered absent. With the

FFT method, it is possible to implement strategies

based on always collecting data for a preset period of

time or number of sweeps, and making the decision

based on the F-ratio at the end of this time period.

Alternatively, ASSR testing could proceed until a

statistically significant ASSR is detected by evaluating

the F-ratio after each sweep. While this method is

more practical in clinical settings with limited test

time, the application of multiple sequential statistical

comparisons increases the likelihood that the result

could reach significance by chance. These types of

errors could be compensated for by changing the

significance level (e.g., using a Bonferroni correction);

however, it is has been shown that this type of

correction may lead to errors of not detecting real

ASSRs (Luts et al, 2007). Some combination of these

strategies may be the best solution in terms of test time

and accuracy. For example, recording could be contin-

ued for a predetermined length of time or be stopped

when the response remains significant for several

sweeps in a row or when the noise floor falls under a

certain criterion level.

The FLC method, by contrast, is continuously

updated in 1 sec increments and automatically stops

recording when statistical criteria are reached. The

circular T2 statistic, which is geared toward repeat-

ability, is not subject to the same types of errors of

multiple comparisons. However, to this author’s

knowledge, there are no reports in the literature of

the accuracy of this technique in estimating thresh-

olds. This study, therefore, provides a first report in

the literature of the use of the FLC analysis technique

for recording ASSR. It directly compares the FLC and

FFT analysis methods as implemented commercially

under independent conditions to optimize each proto-

col as well as under simultaneous recording condi-

tions, controlling for subject state as an extraneous

variable. Finally, this study attempts to utilize

clinically relevant strategies of variable recording

times to evaluate the balance between the test

accuracy and completion of recording in a feasible

amount of time.

METHODS

Subjects

Two groups of adult subjects were recruited for this

study, one group with normal hearing (NH), and one

group with hearing impairment (HI) for at least one of

the frequencies tested. A total of 43 adults were

enrolled, and 20 NH subjects (16 female) and 20 HI

subjects (7 female) completed the study. One NH

subject and two HI subjects did not complete the study

due to excessive noise levels even after allowing a

considerable period to relax/sleep and/or were unwill-

ing to return for further testing. Both ears were tested

and included in the analysis for each subject, with the

exception of one HI subject with a unilateral conduc-

tive component that exceeded the maximum stimulus

levels for the first experiment in this study at all test

frequencies, in which case the ear with mixed loss was

excluded. For the purposes of this study, NH was

defined as thresholds of 25 dB HL (American National

Standards Institute, 2004) or better for all frequencies

from 250 to 4000 Hz, and HI was defined as thresholds

.25 dB HL at one or more frequencies.

Recruitment focused on individuals with a somewhat

limited range of hearing loss due to the fact that the

upper limit of stimulation under simultaneous record-

ing conditions was 60 dB HL. The degree of hearing

loss for the HI group varied from mild to severe,

although most subjects had primarily high-frequency

hearing loss. Mean thresholds for the HI group were

21, 29, 36, and 52 dB HL for 500, 1000, 2000, and

4000 Hz respectively. The age range for the NH

subjects was 19–74 years (mean 29.3 6 13.4), and the

age range for the hearing-impaired subjects was 26–82

years (mean 70.7 6 12.0). These groups therefore

represent different ages as well as differences in

hearing status. All 40 participants completed Experi-

ments 1 and 2, and 36 of the participants also

completed Experiment 3 (17 NH and 19 HI). The

protocol used in this study was approved by the

Institutional Review Board of Syracuse University,

and informed consent was obtained from all partici-

pants.

Stimulus Parameters

ASSR stimuli were amplitude- and frequency-mod-

ulated (100% AM, 20% FM) tones at four carrier

frequencies (500, 1000, 2000, and 4000 Hz) per ear,

ASSR Analysis Methods/Vander Werff

435

presented to both ears simultaneously. Modulation

frequencies were approximately 80, 85, 90, and 95 Hz

for the left ear and 78, 83, 87, and 92 Hz for the right

ear, adjusted slightly for an integer number of cycles

within each epoch. ASSR stimuli were calibrated

separately for each carrier frequency in dB HL using

a Bruel & Kjaer Type 2209 sound level meter in linear

mode with a Bruel & Kjaer DB0138 2 cc coupler. ANSI

S3.6-1996 (American National Standards Institute,

1996) corrections for Etymotic ER-3A earphones were

used to convert from dB SPL to dB HL.

For Experiment 1, stimuli were generated by each

individual evoked potential system, the MASTER for

FFT method and the RapidASSR system for the FLC

method. For Experiment 2, under simultaneous re-

cordings of FFT and FLC, stimuli were generated and

presented by the MASTER system. Experiment 3

involved only the FLC method in an automated

threshold search protocol, and stimuli generated by

the RapidASSR system. Output levels were approxi-

mately 5 dB lower for the RapidASSR system com-

pared to the MASTER system; therefore, corrections

were made to adjust for differences between the output

levels of the two systems for the independent FLC

recordings. Frequency spectra for the stimuli generat-

ed by each system were comparable.

Instrumentation and Recording Parameters

Single-channel ASSRs were recorded using an active

electrode placed on the vertex (Cz), linked reference

electrodes on both mastoids (M1 and M2), and a

ground electrode on the low forehead (Fpz). All inter-

electrode impedance values were less than 5 kV and

were within 1.5 kV of each other. During simultaneous

recordings in Experiment 2, electrodes were linked via

jumper cables to the pre-amplifiers of both evoked

potential systems.

MASTER System (FFT method)

The Bio-logic MASTER system (research software v

2.02-R) implemented multiple ASSR using FFT anal-

ysis. EEG signals were amplified with a gain of 10,000

and filtered using a band pass of 30 to 150 Hz. The

basic unit of each stimulus was an epoch of 853 msec,

and a ‘‘sweep’’ of data in the average consisted of 16

epochs. Artifact rejection with a rejection level of 20 mV

was utilized to reject epochs with excessive myogenic

interference. The maximum number of sweeps record-

ed per intensity level was 45 for stimulus levels up to

70 dB HL, but were limited to a maximum of 36 sweeps

at 80 dB HL. The maximum recording time for each

level was, therefore, up to 10.5 min.

The F-ratio of the energy at each modulation

frequency versus the energy in neighboring bands

was calculated after each sweep. Response detection

level was set at p , 0.05. The ongoing display indicated

the amplitude, phase, F-statistic, and noise floor level

for each carrier/modulation frequency combination

after each sweep. Averaging could be stopped manu-

ally at any time after 1 sweep or continued until the

full 45 sweeps were completed. The intensity level was

automatically decreased by 10 dB after the maximum

number of sweeps was completed or when recording

was halted manually.

A clinical protocol of variable recording lengths was

employed based on two main criteria, although a

minimum of 15 sweeps were always collected. Once

15 sweeps had been collected, the recording could be

stopped before the maximum number of sweeps if one of

the two criteria were met: (1) an ASSR remained

statistically significant for five sweeps in a row or (2)

no stable ASSR was present, and the average noise floor

was below 10 nV. One of these two criteria had to be

fulfilled for all eight frequencies (four in each ear);

otherwise themaximum number of sweepswascollected.

RapidASSR System (FLC method)

The RapidASSR system (EP v 4.0, ASSR DLL v

1.1.05) implemented multiple ASSR using an adaptive

filter analysis method called FLC. EEG signals were

amplified with a gain of 100,000 and filtered from 10 to

105 Hz. The artifact reject level was not adjustable

with this system. The probability of a significant ASSR

was calculated by comparing the amplitude and phase

of model sine waves to the incoming EEG data

approximately once per second using a circular T2

statistic with the criterion level set at 95%. Once the

probability reached 95%, no further analysis was

conducted at that carrier/modulation frequency (re-

cording was stopped automatically and the ASSR

considered present), although analysis continued for

any other frequencies not yet significant. If the

criterion 95% probability level was not reached,

recording continued until the maximum time was

reached. The maximum recording time was adjustable

in minute increments and was set to 11 min to match

the MASTER system as closely as possible.

The information available on the screen during

recording included the response probability at each

carrier frequency updated in 1 sec intervals. Once 95%

probability was reached, the display indicated that a

present ASSR had been identified. If ASSRs were

identified at all frequencies, recording stopped for that

intensity level. The intensity level then automatically

decreased by 10 dB. Although the FLC method

eliminated most user control for determining recording

length (other than setting the maximum time), for the

purposes of this study an additional criterion was

developed during pilot testing. If more than 2 min


436

elapsed during which the response probability remained

lower than 50%, recording was manually halted and the

result was considered an absent ASSR. Again, criteria

had to be met for all frequencies where ASSRs had not

already been detected. The time (in seconds) taken to

identify each response was saved automatically in a log

file available for off-line analysis.

Procedures

Experiment I: Independent Threshold Assessment

by Analysis Method

ASSR threshold searches were conducted for each

subject with each of the two systems independently in

separate recordings to evaluate performance under the

optimal conditions for each method. In all cases, a

descending threshold search was conducted using

10 dB steps. The lowest level tested was 10 dB HL.

For NH subjects and those with mild hearing loss, the

search protocol began at 60 dB HL. For most HI

subjects the search began at 80 dB HL. The two

systems differed on how stimuli over 60 dB HL were

presented. The MASTER system allowed presentation

of multiple stimuli up to 80 dB HL, while the

RapidASSR system limited presentation of multiple

stimuli to 60 dB HL. Therefore stimuli were presented

singly at 70 and 80 dB HL for the RapidASSR system.

If configured to present single stimuli, the MASTER

system defaulted to modulation frequencies below

70 Hz; therefore, the decision was made to collect data

using multiple stimuli for the MASTER and single

stimuli for the RapidASSR at 70 and 80 dB HL, rather

than change the modulation frequencies. The maxi-

mum intensity level tested for independent recordings

of both systems in Experiment 1 was 80 dB HL.

The criteria described above for each method were used

to determine the length of recording at each intensity

level, and the investigatordescended to the next intensity

level as soon as any of these criteria were reached for all

frequencies. The RapidASSR system automatically

stopped presenting the stimulus for an individual carrier

frequency as soon as a response reached the 95%

confidence level, meaning that the number of stimuli

presented at one time changed over the course of the

recording. Each system was also set up to automatically

descend to the next intensity step when the maximum

time was reached. Testing continued in descending steps

until no ASSRs were obtained for two consecutive

intensity levels or to the lowest level of 10 dB HL.

Experiment 2: Simultaneous Comparison of ASSR

Detection by Analysis Method

Simultaneous recordings were conducted in order to

compare ASSR detection rates of the two systems

under exactly the same subject and noise conditions.

As in the independent recordings, a descending

intensity series was conducted from 60 to 10 dB HL

in 10 dB steps. Higher intensity levels were not tested

due to the differences in protocols for the two systems

at these levels. As a result, ASSR thresholds were not

obtained in the high frequencies for a number of HI

subjects.

Stimuli were presented via the MASTER system

signal generation and insert earphones. The Rapi-

dASSR recording system was calibrated by engineers

from GN Otometrics to incorporate exact stimulus

frequency and phase parameters of the MASTER

stimuli into the response detection model algorithm.

The two evoked potential systems were not directly

linked in any way other than by jumpers between the

pre-amplifiers. Stimulus presentation and recording

was started through the MASTER system, and immedi-

ately following, recording was begun with the Rapi-

dASSR system. Recording by the two systems was begun

as close to simultaneously as possible, however precise

triggering between the two systems was not required due

to the nature of the FLC detection method.

Criteria used to determine recording length were the

same for simultaneous recording as described in

Experiment 1, although stimulus presentation contin-

ued until criteria were met for both systems. For the

MASTER system, this meant that recording continued

while ‘‘waiting’’ for the RapidASSR system to meet

criteria. These additional sweeps collected after crite-

ria were met for the FFT were not included in the

calculations. Due to the way the RapidASSR system

operates in automatically stopping recording when

responses are detected, if FLC criteria were met before

FFT criteria the RapidASSR system was paused to

prevent data collection at the next intensity level.

Experiment 3: Automated Threshold Search

Protocol Using the FLC Method

The RapidASSR system offers an automated thresh-

old search protocol called ‘‘Quick Search.’’ The Quick

Search protocol adjusts the stimulus level indepen-

dently for each carrier frequency once a present ASSR

is detected. The user defines the upper and lower limits

of the search, and the software algorithm is designed to

bracket threshold within this range, beginning with a

level in the middle of the selected range. If a present

ASSR is detected, the level will automatically decrease

for that frequency as much as 20 dB (unless this is

below the lower limit). If no ASSR is detected following

the maximum time, the presentation level is increased

by 5–10 dB. Because the level is adjusted for each

frequency separately, the intensity levels of each and

the number of stimulus frequencies being tested

vary over time. The automated search protocol was


437

implemented with an upper limit of 80 dB HL and a

lower limit of 10 dB HL, using a minimum 5 dB step

size and independent testing of each ear.

General Procedures

Experiments 1 and 2 were completed first. In order

to complete these two experiments, subjects were

tested under three conditions in random order: FFT

independently, FLC independently, and simultaneous

FFT and FLC. Experiment 3 was then completed if

time permitted, always after the first two experiments

were completed. Most subjects required two test

sessions to complete testing for all three experiments;

however, each individual test condition was completed

within a single session. Subjects were seated in a

comfortable, reclining chair in a double-walled sound-

treated booth with lights dimmed. They were asked to

relax with their eyes closed and try to sleep if possible.

No formal assessment was made of subject state,

although most subjects were able to sleep or doze

during the recording sessions. For all experiments,

testing continued until ASSR threshold was deter-

mined for all frequencies. ASSR thresholds were

defined as the lowest intensity level where a statisti-

cally significant ASSR was detected, provided that

suprathreshold ASSRs were also present within 20 dB

(unless this exceeded the highest level tested). If

ASSRs were absent for two consecutive intensity steps

above the lowest present ASSR, it was considered to be

a false positive response.

Analysis

ASSR thresholds were analyzed by frequency,

participant group, and test method (FLC or FFT) for

all three experiments. Difference scores between

behavioral and evoked potential thresholds were

calculated for each individual by subtracting behav-

ioral threshold from ASSR threshold. Means and

standard deviations of these values were determined

by frequency and subject group. Cases where threshold

could not be determined due to absent ASSR at the

upper limits of testing were not included in the

average. However, in order to compare test performance

using repeated measures analysis by test methods,

values 10 dB above the highest stimulus levels were

assigned to no-response conditions (i.e., 90 dB assigned

for no-response conditions in Experiment 1 and 70 dB

HL for Experiment 2). The time required to complete

the entire threshold search using each analysis method

was analyzed for independent recordings in Experiment

1 and Experiment 3. For Experiment 2, the number of

detected ASSRs per method, the agreement between

methods, and the time per detected ASSR were

evaluated. For repeated measures comparisons of time

per detected ASSR, no-response conditions were as-

signed a value of 46 sweeps (one greater than the

maximum number of sweeps collected). Statistical

analyses are described in the results for each experi-

ment. Due to the fact that much of the data was not

normally distributed, nonparametric tests were utilized

where required.

RESULTS

Experiment 1: Independent Threshold

Assessment by Analysis Method

ASSR versus Behavioral Thresholds

Mean ASSR thresholds for the FFT and FLC

methods as recorded independently are shown in

Table 1 for the NH and HI groups for each stimulus

frequency. The shaded rows in Table 1 provide the

mean difference scores, that is, the mean difference

amount that ASSR thresholds are raised over behav-

ioral thresholds for each group. Table 1 reveals a trend

for higher ASSR thresholds, and correspondingly

larger difference scores, for the FLC method over the

FFT method. Friedman Repeated Measures ANOVA

(analysis of variance) on Ranks showed that ASSR

thresholds were significantly higher for the FLC

method compared to the FFT method at 500 Hz (p 5

0.009) and 1000 Hz (p , 0.001), but there was no

significant difference by method for 2000 Hz (p 5

0.133) or 4000 Hz (0.961). There was not a significant

difference in ASSR-behavioral threshold difference

scores for the NH compared to the HI subjects for

either the FLC or the FFT method at any frequency

when compared by Wilcoxon signed-rank test (p . 0.05

for all comparisons).

The top two panels of Figure 1 show the relation-

ships between ASSR and behavioral thresholds for the

FFT and FLC methods under independent recording

conditions in Experiment 1. Data for all four stimulus

frequencies are combined, and symbols may represent

multiple overlapping data points. The solid line in each

panel of Figure 1 represents equal ASSR and behavioral

thresholds. As expected, ASSR thresholds were gener-

ally higher than behavioral thresholds. Thresholds

falling above the solid line (ASSR threshold lower than

behavioral threshold) were considered to be false

positives for the purpose of this study. By these criteria,

there were eight false positives for the FFT method

(seven of these in the HI group) and only four false

positives for the FLC method (three in the HI group).

Although there were fewer false positives for the FLC

method, the largest false positive (20 dB) was obtained

with this method for an HI subject at 2000 Hz.

The dotted line in each panel of Figure 1 shows the

regression between ASSR and behavioral thresholds.


438

There was a strong overall correlation between ASSR

and behavioral thresholds for both methods underindependent recording conditions, with Pearson corre-

lation coefficients for all frequencies combined of 0.83

and 0.81 for the FFT and FLC methods respectively.

Table 1 lists Pearson correlation coefficients between

ASSR and behavioral thresholds by individual carrier

frequency. Statistical comparison using 95% confidence

intervals around the r-values indicated that the corre-

lation at 500 Hz was significantly poorer than for allother frequencies for the FFT method (p , 0.001). The

correlation at 500 Hz for the FLC method was signifi-

cantly poorer than for 2000 and 4000 Hz (p , 0.001),

although not significantly different from the correlation

at 1000 Hz for the FLC (p 5 0.075). Correlations were

not significantly different at any of the four frequencies

for FFT compared to FLC (p . 0.05), however.

An overall mean difference score was calculated for

each ear by averaging the difference scores at each of

the four stimulus frequencies in each ear. Figure 2

compares mean difference scores for the FFT and FLCmethods for the independent recording conditions in

Experiment 1. The solid line in this figure indicates

equal difference scores for the two methods, while the

dotted line indicates the regression fit to the data.

Average difference scores were significantly correlated

between analysis methods, with Pearson correlation

coefficient of r 5 0.74. Mean difference scores were not

significantly different between the two methods whenevaluated by paired t-test (p 5 0.102).

Total Test Time

Mean total test time for all subjects (NH and HI

subjects combined) required to complete thresholdtesting for all frequencies in both ears was 46.1 6

15.2 min for the FFT method and 43.6 6 12.2 min for

the FLC method. Overall, Friedman Repeated Mea-

sures ANOVA on Ranks showed that the total test

times were not significantly different between the FFT

and FLC methods in independent test sessions (p 5

0.739). As can bee seen in Figure 3, when broken down

by subject group, total test times were shorter for both

methods for the NH subjects compared to HI subjects.The test time for NH subjects was slightly shorter on

average for the FFT method (38.0 6 10.5 min) than for

the FLC method (39.2 6 10.3 min). Test times were

longer on average for the HI subjects by the FFT

method (54.9 6 13.5 min) compared to the FLC method

(47.4 6 12.9 min) for the FFT and FLC methods

respectively. Kruskal-Wallis one-way ANOVA on

ranks with Dunn’s method of post-hoc comparisonshowed that it took longer to complete threshold

testing for the HI subjects using the FFT method than

for NH subjects using either the FFT or the FLC

method (p , 0.05). No other comparisons for total time

by group or method were statistically significant.

The results for Experiment 1, therefore, showed that

under independent test conditions ASSR thresholds

were significantly higher in the low frequencies for the

FLC method compared to the FFT method. However,

correlations between ASSR and behavioral thresholds

were similar for the two analysis methods, and theaverage difference score across frequencies was well

correlated between the two methods. The total time

needed to complete testing for the entire audiogram

was longer for HI compared to NH subjects but was not

significantly different between methods.

Experiment 2: Simultaneous Comparison of

ASSR Detection by Analysis Method

ASSR versus Behavioral Thresholds

ASSR thresholds were also determined under simul-

taneous recording conditions. Recall that the highest

Table 1. Mean ASSR Thresholds and Difference Scores, Experiment 1


439

stimulus level for the simultaneous condition was 60 dB

HL. Table 2 shows the mean ASSR thresholds and

difference scores for the simultaneous condition. When

compared under the same recording conditions, ASSR

thresholds were significantly higher for the FLC condi-

tion at 500 Hz (p , 0.001), 1000 Hz (p 5 0.013), and

2000 Hz (p , 0.001), but not at 4000 Hz (p 5 0.670) by

Friedman Repeated Measures ANOVA on Ranks. The

significant difference at 2000 Hz, however, is likely due

to the smaller number of HI subjects with measurable

thresholds at this frequency. Because of the 60 dB HL

upper limit for the simultaneous condition, thresholds

were obtained in a limited number of HI subjects. Mean

ASSR thresholds, therefore, appear to be better for the

simultaneous than the independent condition (Table 1)

for each method, but only due to the number of subjects

not included in the simultaneous average. This is

particularly true at 4000 Hz, where only 10 and 11 ears

contributed to the average for the FFT and FLC methods

respectively, while for the independent recordings,

ASSR thresholds could be determined for 28 and 25 ears

respectively.

The relationship between ASSR thresholds obtained

during simultaneous recording and the behavioral

Figure 1. Scatterplots representing the relationship between individual ASSR thresholds in dB HL obtained for the FFT (left column)and FLC methods (right column) and behavioral thresholds in dB HL. Thresholds obtained for all four stimulus frequencies and bothears for all subjects are included, and symbols may represent multiple overlapping data points. The top two panels show results fromindependent recordings in Experiment 1, while the bottom two panels show data obtained under simultaneous recording conditions inExperiment 2. Data were not included for any individuals/frequencies where threshold exceeded the upper limits of stimulation (80 dBHL for Experiment 1 and 60 dB HL for Experiment 2). Solid diagonal lines in each panel represent equal ASSR and behavioralthresholds. Dotted lines show the regression of all data in each panel. Pearson correlation coefficients and the regression equations foreach are indicated in the upper left of each panel.


440

thresholds for each individual are shown in the bottom

two panels of Figure 1. The correlation between ASSR

and behavioral thresholds was poorer for the simulta-

neous recording conditions (r 5 0.74 and 0.72 for FFT

and FLC respectively) than for the independent

recordings (top panels), likely due to missing data

and restrictions placed on this recording method.

Under simultaneous recording conditions, correlations

were weakest for 500 Hz and strongest at 2000 Hz for

both the FFT and the FLC methods (Table 2). As with

the independent recordings, there were no significant

differences in the strength of the correlation for FFT

compared to FLC for any of the frequencies for the

simultaneous comparison (p . 0.05 for all compari-

sons).

Detected ASSRs by Analysis Method

While comparisons of ASSR thresholds between

methods are limited under the simultaneous recording

condition, this method allows for a direct comparison of

the ASSR detection rate by method under the same

subject and recording conditions. Each subject had the

possibility of 48 present ASSRs during the entire test

session if all intensity levels were tested and present

ASSRs were detected at all levels and frequencies (six

intensity levels 3 four frequencies 3 two ears).

Because not all subjects were tested at all levels, some

subjects had fewer possible ASSRs. Figure 4 shows the

number of detected (statistically significant) ASSRs for

each individual subject for both the FFT (filled

symbols) and FLC (open symbols) methods, in individ-

ual order by subjects from the lowest to highest number

of detected ASSRs within each subject group. The

number of ASSRs detected during the total test session

per subject ranged from 2 to 45. For almost all subjects,

the number of ASSRs detected by the FFT method was

higher than the number detected by the FLC method.

The inset of Figure 4 shows the average number of

detected ASSRs by group and analysis method.

Figure 2. Scatterplot of the relationship between the meandifference scores for each individual ear for the FFT and FLCmethods. Difference scores were calculated by subtractingbehavioral threshold in dB HL from ASSR threshold in dB HL.The mean difference score was obtained for each ear tested byaveraging the difference score for 500, 1000, 2000, and 4000 Hz.The solid diagonal line represents equal mean difference scoresfor the two methods. The dotted line indicates the regression forall data in the figure, and the Pearson correlation coefficient isindicated in the upper left of the panel.

Figure 3. Box plots of the total time required to complete threshold testing for all frequencies in both ears by test method and subjectgroup for Experiment 1. Outer limits of each box represent the 25th and 75th percentiles, with the median shown as the line within thebox. Whiskers (error bars) indicate the 10th and 90th percentiles, with filled circles showing the 5th and 95th percentiles. Results for theNH subjects are shown in the left panel, and the HI subjects in the right panel.


441

Figure 5 shows the detection rate in terms of

percentage of present ASSRs out of the total possible

as a function of intensity level for each carrier

frequency. For the NH subjects (filled symbols), the

percentage detected ASSRs increases steeply with

intensity level for both the FFT (circles) and FLC

(triangles) methods. The FLC method detected an

equal or slightly lower percentage of ASSRs than the

FFT method for most comparisons, although the

difference between methods was generally greatest at

the highest intensity levels (60 dB HL). For the HI

subjects, a smaller percentage of ASSRs was detected

as compared to the NH subjects, particularly at 2000

and 4000 Hz due to the fact that behavioral thresholds

were elevated in this group. The percentage detected

ASSRs for the HI subjects detected tended to be higher

for the FFT method over the FLC method, although

this pattern was less consistent across intensities for

some frequencies (e.g., 1000 Hz).

Despite these trends for a higher ASSR detection

rate for the FFT method over the FLC method, chi-

square tests revealed no significant difference in the

Figure 4. The total number of present ASSRs (significant responses) detected per individual subject for the FFT (filled symbols) andthe FLC (open symbols) methods for Experiment 2. Subjects with ASSRs at all frequencies in both ears at all stimulus levels would havea possible 48 present ASSRs. Subjects are sorted in order from fewest to most present ASSRs in each subject group. The inset of thefigure shows the mean and standard deviation of the number of present ASSRs for the FFT (black bars) and FLC methods (gray bars) forthe NH and HI subjects.

Table 2. Mean ASSR Thresholds and Difference Scores, Experiment 2


442

number of present ASSRs between the FFT and FLC

methods by subject group (x2 5 0.586, df 5 1, p 5

0.444), test frequency (x2 5 0.0572, df 5 3, p 5 0.996),

or intensity level (x2 5 0.911, df 5 5, p 5 0.969) for

these simultaneous comparisons.

Because behavioral thresholds varied among sub-

jects, the detection rate for the two methods was also

compared by examining results at comparable intensi-

ties in terms of equal sensation levels (dB SL). Due to

the fact that ASSR was collected in 10 dB steps, while

behavioral thresholds were in 5 dB steps, ‘‘equal’’

sensation levels were considered to be within a range

of 5 dB. For the frequencies of 1000–4000 Hz, the

percentage of detected ASSRs was evaluated at either

15 or 20 dB SL for all subjects. Because ASSRs at

500 Hz tend to be further elevated above threshold

relative to the higher frequencies, detection rates were

evaluated for levels of 25 or 30 dB SL. When analyzed

at this low and equal SL, a slightly higher percentage

of ASSRs were detected by the FFT method (60, 65, 70,

and 60% for 500, 1000, 2000, and 4000 Hz respectively)

than the FLC method (60% at 500, 1000, and 2000 Hz;

50% at 4000 Hz) for the NH subjects. For HI subjects,

ASSRs were detected at a low SL in 70, 65, 72, and 36%

of cases for the FFT method compared to 68, 57, 59,

and 38% for the FLC method. Chi-square tests

revealed, however, that there were no significant

differences in detection rate by method at a low SL

Figure 5. The percentage of detected ASSRs by stimulus frequency and intensity level across subjects for each analysis method. Filledsymbols represent the NH subjects, while open symbols represent HI subjects for the FLC method (circles) and the FFT method(triangles). Percentages were calculated as the number of detected ASSRs out of the total number of tested across subjects.


443

for either the NH (x2 5 0.236, df 5 3, p 5 0.971) or the

HI subjects (x2 5 0.197, df 5 3, p 5 0.978), nor was

there a significant difference by subject group overall

(x2 5 0.005, df 5 1, p 5 0.941).

Along with the overall number of present ASSRs, it

is important to know whether the two methods

obtained the same result (present or absent ASSR)

for a particular test. For example, for an individual

subject, when testing the 500 Hz in the left ear at

30 dB HL, is the same result obtained by the FFT and

FLC methods? The agreement between the FFT and

FLC methods by intensity level for each carrier/

modulation frequency was analyzed by calculating

whether (1) both methods detected a present, statisti-

cally significant ASSR, (2) both methods failed to

detect an ASSR, or (3) a present ASSR was found by

one method while no ASSR was detected by the other

method. Overall, the test outcome of present or absent

ASSR was in agreement between the two methods for

89% of tests. The rate of agreements and disagreement

can be seen in Figure 6 for the NH (left panels) and HI

(right panels) groups. For the NH subjects, the two

methods agreed in 91% of cases overall. Disagreements

between the methods were found in 7.5, 11.3, 7.9, and

6.7% of cases for 500, 1000, 2000, and 4000 Hz

respectively.

The number of ASSRs detected was lower in the HI

subjects overall due to fewer intensity levels tested as

well as absent ASSRs due to elevated hearing thresh-

olds. Results from the FFT and FLC methods still

agreed in the majority of cases (87%). The percentage of

disagreements was higher, however, for the HI subjects

compared to the NH subjects, with 10.1, 13.8, 12.4, and

11.9% of cases for 500, 1000, 2000, and 4000 Hz. Three-

way ANOVA (group 3 frequency 3 intensity) with

Holm-Slidak pairwise comparisons showed that the

number of disagreements did not vary significantly

based on carrier frequency (F 5 1.678, df 5 3,15, p 5

0.214) or intensity level (F 5 0.654, df 5 1,15, p 5

0.663), but the number of disagreements between

methods was significantly greater for the HI group over

the NH group (F 5 4.553, df 5 5, 15, p 5 0.050).

It is possible that disagreements between methods

could relate to differences in false positive or false

negative response rates between methods. For the

purposes of this study, a false positive response was

defined as a statistically significant ASSR detected at

any level below the behavioral threshold, or one that

occurred more than 20 dB (2 test levels) below the

next-highest phase-locked ASSR. Out of a total of 960

possible tests, there were seven false positive respons-

es detected in the NH subject group (0.73%), including

four for the FFT method and three for FLC. There were

37 false positives out of 872 tests in the HI group

(4.24%), 22 by FFT, and 15 by FLC. Three of these false

positives occurred for both FFT and FLC, which

resulted in agreements between methods. Therefore,

it does not appear that there was increased between-

method variability due to false positives.

There were also a few instances of absent ASSRs at

intensity levels well above behavioral threshold and/or

despite present ASSRs at lower intensity levels (e.g.,

absent ASSR at 60 dB HL and present ASSRs at 50, 40,

and 30 dB HL). These cases could be considered false

negatives. If it is assumed that, for example, all NH

subjects should have a present ASSR at 60 dB HL, false

negatives occurred in 18 cases or 11.25% for the FLC

method, and only 5 cases or 3.13% for the FFT method.

For example, no significant ASSRs were detected for

NH subject A4 to the 1000 Hz stimulus by the FLC

method at any of the tested intensity levels in the right

ear despite a behavioral threshold of 0 dB HL at this

frequency. ASSRs were obtained at 40, 50, and 60 dB

HL at this same frequency by the FFT method.

However, no ASSRs were detected for this subject at

60 dB HL for 4000 Hz in the right ear by either FLC or

FFT but were detected at lower levels. This subject’s

noise floor started out fairly high and did not fall under

10 nV at any point during the recording session.

It is possible, therefore, that false negatives may

have been due to higher subject noise levels at the

beginning of the test session when the highest

intensity levels were tested. However, subjects with

false negatives at 60 dB had a mean noise floor of 11.1

6 4.0 mV, while those without any false negatives at

60 dB had a mean noise floor of 13.6 6 3.7 mV. There

was also not a significant correlation between individ-

ual subject noise floor and the number of agreements

between the two methods for either NH (p 5 0.080, r 5

0.16) or HI subjects (p 5 0.558, r 5 0.06). In general,

therefore, any difference in results between the two

methods did not appear to relate to the noise floor of

the individual subject.

Time per Detected ASSR by Analysis Method

Comparing the time per detected ASSR during

simultaneous recording is complicated both by the

differences between methods including sweep length

and by the decision criteria used to determine present

ASSR. The FLC method updated statistical results in

1 sec increments, while FFT data are in sweeps of

13.65 sec. Unlike the FLC method implemented by the

RapidASSR system, the MASTER system and FFT

method were not stopped automatically when response

criteria first reaches significance. The current data

were collected using a minimum of 15 sweeps for the

FFT method, which results in considerably longer

averaging times per ASSR than the FLC method.

Instead of requiring 15 sweeps, FFT data could be

analyzed in terms of the first significant sweep, or after

a required number of consecutive significant sweeps.


444

Figure 7 shows a comparison between the recording

times for each individual ASSR detected by the FFT

and FLC methods in terms of sweeps using two types of

FFT criteria. In order to compare the two methods

more directly, the FLC data was rounded up to the

nearest ‘‘sweep’’ of 13.65 sec. That is, if the FLC

method obtained a present ASSR at 15 sec, it was

considered to be two sweeps. The converted FLC data

was compared to the FFT data after the first

significant sweep and the fifth consecutive significant

sweep to compare the time per detected ASSR. For

data to be included in this comparison, both the FFT

and FLC methods resulted in a present ASSR for the

carrier frequency and intensity level tested (e.g., one

Figure 6. Agreement in response presence or absence between the FFT and FLC methods by intensity level for the NH (left panels)and the HI (right panels) at each frequency in Experiment 2. Black portions of the bars represent the number of cases where the presentASSRs were detected by both methods. Light gray areas indicate the number of cases where ASSRs were found to be absent (noresponse) by both analysis methods. The dark gray areas at the top of the bars represent cases of disagreement, where a present ASSRwas detected by one method and no response was detected by the other method at a particular frequency/intensity.


445

data point represents subject A1 at 500 Hz, 60 dB HL,

left ear). The gray shaded area in Figure 7 represents

response detection times within five sweeps (68.24 sec)

between the two methods.

The filled symbols in Figure 7 show the time when

the ASSR first became significant by the FFT method,

called ‘‘FFT (1st),’’ compared to the time rounded to the

next highest sweep when significance criteria was met

for the FLC method. Almost all data points lie below

the solid line, indicating that the first significant

sweep for the FFT method was earlier than when the

response reached statistical criteria for the FLC

method. The time for FFT (1st) was significantly faster

than the time for the FLC method for all test

frequencies (p , 0.05 for each) by Friedman Repeated

Measures ANOVA on Ranks. However, the ASSR often

did not remain significant for subsequent sweeps for

the FFT, and this first significant response would be

considered an error if it did not remain for five

consecutive sweeps. The percentage of cases where

the first significant FFT response resulted in an error

ranged from 19% at 500 Hz to 39% at 4000 Hz. The use

of the first significant sweep for the FFT method as

criterion would not be recommended; however, this

comparison highlights that the time when the FLC

analysis became significant did not correspond to the

first significant sweep for the FFT.

The second criterion evaluated was the requirement

for five consecutive significant sweeps for FFT, called

‘‘FFT (5sig),’’ for which the minimum time for the FFT

Figure 7. The relationship between the time per detected ASSR for the FLC method and the FFT method under simultaneouscomparisons in Experiment 2. FLC times were converted to ‘‘sweeps’’ by rounding to the nearest FFT sweep length of 13.54 sec. Filledsymbols indicate the time in sweeps for the first significant response by the FFT method. Open symbols show the time in sweeps for theFFT method to meet the requirement of five consecutive significant sweeps. The solid diagonal line in each panel represents equal timesin sweeps for the two methods, while the gray shaded area includes results for the FFT and FLC methods that are within five sweeps ofeach other.


446

method would be five sweeps (68.24 seconds). The time

per detected ASSR for the FLC method correlated more

strongly with the FFT (5sig), as shown by the open

symbols in Figure 7, than for FFT (1st). If data points

falling within the gray area are considered to represent

comparable times between methods (within five

sweeps or just over 1 min), the times were similar for

between 74% (500 Hz) and 82% (4000 Hz) of cases.

Overall, the FLC method detected the ASSR faster

than FFT (5sig) 61% of the time; however, in only 5% of

the total cases was the FLC method faster by over five

sweeps (shown by data points above the diagonal

shaded area). The FFT (5sig) method was faster in 32%

of cases overall, and in 16% of the total comparisons

the FFT (5sig) method was faster by over five sweeps.

The time in sweeps per detected ASSR was signifi-

cantly shorter for the FLC method compared to the

FFT (5sig) criteria for all four carrier frequencies

across intensity levels by Friedman Repeated Mea-

sures ANOVA on Ranks (p , 0.05 for each). However,

the FLC method and the FFT (5sig) method were

within five sweeps of each other in 79% of cases overall,

meaning present ASSRs were detected by the two

methods within just over 1 min of each other.

Note that considering the criterion for a minimum of

15 sweeps for the FFT method would result in all data

points below 15 sweeps in Figure 7 being moved up to

this value. The time difference for the FLC compared

to the FFT 15 sweep requirement was significant for

all test frequencies (p , 0.05), with the FLC method

faster 76% of time overall and over five sweeps faster

in 61% of these cases. The choice of stopping criteria for

the FFT method had an effect not only on the time

taken to detect a response but also on whether an

ASSR would be considered present at a given intensity

level. There were cases in which the criteria of five

consecutive significant sweeps resulted in a different

decision regarding ASSR presence/absence than if a

minimum of 15 sweeps were also required. In total,

this happened in only 13 instances out of a total of

1,920 individual tests (including all subjects/frequen-

cies/levels/ears) during the simultaneous recordings.

There were 37 instances in which there were five

consecutive significant sweeps at some point during

the test, followed by nonsignificant sweeps. In 20 of

those cases, the ASSR threshold would have changed

depending on whether recording would have stopped

after the first five consecutive sweeps.

In summary, the results for the simultaneous

recordings in Experiment 2 in which test conditions

were exactly the same for the two methods revealed

few significant differences between the FFT and FLC

methods. As with the independent recordings in the

first experiment, ASSR thresholds were found to be

significantly higher for the FLC method in the low

frequencies. Mean ASSR thresholds for the FLC

method were also found to be significantly higher at

2000 Hz for this experiment, although this result is

likely influenced by the reduced upper intensity limit

and correspondingly fewer ASSR thresholds obtained

for HI subjects at this frequency. The FFT method

tended to detect a greater number of ASSRs overall,

although this trend did not reach significance by

intensity, frequency, or subject group. In terms of time

per detected ASSR, the first significant sweep for the

FFT method was significantly faster than the time for

a significant ASSR by the FLC method, although this

criterion resulted in a large number of errors. If

criteria for five consecutive significant sweeps for

FFT were used, with or without a 15-sweep minimum,

the FLC method was significantly faster in detecting

an ASSR. However, in the majority of cases the two

methods detected ASSRs within a few sweeps, or just

over 1 min of each other.

Experiment 3: Automated Threshold Search

Protocol Using the FLC Method

For Experiment 3, an automated threshold search

protocol implemented by the RapidASSR system using

the FLC analysis method was evaluated. Because this

search protocol allowed for independent changes of

stimulus level by frequency and in each ear, the

automated algorithm has the potential to complete

threshold testing across frequencies in a more time-

efficient manner. The automated threshold search was

implemented as described in the methods section for 36

of the participants who had completed Experiments 1

and 2 (17 NH and 19 HI).

Figure 8 shows ASSR thresholds obtained using the

FLC automatic search protocol compared to behavioral

thresholds. The overall correlation of r 5 0.74 is poorer

than those obtained with either the FLC or FFT

independent recordings in Experiment 1 (Figure 1).

Correlation coefficients of r 5 0.63, 0.71, 0.88, and 0.77

were found for 500, 1000, 2000, and 4000 Hz respec-

tively. With the exception of 500 Hz, these were lower

than those found in Experiment 1. In addition,

Figure 8 shows that in a few instances ASSR thresh-

olds underestimated behavioral thresholds by a large

amount (up to 45 dB at 4000 Hz in one case).

The correlation between ASSR-behavioral threshold

difference scores for the automated search compared to

the independent threshold search in Experiment 1 was

r 5 0.60 overall. Mean difference scores were found to

be 27.3 6 13.3, 21.97 6 13.6, 19.8 6 10.8, and 15.4 6

15.6 across all subjects (NH and HI combined) for 500,

1000, 2000, and 4000 Hz respectively. Although mean

difference scores were generally similar for the

automated search as compared to the FFT or FLC

independent recordings in Experiment 1, standard

deviations were slightly larger, indicating higher


447

variability for this experiment. Friedman repeated

measures ANOVA on ranks with post-hoc Tukey tests

showed that ASSR thresholds for the automated

search did not differ significantly from those obtained

using the FLC method in Experiment 1 at any of the

four frequencies (p . 0.05 for all). However, ASSR

thresholds for the FLC automated search were signif-

icantly higher than the FFT thresholds in Experiment

1 at 500, 1000, and 2000 Hz (p , 0.05).

While the correlation between ASSR and behavioral

thresholds was slightly poorer for the automated

search, the total test time was significantly decreased

compared to Experiment 1. Figure 9 shows box plots of

the total test time for the NH and HI subjects. The

mean test time for the NH group was 24.9 6 11.0 min,

while the total time for the HI subjects was 27.5 6

6.6 min. When compared to the test times shown in

Figure 3, there was a significant effect of the type of

search protocol on total test time by Friedman

Repeated Measures ANOVA on Ranks (p , 0.001).

The total test time for the automated protocol was

significantly faster than either the FLC or the FFT

independent tests (p , 0.05 each).

The results of Experiment 3 show that the FLC

automated search protocol with independent adjust-

ment of intensity for each carrier frequency was

significantly faster than a descending threshold search

in which criteria had to be met for all frequencies at a

certain intensity level before testing at the next level in

the series. However, results were slightly more

variable with poorer correlations between ASSR and

behavioral thresholds as compared to Experiment 1.

DISCUSSION

In this study, two multiple-frequency ASSR analysis

methods were compared under independent andsimultaneous test conditions in normal-hearing and

hearing-impaired adults. ASSR thresholds obtained by

both methods, as well as the relationship between

ASSR and behavioral thresholds, were found to be in

line with those reported in the literature for multiple-

frequency ASSR. The mean difference scores in

Tables 1 and 2 are similar to those reported in many

studies using multiple-frequency ASSR (Picton et al,1998; Picton et al, 2001; Vander Werff and Brown,

2005), although higher than those reported in others

(Lins et al, 1996; Herdman and Stapells, 2001; Perez-

Abalo et al, 2001; Dimitrijevic et al, 2002; Luts et al,

2004). Across studies in the literature, errors in

estimating audiometric threshold using the ASSR tend

to be greater in normal-hearing individuals compared

to those with hearing loss. The reader is referred toseveral published reviews on this topic (Herdman and

Stapells, 2003; Picton et al, 2003; Tlumak et al, 2007;

Vander Werff et al, 2008). In the current study,

difference scores and standard deviations were similar

between the NH and HI groups for both the FFT and

FLC methods. This result is likely due to the limited

range of hearing loss of the subjects in the HI group.

Most subjects had normal or nearly normal thresholdsin the low frequencies, and the degree of hearing loss

was limited to accommodate upper test limit of 60 dB

HL for the simultaneous comparison.

Figure 8. Scatterplot representing the relationship betweenbehavioral thresholds in dB HL and individual ASSR thresholdsin dB HL obtained using the automated search protocol for theFLC method in Experiment 3. Format is the same as for Figure 1.

Figure 9. Box plots of the total time required to completethreshold testing for all frequencies in both ears using theautomated search protocol in Experiment 3. Box format is thesame as for Figure 3.


448

Most of the multiple-frequency ASSR studies in the

literature have utilized a method of time-domain

averaging and FFT analysis. The current study is the

first report of ASSR thresholds obtained using a

commercial implementation of an adaptive filtering

algorithm (FLC) to analyze the EEG signal, rather than

time-domain averaging and FFT analysis. When tested

independently, ASSR thresholds tended to be higher for

the FLC method as implemented by the RapidASSR

system compared to those obtained using the FFT

method implemented by the MASTER system. The

variability in mean thresholds was similar for the two

methods. The difference in thresholds between meth-

ods was found to be significant for the low frequencies,

where ASSR thresholds obtained using the FLC

method were higher but there was no significant

difference for the high frequencies. Average difference

scores across frequencies were the same for the two

methods, however. When tested under simultaneous

recording conditions, the ASSR thresholds remained

significantly higher for the FLC method at 500 and

1000 Hz. Although there was also a significant

difference at 2000 Hz under the simultaneous condi-

tion, this result was likely influenced by the small

number of thresholds obtained in the HI subjects at

this frequency.

Correlations between behavioral thresholds and

ASSR thresholds, however, were not significantly

different between the FFT and FLC methods. For both

the FFT and FLC methods, the poorest correlations

were obtained for 500 Hz. 500 Hz correlations of 0.53

and 0.59 for the FFT and FLC methods respectively in

the current study are somewhat lower than those

reported in previous studies, although consistent with

trends across ASSR studies in the literature for the

poorest correlations and highest variability at 500 Hz

(Dimitrijevic et al, 2002; Herdman and Stapells, 2003;

Hsu et al, 2003; Van Maanen and Stapells, 2005;

Vander Werff and Brown, 2005).

These results overall suggest that ASSR thresholds

obtained using the FLC method estimate the behav-

ioral audiogram as accurately as those obtained using

the FFT method. While ASSR thresholds were signif-

icantly higher in the low frequencies for the FLC

method compared to the FFT method, it is important to

note that the relative difference in ASSR thresholds

and difference scores for FLC compared to FFT was

generally small. Under simultaneous recording condi-

tions, the difference scores for the FFT and FLC

methods were the same for 65% of cases, and within

10 dB (the minimum step size) of each other for 94%

of ears tested. When independent recordings were

compared, the variability increased slightly. Differ-

ence scores for the FFT and FLC methods were

within 10 dB of each for 68% of cases, and within

20 dB for 92%. This higher variability would be

expected due to differences in subject state and other

noise conditions across test sessions for the indepen-

dent recordings.

There are limitations in comparing thresholds

obtained by the two methods for either independent

or simultaneous recording conditions. As mentioned

above, subject and test variability could influence the

independent comparisons. Under simultaneous record-

ing conditions, the upper limit for testing affected the

number of individuals with measurable ASSRs for the

higher frequencies. The fact that ASSR thresholds

obtained using the FLC method were more elevated

above behavioral threshold at 500 and 1000 Hz than

for the FFT method under both simultaneous and

independent conditions suggests that clinicians should

consider correction factors specific to the analysis

method and protocol when estimating behavioral

threshold based on ASSR.

The simultaneous recording condition in Experiment

2 allowed for a direct comparison of the detection rate

and recording time for an individual ASSR, given that

subject and environmental factors were exactly the

same. Overall, more ASSRs were detected by the FFT

method than the FLC method across subjects. This

trend was apparent across most frequencies and

intensity levels under simultaneous test conditions.

However, detection rates for the two methods were not

found to be significantly different whether compared

by equal levels in dB HL or dB SL. In most cases, the

results of the two methods were in agreement in terms

of ASSR presence or absence for each frequency/

intensity combination (91% for NH, 87% for HI). A

significantly higher number of disagreements between

methods occurred for the HI group, although these

disagreements did not occur more frequently at

threshold or below threshold levels overall.

There were instances of false positives (present ASSR

below threshold) and false negatives (absent ASSR at

clearly suprathreshold level) for both the FFT and FLC

methods. The criteria used to classify a response as a

false positive was somewhat strict, as it would be

expected with normal variation that some ASSR

thresholds may fall below behavioral threshold. Al-

though there were slightly fewer cases of false positives

for the FLC method, the largest false positives of 20 dB

below behavioral threshold were obtained with this

method. In addition, false positives of 25 and 45 dB

below threshold were obtained during the automated

search using the FLC method. In each of these cases, the

FLC method detected a present ASSR within the first

22–70 sec of testing. For subject HA39, for example, a

present ASSR was found in 22 sec at 20 dB HL, after no

ASSR was detected during the maximum recording

time at 30 dB HL. A clinical strategy to reduce false

positives therefore may be to repeat an ASSR test if an

ASSR is detected in such a short period of time at a


449

relatively low intensity level, particularly if results do

not correspond with those at higher levels.

False negatives also occurred for both the FFT and

FLC methods. Although the true number of false

negatives is not known, there were several cases of

absent ASSRs at 60 dB HL in the NH group that could be

considered false negatives. False negatives also occurred

more frequently for the FLC method than the FFT

method. Because the highest intensity levels were tested

at the beginning of the session, subject state and noise

floor may have influenced these results, although there

was not a significant relationship between noise floor and

the agreement in results between the two methods.

Total test time under independent recording condi-

tions in Experiment 1 was not significantly different

for the two methods overall, although test time was

longer for the FFT method for the HI subjects. This

result is most likely due to the difference in recording

protocols between the two systems for the highest

stimulus levels. The MASTER system allowed for

simultaneous presentation of all four stimuli up to

80 dB HL, while the RapidASSR system limited the

simultaneous presentation to 60 dB HL. For HI

individuals, particularly those with larger differences

in threshold across frequencies, testing each frequency

independently may be a more efficient method. The

commercial MASTER system allows for testing single

frequencies, although the modulation frequency chang-

es with this configuration. Overall, it did not appear

that either analysis method held an advantage in

terms of overall test time in independent comparisons

using the protocol implemented in the current study.

Under simultaneous recording conditions, it is

difficult to evaluate the relative efficiency in terms of

time between the two methods for a specific test due to

differences in the two detection algorithms and the

clinical test protocol chosen. The FLC method updated

statistical test results in 1 sec increments, while the

FFT method updated results after each sweep of

13.65 sec. The FFT method also must be halted

manually by the user or complete a maximum number

of sweeps, while the FLC method stops as soon as

statistical significance is reached.

Using the minimum time segment of 13.65 sec, or 1

sweep, as the smallest unit of time allows for a

reasonable, though not ideal, comparison of individual

recording times per detected ASSR. The time in sweeps

when the FFT method first became significant was

almost universally faster than the time to the nearest

sweep when the FLC method became significant. How-

ever, as reviewed above, this method would likely result

in considerable error in threshold estimation. While the

time per detected ASSR was more similar when compar-

ing the FLC with the fifth consecutive significant sweep

for the FFT method, the FLC method was faster in 61% of

these comparisons. This time difference in sweeps was

statistically significant, although the two methods were

within a few sweeps of each other for the majority of

comparisons. Therefore, depending on the decision

criteria used, when ASSRs were present by both methods,

the FLC was slightly more time efficient in detecting the

response. However, this result may be balanced out

overall by a larger number of false negatives and higher

thresholds in general for the FLC method. Test time for

an absent ASSR, such as false negatives and subthresh-

old tests, may be longer than for present ASSR, unless

subject noise floor is quite low.

While ASSR detection is objective, decisions on when

to accept an ASSR as present or absent affect the

overall accuracy of threshold estimation. In most

research studies using the FFT method, response

presence or absence has been judged after a fixed

number of sweeps have been recorded. Studies have

shown that the longer the test duration, the smaller

the difference scores and their associated standard

deviations (John et al, 1998; John and Picton, 2000;

Luts and Wouters, 2005), due to lower noise levels and

higher SNR. However, clinical test time is limited,

especially when testing young, sleeping infants. There

is a need for compromise and more time-efficient

strategies such as stopping the recording as soon as a

significant response is detected rather than waiting for

a predetermined length of time. In the current study,

two different analysis methods were applied using

protocols designed to be clinically feasible. The FLC

method did not require any clinical judgment for a

present ASSR, as recording was halted as soon as the

statistical results reached significance. A decision

criterion was implemented to halt testing after a

specified time if the probability of response remained

low. The results of this study show that the accuracy of

the FLC method was in line with previously published

multiple-frequency ASSR studies.

For the FFT method, a protocol designed to increase

time efficiency was evaluated in which averaging was

continued until the statistical results were significant for

five consecutive sweeps, with a 15 sweep minimum and a

noise floor criterion for no response decisions. This

method requires sequential application of multiple

statistical testing, which has been shown to increase

error rates (Sturzebecher et al, 2005). Compared to

previous ASSR studies in the literature, ASSR thresholds

and difference scores were with the range of previous

reports, although on the higher end. Correlations

between ASSR and behavioral thresholds were slightly

poorer than some studies, but also within the range

across the literature (Vander Werff et al, 2008). These

results suggest that the clinical protocol added some

variability as compared to research protocols, but without

major adverse affects for threshold estimation in this

particular subject group. There were, however, examples

of significant errors in some individual subjects.


450

Luts et al (2008) found that error rate, detection

rate, and recording time can vary significantly with

even small changes in the protocol. These authors

evaluated the use of fixed recording lengths, variable

lengths requiring a number of consecutive significant

responses, and the variable lengths plus the use of a

minimum number of sweeps. For variable length

recordings, error rates increased as the number of

consecutive significant sweeps required decreased.

When a fixed number of sweeps was required, error

rates were 5% or less, in agreement with the

significance level of the statistical test. For the variable

recording lengths, error rates were 29.8, 17.9, 9.7, 4.4,

and 1.3% for requirements of 1, 2, 4, 8, and 16

consecutive significant sweeps respectively. Requiring

a minimum of 8 sweeps decreased the error rates

slightly for 1, 2, and 4 consecutive sweeps to 19.2, 14.6,

and 8.9%. In the current study, the addition of a 15

sweep minimum requirement changed the result in a

small number of cases, which would be consistent with

a slight reduction in the error rate.

The results of this study, as well as those of Luts et

al (2008) suggest that a compromise may be possible

where recording time is limited without a large

sacrifice in accuracy. In a recent meta-analysis of

ASSR literature, Tlumak et al (2007) reported that the

maximum number of sweeps was not significantly

related to variability across studies in the difference

between ASSR and behavioral thresholds at any

carrier frequency. However, our results highlight the

need for critical clinical decision making as part of the

ASSR protocol. While ASSR testing is ‘‘objective,’’ it is

clear that the best results (fewest errors) are obtained

over long recording times in quiet subjects. In order to

strike a balance between shorter recording times and

high ASSR detection rates, clinicians need to strive to

improve clinical decision making. For example, in the

presence of higher subject noise, longer minimum

recording times would be required. In addition,

clinicians may change to stricter criteria if significance

levels are changing frequently with additional sweeps.

Repetition of particular tests that seem ‘‘suspicious’’

would also add confidence to ASSR results. For

example, if an ASSR is detected quickly at a low

intensity level, the test should be repeated and/or

evaluated at the next highest stimulus level. In

addition, clinicians should pay particular attention to

obtaining high-quality recordings by making sure

electrode impedances are low and balanced, subject

noise is reduced as much as possible, and electrical

interference is eliminated or reduced.

Even with the use of clinical protocols with variable

recording times, however, total test times were consid-

erable for the independent tests: around 40 min for NH

subjects and 50 min for HI subjects for both methods.

The total time may have been negatively affected by the

chosen test protocol for this research study (e.g.,

straight descent intensity series, requiring a minimum

of 15 sweeps for FFT), but does not seem to suggest a

clinical time savings overall. One limitation of multiple-

frequency ASSR that has limited potential time savings

is that for a certain intensity level, the stimulus may be

above threshold at some frequencies but below thresh-

old at other frequencies. Test time for that intensity

level therefore depends on the smallest amplitude ASSR

out of the eight possible in the two ears. The final

experiment in the current study evaluated the use of an

automated threshold search protocol using the FLC

method, which allowed for independent testing of each

stimulus frequency. Given the objective statistical

nature of determining present or absent ASSR, an

automated independent protocol holds promise for

decreasing total test time, particularly in cases where

thresholds vary considerably across frequency such as

in sloping SNHL. In the case of the FLC method, the

automated search protocol that independently con-

trolled stimulus level across frequencies was found to

be significantly faster than the descending search

protocols used in the independent tests in Experiment

1 (for either the FLC or the FFT method). Mean test

time was under 30 min for both the NH and the HI

groups. This decreased test time would likely represent

a significant clinical advantage.

However, the time savings was somewhat counter-

acted by decreased test accuracy as shown by poorer

correlations to behavioral threshold and larger stan-

dard deviations for the automated search method

compared to the manual methods from Experiment 1.

There were examples of large underestimations of

threshold during the automated search, where ASSRs

were detected in short amounts of time at subthreshold

intensity levels. In a clinical setting, the audiologist

could decide that this response was unlikely to

represent a true ASSR and either disregard it, repeat

that specific test, or complete additional intensity

levels to improve the accuracy of the automated

protocol. Limitations of the independent control of

stimulus level across frequencies must also be consid-

ered, such as interactions between stimuli of varying

intensities and masking effects. The time benefits of

the automated protocol appear to be significant;

therefore, techniques to improve performance should

be further explored.

CONCLUSIONS

This study has shown that both the FFT and FLC

analysis methods for ASSR detection, as imple-

mented commercially, can be used for behavioral

threshold estimation with approximately equal accu-

racy. The use of correction factors specific to the test

may be necessary, however, when converting ASSR


451

thresholds to behavioral threshold due to the largerdifference scores for the FLC method in the low

frequencies. Under simultaneous recording conditions,

controlling for subject state as an extraneous variable,

the ASSR detection rate was not significantly different

between methods. Clinical protocols with variable

recording lengths were used in an attempt to balance

time efficiency and accuracy, resulting in performance

similar to previous studies in the ASSR literature.Although an automated threshold search protocol

using the FLC method based on independent adjust-

ment of stimulus level by frequency resulted in

significantly decreased test time, there was an associ-

ated decrease in the correlation between ASSR and

behavioral threshold. Further evaluation of this type of

protocol is warranted given the time efficiency if false

positive responses and variability could be improved.Variability in individual ASSR thresholds and reduced

correlations to behavioral threshold, as well as signif-

icant errors in some individuals, in this study highlight

the need for clinical decision making to improve test

accuracy for both the FFT and FLC methods when

these clinical protocols are used.

Acknowledgments. The author would like to gratefully

acknowledge all the participants in this study and sincerely

thank Margaret Overman, Heather Schwartz, Kerrie Nes-

bitt, Kyle Wilson, and Kristen Burns for their assistance in

collecting and analyzing data for this project.

REFERENCES

American National Standards Institute. (1996) Specification forAudiometers (ANSI S3.6-1996). New York: American NationalStandards Institute.

American National Standards Institute. (2004) Specification forAudiometers (ANSI S3.6-2004). New York: American NationalStandards Institute.

Dimitrijevic A, John MS, Van Roon P, Purcell DW, Adamonis J,Ostroff J, Nedzelski JM, Picton TW. (2002) Estimating theaudiogram using multiple auditory steady-state responses. J AmAcad Audiol 13(4):205–224.

Herdman AT, Stapells DR. (2001) Thresholds determined usingthe monotic and dichotic multiple auditory steady-state responsetechnique in normal-hearing subjects. Scand Audiol 30(1):41–49.

Herdman AT, Stapells DK. (2003) Auditory steady-state responsethresholds of adults with sensorineural hearing impairments.Int J Audiol 42(5):237–248.

Hsu WC, Wu HP, Liu TC. (2003) Objective assessment of auditorythresholds in noise-induced hearing loss using steady-stateevoked potentials. Clin Otolaryngol 28(3):195–198.

John MS, Lins OG, Boucher BL, Picton TW. (1998) Multipleauditory steady-state responses (MASTER): stimulus and record-ing parameters. Audiology 37(2):59–82.

John MS, Picton TW. (2000) MASTER: a windows program forrecording multiple auditory steady-state responses. ComputMethods Programs Biomed 61(2):125–150.

Lins OG, Picton TW, Boucher BL, Durieux-Smith A, ChampagneSC, Moran LM, Perez-Abalo MC, Martin V, Savio G. (1996)Frequency-specific audiometry using steady-state responses. EarHear 17(2):81–96.

Luts H, Desloovere C, Kumar A, Vandermeersch E, Wouters J.(2004) Objective assessment of frequency-specific hearing thresh-olds in babies. Int J Pediatr Otorhinolaryngol 68(7):915–926.

Luts H, Van Dun B, Alaerts J, Wouters J. (2007) Objectivedetection of ASSR: do’s and don’ts. Paper presented at theInternational Evoked Response Audiometry Study Group (IER-ASG) XXth Biennial Symposium, Bled, Slovenia.

Luts H, Van Dun B, Alaerts J, Wouters J. (2008) The influence ofthe detection paradigm in recording auditory steady-stateresponses. Ear Hear 29(4):638–650.

Luts H, Wouters J. (2005) Comparison of MASTER and AUDERAfor measurement of auditory steady-state responses. Int J Audiol44(4):244–253.

Perez-Abalo MC, Savio G, Torres A, Martin V, Rodriguez E,Galan L. (2001) Steady state responses to multiple amplitude-modulated tones: an optimized method to test frequency-specificthresholds in hearing-impaired children and normal-hearingsubjects. Ear Hear 22(3):200–211.

Picton TW, Dimitrijevic A, John MS, Van Roon P. (2001) The useof phase in the detection of auditory steady-state responses. ClinNeurophysiol 112(9):1698–1711.

Picton TW, Durieux-Smith A, Champagne SC, Whittingham J,Moran LM, Giguere C, Beauregard Y. (1998) Objective evaluationof aided thresholds using auditory steady-state responses. J AmAcad Audiol 9(5):315–331.

Picton TW, John MS, Dimitrijevic A, Purcell D. (2003) Humanauditory steady-state responses. Int J Audiol 42(4):177–219.

Sturzebecher E, Cebulla M, Elberling C. (2005) Automatedauditory response detection: statistical problems with repeatedtesting. Int J Audiol 44(2):110–117.

Tlumak AI, Rubinstein E, Durrant JD. (2007) Meta-analysis ofvariables that affect accuracy of threshold estimation viameasurement of the auditory steady-state response (ASSR).Int J Audiol 46(11):692–710.

Van Maanen A, Stapells DR. (2005) Comparison of multipleauditory steady-state responses (80 versus 40 Hz) and slowcortical potentials for threshold estimation in hearing-impairedadults. Int J Audiol 44(11):613–624.

Vander Werff KR, Brown CJ. (2005) Effect of audiometricconfiguration on threshold and suprathreshold auditory steady-state responses. Ear Hear 26(3):310–326.

Vander Werff KR, Johnson TJ, Brown CJ. (2008) Behaviouralthreshold estimation for auditory steady-state response. In:Rance G, ed. Auditory Steady-State Response: Generation,Recording, and Clinical Applications. San Diego: Plural Publish-ing, 125–147.


452

Date post:	26-Sep-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Accuracy and Time Efficiency of Two ASSR Analysis Methods ...thresholds, and clinical test time are...

Documents