BINAURAL HEARING and INTELLIGIBILITY in AUDITORY DISPLAYS
_________________________________________________________
Durand R. Begault
Human Factors Research & Technology Division
NASA Ames Research Center
Moffett Field, California
1. Binaural hearing phenomena
2. Newly developed auditory displaysthat exploit spatial hearing for improving
-speech intelligibility-alarm intelligibility
in aviation applications
Physical characteristics of sound and perceived attributes
• Frequency (perceived pitch)
• Intensity (loudness)
• Spectral content (timbre)
• FIS, plus binaural differences (localization)
Physical characteristics of sound and perceived attributes
• Frequency (perceived pitch)
• Intensity (loudness)
• Spectral content (timbre)
• FIS, plus binaural differences (localization)
** All characteristics are important in the identification and
discrimination of auditory signals and for speech intelligibility
in communication contexts
Two important functions of the binaural hearing system
• Localization
(lateral and 3-dimensional)
• Binaural release from masking:
Echo supression, room perception
Binaural hearing (localization; signal separation & detection):
forming spatial auditory events from acoustical (bottom-up) and psychological (top-down) inputs
Model of the binaural hearing systemA
cous
tic s
igna
l-driv
en
Psychologically-driven
Filtering of acoustic signalby pinnae, ear canal
Binaural hearing (localization; signal separation & detection)
Model of the binaural hearing systemA
cous
tic s
igna
l-driv
en
Psychologically-driven
Filtering of acoustic signalby pinnae, ear canal
Filtering by inner ear; frequency-specific neuronfirings
Binaural hearing (localization; signal separation & detection)
Model of the binaural hearing systemA
cous
tic s
igna
l-driv
en
Psychologically-driven
Filtering of acoustic signalby pinnae, ear canal
Filtering by inner ear; frequency-specific neuronfirings
Physiological evaluationof interaural timing andlevel differences
Binaural hearing (localization; signal separation & detection)
Model of the binaural hearing systemA
cous
tic s
igna
l-driv
en
Psychologically-driven
Filtering of acoustic signalby pinnae, ear canal.
Filtering by inner ear; frequency-specific neuronfirings
Physiological evaluationof interaural timing andlevel differences
Aco
ustic
sig
nal-d
riven
Binaural hearing (localization; signal separation & detection)
Multi-sensory information; cognition
Aco
ustic
sig
nal-d
riven
Psychologically-driven
Model of the binaural hearing system
• ILD (interaural level difference)• ITD (interaural time difference)
“Duplex” theory of localization
Lateral localization of auditory images
• ILD (interaural level difference) caused by head shadow of wavelengths > 1.5 kHz
Lateral spatial image shiftLe
vel d
iffer
ence
(dB
)Le
vel d
iffer
ence
(dB
)
• ITD (interaural time difference)
Lateral image shift
Frequency
Log
Mag
nitu
de (
dB)
2000 4000 6000 8000 10000 12000 14000100 16000
-40
-30
-20
-10
0
-50
10Right 30°, elevated
Right 90°, ear level
Right 120, below
Head-related transfer function cues (HRTFs) providecues for front-back discrimination and elevation
45°, 0°
135°, 0°
Basis of 3-D audiosignal processing in auditory displays
Vibration source(internal, external)
Ground, Structure response
hearing feeling seeing
Airbornesound
Response: qualitative assessment
Performance metric
Walls,Windows,
objects
Chairs,Tables,
floor
Walls,Windows,
plants
3-D audiodisplay
Head-mounted
visualdisplay
ExpectationInter-modal coordinationIdentificationExperience-adaptation
Sound sourcescan be ‘felt’and ‘seen’ aswell as heard
Applications of spatial sound for improvingintelligibility in auditory displays
Using binaural hearing advantage for separating multiple auditory “streams” (simultaneous sources)
3-D communication system patented, developed for NASA-KSC
0.0
2.0
4.0
6.0
8.0
Azimuth of spatialized signal (mean of left & right sides)
Speech Intelligibility advantage compared to one-ear listening
Full frequency bandwidth
Telephone bandwidth
Adv
anta
ge (
dB)
0%
25%
50%
75%
100%
Per
cent
age
of "
yes"
ans
wer
s
35-4
4 (1
1)
45-5
4 (2
3)
55-6
4 (3
0)
Age range of pilots (no. of subjects)
General Population
Question 2 ("Personally suspect...)
Question 1 ("Told by a doctor")
Hearing loss for target users: 64 active commercial airline pilots
0
25
50
75
100
Frequency (kHz)
Audiogram data
% > 20 dB HL
% > 25 dB HL
Audiogram data summary for 20 active commercial pilots(age range 35-64; not corrected for presbycusis)
% o
f 20
pilo
ts e
valu
ated
Use of auditory icons (AI) and left-right spatialization for information redundancy, situational awareness of actions of crew (CRM) and haptic feedback substitution
Check Oil Press.........................Color___, Value___
Retard Throttle.........___ Retarded
Identify Which Throttle (Lft. or Rgt).......................___
Engine Shutdown.....................O.K.
Oil Pressure
30-90
0-30
Shut Off Valve
ISOVALVES
MenuFUEL
PP PP
28
15.1
F. Press
LOWhigh
AutoManOff
Lbs
Disc.
Totalizer
30.2
LOWhigh
AutoManOff
PP PP
28
15.1
F. Press
Lbs
Shut Off Valve
APU
Fuel Heat
EngRgt
EngLft
"Page Through" auditory icon (For selecting menu pages from tabs)
"Click" auditory icon (For selecting items that change orientation within the menu display)
"Latch" auditory icon (For actions that correspond to changes in the aircraft electrical, hydraulic, or engine systems)
“Page-through”& “switch” AIsfor touch screenchecklist
“Mechanical latch”AIs for actions corresponding to electrical, fuel, hydraulic systems
NASA ARC advanced cabsimulator
Head up auditory display for TCAS
3-D audio alert 60 6015 1545 4530 30
60 50 40 30 20 10 0 10 20 30 40 50 60
CAPTAIN'SSCREEN
COMMON SCREEN
FIRST OFFICER'SSCREEN
Visual field of view
Application of 3-D audio head-up display for Traffic Collision Avoidance System (TCAS II) investigated.
Target acquisition times can decrease from 0.5 – 2.2 sec.
0
1
2
3
4
5
6
7
3-D Audio No Map Display
Monotic Audio No Map Display
head-up 3-D audio display
5
4
3
2
1
0head-down map display
Mean target acquisition times (2.63 vs. 2.13 s) and standard deviations for second TCAS experiment. The 3-D audio cues were not exaggerated, and there were three categories of elevation cues.
Mean target acquisition times (4.7 vs. 2.5 s) and standard deviations for first TCAS experiment. The 3-D audio cues were exaggerated in azimuth relative to the visual target, and no elevation cues were supplied.
Q u i c k T i m e ™ a n d a T I F F ( L Z W ) d e c o m p r e s s o r a r e n e e d e d t o s e e t h i s p i c t u r e .
Head-up auditory display with head-up visual display
-60
-40
-20
0
20
40
60Ti
me
(sec
): m
ean
of 3
rout
es
1 2 3 4 5 6 7 8 9
10 11 12
Crew
Reduction in taxi time:Advantage of 3-D audio
Spatially-modulated auditory alerts
In an auditory display, how to insure that an alarm is audible?
-“Common sense” engineering approach: make the alarm a lotlouder than the background noise forwide-area coverage
Fire alarm and horn from ca. 1933
In an auditory display, how to insure that an alarm is audible?
-ISO 7731 (“Danger signals for work places-Auditorydanger signals”) specifiessignal to be >= 13 dBre masked threshold in a1/3 octave band (0.3-3.0 kHz)
-Recipe for “startle effect”, high overall SPLs,and potentially low performance in a high-stress environment
Signal
Noise
Current approach
-Improve detection of an alarm (signal) against ambient sound (noise) using signal processingtechniques other than level increase
Requirement / Caveat
-Technique should apply to currently-used alarms(to avoid “relearning” semantic content of new auditory signals).
Technique
-Three methods addressed in patent application (pending) for accomplishing this.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Alarm (basic stimulus)737-300 alarm: Two successive square waves(preceding verbal “wind sheer” alert)
300 ms 300 ms
200 Hz 764 Hz
Stimuli
0 Hz jitter 1.6 Hz jitter
x 2R
ear
L ea
rA
mpl
itude
Time
3.3 Hz jitter
Summed L+R RMS levels equivalent for all stimuli; but jittered stimuli have + 5 dB peaks re unjittered due to HRTF.
-20-18-16-14-12-10
-8-6-4-20
2000
2500
3150
4000
5000
6300
8000
1000
0
1250
0
1600
0
Octave band Center Frequency
Atte
nuat
ion
dB .
...
-35
-30
-25
-20
-15
-10
Loud
spea
ker
(exi
stin
g)
Hea
dpho
ne,
1.6
Hz
traje
ctor
y
Hea
dpho
ne,
3.3
Hz
traje
ctor
yThre
shol
d re
noi
se le
vel (
dB) (
dB)
......
......
......
......
Results (1)Headphone with jittered signal has 13.4 dB advantage overmonaural loudspeaker (existing condition on aircraft), partly due to attenuation of noise by headphone
ResultsHeadphone attenuation
Sennheiser HD 480 vented
0
1
2
3
4
5
6
7
8
9
1
"Spatial unmasking"
Peak level due to 90deg. HRTF
Results (2)Headphone with jittered signal has significant (p < .000)7.8 dB advantage over headphone without jittered signal. No significant difference between 1.6 and 3.3 Hz modulation.
results source of unmasking (?)
Condition 4, 5, 6 (headphones)
-25
-20
-15
-10
-5
0
0.0 Hz 1.6 Hz 3.3 Hz
Trajectory velocity
Thre
shol
d re
noi
se le
vel (
dB) (
dB)
dB a
dvan
tage
Conclusions
A new approach to designing alerts for auditorydisplays in high-stress interfaces: use of spatial modulation for improved detection.
Headphones + spatial modulation lower threshold by 13.4 dB.
Spatial modulation lowers threshold by 7.8 dB.5 dB is due to HRTF interaural level difference ifinstantaneous (peak) level differences are assumed.This amount is reduced as a function of longertemporal integration periods. Remaining advantageis due to time varying interaural cross-correlation.
BINAURAL LOCALIZATION
• INTELLIGIBILITY IMPROVEMENT Binaural release from masking • DISCRIMINATION and SELECTIVE ATTENTION IMPROVEMENT
THE ''COCKTAIL PARTY'' EFFECT
• ALTERNATIVE or REDUNDANT DISPLAY for
VISUALLY-ACQUIRED INFORMATION
• IMMEDIATE SITUATIONAL AWARENESS (WITH HEADS-UP ADVANTAGE)
+
Benefits: increased aviation safety & efficiency =
+ACTIVE NOISE
CANCELLATIONHEARING CONSERVATION