+ All Categories
Home > Documents > =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences...

=31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences...

Date post: 27-May-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
15
Transcript
Page 1: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Audio Engineering Society

Convention PaperPresented at the ���th Convention

���� October ����� San Francisco� CA� USA

This convention paper has been reproduced from the author�s advance manuscript� without editing� corrections� orconsideration by the Review Board� The AES takes no responsibility for the contents� Additional papers may be

obtained by sending request and remittance to Audio Engineering Society� �� East ��nd Street� New York� New York���������� USA also see www�aes�org� All rights reserved� Reproduction of this paper� or any portion thereof� isnot permitted without direct permission from the Journal of the Audio Engineering Society�

Modeling Auditory Localization of Subwoofer

Signals in Multi�Channel Loudspeaker Arrays

Jonas Braasch�� William L� Martens�� and Wieslaw Woszczyk�

�CIRMMT� Faculty of Music� McGill University� Montreal� Canada

Correspondence should be addressed to Jonas Braasch �jb�music�mcgill�ca�

ABSTRACT

For economical reasons� home entertainment surround sound systems are usually equipped with a singlesubwoofer channel� The main argument for this procedure is the believed inability of the auditory systemto localize low frequencies in small reverberant rooms� However� a psychoacoustic localization test thatwas conducted using a standard ��channel set�up with subwoofers showed that the listeners were able todetermine the lateral displacement left� center or right of the loudspeaker presenting the test stimulus �anoctave�band noise burst at �����Hz� ���Hz or ���Hz center frequency�� Using a binaural model simulatinghuman perception� recordings of subwoofers signals at dierent positions were analyzed� As expected� theinteraural level dierences remained nearly constant for dierent subwoofer positions in the low frequencyrange� On basis of interaural time dierences� however� the model was able to predict the position of theloudspeaker regarding the left�right dimension� verifying the outcome of the listening test� The resultsindicate the importance to consider more than one subwoofer in multi�channel audio systems�

�� INTRODUCTION

In its recommendation ITU�R BS ����� for mul�tichannel stereophonic sound systems� the Inter�national Telecommunication Union �ITU� proposesthe use of a single loudspeaker for the low fre�quency range below � Hz as a cost�eective solu�tion� While such an approach is generally acceptedfor home entertainment systems� it should be ques�tioned whether this procedure has disadvantages insound when used in studio monitoring systems and

other high�end applications�

Several recent articles address this topic under dif�ferent aspects� and dierent conclusions were drawn�K�gler and Theile ��� investigated whether substan�tial dierences in sound are perceived between afullrange �channel sound system and a ���channelsound system that are both identical with the ex�ception that the ���channel system uses one sub�woofer� The authors conclude from the outcome ofan AB comparison test with dierent music samples

Page 2: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

that the dierences are negligible if the crossover fre�quency is under � Hz� Zacharov et al� ���� cameto a similar conclusion and recommend that for do�mestic environments the loudspeaker set�up is lesscritical for a one�subwoofer system �tested crossoverfrequency �� Hz�� and the audible dierences be�tween the experimental one�subwoofer case and themultiple subwoofer case were found to be little� Tothe opinion of the authors� the use of multiple sub�woofers on the other hand can lead to �spatial andtemporal smearing� if the subwoofers are not prop�erly aligned�

Noussaine ��� and Welti ���� draw the conclusionthat a multiple�subwoofer set�up is advantageous toa one�subwoofer set�up after they examined bothoptions under the aspect of room mode excitation�Both authors belief that the use of multiple sub�woofers lead to a more �even� or �balanced� exci�tation of room modes� Griesinger ��� and Martens��� conclude that multiple subwoofer arrays can beuseful in decorrelating low�frequency sounds� whichis important to generate the impression of appar�ent source width �ASW� and listeners envelopment�LEV��

When discussing spatial properties of sound� soundlocalization is probably the most frequently dis�cussed issue in this �eld� The �rst systematic investi�gations date back to the late ��th century� In regardto low�frequency localization in loudspeaker arrays�however� our knowledge is still incomplete� Thispaper addresses the question whether the multi�channel information of sound reproduction systemsshould be extended to low frequencies for localizabil�ity reasons� For this purpose� a localization experi�ment was conducted� which is described in the nextsection� following an extensive model simulation tosupport the psychoacoustical �ndings �Section ���

�� PSYCHOACOUSTIC EXPERIMENT

While designing the listening test� we found it to beimportant to maintain the typical features of stereo�phonic listening situations to make our conclusionsapplicable to the practical audio engineer� For thisreason� we chose not to hide the architecture of thespeaker set�up �e�g�� by blindfolding the listeners�nor did we restrict the listeners from using headmovements� In the test� the listeners were asked to

indicate the speaker closest to the direction of theauditory event� Strictly speaking� our test resem�bled a source identi�cation test rather than a purelocalization experiment� Nonetheless� our aim wasto show whether true multichannel sound should beprovided in the low�frequencies for reasons of local�izability� and not to investigate the general perfor�mance of the auditory system when localizing lowfrequency sounds�

���� Methods

������ Listeners

Four listeners including the �rst two authors �onefemale� three males� participated in the experiment�Their ages ranged from to �� years� None of thelisteners had known hearing disorders�

������ Apparatus and Stimuli

In the source identi�cation test� the signals were pre�sented to the listeners through subwoofer loudspeak�ers� The subwoofers were custom�built by Bang �Olufsen for use with the Beolab � loudspeakers�but in the test only the subwoofers were addressed�The loudspeakers were set�up in MARLAB �Multi�Channel Audio Research Laboratory� at McGill Uni�versity according to ITU recommendation ITU�RBS ����� at the following azimuth angles� ��� ����

and ����� � m radius�� The speakers were placedon the �oor� and the center of the driver ���in diam�eter� was at a height of � cm� The following rever�beration times were measured for MARLAB �t �� � ��� Hz�� �� ��� Hz�� � �� Hz��

The test stimuli were generated in Matlab on a per�sonal computer �Hewlett Packard� Pavilion a� n�Windows XP�� For this purpose� white noise burstsof � �ms duration were created at �� �kHz sam�pling frequency and ���bit resolution� Afterwards�the signals were �ltered with an octave�wide FIR �l�ter �� �� coe�cients� at dierent center frequencies����� Hz� �� Hz� and �� Hz�� The signals had �ms cos� on� and oset ramps to exclude that the lis�teners could use �spectral widening� onset cues� Themaximum sound pressure level of the loudspeakersignals was set to � dB�c� measured at the centerposition of the set�up� In the �����Hz center fre�quency condition� the sound pressure level had tobe reduced to �� dB�c� to avoid audible non�lineareects of the speakers� Each loudspeaker was cali�brated individually using a mixing console �Yamaha

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 3: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

loca

lized

spe

aker

presented speaker

L 1

sl

l

c

r

sr

L 2

f c=31

.5 H

z

100%

50%

25%

10%

5%

1%

sl

l

c

r

sr

L 3 L 4

f c=63

Hz

sl l c r sr

sl

l

c

r

sr

sl l c r sr sl l c r sr

f c=12

5 H

z

sl l c r sr

Fig� �� Localization performance of four listeners �L��L�� localizing low�frequency octave�band noise burstsat dierent center frequencies as indicated in the righthand labels�

�D�� A level rove ��� dB� � dB� and � dB� was in�cluded� to exclude that the listeners would identifythe loudspeaker signal according to its sound pres�sure level�

���� Procedure

The experiment was divided into several sessions�In each session� the center frequency was kept con�stant� and only the azimuth of the signal was pseudo�randomly varied� The session started with a train�ing phase in which the listener could become familiarwith the experiment procedure� After a stimulus hadbeen presented� the listener reported the direction ofthe auditory event on a graphical interface by indi�cating one of the �ve buttons� which resembled the�ve loudspeakers� using a mouse� In those cases in

which the auditory event did not match the positionof any speaker� the listeners were asked to indicatethe speaker closest to the auditory event�

After the response� the next stimulus was presented�At the end of the training phase� the recording ofthe listener�s responses began� In each session� everystimulus was presented twelve times �four times foreach of the three roved levels� �� dB� � dB and� dB�� The duration of each session was about �veminutes� No feedback was provided to the listenersduring the training phase nor the recording phase�Listeners � and participated in �ve sessions foreach of the three center frequencies� Listeners � and� took part only in two sessions each for the twohighest center frequencies ��� Hz and �� Hz��

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 4: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

left side right side

Fig� � Percentage of correctly identi�ed front�back directions� The left graph shows the results for the leftand surround�left subwoofers� the right graph for the right and surround�right subwoofers� The black barsshow the �����Hz center frequency condition� the gray bars the ���Hz center frequency condition� and thewhite bars the ���Hz center frequency condition� In both plots� the horizontal dot�dashed line marks thepercent correct level above which performance is signi�cantly better than chance would allow�

���� Results

The results of the localization experiment are shownseparately for each of the four listeners �columnwise� and dierent center frequencies �row wise� inFig� �� The data for all sound�pressure level settingswere combined together to construct the responsepercentages plotted in this graph� The x�axis de�picts the presented speaker positions� surround left�sl�� left �l�� center �c�� right �r�� and surround right�sr�� The y�axis shows the reported speaker loca�tions� The size of each black circle is proportionalto the occurrences the speaker position �given at they�axis of the circle�s center� was chosen for the pre�sented speaker position �given at the x�axis of thecircle�s center�� A legend on the top�right providesthe relationship between dierent circle sizes andpercentage of answers that were given for this com�bination of presented�perceived direction� The top�left panel� for example� shows the localization resultsfor Listener � �L�� for the �����Hz center frequencycondition� If the listener would have been able toidentify all speaker positions correctly� we would ex�pect �ve large circles aligned at the diagonal line forcorresponding localized and presented speaker posi�tions� For the center speaker� the listener�s responsepattern is quite close to this case� but for all re�maining speakers we observe a number of confusionsbetween the front and the corresponding surround

speakers� e�g� left�surround�left and right�surround�right� The remaining panels in Fig� � show very sim�ilar patterns� Obviously� the listeners are fairly goodin discriminating in the left�right dimension� but donot do so well in distinguishing between front�back�Only two signi�cant exceptions were found� the ���Hz center frequency conditions for L and L�� Inthis condition� both listeners are able to discrimi�nate between the left speaker and the surround�leftspeaker and to some extent also between the rightspeaker and the surround�right speaker�

To elaborate on the details of this eect further�the data of the localization experiment are shownin the graph on the left side of Fig� � Each bargives the percentages of correct responses given forthe left and the surround�left speakers� The dashedand the dashed�dotted line show the � � thresh�old of chance and the ��� detection threshold re�spectively� As Fig� � already revealed� only for Land L� the percentages of correct responses clearlyexceed the ��� threshold in the ���Hz condition�In two further cases� the percentage of correct re�sponses is just above the ��� threshold �L�� ���Hzcondition and L�� ���Hz condition�� In all othercases� the percentages of correct responses are be�low the ��� threshold� The right graph of Fig� shows the percentages of correct responses given forthe right and the surround�right speakers� similar tothe left graph of the same �gure� Here� only in two

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 5: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

left� center� right surround left� surround right

left� right

Fig� �� Percentage of correctly identi�ed left�rightdirections� The top�left graph show the resultsfor the left� center� and right subwoofers� the top�right graph for the surround�left and surround�right subwoofers� and the bottom�left graph forthe left and right subwoofers� The black bars showthe �����Hz center frequency condition� the graybars the ���Hz center frequency condition� and thewhite bars the ���Hz center frequency condition�

cases the percentages of correct responses exceededthe ��� threshold� L for the ���Hz condition andL� for the ���Hz condition�

Regarding the left�right dimension� the percentagesof correct responses depicted in the top�left graphof Fig� � are near the possible maximum of � ��indicating that the listeners did not face any di��culties judging the lateral displacement of the soundsources� In this �gure� the data were collapsedover the stimulus presentations over the three frontspeakers� Those cases in which the listeners falselyindicated the surround�left speaker instead of thefront�left speaker and the cases in which the listenersreported the surround�right speaker instead of thefront�right speaker were counted as correct responsesbecause the sideness of the judgement� left� center�or right remained correct� The left�center�right dis�crimination task should be easier with the two sur�round loudspeakers� since larger ITDs occur here�However� the performance cannot get much better�and only in the case with the lowest percentage ofcorrect answers �L�� �����Hz center frequency con�

dition�� this value was noticeably higher for the sur�round speaker case �Fig� �� top�right graph�� Thisobservation is not aected when we compare theresponses for both surround speakers to the front�left and front�right speaker only �Fig� �� bottom�leftgraph��

When analyzing the data� we were also interestedwhether the variation of the sound�pressure levelchanges the performance of the listeners or not� Fig�ure � shows the results of the localization experimentfor L� ��rst two rows from left� and L ��rst tworows from right�� The scatter plots are shown inthe same way as Fig� �� but now each level settingis depicted separately� To save space� we only showthe �� dB and �� level conditions for L� and L�Recall that L� and L� participated only in two in�stead of �ve sessions for each center frequency� Thecorresponding plots for the �� dB and �� dB levelconditions look very similar� with two exceptions�Firstly� L� seems to have less confusions between thesurround speakers and the matching front�left�rightspeakers in the �����Hz center frequency condition

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page of �

Page 6: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

loca

lized

spe

aker

presented speaker

−3 dBL1

sl

l

c

r

sr

+3 dB −3 dBL2

+3 dB

f c=31

.5 H

z

sl

l

c

r

sr

f c=63

Hz

sl l c r sr

sl

l

c

r

sr

sl l c r sr sl l c r sr

f c=12

5 H

z

sl l c r sr

Fig� �� Same as Fig� �� but with separate plots for dierent level settings �Listeners � and only��

��rst two top panel from left�� Secondly� L ap�pears to indicate the front�right speaker instead ofthe surround�right speaker� when the relative levelwas set to �� dB� One possible explanation for thisdegradation in performance is that L was moni�toring monaural level cues and expected the level inboth surround speakers to be less than in the frontalspeaker� as would result from shadowing eects ofthe head�

In the left graph of Fig� �� the percentages of cor�rect responses regarding the front�back direction areshown� The method of presenting this data is analo�gous to Fig� � except for the fact that this time thedata for the left and right speakers were combined�and more important� all three level settings are now

shown separately� For this reason� the code for thebars shows the relative sound pressure level setting��� dB� � dB� and �� dB� of the stimuli rather thantheir center frequency� The latter is now depicted inseparate rows as indicated by the righthand side la�bels� The only noteworthy changes in the responsepatterns through level variation of the stimuli is thepreviously mentioned case of L� in the �����Hz cen�ter frequency condition� Some improvement is alsoobservable for L� in the ���Hz condition�

In concluding this section� the percentage of cor�rect responses regarding the left�right direction isdepicted in the right graph of Fig� � in an analogousway to Fig� �� but again all three level settings areshown separately� As already observed in Figs� �

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 7: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

listener ID

corr

ect a

nsw

ers

[%]

f c=31

.5 H

z

0 20 40 60 80

100

f c=63

Hz

0 20 40 60 80

100

f c=12

5 H

z

1 2 3 4 0 20 40 60 80

100

listener ID

corr

ect a

nsw

ers

[%]

f c=31

.5 H

z

0 20 40 60 80100

f c=63

Hz

0 20 40 60 80100

f c=12

5 H

z

1 2 3 4 0 20 40 60 80100

front�back discrimination left�right discrimination

Fig� �� Left graph� Percentage of correctly identi�ed front�back directions for dierent level settings� rightgraph� Percentage of correctly identi�ed left�right directions for dierent level settings� In both graphs� theblack bars show the �� dB level rove condition� the gray bars the � dB level rove condition� and the whitebars the �� dB level rove condition�

and � �left graph�� the level variation has minor in��uence on the performance of the listeners� Only inthree out of ten cases� the improvement at the high�est level was above �� compared to the lowest level�L�� �����Hz center frequency� L� �����Hz centerfrequency� and L�� ���Hz center frequency��

�� MODEL ALGORITHMS

���� Introduction

Wightman and Kistler showed in a localization testthat interaural time dierence �ITD� cues domi�nate interaural level dierence �ILD� cues for nat�ural ITD�ILD combinations if low frequencies arepresent ���� For that reason� our analysis focuseson cross�correlation algorithms based on ITD cues�Besides simulating the auditory periphery and an al�gorithm for estimating the target position� combin�ing the output of the single frequency bands� wereimplemented�

���� Stimuli

Principally� the same test signals as in the psychoa�coustic experiment were used to evaluate the model�A dummy head �Br�el � Kj�r� served as the �ears�for the binaural model� For this purpose the binau�ral room impulse response �BRIR� was measured be�tween the dummy head� placed in the center� and asubwoofer loudspeaker �Yamaha� SW � � positionedsequently at all �ve locations of the surround set�up�For the impulse�response measurement� the Yamahasubwoofer was chosen over the Bang � Olufsen sub�woofer for the simple reason that it allowed highersound pressure levels which results in a better signal�to�noise ratio of the measured impulse responses� Onthe other hand� the Yamaha speaker could not beused in the listening test� because not enough speak�ers were available for a �ve�channel set�up� Never�theless� both the model and the psychoacoustic test

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 8: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

0 25 50 75 100

−1

−0.5

0

0.5

1 left channel

Time [ms]

Rel

. Am

plitu

de

0 25 50 75 100

−1

−0.5

0

0.5

1 right channel

Time [ms]

Rel

. Am

plitu

de

Fig� �� Binaural room impulse response measuredbetween the right speaker ����� azimuth� �� eleva�tion� and a dummy head placed in the center of theloudspeaker set�up�

environments should be comparable� since the roomand the loudspeaker positions were the same� Theimpulse responses were measured using a custom�built software based on a cross�correlation algorithmand MLS signals which was executed on a personalcomputer and an external sound device �GatewayPentium �� Marc of the Unicorn� MOTU ����� Af�terwards� the same signals that were previously usedin the listening test were convolved with the BRIRsand then analyzed using the binaural model�

A second set of BRIRs was measured in the TVstudio of the Instructional Communication Centre�ICC studio� at McGill University� The ICC studiohas the following reverberation times �t �� ��� ���Hz�� �� ��� Hz�� �� �� Hz�� For this measure�ment� a � channel sound reproduction system wasused� which was developed within the Valorisation�

recherche Quebec �VRQ� project Real�time Commu�

nication of High�resolution Multi�sensory Content

via Broadband Networks� The system is basicallyan augmented standard surround system with � ver�tically aligned rings of custom�built ribbon loud�speakers� Five subwoofers �D�Box� Mini Mammoth��

Ban

dp

ass

filt

erb

ank

D ec is io n d ev ice

Ban

dp

ass

filt

erb

ank

Hai

r-ce

llsi

mu

lati

on

Cro

ss-c

orre

lati

on

Hai

r-ce

llsi

mu

lati

on

1 st

2 nd

i th

n th

freq u en c yb an d :

H C

H C

H C

H C

H C

H C

C C

C C

C C

C en tra l n e rv o u ssystem

In n er ea r In n er ea r

righ

t ch

ann

el

left

ch

ann

el

Fig� �� General model structure of the localizationalgorithm�

which were set up on the studio �oor in a standardITU ��channel con�guration are used to reproducelow�frequencies� Full frequency range room�impulseresponses were measured using the subwoofers andthose ribbon speakers that were mounted directlyabove them� Again� the dummy head was positionedin the center of the loudspeaker array� Figure �shows an example of a measured BRIR�

���� Model Structure

Stages to simulate the auditory periphery were im�plemented into the cross�correlation model �Fig� ��in order to simulate human hearing with a su�cientaccuracy� The model is similar to an earlier modelproposed by Blauert and Cobben ��� The trans�formation from the sound sources to the eardrumsare taken into account by �ltering the sounds withBRIRs from a speci�c direction as described in theprevious section� Basilar�membrane and hair�cellbehavior are simulated with a gammatone �lter bankof �� bands at a sampling frequency of �� kHz ����bit resolution�� as described by Patterson et al� �� ��and a simple half�wave recti�cation� With the ex�ception of Fig� �� only the low frequency bands � to� ��� Hz���� Hz� are analyzed�

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 9: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 47

293

738

1559

3105

Freq. [Hz

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 47

293

738

1559

3105

Freq. [Hz

ITD [ms]

Fig� �� Demonstration of the decompression algo�rithm that was introduced in the cross�correlationmodel to sharpen the peaks by taking the cross�correlation function to the power of � � The topgraph shows the output of the cross�correlationmodel without the decompression stage� the bottomgraph with decompression stage�

After the half�wave recti�cation� the interaural crosscorrelation was estimated within each frequencyband over the whole stimulus duration t��t��

�Yl�r�f� �� �

t� � t�

t�X

t�t�

Yl�f� t�Yr�f� t� �� ���

To observe a cross�correlation peak at low fre�quencies in the ITD range from ���� ms to ����ms� the cross�correlation functions had to be de�compressed� This was done by taking the cross�correlation function to the power of � Otherwisethe cross�correlation functions would have resultedin a plateau� The dierence the decompression stagemakes is shown in Fig� �� Since we are primar�ily interested in how the auditory system localizes

low�frequency signals� we chose a constant decom�pression rate� In a full�range model� the decom�pression rate could be adapted with decreasing val�ues at higher frequencies� such that the widths ofthe cross�correlation peaks correlate with variableslike apparent source width �ASW� or listener�s en�velopment �LEV�� Unlike in previous studies by the�rst author ���� ���� the cross�correlation functionsare not remapped from ITD�based to azimuth�basedfunctions before the decision device for two reasons�Firstly� the actual location of the auditory event isnot of primary interest since� strictly spoken� the aimis to predict the outcome of a sound source identi�ca�tion experiment rather than simulating a pure local�ization test� Therefore� it is su�cient to show thatthe model can or cannot discriminate between dif�ferent loudspeaker positions in a similar way to theperformance of human listeners� The model shouldassign the test stimuli to speakers with a similar ac�curacy as was found in the listing test�

In the decision device� the average of the ICC func�tions is calculated and normalized to one� The ICCpeak in each frequency band is scaled to the soundpressure level measured in this band beforehand�The model estimates the sound source�s directionof incidence at the position of the local peak of theaveraged ICC function�

���� Results

The outputs of the cross�correlation algorithm indierent frequency bands is shown in Fig� �� Forboth the test stimuli at �����Hz and ���Hz cen�ter frequencies and for each frequency band� we ob�serve a clearly visible shift of the cross�correlationpeaks as we move the speaker position from the cen�ter speaker to the surround�left speaker via the leftspeaker �top to bottom row�� In the active region�the peaks of the cross�correlation functions for dier�ent frequency bands are more or less aligned at thesame internal delay �ITD�� Naturally� not only doesthe �����Hz stimulus activate lower frequency bandsthan is observed for the ���Hz stimulus� but alsoits cross�correlation functions appear to be wider�because the width of the cross�correlation peak de�pends on the wavelength of the analyzed signal�

Figure � shows the results of the model simulationafter the cross�correlation functions were summed

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �

Page 10: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

center

left

surround�

left

center

left

surround�

left

�����Hz center frequency ����Hz center frequency

Fig� �� Outputs of the cross�correlation algorithm for dierent speaker positions in MARLAB �row wise�and stimuli with dierent center frequencies �column wise��

over all frequency bands� According to the im�plementation of the decision device� the estimatedstimulus position corresponds to the maximum ofthe cross�correlation peak� Since we are not inter�ested in the height of the peak� the peak of thecross�correlation functions were normalized to onefor a better readability of the graphs� The �gureshows all simulated test conditions and not only theones previously depicted in Fig� � �from top to bot�tom� �����Hz� ���Hz� ���Hz center frequencies�� Ineach graph� the average cross�correlation function

for each speaker position is shown� The legend inthe graph for the ���Hz center frequency condi�tion� which is valid for all three frequency condi�tions� gives the line type used for each speaker lo�cation� Again� it is obvious that the peak of thecross�correlation functions widens with decreasingfrequency� It might be surprising to some readersthat the cross�correlation peak for the center speakeris not always located at �ms ITD ����Hz and ���Hzcenter frequency condition�� and that the peaks forthe left and right speakers �respectively surround�

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page �� of �

Page 11: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=31

.5 H

zsl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=63

Hz

sl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ITD [ms]

ICC

am

p.

f c=12

5 H

zf c=

125

Hz

sl l c r sr sllcrsr

Fig� � � Model performance of the cross�correlationmodel for the lateralization of an octave�band�widenoise burst for dierent speaker positions in MAR�LAB �sl!surround left� l!left� c!center� r!right�and sr!surround right as indicated in the legend�and center frequencies �from top to bottom row������ Hz� �� Hz� �� Hz�

left and surround�right speaker� are not equidistantfrom the position for the center speaker� However�we should consider that present room re�ections canbe accounted for the observed asymmetrical eects�

In order to investigate the eect of reverberationmore thoroughly� the simulation was rerun� with theexception that this time the BRIRs from the mea�surement in the ICC studio were used� The resultsare shown in Fig� �� �left row�� They are plottedin the same way as in the previous �gure� Sincethe ICC studio is more reverberant than MARLAB�greater deviations are to be expected� Most inter�estingly is the fact that in the ���Hz condition thepeak position for the center speaker is found to beleft of the peak position for the left speaker� andin the �����Hz condition� the peak position for thesurround�right speaker is located left of the peakposition for the right speaker� The peak positionsfor the center speaker and the right speaker nearlymatch for the ���Hz condition�

In the next step� the ICC�studio BRIRs for all mea�

sured positions were truncated just before the ar�rival of the �rst prominent early re�ection� The �rstprominent early re�ection usually arrived in the or�der of ten milliseconds after the direct source� Theidea was to eliminate the room re�ections as wellas possible� After truncating the BRIRs� the sim�ulation was rerun� and the results are depicted inthe right row of Fig� ��� In comparison to the re�verberant condition� the positions of the peaks arenow more symmetrical spaced� and the peak posi�tions for the center speaker are closer to midline�In addition� the peak widths decreased in absenceof the re�ections� and the peaks moved further out�ward� A look at the cross�correlation functions thatwere measured in the single frequency bands revealsthat the peaks are more aligned after the re�ectionshave been removed �Fig� ���

Concluding the model analysis� the ILDs for eachspeaker position� estimated in each frequency band�are shown in Fig� ��� The legend of the graph pro�vides the line type used for each speaker location�The measured ILDs� with maximum magnitude val�ues of approximately two decibels� remain very lowin the plotted frequency domain� and the curves forthe three frontal speakers cross each other severaltimes�

�� DISCUSSION

One of the strongest arguments for the use of onlyone subwoofer in a standard two� or �ve�channelloudspeaker set�up is the inability for humans to lo�calize low�frequency sounds� which is often believedto occur in reverberant listening spaces� The out�come of the psychoacoustic experiment gives evi�dence that this is not the case for the listening spaceof the experimental set�up� Even though the lis�teners had di�culties in discriminating between theleft and surround�left speakers� and the right andsurround�right speakers� they showed no di�cultyin determining whether the sound arrived from left�center or right�

The model simulation was established to decidewhether the psychoacoustic �ndings are in line withour present knowledge of how our auditory systemoperates� or whether our psychoacoustical resultscannot be explained theoretically so far� It shouldbe pointed out that the human ability to localizelow�frequency sounds is not questioned in general�

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page �� of �

Page 12: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00IC

C a

mp.

f c=31

.5 H

zsl l c rsr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=63

Hz

sl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ITD [ms]

ICC

am

p.

f c=12

5 H

z

sl lc r sr sllcrsr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=31

.5 H

zsl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=63

Hz

sl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ITD [ms]

ICC

am

p.

f c=12

5 H

z

sl l c r sr sllcrsr

with reverb without reverb

Fig� ��� Model performance of the cross�correlation model for the lateralization of an octave�band�widenoise burst for dierent speaker positions in ICC�studio �sl!surround left� l!left� c!center� r!right� andsr!surround right as indicated in the legend� and center frequencies �from top to bottom row�� ���� Hz��� Hz� �� Hz� The left graph shows the results for the complete impulse responses �including reverberation��in the right graph� the impulse responses were truncated to eliminate early re�ections and reverberation�

For example� low�frequency ILDs cause a great lat�eral displacement of the auditory event� It is ratherquestioned whether the cues provided in a real en�vironment are su�cient for the auditory system toestimate the position of the sound source� For lowfrequencies� the ILDs are very small� because thehead dimensions here are in orders smaller than thewavelength of the sound� and head shadowing eectsare not eective anymore� For ITDs on the otherhand� reverberation can distort the cue� and further�more� the physically measured phase dierences be�come relatively small� An ITD of �� ms"which isdetermined by the distance between both eardrums�approximately �� cm� and which is about the nat�ural limit found in nature"results in phase dier�ences of ������ at �����Hz center frequency� ���

at ���Hz center frequency� and ��� at ���Hz cen�ter frequency� More importantly� the peak of thecross�correlation function becomes very wide at lowfrequencies� The model results� however� show thatthe sounds can still be localized relatively well� whichbecomes more apparent after the cross�correlation

peaks were decompressed to eectively reduce thewidth of the cross�correlation peaks�

In the model simulation� the cross�correlation peakvaried with the speaker position in such a way thatthe lateral displacement of the peak always matchedthe speaker position tested� The lateral displace�ment of the speakers from the center listening po�sition� surround left� left� center� right� and sur�round right� was maintained in the same order bythe peak positions of the cross correlation functions�In particular� the distance between the peaks for thefront�left and front�right speakers to the correspond�ing surround speakers was in the same order as thedistance to the peak for the center speaker� This�nding suggests that the listeners should be able todistinguish between the left and the surround�leftspeakers and the right and surround�right speakers�if they can distinguish between the left� center andright speaker� At this point� it can be only specu�lated why such a response pattern was not observedin most listening conditions� One explanation isthat the density of coincidence cells is believed to

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page �� of �

Page 13: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

left

surround�

left

left

surround�

left

with reverb without reverb

Fig� �� Outputs of the cross�correlation algorithm for dierent speaker positions in ICC�studio �row wise� foroctave�band wide noise bursts at ���Hz center frequency� The left graphs show the results for the completeimpulse responses �including reverberation�� In the right graphs� the impulse responses were truncated toeliminate early re�ections and reverberation�

be larger for small ITDs than for large ITDs ���� andtherefore the ability to discriminate between ITDsdecreases with the ITD magnitudes� The assump�tion that the listeners cannot remap the ITDs of suchlow frequencies to a distinct azimuth value mightserve as a second explanation�

Another interesting �nding should be noted� In the���Hz center frequency condition of the listeningtest �Fig� �� three out of four listeners �L� L��and L�� showed a better performance in discrimi�nating between the left and surround�left speakersthan for the right and surround�right speakers� The�ndings of the model simulation point in a simi�lar direction� Here the cross correlation peaks forthe left and the surround�left speakers are closerto each other than it is the case for the right andthe surround�right speakers �Fig� � � bottom�� This�nding also supports the assumption that the lis�teners did not discriminate between the surround

speakers and their corresponding front speakers onthe basis of front�back discrimination but rather bycues regarding the lateral displacement� While ITDcues are very similar for the same left�right positionin the frontal and the rear hemisphere� ILD cues andmonaural cues typically provide the information forfront�back�discrimination� Since the stimuli appliedwere only an octave wide and low in frequency� itis unlikely that monaural cues were utilized by thelisteners� Also the absolute signal level can be ex�cluded as a cue� because a level rove was appliedduring the experiment� The use of ILD cues is notvery likely either� The dierence in ILDs betweenthe surround speakers and the corresponding frontspeakers is in the order of one decibel� which is nearthe absolute threshold in ILD discrimination tasks�

Regarding the center frequency of the stimuli� thedistances between the cross correlation peaks appearto be larger for the very low frequencies �Fig� � ��

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page �� of �

Page 14: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

50 100 150 200−3

−2

−1

0

1

2

3

frequency [Hz]

ILD

[dB

]sl

l

c

r

sr

sllcrsr

Fig� ��� Interaural level dierences measured witha binaural model for dierent loudspeaker positions�surround left �sl�� left �l�� center �c�� right �r�� sur�round right �sr� as indicated in the legend� The ILDsare shown as function of the center frequency of thefrequency bands of the gammatone �lter bank�

The explanation why the front�back discriminationperformance rather improved with increasing stimu�lus center frequency� therefore� should be connectedto the decreasing peak width� assuming that thisleads to a better discriminability of adjacent peakpositions�

So far� it was not discussed to what extend the�ndings of this investigation can be generalized toother spaces� After comparing the data for MAR�LAB �Fig� � � with the data for ICC studio �Fig� ���left panel�� it can be assumed that the cues pro�vided for the same listening situation are less reli�able in the ICC studio� After reducing the roomre�ections� though� the model performance greatlyimproved� In the near future� further localizationexperiments in more reverberant spaces than MAR�LAB are planned� The results of these listening testwill hopefully show whether the human auditory sys�tem can utilize the precedence eect for such low fre�quencies or whether the performance of the listenersdegrades as the reverberant condition of the modelsimulation in ICC studio would suggest�

It should also be discussed to what extent the binau�ral cues in the subwoofer frequency range �typically# � Hz� contribute to the position of the auditory

event for sounds of broader bandwidths� In the psy�choacoustic experiment� the frequency range of thenoise bursts was limited to an octave� but in naturemost sounds extent to higher frequencies� Researchhas been previously conducted to determine how theauditory system weights information throughout fre�quency when determining the lateral position of au�ditory cues on an ITD basis� The spectral domi�nance region was found to be at � Hz� and theinformation in other frequency bands are weightedless the more its center frequency deviates from thisvalue ���� In this context it is noteworthy that eventhough con�icting cues caused by one�subwoofer pre�sentations might not signi�cantly in�uence the posi�tion of the auditory event� the con�icting cues couldlead to larger apparent source widths and even splitauditory events�

In summary� our �ndings do not support the gen�eral use of one�subwoofer systems in high�end audioapplications� The question remains why the inves�tigations of K�gler and Theile ��� and Zacharov etal� ���� came to the opposite conclusion� This dis�crepancy can be explained by the strong in�uence ofreverberation� The space used in our investigation islikely to have been less reverberant than the spacesused by K�gler and Theile and Zacharov et al� An�other dierence is that in our investigation the up�per stimulus frequency was below the crossover fre�quency of the subwoofer� while in the other two in�vestigations this was not the case� In the latter case�the binaural cues at higher frequencies might havemasked the low�frequency cues� Also regarding gen�eral methodology� dierences are found between thethree investigations� Zacharov et al� report aboutan informal listening test� and the outcome of K��gler and Theile�s experiment might have been dier�ent� if it had been designed as a forced�choice exper�iment� In the graphs shown by the authors �Figs� ���� and ��� the percentage of audible dierences arehigher for all tested crossover frequencies than in thecontrol condition� and the recommendation to avoidcrossover frequencies above � Hz� does not appearto have been made on the basis of a statistical testor another systematic procedure�

�� CONCLUSION

As the results of our investigation show� low fre�quency signals are often localizable� and we there�fore recommend to consider using two or more sub�

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page �� of �

Page 15: =31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Braasch et al Modeling Localization of Subwoofer Signals

woofers or a full�range speaker system� When usingtwo subwoofers in a ��channel surround set�up� it isoften advantageous to place the subwoofers at theside ���� and ��� rather than at the front�left andfront�right position� In the �rst case� larger ITDscan be generated and� because of the physical ab�sence of ILDs� cues for front�back discriminationsare missing for frequencies below � Hz�

�� ACKNOWLEDGEMENT

This investigation was supported by a Grant ofthe Government of Qu$bec within the projectReal�time Communication of High�resolution Multi�

sensory Content via Broadband Networks of theprogram Valorisation�recherche Qu�bec �VRQ�� Wewould like to thank our anonymous listeners for par�ticipating in the listening tests and Durand Begaultand his students for providing us with the reverber�ation time measurements� which were measured inDurand�s class at McGill University�

�� REFERENCES

��� F� A� Bilsen and J� Raatgever� Spectral dom�

inance in binaural lateralization� Acustica ���pp� ������ �������

�� J� Blauert and W� Cobben� �Some considerationof binaural cross correlation analysis�� Acustica��� pp� ���� � �������

��� J� Braasch� �Localization in the presence of adistracter and reverberation in the frontal hor�izontal plane� II� Model algorithms�� ACUS�TICA�acta acustica ��� pp� ������� � ��

��� J� Braasch� �Localization in the presence of adistracter and reverberation in the frontal hor�izontal plane� III� The role of interaural leveldierences�� ACUSTICA�acta acustica� ��� pp������� � ���

��� H� S� Colburn� �Theory of binaural interactionbased on auditory�nerve data� II� Detection oftones in noise�� J� Acoust� Soc� Am� ��� pp������� �������

��� D� Griesinger� �Objective measures of spa�ciousness and envelopment�� in� Proc� AudioEngineering Society� ��th Int� Conf� on Spa�tial Sound Reproduction� Rovaniemi� Finland������ pp� �����

��� C� K�gler� G� Theile� �Loudspeaker Reproduc�tion� Study on the Subwoofer Concept�� pre�sented at the �nd Convention of the AudioEngineering Society� J� Audio Eng� Soc� �Ab�stracts�� ��� p� ��� ���� May�� preprint �����

��� W� L� Martens� �Subjective evaluations of au�ditory spatial imagery associated with decorre�lated subwoofer signals�� Proc� of the Int�Conf� of Auditory Display� July ��� � Ky�oto� Japan � ��

��� T� Noussaine� �Multiple subwoofers for hometheater�� presented at the � �rd Convention ofthe Audio Engineering Society ������� preprint���� �H���

�� � R� D� Patterson� M� H� Allerhand and C�Gigu%re� �Time�domain modeling of periphalauditory processing� A modular architectureand software platform�� J� Acoust� Soc� Am� ���pp� ��� ����� �������

���� T� Welti� �How many subwoofers are enough��presented at the ��rd Convention of the AudioEngineering Society � �� preprint �� �

��� F� L� Wightman� D� J� Kistler� �The dominantrole of low�frequency interaural time dierencesin sound localization�� J� Acoust� Soc� Am� ���pp� ���������� ������

���� N� Zacharov� S� Bech� D� Meares� �The use ofsubwoofers in the context of surround soundprogram reproduction�� J� Audio Eng� Soc� ���pp� ����� �������

AES ���th Convention� San Francisco� CA� USA� ���� October �����

Page � of �


Recommended