+ All Categories
Home > Documents > Final_1]2_ phd

Final_1]2_ phd

Date post: 18-Apr-2015
Category:
Upload: dances4u2003
View: 130 times
Download: 2 times
Share this document with a friend
210

Click here to load reader

Transcript
Page 1: Final_1]2_ phd

THE ASSESSMENT OF SPEECH INTELLIGIBILITY IN ROOM ACOUSTICS FOR EFFICIENT APPLICATION IN COMPUTER MODELLING AND

IMPROVED ENCLOSED SPACES

Volume 1 of 2: Text

Christos Nestoras

Supervisors: Dr Stephen Dance Prof Bridget Shield

A thesis submitted in partial fulfilment of the requirements of London South Bank University for the degree of Doctor of Philosophy

September 2009

Page 2: Final_1]2_ phd

Preface

ii

Reviewed by: First supervisor: Dr Stephen Dance, Senior lecturer, London South Bank University Second supervisor: Prof Bridget Shield, Professor of Acoustics, London South Bank University External examiner: Prof John Turner, Pro Vice-Chancellor, Portsmouth University External examiner: Mr Peter Mapp, Principal consultant, Peter Mapp Associates Internal examiner: Mr Ken Rotter, Senior lecturer, London South Bank University

Page 3: Final_1]2_ phd

Preface - Abstract

Abstract

The aim of this study is the development of computer models that are capable of

consistently predicting primary and speech intelligibility specific parameters for variable

source configurations within lecture rooms. Four main parts can be highlighted in the

study: consistently measuring the acoustic environment in rooms using a generic

measurement methodology; a series of low level measurements to establish the effect of a

continuously reducing S/N on the measurement results; the development and validation

of efficient and consistent computer models; and the designation of a new hybrid method

for an objective validation of auralization.

Multiple computer simulations have been processed for ten test rooms and validated by

using measurements taken for different source types and positions. Two contributions in

the model development have been made: the simulation validation/calibration

methodology using EDT, and the development of accurate computer models based on a

single reference parameter i.e. EDT.

The acoustic characteristics within ten test rooms have been measured and analyzed for

four source configurations to determine the level of consistency among the different

methodologies. The assessment also provided validation data for the computer modelling

purposes. Complementing measurements were taken using an open loop system, to

quantify the effectiveness and accuracy compared to traditional closed loop measurement

systems.

Finally, an objective validation method was developed to enable an efficient clear-cut

assessment of auralization quality, in particular for speech intelligibility parameters. A

comparative assessment of auralizations from ten test rooms was consequently

undertaken to determine the quality level incorporated. The procedure was complemented

by a typical subjective evaluation via listening tests for a broader view of the results.

iii

Page 4: Final_1]2_ phd

Preface - Acknowledgements

Acknowledgements The core financial support for this work was provided by an LSBU research scholarship.

The Royal Academy of Engineering is also gratefully acknowledged for the award of a

travel grant to promote the dissemination of information and development of research

networks, among others. A number of people have contributed with their time and effort

in the duration of this project; I would like to thank primarily my supervisor Dr Stephen

Dance for his constructive criticism and continuous support throughout this study. His

enthusiastic approach has been invaluable. Also Prof Bridget Shield for her critical view,

particularly in the early stages of the study, and continuous encouragement. Mr Lars

Morset of Morset Sound Development Norway (WinMLS) provided direct support when

needed with software licensing issues. Dr Bengt-Inge Dalenbäck of CATT Acoustics

Sweden offered his expert knowledge in different simulation aspects. Broad technical

support was provided by the technicians in the Acoustics Group lab. I would like to thank

our retired technician and current enterprise consultant in particular, Mr Salih Hassan, for

his help at various instances during the acoustic measurement sessions. Also Mr Louis

Gomez, KTP associate, for the long hours in field tests for the London Underground and

the synergy created to facilitate the audition of concept ideas in to actual working

environments; I want to thank all who offered their time to take part in the subjective

testing sessions. Numerous people that would be many to list here have also offered their

moral support, a contribution that I highly value. This project would have been

impossible without the support of my family. I want to thank Naoko for her patience and

understanding.

iv

Page 5: Final_1]2_ phd

Preface – List of tables and figures

LIST OF TABLES

Page

Table 2.1. Matrix used for the determination of MTF in the 14 modulation frequencies and seven octave bands 29 Table 2.2. STI scale and equivalent subjective perception of speech intelligibility (current) 32 Table 2.3.STIPA modulation frequencies 34 Table 2.4. Weighting factors adopted by STIPA 34 Table 3.1. Room list for room acoustics measurements 60 Table 3.2. Average BGNL over ten rooms (Leq, 1min) 61 Table 3.3. Acoustic parameters measured in ten test rooms and statistical summary 61 Table 3.4. Standard deviation for T30 and EDT among the four source types in Room 1 63 Table 3.5. Standard deviation for T30 and EDT among the four source types in Room 2 63 Table 3.6. Standard deviation for T30 and EDT among the four source types in Room 3 63 Table 3.7. Standard deviation for T30 and EDT among the four source types in Room 4 63 Table 3.8. Standard deviation for T30 and EDT among the four source types in Room 5 63 Table 3.9. Standard deviation for T30 and EDT among the four source types in Room 6 64 Table 3.10. Standard deviation for T30 and EDT among the four source types in Room 7 64 Table 3.11. Standard deviation for T30 and EDT among the four source types in Room 8 64 Table 3.12. Standard deviation for T30 and EDT among the four source types in Room 9 64 Table 3.13. Standard deviation for T30 and EDT among the four source types in Room 10 64 Table 4.1. Sample threshold efficient S/N ratios using a sine sweep in a test room 97 Table 4.2. Comparison of STI for reference and experimental (marginal) conditions 98 Table 4.3. Threshold efficient S/N for the six system configurations derived from marginal T30 data in N11 (reverberation chamber) 128 Table 4.4. Threshold efficient S/N for the six system configurations derived from marginal T30 data in Room 10 128 Table 5.1. Example mean values for single omni directional source (prediction against measurement) 142 Table 5.2. T30 for actual and predicted conditions (simple) in Room 8 144 Table 5.3. EDT for actual and predicted conditions (simple) in Room 8 144 Table 5.4. C50 for actual and predicted conditions (simple) in Room 8 144 Table 5.5. T30 for actual and predicted conditions (CAD) in Room 8 145 Table 5.6. EDT for actual and predicted conditions (CAD) in Room 8 145 Table 5.7. C50 for actual and predicted conditions (CAD) in Room 8 145 Table 5.8. Example of average error for ‘Simple’ model using an alternative source configuration in Room 8 (sound system, SS4) 146 Table 5.9. Example of average error for ‘CAD’ model using an alternative source configuration in Room 8 (sound system, SS4) 146 Table 5.10. Comparison of prediction data to room acoustics measurements in Room 1 (omni source), averaged over all receiver positions 150 Table 5.11. Comparison of prediction data to room acoustics measurements in Room 1 (sound system), averaged over all receiver positions 150 Table 5.12. Comparison of prediction data to room acoustics measurements in Room 2 (omni source), averaged over all receiver positions 151 Table 5.13. Comparison of prediction data to room acoustics measurements in Room 2 (sound system), averaged over all receiver positions 151 Table 5.14. Comparison of prediction data to room acoustics measurements in Room 3 (omni source), averaged over all receiver positions 152 Table 5.15. Comparison of prediction data to room acoustics measurements in Room 3 (sound system), averaged over all receiver positions 152 Table 5.16. Comparison of prediction data to room acoustics measurements in Room 4 (omni source), averaged over all receiver positions 153 Table 5.17. Comparison of prediction data to room acoustics measurements in Room 4 (sound system), averaged over all receiver positions 153 Table 5.18. Comparison of prediction data to room acoustics measurements in Room 5 (omni source), averaged over all receiver positions 154 Table 5.19. Comparison of prediction data to room acoustics measurements in Room 5 (sound system), averaged over all receiver positions 154

v

Page 6: Final_1]2_ phd

Preface – List of tables and figures

Table 5.20. Comparison of prediction data to room acoustics measurements in Room 6 (omni source), averaged over all receiver positions 155 Table 5.21. Comparison of prediction data to room acoustics measurements in Room 6 (sound system), averaged over all receiver positions 155 Table 5.22. Comparison of prediction data to room acoustics measurements in Room 7 (omni source), averaged over all receiver positions 156 Table 5.23. Comparison of prediction data to room acoustics measurements in Room 7 (sound system), averaged over all receiver positions 156 Table 5.24. Comparison of prediction data to room acoustics measurements in Room 8 (omni source), averaged over all receiver positions 157 Table 5.25. Comparison of prediction data to room acoustics measurements in Room 8 (sound system), averaged over all receiver positions 157 Table 5.26. Comparison of prediction data to room acoustics measurements in Room 9 (omni source), averaged over all receiver positions 158 Table 5.27. Comparison of prediction data to room acoustics measurements in Room 9 (sound system), averaged over all receiver positions 158 Table 5.28. Comparison of prediction data to room acoustics measurements in Room 10 (omni source), averaged over all receiver positions 159 Table 5.29. Comparison of prediction data to room acoustics measurements in Room 10 (sound system), averaged over all receiver positions 159 Table 6.1. Result variation example between the two auralization validation methods (averaged over ten test rooms for omni source 1) 173 Table 6.2. Result variation example between the two auralization validation methods (averaged over ten test rooms for omni source 2) 173 Table 6.3. Prediction and auralization data comparison example (Room 2, S1) 175 Table 6.4. Prediction and auralization data comparison example (Room 2, SS4) 176 Table 6.5. Example of prediction and auralization based acoustic parameter differences for ‘Simple’ model, Single source (S1) in Room 8 178 Table 6.6. Example of prediction and auralization based acoustic parameter differences for ‘CAD’ model, Single source (S1) in Room 8 178 Table 6.7. Example of prediction and auralization based acoustic parameter differences for ‘Simple’ model, Multi source (SS4) in Room 8 179 Table 6.8. Example of prediction and auralization based acoustic parameter differences for ‘CAD’ model, Multi source (SS4) in Room 8 179 Table 7.1. Assessment uncertainty via error margins, calculated by considering the data origin at the assessment stages 190

LIST OF FIGURES

Page

Figure 2.1 Impulse response example 9 Figure 2.2 Delta function pulse in time domain 10 Figure 2.3 Continuous signal as a function of Delta function pulses 11 Figure 2.4 Exponentially swept sine example, a) Input b) System response to input 12 Figure 2.5 STIPA signal sample incorporating a pink spectrum, a) Signal spectrum, b) Signal time history (5sec), c)

Sample of typical speech time history (5sec) 15 Figure 2.6 Multi sloped sound decay accounting for a 60dB level drop 17 Figure 2.7 Multi sloped sound decay accounting for a 30dB level drop 18 Figure 2.8 Typical audio and modulation spectra for speech 24 Figure 2.9 Input/Output comparison with respect to modulation depth and resulting MTF spectrum 29 Figure 2.10 Effects of binaural hearing, I) ILD, based on level difference (for higher frequencies), II) ITD, based on

phase difference (for lower frequencies) 35 Figure 2.11 The Common Intelligibility Scale as determined by the IEC 44 Figure 2.12 Ray tracing principle and example sound propagation paths 47 Figure 2.13 Image source principle 48 Figure 2.14 Image source and resultant sound ray examples in a room 50 Figure 3.1 Sound source configurations (I-IV) 58 Figure 3.2 Examples of classroom population used in the study 60 Figure 3.3 Source efficiency in terms of C50- Room 1 66

vi

Page 7: Final_1]2_ phd

Preface – List of tables and figures

Figure 3.4 Source efficiency in terms of C50- Room 2 66 Figure 3.5 Source efficiency in terms of C50- Room 3 66 Figure 3.6 Source efficiency in terms of C50- Room 4 67 Figure 3.7 Source efficiency in terms of C50- Room 5 67 Figure 3.8 Source efficiency in terms of C50- Room 6 67 Figure 3.9 Source efficiency in terms of C50- Room 7 68 Figure 3.10 Source efficiency in terms of C50- Room 8 68 Figure 3.11 Source efficiency in terms of C50- Room 9 68 Figure 3.12 Source efficiency in terms of C50- Room 10 69 Figure 3.13 STI for four source configurations in Room 1, I) Primary - II) Post processed 70 Figure 3.14 STI for four source configurations in Room 2, I) Primary - II) Post processed 71 Figure 3.15 STI for four source configurations in Room 3, I) Primary - II) Post processed 71 Figure 3.16 STI for three source configurations in Room 4, I) Primary - II) Post processed 71 Figure 3.17 STI for four source configurations in Room 5, I) Primary - II) Post processed 71 Figure 3.18 STI for four source configurations in Room 6, I) Primary - II) Post processed 72 Figure 3.19 STI for four source configurations in Room 7, I) Primary - II) Post processed 72 Figure 3.20 STI for four source configurations in Room 8, I) Primary- II) Post processed 72 Figure 3.21 STI for four source configurations in Room 9, I) Primary - II) Post processed 72 Figure 3.22 STI for four source configurations in Room 10, I) Primary - II) Post processed 73 Figure 3.23 Relation of Clarity to MTI in ten test rooms, I) C50 to MTI without background noise, II) C80 to MTI

without background noise, III) C50 to MTI with background noise, IV) C80 to MTI with background noise 75 Figure 3.24 MTI relation to space reverberance in ten test rooms (no noise) 76 Figure 3.25 Relation of EDT to T30 for four source configurations in ten test rooms (S1, S2, SS4loudspeakers,

SS2loudspeakers) 77 Figure 3.26 Relation of EDT to T30 for four source configurations after excluding Rooms 9-10 (S1, S2,

SS4loudspeakers, SS2loudspeakers) 77 Figure 3.27 Relation of C50 to EDT in ten test rooms 78 Figure 3.28 Measurement system, I) Closed loop configuration, II) Open loop configuration 79 Figure 3.29 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 1 (Four source configurations) 81 Figure 3.30 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 2 (Four source configurations) 81 Figure 3.31 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 3 (Four source configurations) 83 Figure 3.32 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 4 (Three source configurations) 84 Figure 3.33 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 5 (Four source configurations) 85 Figure 3.34 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 6 (Four source configurations) 86 Figure 3.35 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 7 (Four source configurations) 87 Figure 3.36 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 8 (One source configurations) 88 Figure 3.37 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 9 (Four source configurations) 89 Figure 3.38 Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring

positions in Room 10 (Four source configurations) 90 Figure 4.1 Six system configurations (I-VI) 96 Figure 4.2 Test room (reverberation chamber) schematic with source-receiver positioning 97 Figure 4.3 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal

level per measurement), Room 1, Signal from Sound system (no simulated noise) 101 Figure 4.4 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal

level per measurement), Room 2, Signal from Sound system (no simulated noise) 102 Figure 4.5 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal

level per measurement), Room 3, SS (signal), SS (noise) 103 Figure 4.6 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal

level per measurement), Room 4, SS (signal), SS (noise) 104 Figure 4.7 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal

level per measurement), Room 5- SS (signal), SS (noise) 105

vii

Page 8: Final_1]2_ phd

Preface – List of tables and figures

Figure 4.8 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 6, SS (signal), SS (noise) 106

Figure 4.9 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 7, SS (signal), SS (noise) 107

Figure 4.10 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 8, SS (signal), SS (noise) 108

Figure 4.11 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 9, SS (signal), SS (noise) 109

Figure 4.12 T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 10, SS (signal), SS (noise) 110

Figure 4.13 MTI data for Room 1 113 Figure 4.14 MTI data for Room 2 114 Figure 4.15 MTI data for Room 3 115 Figure 4.16 MTI data for Room 4 116 Figure 4.17 MTI data for Room 5 117 Figure 4.18 MTI data for Room 6 118 Figure 4.19 MTI data for Room 7 119 Figure 4.20 MTI data for Room 8 120 Figure 4.21 MTI data for Room 9 121 Figure 4.22 MTI data for Room 10 122 Figure 4.23 STI in Room 1 (no frequency weighting) 123 Figure 4.24 STI in Room 2 (no frequency weighting) 123 Figure 4.25 STI in Room 3 (no frequency weighting) 123 Figure 4.26 STI in Room 4 (no frequency weighting) 123 Figure 4.27 STI in Room 5 (no frequency weighting) 123 Figure 4.28 STI in Room 6 (no frequency weighting) 124 Figure 4.29 STI in Room 7 (no frequency weighting) 124 Figure 4.30 STI in Room 8 (no frequency weighting) 124 Figure 4.31 STI in Room 9 (no frequency weighting) 124 Figure 4.32 STI in Room 10 (no frequency weighting) 124 Figure 4.33 Threshold efficient S/N trend in relation to T30, 125-8kHz octave band data in ten test rooms 126 Figure 4.34 Threshold efficient S/N trend in relation to EDT, 125-8kHz octave band data in ten test rooms 126 Figure 4.35 Test rooms used in the assessment of result repeatability, I) N11, II) Room 10 128 Figure 5. 1 Example room geometry, I) Simple representation II) Enhanced detail 135 Figure 5. 2 Measured directivity response for Yamaha HS50M monitors at 1m in free field conditions (balloon) 136 Figure 5. 3 Measured directivity response for Yamaha HS50M monitors at 1m in free field conditions (polar) 137 Figure 5. 4 Top view of test rooms for the validation of T30 calibration methodology 141 Figure 5. 5 Example of model detail resolution in Room 8, I) Via coordinate system, II) Via CAD software 143 Figure 5. 6 Mean EDT for actual and predicted conditions (simple) in Room 8 144 Figure 5. 7 STI comparison for actual and predicted conditions (simple) in Room 8 144 Figure 5. 8 Mean EDT for actual and predicted conditions (CAD) in Room 8 145 Figure 5. 9 STI comparison for actual and predicted conditions (CAD) in Room 8 145 Figure 5. 10 Example STI error in multi source conditions (S4) for ‘Simple’ model in Room 8 147 Figure 5. 11 Example STI error in multi source conditions (S4) for ‘CAD’ model in Room 8 147 Figure 5. 12 Model detail example, I) Full, II) Optimized 148 Figure 5. 13 Geometry representation of Room 1 in simulation software 150 Figure 5. 14 Geometry representation of Room 2 in simulation software 151 Figure 5. 15 Geometry representation of Room 3 in simulation software 152 Figure 5. 16 Geometry representation of Room 4 in simulation software 153 Figure 5. 17 Geometry representation of Room 5 in simulation software 154 Figure 5. 18 Geometry representation of Room 6 in simulation software 155 Figure 5. 19 Geometry representation of Room 7 in simulation software 156 Figure 5. 20 Geometry representation of Room 8 in simulation software 157 Figure 5. 21 Geometry representation of Room 9 in simulation software 158 Figure 5. 22 Geometry representation of Room 10 in simulation software 159 Figure 6. 1 Objective validation of auralization schematic 170 Figure 6. 2 Binaural recording setup 171 Figure 6. 3 Head and torso simulator in measurement position 171 Figure 6. 4 Comparison of T30 from prediction output and auralization validation (simple) in Room 8 178 Figure 6. 5 Comparison of EDT from prediction output and auralization validation (simple) in Room 8 178

viii

Page 9: Final_1]2_ phd

Preface – List of tables and figures

ix

Figure 6. 6 Comparison of T30 from prediction output and auralization validation (CAD) in Room 8 178 Figure 6. 7 Comparison of EDT from prediction output and auralization validation (CAD) in Room 8 178

Page 10: Final_1]2_ phd

Preface – Abbreviations and definitions

Abbreviations and definitions

Abbreviation

Definition

%Alcons Percentage articulation loss of consonants, see p.22. A parameter for the assessment of speech intelligibility.

AES Audio engineering society

AI Articulation index, see p.22. A parameter for the assessment of speech intelligibility.

ANSI American national standards institute

BGNL Background noise level

BRIR

Binaural room impulse response, See also RIR. BRIR relates to a binaural listener model that involves two impulse responses at a listener position. Binaural denotes stereophonic and equivalent to human hearing audio processing.

BSI British standards institute

C

Clarity index (dB). C describes the clarity of the signal on propagation, at a receiver position. This energy ratio is defined as the ratio of early to late arriving sound. The index e.g. 50 relates to the time threshold for defining late arriving sound (in milliseconds). A 50ms limit is commonly used for speech intelligibility applications.

CAD

Computer aided design. Denotes software implementation for 3D model generation, commonly for architectural design purposes. In the context of the study, 'CAD' also denotes a computer model incorporating a somewhat more detailed definition of room geometry.

CIS Common intelligibility scale. Published by the IEC, CIS relates a number of speech intelligibility measures on a single scale, also including common subjective assessment methods.

CVC

Consonant-vowel-consonant. Denotes a sequence of words as arranged by their starting letter. CVC word lists are used in subjective speech intelligibility assessments such as MRT, DRT, PB word lists etc. (see related glossary entries).

D

Definition (%). D is an energy ratio (early energy / total energy) relating to the distinctness of sound in a room. The measure defines a condition as a percentage and has been found to have good correlation with speech intelligibility.

DRT Diagnostic rhyme test. DRT is a subjective method for the assessment of speech intelligibility.

EDT Early Decay Time. Reverberation time derived from the first 10 dB of level decay, normalized to a 60dB decay. See p.18.

G Strength (dB). G is a measure relating to the overall energy transferred from the sound source to the receiver after subtracting the influence of the direct field.

HATS Head and torso simulator. HATS is typically used for binaural recordings/measurements to account for the influence of the human body on sound propagation, as would be perceived by a listener.

HRTF Head related transfer function. HRTF relates to a filter characteristic that simulates the influence of the human head on sound propagation when reaching a listener.

I/O Input/output

IACC

Interaural cross correlation. IACC is an acoustic parameter relating to spaciousness. It is obtained using a binaural receiver and provides information on the correlation between the signals received at the two ears. See p. 19.

ILD Interaural level difference. ILD is a psychoacoustic effect based on the signal level difference at the ears due to shadowing effects caused by the head. See section 2.5.

ISO International organization for standardization.

ITD Interaural time difference. ITD is a psychoacoustic effect based on the phase shift caused by the interaural time delay. It is effectively a

x

Page 11: Final_1]2_ phd

Preface – Abbreviations and definitions

function of time delay for sound arriving at the two ears due to the relative position of the source with regard to the listener. See section 2.5.

ITU International Telecommunication Union.

JND Just noticeable difference. JND is used to describe the perception threshold for changes in a particular condition on a subjective basis.

L10 L10 is a statistical measure to describe the sound level exceeded for 10% of the measurement duration. Effectively, L10 is a description of peak noise.

LAeq Leq measured in dBA. See also Leq.

Leq Leq (dB) is the continuous noise level equivalent of a sound event, as an average over a period of time.

m(F)

Modulation transfer function. m(F) describes/quantifies a transmission path by the decrease of the modulation depth, via a comparison of the test signal’s modulation index, mi, to the modulation index at the receiver, mo, as a function of modulation frequency. The result relates to a subjective perception of speech intelligibility. See section 2.4.3.

MLS

Maximum length sequence. MLS is a type of test signal also used in acoustic measurements, described as a periodic pseudorandom signal. It is considered as largely efficient on a number of aspects relating to acoustic measurements. See section 2.2.1.2.

MRT Modified rhyme test. MRT is a subjective method for the assessment of speech intelligibility.

MTF Modulation transfer function. The abbreviation is normally used to refer to the general MTF theory. See also m(F) and section 2.4.3.

MTI

Modulation transfer index. MTI is the average TI per octave band k. A direct averaging of MTI values will result in a basic STI, however a number of corrections are normally applied for a more realistic STI. See TI, STI entries and section 2.4.3.

P.A. Public address system. P.A. refers to a sound system aimed for public address.

PB word lists Phonetically balanced word lists. PB word lists comprise purposely designed word lists forming the basis for a subjective methodology for the assessment of speech intelligibility.

RIR Room impulse response. The impulse response of a system (room) is the mathematical function that describes the output signal when the input is excited by a unit pulse.

RT Reverberation time. RT is defined as the time taken for a sound to decay by 60dB (T60) after the excitation source has ceased.

S/N Signal to noise ratio. S/N is a fundamental parameter in acoustic measurements, directly relating to speech intelligibility.

SII Speech intelligibility index, see p.22. A parameter for the assessment of speech intelligibility.

SIL Speech interference level, see p.23. A parameter for the assessment of speech intelligibility.

SNR See S/N.

SPL Sound pressure level. SPL denotes the rms sound pressure of a signal with reference to the threshold of hearing.

STD See σ.

STI

Speech transmission index. STI is one of the primary acoustic parameters for the assessment of speech intelligibility. It is based on the determination of m(F) values at 98 data points. See section 2.4.3.1.

STIPA

Speech transmission index for public address systems. STIPA is an STI derivative specifically aimed for assessing speech intelligibility for transmission channels involving a sound system, see section 2.4.3.2. The abbreviation is also used to refer to the test signal purposely developed for use with the method, see section 2.2.1.4.

T30 Reverberation time as defined by a 30dB decay (T30) normalized to a 60dB decay equivalent. See also RT.

Threshold efficient S/N A S/N that is marginally efficient in terms of the signal level required for an accurate measurement. See also S/N.

TI

Transmission index. TI is determined from the apparent S/N ratio, specific for octave band k and modulation frequency f. It forms the basis for the derivation of MTI and subsequent STI. See MTI, STI entries and section 2.4.3.

xi

Page 12: Final_1]2_ phd

Preface – Abbreviations and definitions

xii

Ts Centre time (ms). Ts is an acoustic parameter defined as the time of the centre of gravity of the squared impulse response, see p.21.

σ Standard deviation. σ is a statistical measure relating to the dispersion of a group of values from the mean.

Page 13: Final_1]2_ phd

Contents

CONTENTS

Page

ABSTRACT.................................................................................................................................................................... iii

CHAPTER 1.................................................................................................................................................................... 4 INTRODUCTION........................................................................................................................................................... 4

1.1 Aims.................................................................................................................................................................. 4 1.2 Outline of thesis ................................................................................................................................................ 6

CHAPTER 2.................................................................................................................................................................... 8 LITERATURE REVIEW............................................................................................................................................... 8

2.1 Introduction....................................................................................................................................................... 8 2.2 The impulse response theory............................................................................................................................. 8

2.2.1 Test signals.............................................................................................................................................. 10 2.2.1.1 Dirac or Delta function pulse........................................................................................................... 10 2.2.1.2 Maximum Length Sequence (MLS) ................................................................................................ 11 2.2.1.3 Sine sweep (exponential)................................................................................................................. 12 2.2.1.4 STIPA signal ................................................................................................................................... 14 2.2.1.5 Other signals.................................................................................................................................... 15

2.2.2 Acoustic Parameters ................................................................................................................................ 16 2.2.2.1 Reverberation Times Indexes (RT) ................................................................................................. 16 2.2.2.2 Spaciousness Parameters ................................................................................................................. 19 2.2.2.3 Energy Ratios .................................................................................................................................. 20 2.2.2.4 Levels .............................................................................................................................................. 21 2.2.2.5 Speech Intelligibility Parameters..................................................................................................... 22

2.3 Fundamental attributes of speech .................................................................................................................... 23 2.4 Speech intelligibility measurement methodologies ......................................................................................... 24

2.4.1 Statistical (Direct) Measures of Speech Intelligibility............................................................................. 25 2.4.2 Objective (Indirect) Measures of Speech Intelligibility........................................................................... 26 2.4.3 The Modulation Transfer Function (MTF).............................................................................................. 27

2.4.3.1 The Speech Transmission Index (STI) ............................................................................................ 30 2.4.3.2 The STI Public Address (STIPA) method ....................................................................................... 33

2.5 Implications of binaural hearing...................................................................................................................... 34 2.6 Sound fields in enclosed spaces for speech intelligibility ............................................................................... 36

2.6.1 Sound fields in natural acoustics ............................................................................................................. 37 2.6.2 Sound system assisted sound fields ......................................................................................................... 38 2.6.3 (University) classroom acoustics............................................................................................................. 40 2.6.4 Relations between measures of speech intelligibility .............................................................................. 42

2.7 Sound field simulation .................................................................................................................................... 44 2.7.1 Geometrical acoustics ............................................................................................................................. 44

2.7.1.1 Ray tracing (stochastic) ................................................................................................................... 46 2.7.1.2 Image source method (deterministic) .............................................................................................. 48 2.7.1.3 Hybrid models (deterministic-stochastic) ........................................................................................ 51

2.7.2 Auralization............................................................................................................................................. 52 2.8 Summary and conclusions............................................................................................................................... 53

CHAPTER 3.................................................................................................................................................................. 56 ROOM ACOUSTICS MEASUREMENTS................................................................................................................. 56

3.1 Introduction..................................................................................................................................................... 56

1

Page 14: Final_1]2_ phd

Contents

3.2 Acoustic conditions for the different source configurations............................................................................ 57

3.2.1 Measurement methodology ..................................................................................................................... 57 3.2.2 Equipment list ......................................................................................................................................... 59 3.2.3 Test rooms............................................................................................................................................... 59 3.2.4 Results..................................................................................................................................................... 61

3.3 Measurement output accounting for typical speech and BGNL ...................................................................... 70 3.4 Parameter interrelations................................................................................................................................... 73

3.4.1 Clarity (C) energy ratios versus STI........................................................................................................ 74 3.4.2 Room reverberance (EDT, T30) versus STI ............................................................................................. 75 3.4.3 EDT versus T30........................................................................................................................................ 76 3.4.4 EDT versus C50 ....................................................................................................................................... 77

3.5 Comparison of measurements for closed/open loop........................................................................................ 78 3.5.1 Open loop measurement methodology .................................................................................................... 78 3.5.2 Supplementary equipment list for open loop measurements ................................................................... 79 3.5.3 Two system data comparison (Closed-Open loop data) .......................................................................... 79 3.5.4 Comments on Open loop measurement sessions ..................................................................................... 91

3.6 Conclusions..................................................................................................................................................... 92

CHAPTER 4.................................................................................................................................................................. 94 LOW LEVEL MEASUREMENTS ............................................................................................................................. 94

4.1 Introduction..................................................................................................................................................... 94 4.2 Measurement methodology ............................................................................................................................. 95

4.2.1 Threshold efficient signal to noise ratio (S/N) measurements ................................................................. 95 4.2.2 Noise source incorporation...................................................................................................................... 95

4.3 Initial investigation and screening sessions..................................................................................................... 97 4.4 Low level measurements in ten test rooms...................................................................................................... 99

4.4.1 Measure interrelations and performance ................................................................................................. 99 4.4.2 Correlation of T30 (and EDT) with threshold efficient S/N ................................................................... 125 4.4.3 Repeatability of results with/without simulated noise floor .................................................................. 127

4.5 Conclusions................................................................................................................................................... 130

CHAPTER 5................................................................................................................................................................ 131 COMPUTER MODELLING OF TEST SPACES ................................................................................................... 131

5.1 Introduction................................................................................................................................................... 131 5.2 Preparation methodology .............................................................................................................................. 133

5.2.1 Model design......................................................................................................................................... 133 5.2.1.1 Model detail resolution.................................................................................................................. 134 5.2.1.2 Source directivity .......................................................................................................................... 135 5.2.1.3 Definition of absorption and scattering coefficients ...................................................................... 137

5.2.2 Model validation/calibration methodology............................................................................................ 139 5.3 Experimental results...................................................................................................................................... 140

5.3.1 Basis for model validation/calibration................................................................................................... 140 5.3.1.1 Test methodology - Room acoustics measurements ...................................................................... 140 5.3.1.2 Test methodology – T30 calibration ............................................................................................... 140 5.3.1.3 Test methodology – Results via output comparison ...................................................................... 141 5.3.1.4 Session conclusions....................................................................................................................... 142

5.3.2 Model resolution ................................................................................................................................... 142 5.3.2.1 Assessment preparation and the impact of detail resolution .......................................................... 142 5.3.2.2 Assessment result for a single omni directional source ................................................................. 143 5.3.2.3 Use of Alternative Source Configurations..................................................................................... 146 5.3.2.4 Discussion ..................................................................................................................................... 147 5.3.2.6 Session conclusions....................................................................................................................... 149

5.4 Prediction results........................................................................................................................................... 149 5.4.1 Room 1 data .......................................................................................................................................... 150 5.4.2 Room 2 data .......................................................................................................................................... 151 5.4.3 Room 3 data .......................................................................................................................................... 152 5.4.4 Room 4 data .......................................................................................................................................... 153 5.4.5 Room 5 data .......................................................................................................................................... 154

2

Page 15: Final_1]2_ phd

Contents

3

5.4.6 Room 6 data .......................................................................................................................................... 155 5.4.7 Room 7 data .......................................................................................................................................... 156 5.4.8 Room 8 data .......................................................................................................................................... 157 5.4.9 Room 9 data .......................................................................................................................................... 158 5.4.10 Room 10 data ........................................................................................................................................ 159

5.5 Discussion ..................................................................................................................................................... 160 5.5.1 Prediction results ................................................................................................................................... 160 5.5.2 Simulation calibration using reference EDT ......................................................................................... 161 5.5.3 Model design and preparation ............................................................................................................... 162

5.5.3.1 Detail resolution requirement ........................................................................................................ 163 5.5.3.2 Room acoustics modelling guideline............................................................................................. 163

5.5.4 Comments on modelling sessions ......................................................................................................... 164 5.6 Conclusions................................................................................................................................................... 164

CHAPTER 6................................................................................................................................................................ 167 AURALIZATION....................................................................................................................................................... 167

6.1 Introduction................................................................................................................................................... 167 6.2 Objective validation of auralized responses .................................................................................................. 168

6.2.1 Objective validation of auralized responses using a swept sine ............................................................ 169 6.2.2 Evaluation of results.............................................................................................................................. 170

6.3 Subjective validation of auralized responses................................................................................................. 171 6.3.1 Recording room responses .................................................................................................................... 171 6.3.2 Equipment list ....................................................................................................................................... 171 6.3.3 Comparison of recordings to predicted auralization.............................................................................. 172

6.4 Auralization study in ten test rooms.............................................................................................................. 172 6.4.1 Comparison of objective validation methods ........................................................................................ 173 6.4.2 Objective assessment of auralizations ................................................................................................... 174 6.4.3 Subjective assessment of auralizations.................................................................................................. 176

6.5 Auralization accuracy and relation to model detail ....................................................................................... 177 6.5.1 Objective assessment of convolution quality from ‘simple’ and ‘CAD’ models................................... 177 6.5.2 Assessment of auralization realism by subjective means ...................................................................... 179

6.6 Conclusions................................................................................................................................................... 180

CHAPTER 7................................................................................................................................................................ 183 SUMMARY AND CONCLUSIONS.......................................................................................................................... 183

7.1 Overview....................................................................................................................................................... 183 7.2 Room acoustics measurement methodology ................................................................................................. 183 7.3 Low level measurements............................................................................................................................... 185 7.4 Development of an optimized methodology for improved computer models................................................ 186 7.5 Auralization accuracy assessment ................................................................................................................. 187 7.6 Further work.................................................................................................................................................. 189 7.7 Overall conclusions....................................................................................................................................... 189

REFERENCES............................................................................................................................................................ 191

Page 16: Final_1]2_ phd

Chapter 1 - Introduction

CHAPTER 1

Introduction

1.1 Aims

The aim of this study is to develop and validate an efficient methodology comprising

acoustic measurements and computer modelling for the prediction of acoustic parameters,

in particular speech intelligibility, in lecture rooms.

The focus of the research is principally on post evaluation of conditions, i.e. improving

existing environments, for a number of different source configurations including multi

source conditions (sound system). An assessment on this basis depends on the numerical

output of the prediction that will be used to identify existing or potential acoustic

conditions. Prediction consistency is in turn dependant on the accuracy of room acoustics

measurements that are typically used as a reference for performance, enabling calibration

and fine tuning of the simulation. An auralization of an enclosed space provides a means

for a fast subjective evaluation of the room acoustics. Thus precision in the prediction of

the room response is a fundamental requirement.

Room acoustics measurements, being the first stage of an assessment, require a consistent

measurement methodology that is adequately generic to realistically account for the

actual potential conditions within a space e.g. not invalidated by loudspeaker distortion or

low signal to noise ratios (S/N). The measurement results would thus enable their use as a

general indicator of acoustic conditions, as opposed to a measurement outcome that is

invalidated on this basis. Accordingly, it is necessary to identify a reliable methodology

to obtain reference measurement results that accurately represent a space and can thus be

suitably post processed to derive speech intelligibility specific parameters for different

4

Page 17: Final_1]2_ phd

Chapter 1 - Introduction

conditions i.e. different speech level, speaker gender and background noise level

(BGNL).

Validated computer models can be used to predict the acoustic environment in

significantly altered conditions e.g. alternative source types and positions. This option

requires a simulation that is capable to retain consistency with measurements under the

altered conditions; a suitable simulation validation method is essential for this approach

to be effective. The level of detail incorporated in a computer model is an additional

factor in the development process that has a direct effect on the resultant simulation

efficiency e.g. processing speed and prediction accuracy. Commonly, rules of thumb

relating to the required detail resolution concern larger rooms that fundamentally do not

apply in typical university classrooms; consequently, there is no clear guidance for model

construction of such spaces. It is thus necessary to establish the influence of such variants

on the simulation efficiency to identify the most competent approach.

In a similar context, the auralization generated via a computer simulation is typically

assumed to be consistent with the predicted acoustic parameters, accurately representing

a given space. However, an objective validation methodology is evidently required to

enhance confidence in the result, as subjective methodologies involving listening tests are

usually not cost effective and difficult to implement. Subjective testing also can not give

absolute results. Currently, one such methodology exists that is limited in assessing the

accuracy of an intermediate product in the auralization process i.e. the room impulse

response. An improved methodology is required to enable an assessment of the end

product of the process, the auralized material. This would provide enhanced confidence

on the consistency of the auralization in relation to the prediction outcome, and

consequently to the measured parameters as well. The current method is furthermore

limited by the capabilities of the required host measurement software, as only a Dirac

pulse can be used in the necessary deconvolution process. Increased flexibility of an

objective validation method in these terms would be beneficial by enabling a broader

application of the method, leading to more accurate auralization based assessments.

5

Page 18: Final_1]2_ phd

Chapter 1 - Introduction

This study addresses the processes, from room acoustics measurements to auralization via

computer simulations, as linked by a prospective speech intelligibility predictive

assessment for improved enclosed spaces.

1.2 Outline of thesis

The work in this study has been divided in 7 chapters. Chapter 1 gives a synopsis of the

work undertaken, introducing the primary objectives. Chapter 2 gives a general review of

the literature related to the study. The fundamental concept of the impulse response

theory and measurement techniques is primarily addressed, while the main acoustic

parameters used in the study are presented and defined in this context. The notion of

speech intelligibility and basic psychoacoustic and phonological attributes of speech

signals are given to complement the related measurement techniques. Subjective

measurement methodologies are also briefly outlined. The currently main objective

measurement techniques i.e. STI, STIPA are presented in detail, following an

introduction to the underlying modulation transfer function theory (MTF). The

implications of speech intelligibility in the acoustic conditions of enclosed spaces are

addressed in terms of both natural acoustics and sound system assisted sound fields. The

interrelation between measures relating to speech intelligibility is reviewed in the context

of classroom acoustics. The chapter concludes with an introduction to computer

modelling in terms of ray tracing, image source methods and hybrid approaches, as

related to the study.

Chapter 3 introduces the room acoustics measurements in ten test rooms. The primary

objective is to establish the acoustic character of the rooms considered, while EDT, T30,

C and STI are analyzed to determine their interrelationship. Using in turn four source

configurations the effect of the sound source type and position on the derived acoustic

parameters is examined to determine the consistency of measurement data that are

obtained under different conditions as such. Open loop based measurement data is

compared on the same basis against the typical closed loop system equivalent.

6

Page 19: Final_1]2_ phd

Chapter 1 - Introduction

7

In Chapter 4, low level measurements in ten test rooms are examined to determine the

effect of a continuously reducing S/N on the associated measurement. The relation of S/N

to room reverberance is thus examined to discern an efficient approach in obtaining

usable data from low level measurements.

Chapter 5 introduces the development of computer models for ten test rooms. Focus is

primarily on two aspects of the simulation process i.e. the validation/calibration in terms

of an appropriate reference parameter and the required detail resolution in a model. T30

and EDT are examined as possible reference parameters for the validation/calibration

process. An efficient methodology is presented and the influence of the reference

parameter, T30 and EDT, in terms of the resultant simulation accuracy is evaluated.

The required detail resolution is determined with the use of models incorporating

different levels of detail for the same room. Enhanced detail for smaller rooms appears

advantageous in terms of consistency with measurements. However, given the wide

availability of CAD software which can significantly simplify the generation process, the

balance in terms of detail level, processing speed and overall efficiency is often disturbed

by users. Issues relating to the model development time and resultant processing speed

are thus addressed to streamline the process of model development.

Chapter 6 discusses the need for an objective auralization validation process, as opposed

to typical subject based approaches. A new hybrid method is proposed and compared

with what is currently the only available objective practice, to establish the gain by the

new approach. Auralizations in ten test rooms are subsequently assessed to examine

consistency with the predictions’ numerical output. The influence of the model detail

resolution is further examined through auralizations. The sessions are complemented by a

subjective assessment via listening tests for a broader view of the results.

Finally, a synopsis of the work achieved in this study and suggestions for further work

are given in Chapter 7.

Page 20: Final_1]2_ phd

Chapter 2 - Literature review

CHAPTER 2

Literature review

2.1 Introduction

Speech intelligibility is a complex concept, forming a dynamic process with its key

parameters that directly relate to potential acoustic performance in a room. Given the

scope of the current study, different methodologies for the measurement of acoustic

parameters, and speech intelligibility in particular, are presented and the discussion

expands to include the general framework of the impulse response theory and related

measurement techniques. Acoustic parameters related to the study are briefly reviewed

and the MTF theory, the theoretical basis for key measures of intelligibility, is

introduced, following a synopsis of critical elements of speech sounds. A review and

discussion on classroom acoustics, as well as an introduction to computer modelling and

auralization conclude the chapter.

2.2 The impulse response theory

The impulse response of a system is the mathematical function that describes the output

signal when the input is excited by a unit pulse [1] (Figure 2.1). In acoustical science the

theory is applied in an analogous way by regarding an enclosed space as a filter having an

input that is the test signal source S(t) and an output S’(t), represented by a given receiver

position within the room.

8

Page 21: Final_1]2_ phd

Chapter 2 - Literature review

Figure 2.1. Impulse response example

For a system that is linear and time invariant (LTI) the distinct alterations that the

traveling signal undergoes become the means to determine and characterize the acoustic

properties of the space in consideration. S’(t), in this context, is considered as a temporal

function in time of sequential intensity components, given that for a linear system an

infinite number of ideal pulses (see section 2.2.1.1) arriving at different time delays can

be used to represent the decaying signal. A reverse integration technique, originally

introduced by Schroeder in 1965 for RT estimation [ 2 ], formed the basis for further

processing methodologies to allow derivation of the majority of acoustic parameters from

an impulse response measurement, as the latter is considered an accurate and complete

description of the acoustic properties of a transmission system [1]. S’(t) can be therefore

calculated by convolving the source signal S(t) and the impulse response g(t) as given by:

'( ) ( ') ( ') 't

S t g t t s t dt

(2.1)

Parameters accounting for the directional response of a room moreover can be based on a

binaural room impulse response (BRIR) measurement using a pair of matching

microphones, or a head and torso simulator (HATS) to more closely approximate a

human listener. The obvious advantage for the second case is that wave effects attributed

to sound transmission around the body of the listener are accounted for without the need

for a correction factor i.e. a head related transfer function (HRTF). This is particularly

useful if considering binaural speech intelligibility measurements, see section 2.4.3.1.

9

Page 22: Final_1]2_ phd

Chapter 2 - Literature review

In the context of the aforementioned, HATS based measurements can also be used for the

purposes of a realistic recording/auralization, see section 6.3.

2.2.1 Test signals

In measuring a room’s impulse response a test signal by definition comprises of a

reproducible, with respect to sound power radiation (for directivity and frequency

content), impulsive sound. The quality of the test signal in this sense directly relates to

measurement consistency.

2.2.1.1 Dirac or Delta function pulse

The ideal form of a test signal is a Dirac i.e. Delta function pulse signal which,

theoretically, can be defined as a pulsive signal of infinitely short duration, infinitely high

power and unit energy (Figure 2.2, Eq 2.2). On this basis, an infinitely broad spectrum is

also suggested.

( ) 1t dt

(2.2)

Figure 2.2. Delta function pulse in time domain

Where δ(t)=0 when t≠0 and, approaching to infinity when t=0.

A loudspeaker, nonetheless, confronts physical restraints in reproducing very short length

sounds rendering Delta functions unfeasible/unrealistic as an actual test signal. However,

any signal can be perceived as a close succession of short impulses (Figure 2.3), a

characteristic that is central in methodologies using reverse integration (as discussed in

the preface of section 2.2).

10

Page 23: Final_1]2_ phd

Chapter 2 - Literature review

Figure 2.3. Continuous signal represented by a series of Delta function pulses

2.2.1.2 Maximum Length Sequence (MLS)

The Maximum Length Sequence (MLS) is a periodic pseudorandom signal, widely

accepted as a largely efficient test signal on a number of aspects. Comprising effectively

of Delta function pulses, MLS has the desired property that its frequency spectrum over

one period is as flat as an ideal pulse, resulting in a clean output (no residual noise). Fast

Hadamard transform (FHT) is used to perform the required correlation process, as first

described within the context of room acoustics by Alrutz and Schroeder [ 3 ]. The

deterministic nature of MLS enables synchronous averaging thus allowing an increase in

signal to noise ratio (S/N). This can be achieved as, theoretically, exactly the same results

are expected for subsequent periods of MLS’s from measurement repetition under the

same conditions. By repetition, the sequence will add up in phase with the previous ones

while background noise that is not correlated for the number of periods considered will

effectively be reduced by 3dB per doubling of the number of averages taken [4]. The S/N

gain in this case is shown by:

(2.3) 10log dBN

where N is the number of averages.

Similarly, S/N gain will also relate to the order of the MLS sequence as it determines the

length of the signal. Pre-emphasis applied to the signal (in the form of e.g. a pink noise

spectrum) further increases the potential gain and noise rejection capabilities of the

11

Page 24: Final_1]2_ phd

Chapter 2 - Literature review

system while it is particularly useful for the common situation where the background

noise spectrum is not flat [5].

Limitations for the technique are found when considering system non-linearities and time

variances. The former can be considered negligible in the context of building acoustics [4]

while more care should be taken to avoid time variances, as measurement errors can be

introduced. A practical aspect of MLS signals however is that absolute level can be easily

determined using a sound level meter (SLM), a characteristic that is essential for

measurements in need of level calibration.

The use of MLS cannot be recommended for deriving a general purpose impulse

response but should rather be considered for specialized tasks such as speech parameters’

measurement on an absolute level basis [6]. Given a level calibrated system nonetheless,

this approach can also be actualized by post-processing of the measurements. The latter

can be obtained using any suitable form of excitation signal thus, again not necessitating

the use of MLS signals in this context.

2.2.1.3 Sine sweep (exponential)

The exponential sine sweep test signal (or log-Time Stretched Pulse) is a sine wave

sweeping exponentially in time over the desired frequency range (Figure 2.4).

Figure 2.4. Exponentially swept sine example, a) Input b) System response to input

The idea of sine wave utilization for impulse response measurement is not new (see

Griesinger [7]). However, the method was refined and further developed to its current

12

Page 25: Final_1]2_ phd

Chapter 2 - Literature review

form by Farina [ 8 ] who showed that an exponentially (rather than linearly) varying

frequency sweep could allow to simultaneously deconvolve a system’s linear impulse

response and separate impulse responses for each harmonic distortion order. The method

is based in generating and using an accurate inverse filter f(t) that is capable of packing

the input signal S(t) into a delayed Dirac delta function δ(t). Deconvolution of the

system’s impulse response, and separate distortion components, can then be performed

by convolving the output signal S’(t) with the inverse filter f(t). Another study relating

harmonic distortion induced errors and signal level has shown that a reduced error in the

impulse response can be expected with the use of exponential sweep, as compared to

linear, consequently implying that a higher than before signal level can be used while

improving confidence in the measurement [ 9 ]. Considering, moreover, specific

applications such as recordings for auralization purposes that require in excess of 90dB

S/N [10] the exponentially swept sine would be the only option for accurately obtaining

such results.

Furthermore, time aliasing problems are dealt with by including a segment of silence at

the end of the sweep. Also, as opposed to MLS, if naively the time window for data

analysis is not made long enough the (late) decay tail can only be lost during the

deconvolution process, and not fold back to the beginning of the impulse response thus

altering its character.

Another advantage in terms of time variance is that a sine sweep can be made longer to

compensate for minor time variances of a system. By avoiding the multiple average

approach the resulting impulse response is not affected (as opposed to MLS) by the time

variation, as only a single measurement is involved. The effective S/N can be

significantly improved in this case as more energy is spread over a longer time. BS EN

ISO 18233: 2006 [ 11 ] recommends the use of a single very long sweep to enhance

confidence in the measurement.

It has been experimentally demonstrated that a longer signal duration is superior to the

averaging technique, nonetheless, new advancements in the sine sweep methodology [12]

13

Page 26: Final_1]2_ phd

Chapter 2 - Literature review

although not implying preference, provide a fix to reduce the error to an acceptable level

for cases where averaging cannot be avoided.

Finally, another significant feature is that sine sweeps do not require a tight

synchronization of the sampling clock for the signal generation and recording units. As

such, measurements can be easily performed even if using external sources (i.e. different

than the measurement processing unit). It has been shown that common mismatches

between the sampling clocks of playback and recording devices in such cases do not

result in estimation errors of the system’s impulse response [8].

The assumption that exponential sine sweep based measurements are the most

advantageous choice for the majority of transfer function measurement conditions is

supported in the literature [10].

2.2.1.4 STIPA signal

The STIPA signal was formed for use with the STIPA method, intended for the

prediction of speech intelligibility of sound systems (see section 2.4.3.2). The signal is

spectrally shaped (to approximate a generic speech spectrum) pseudo-random noise

(Figure 2.5), having the distinctive attribute of incorporating the modulation frequencies

(two modulation frequencies per octave, as defined in ISO 60268-16 [13]) that are used in

the MTF theory (see section 2.4.3). This characteristic, while facilitating measurements

without the requirement for ‘source-receiver’ synchronization in the time scale, also

enables a significant simplification of the measurement procedure [14, 15, 16] giving the

signal its main or perhaps the only significant advantage over the use of other test signals.

In contrast, due to the fact that the two modulation frequencies are used simultaneously,

the negative effect of fluctuating or impulsive noise in a given measurement session is

increased [13].

The STIPA signal is less or not affected by certain types (with some exceptions, e.g.

compression) of typical online signal processing that is found in sound system

configurations, and normally affecting signals like MLS and sine sweeps (see Mapp [17]).

14

Page 27: Final_1]2_ phd

Chapter 2 - Literature review

However, disabling the particular units during measurements (this not being a

straightforward action on occasions) solves the problem for the latter.

Figure 2.5. STIPA signal sample incorporating a pink spectrum, a) Signal spectrum, b) Signal time history (5sec), c) Sample of typical speech time history (5sec)

As previously noted [18], potential differences exist between STIPA signals compiled by

different manufacturers (e.g. pink or speech shaped spectrum). Mapp performed a series

of measurements comparing a number of STIPA meters and showed that the signal

differences among the manufacturers could to some extent be responsible for

discrepancies in what would be expected to be comparable results [ 19 ]. In the same

publication it is pointed out that the problem was acknowledged by the manufacturers

who subsequently revised their signals thus, eliminating any concerns in this respect.

2.2.1.5 Other signals

Alternative test signals that were widely used prior to the introduction of more elaborate

options typically include random noise, filtered to give a white (flat) or pink (-3 dB per

octave) spectrum; the latter employed to improve the S/N for lower frequency bands. Due

to the randomness characteristic in this case, a measurement of adequate length was

needed so as to reduce residual noise in the impulse response. The methods are however

considered largely outdated and normally not used.

Different means of sources can be used in an attempt to approximate an ideal pulse (e.g.

starting pistol, balloons). The latter commonly do not give the required accuracy with

respect to directivity control and frequency content reproducibility thus, compromising

15

Page 28: Final_1]2_ phd

Chapter 2 - Literature review

measurement consistency. On a practical/feasible approach however, screening purposes

could be assumed as sources of the type simplify to an extent the measurement procedure.

It should be noted that different spectra can be applied to particular test signals (e.g. MLS,

Sine sweep etc.) to account for male or female type sources. This is done in an attempt to

more closely approximate the effect of different talkers by increasing the S/N at the

desired frequency ranges for a more efficient measurement methodology. Post processing

of the data, however, can also allow for a more explicit approach if necessary.

2.2.2 Acoustic Parameters

There is a multitude of parameters that can be used to acoustically describe a space.

While most of these parameters can be directly derived from the space’s impulse

response a number of additional measures complement the principal record, considering

also the existence of different categories that have a more specialized focus. The

following paragraphs discuss the parameters that are most relevant in the scope of the

current study while incorporating in addition some epigrammatic historical information

on the process of acoustic measure advancements.

2.2.2.1 Reverberation Times Indexes (RT)

Reverberation time (RT) is defined as the time taken for a sound to decay by 60dB (T60)

after the excitation source has ceased (Figure 2.6). According to classical theory as

proposed by Sabine in 1922, RT relates to the volume and total absorption of the

enclosed space considered, by the well known Sabine’s empirical formula of:

60

0.161VRT

A (2.4)

where V is the volume of the enclosure and A is the sum of absorption units derived from

the different surfaces within the room (air absorption is also included when very large

spaces are considered).

16

Page 29: Final_1]2_ phd

Chapter 2 - Literature review

The calculation of RT indexes through classical theory is subject to inherent limitations

in the Sabine formula, introducing errors for conditions differentiating from simple. Thus,

other researchers have attempted to redefine RT for enhancing confidence on the results.

While refining more on specific aspects that are characteristic of the space considered,

other such approaches include the Norris-Eyring formula, cited in Skarlatos [20] (for large

spaces having the same absorption on all surfaces), the Fitzroy formula [21] (for spaces

where surfaces incorporate differentiating absorption coefficients) and the Sette-

Millington formula, cited in Kang [22] (for the estimation of absorption coefficients that

always result in values less than one, and also for RT estimation in spaces where surfaces

incorporate differentiating absorption coefficients). However, none of the early formulas

are able to account for the early reflections and direct sound of a sound field or for non-

diffuse field characteristics.

Going back to the definition of T60, it is often difficult in practice to obtain a decay of this

magnitude due to dynamic range restraints. For this reason, T30 and T20 are commonly

derived (through linear regression on the decay) to quantify RT, as they are the

equivalent decays between -5dB to -25dB and -5dB to -35dB respectively, normalized to

a 60dB decay (Figure 2.7). The first 5dB of the decay are excluded in this case to avoid

the influence of potential strong early reflections and/or direct sound. RT indexes can

provide some information on the diffusivity of a space as similar values among different

indexes would suggest a linear decay and therefore the presence of optimum diffusion.

Figure 2.6. Multi sloped sound decay accounting for a 60dB level drop

17

Page 30: Final_1]2_ phd

Chapter 2 - Literature review

Figure 2.7. Multi sloped sound decay accounting for a 30dB level drop

Early Decay Time (EDT) is another measure relating to reverberance, defined in 1970

by Jordan [23] as the equivalent RT measured over the first 10dB of decay (0dB to -10dB).

It is a parameter of particular value with its significance summed in the fact that it

represents the time taken for the early reflections to reach the receiver. It therefore

comprises a more meaningful description of the early part of the sound decay. EDT is

considered an important cue in judging the character (and from a psychoacoustics point

of view, the size) of a room. Its importance on the basis of subjective perception (also of

early energy in general) is widely acknowledged and exploited in the area of speech

intelligibility (see sections 2.4.3 and 3.4.2).

As discussed in sections 2.2 and 2.2.1.1, RT indexes can be derived from the impulse

response through reverse integration. Following from Eq. 2.1, given that the excitation

source seizes at t=0 the sound intensity at a receiver position is given by:

0'( ) ( ') ( ') ' ( ) ( ) ( ) ( )

t

tS t g t t s t dt g x s t x dx g x s x t dx

(2.5)

To obtain the sound pressure level decay range that is necessary for the RT derivation

sound intensity is converted by:

( ) 10log ( ) ( )t

SPL t f x s x t dx

(2.6)

18

Page 31: Final_1]2_ phd

Chapter 2 - Literature review

It follows that the range of reverberation time indices, Tx, and EDT can be calculated

from the decay corresponding to a given receiver position.

2.2.2.2 Spaciousness Parameters

Spaciousness parameters relate to the subjective spatial impression and are measured by

signal differentiation between the two ears thus, requiring two impulse responses per

listening position.

The Inter-Aural Cross Correlation (IACC), suggested by Damaske and Ando in 1972 [ 24 ], is one of the primary parameters associated with spaciousness (obtained using a

binaural receiver). IACC provides information on the correlation between the signals

received at the two ears, while it has been shown by its authors to relate to the directional

perception of a source (high IACC value at a single delay time) and the general

perception of field diffuseness (low IACC suggests high diffusivity). The measure could

thus be used for a general sound field quality assessment while also potentially assisting

in source positioning and aiming, particularly for multi source conditions to optimize the

subjective perception of sound quality. Early reflections (for spaciousness) or late

reverberant sound (for diffuseness) or both are accounted for, depending on the

integration limits, and the measure can be generically defined in a normalized form as:

0

0 0

01/2

2 2

0 0

( ) ( )

[ ( )] [ ( )]

t

r l

rlt t

r l

g t g t dt

g t dt g t dt

(2.7)

where gr and gl are the impulse responses corresponding to the right and left ear. φrl

denotes the correlation function for which the maximum of its absolute value within the

range 1ms is called the IACC [1, 24].

Additional spaciousness parameters would include the lateral energy fraction (LF) that is

a ratio of the energy from early lateral reflections over the total energy arriving at the

19

Page 32: Final_1]2_ phd

Chapter 2 - Literature review

receiver within the first 80ms. Similar parameters are the early lateral energy fraction

(LEF) and Lateral Fraction Cosine (LFC).

2.2.2.3 Energy Ratios

Energy ratios are closely related to the acoustical character of a space and speech

intelligibility. Their measurement usually requires omni-directional equipment, though

variable directionality is allowed depending on the measurement purpose (e.g. for speech

intelligibility measurement). Energy ratios are derived from the impulse response, g(t), on

the basis of comparing the useful energy (typically for speech, the direct sound and the

first 50 ms of energy, though the time limit is often open for interpretation) to the total or

late arriving energy (arriving after the time threshold defining early energy).

‘Definition’ (D) or ‘Deutlichkeit’ as proposed by Thiele [ 25 ] is one of the earliest

attempts to use energy ratios in quantifying the effect of a space on room acoustics, and

relates to the distinctness of sound. The measure defines a condition as a percentage; the

higher the value, the better the ‘definition’ of sound, while it has been found that a good

correlation between intelligibility and D is the case (cited in Kuttruff [1]). With the direct

sound included in both parts of the ratio, D can be defined as:

50 2

0

2

0

[ ( )]100%

[ ( )]

msg t dt

Dg t dt

(2.8)

The ‘Clarity index’ (C) or ‘Klarheitsmass’ was introduced by Reichardt et al. [26] to

characterize the transparency/clarity of music performances. However, the measure has

been found [27] to directly relate to speech intelligibility given a lower time limit, 50ms

for C50 (originally 80ms) despite the fact that it can not account for background noise or

noise masking effects. A number of alternative indices (e.g. 35ms) based on the

differentiating ear integration time when relating to signal level or frequency have been

suggested in the literature (e.g. see Bradley [28]), nonetheless it is the 50ms threshold that

prevailed as a widely accepted useful energy time threshold for speech applications. The

‘Clarity index’ is defined as in Eq.2.9 (original) and Eq. 2.10 (for speech):

20

Page 33: Final_1]2_ phd

Chapter 2 - Literature review

80 2

010

2

80

[ ( )]10log

[ ( )]

ms

ms

g t dtC

g t dt

dB (2.9)

50 2

050 10

2

50

[ ( )]10log

[ ( )]

ms

ms

g t dtC

g t dt

dB (2.10)

The ‘Centre time’ (Ts) or ‘Schwerpunktszeit’ was first introduced by Kurer [29] in 1969

and can be defined as the time of the centre of gravity of the squared impulse response.

Given that a decrease in the value of Ts is proportional to the reflection time delay when

compared to direct sound, a degree of (negative) correlation has been shown to exist with

speech intelligibility (cited in Kuttruff [1]).

2

0

2

0

[ ( )]

[ ( )]s

g t tdtT

g t dt

(2.11)

2.2.2.4 Levels

Strength (G) or ‘Strength factor’ is a measure relating to the overall energy transferred

from the source to the receiver after subtracting the direct field influence. This is

generically defined as:

2

010

2

0

[ ( )]10log

[ ( )]A

g t dtG

g t dt

dB (2.12)

where gA(t) is the counterpart impulse response measured in an anechoic room at a

distance of 10m.

The Late Lateral Strength (LG) is an alternative measure of the energy transferred to

the receiver after a given time threshold (usually 80ms) and relating to the sense of

listener envelopment. As such, only octave bands up to 1kHz are considered so as to

account for the fact that higher frequencies do not contribute much in listener

envelopment [1]. can be defined as: 80LG

21

Page 34: Final_1]2_ phd

Chapter 2 - Literature review

2

8080 10

2

0

[ ( )]10log

[ ( )]

ms

A

g t dtLG dB

g t dt

(2.13)

2.2.2.5 Speech Intelligibility Parameters

Several parameters have been adopted in the process of evaluating speech intelligibility.

With the subjective (direct) methodologies (see section 2.4.1) being considered as most

accurate, having nonetheless a number of advantages and disadvantages, computer based

(indirect) methods give significant advantages in terms of practicality and thus simplicity

of measurement. A separate section is dedicated on the more widely used MTF theory

and its STI derivatives (see section 2.4.3) as they comprise the main approach in speech

applications. The following paragraphs present some of the earlier parameters relating to

intelligibility, complementing at this stage the C and D energy ratios as previously

discussed.

The Articulation Index (AI) was based on the work of French and Steinberg at Bell

Laboratories in 1947 [30] and reconsidered by Kryter in 1962 [31]. Commonly referred to

nowadays as the speech intelligibility index (SII), it comprises one of the first attempts

to measure intelligibility [ 32 , 33 ]. The basis of AI theory suggests that a speech

communication system can be divided in twenty frequency bands (later reduced to five

bands for simplicity), each carrying a different contribution to intelligibility. With the

total contribution being the sum of the individual bands’ contribution, S/N ratios are

derived for each of the latter and weighted to yield a result. AI assessment varies from 0

to 1 with 0.3 or below considered unsatisfactory, 0.3 to 0.5 satisfactory, 0.5 to 0.7 good

and above 0.7 excellent.

The Percentage Articulation Loss of Consonants (%ALcons) was originally

developed from the early findings of Peutz in 1971 [ 34 ]. Having determined that

intelligibility was proportional to RT, room volume and the distance between the talker

and listener, Peutz concluded that it was the loss of consonants that mostly reduced

intelligibility. In this sense, the %ALcons measure refers to the percentage of consonants

22

Page 35: Final_1]2_ phd

Chapter 2 - Literature review

that will be lost during the transmission of speech through a system. Optimal values are

highlighted at a 5%ALcons with 10% considered good and more than 15% unacceptable.

The main problem here, however, is that the result is derived only from the 1/3 octave

band centered at 2000Hz. All other frequency bands are simply ignored. Moreover, there

is no relation to parameters other than RT and also the measurement procedure does not

include vowels, something that could cause a systematic error with respect to word tests [35]. For these reasons and despite the fact that the method has gained popularity over

earlier years, it does not generally comprise a reliable assessment of intelligibility for the

majority of potential conditions.

The Speech Interference Level (SIL) is a simple method for predicting or assessing

speech intelligibility in cases of direct communication in noisy environments [ 36 ]. In

obtaining a result, the vocal effort of the speaker (accounting for the Lombard effect) and

the distance from the listener are taken into account, though the method is normally not

used where other methods can be applied. SIL is defined as the difference between the

speech level (LS,A,L) and the speech interference level of noise (LSIL), both determined at

the listener’s position for the octave bands in the range of 500Hz to 4kHz. Fair speech

communication is ensured if the difference in levels, SIL= LS,A,L – LSIL, is 10dB. BS

EN ISO 9921: 2003 [36] gives a more detailed description of the method.

2.3 Fundamental attributes of speech

Speech is a continuous waveform having a wide frequency range starting from below

125Hz and extending to above 8kHz. Fundamental frequencies (100Hz - 400Hz) vary

among genders being, on average, about 100Hz for men and 200Hz for women. The

latter are the basis for a series of changing harmonics, called formants, that are created at

integer multiples of the fundamental frequency and are partly the means to determine the

character of an individual’s voice.

Formants create the various vowel sounds and the transitions among them and are

considered as a relatively efficient way to generate sound [37]. Their amplitude can be up

23

Page 36: Final_1]2_ phd

Chapter 2 - Literature review

to 27dB higher than consonants which are all impulsive in character. Consonants are

separated in two categories, voiced or unvoiced, the latter being particularly quiet.

Common durations for vowels and consonants are 90ms and 20ms respectively.

Considering that energy in speech is carried mainly by vowels the latter emerge as a key

parameter in potential masking effects over the consonants as intelligibility is subject

mainly to consonant comprehension. It should be noted that the audible spectrum of

speech is not flat, with a common trend suggesting the presence of more energy in lower

frequencies (Figure 2.8).

Sound that is heard in speech is organized into words or smaller elements that form the

latter, phonemes. Common rates (frequencies) of phonemic production are much lower

than the audible frequency range corresponding to only a few Hz. Audible speech sounds

organized into packets of phonemic information, can be modeled by amplitude

modulating a wide-band signal. Thus a second spectrum in speech, defining the rate at

which humans utter phonemes, is highlighted as the modulation spectrum. This can be

represented by fourteen frequencies spaced at one-third octave intervals ranging from

0.63Hz to about 16Hz, with a spectrum that has higher values at the middle frequencies

(Figure 2.8) [13]. Given its impact in the context of speech communication, the modulation

spectrum comprises a key parameter of the MTF theory and resultant STI method.

Figure 2.8. Typical audio and modulation spectra for speech

2.4 Speech intelligibility measurement methodologies

Speech intelligibility is a continuous and dynamic process, the assessment of which can

be based on a multitude of different elements for a wide range of conditions. On this

24

Page 37: Final_1]2_ phd

Chapter 2 - Literature review

basis, direct and indirect methods comprise the means to measure this function through a

number of different approaches.

2.4.1 Statistical (Direct) Measures of Speech Intelligibility

Direct measurement methods comprise of human – based techniques that make use of the

subjective impression of trained listener subjects. The test procedures involve listening to

sets of words individually, in pairs or within carrier phrases. Stimulants, produced by

trained speakers, comprise of specific word lists designed to assess specific aspects of

speech transmission. The outcome of the test is judged on the basis of evaluating speech

comprehension by the listeners. Derivation of any results, however, unavoidably involves

the use of statistical theory.

Both BSI [36] and ANSI [ 38 ] describe the use of three methods though with minor

alterations among the two documents. These methods are summarized and presented in

the following paragraphs:

The Modified Rhyme Test (MRT) [ 39 ] is based on a dataset of 50 six-word

(monosyllabic English) lists. The structure of the words is based on a consonant-vowel-

consonant (CVC) sequence although CV or VC structures are also present in some

instances. Words within a list are audibly related primarily in terms of a vowel sound that

is included in every word. Each word is initiated (or terminated) by the same consonant

phoneme or phoneme cluster and terminated (or initiated) using six different phonemic

elements thus, forming a list of six word elements e.g. bat-bad-back-bass-ban-bath.

Listeners are asked to identify the word spoken (out of six), usually within the context of

a carrier phrase. Results are based on the indication of errors in initial and final consonant

sound comprehension and can be analyzed in terms of the words correctly identified,

words not correctly identified and the frequency of repeated errors in specific consonant

comprehension situations.

The Diagnostic Rhyme Test (DRT) (Voiers 1977, cited in Steeneken [40]) is similar to

MRT but uses a list of 192 CVC words (monosyllabic English) arranged in 96 rhyming

25

Page 38: Final_1]2_ phd

Chapter 2 - Literature review

pairs. The words themselves audibly differ only in the initial consonant phoneme (e.g.

key-tea) and listeners are asked to identify the word spoken out of a pair, without the use

of carrier phrases.

The Phonetically Balanced (PB) word list test (Egan 1948, cited in Katz [41]) comprises

20 lists having 50 CVC syllables/words each. Here, the initial and final consonant in each

list appear in accordance with their frequency of use in these positions. The words within

a given list are presented in a different (random) order each time the list is used, while

every word appears within the same carrier phrase. The specific method normally

requires more training for listeners and talkers and is particularly sensitive in the S/N;

small changes in the latter suggesting larger changes in the intelligibility results.

Subjective methods are by far the most accurate and reliable type of measurement of

speech intelligibility. With the psychoacoustics of speech perception not yet being clearly

defined, the obvious advantage over their objective counterparts can be highlighted.

However, the testing procedure is considered difficult to implement and can result in an

expensive scheme. Consequently, the objective counterpart alternatives are normally

preferred.

2.4.2 Objective (Indirect) Measures of Speech Intelligibility

With computer based measures being an efficient alternative for assessing speech

intelligibility several parameters have been adopted in the process. A number of these

parameters have been discussed in section 2.2.2, most notably the Clarity and

Deutlichkeit indexes, EDT and SII that are considered a valuable indication of a space’s

acoustic characteristics. In the following section however the theoretical background and

the consequent method that is typically employed in a speech intelligibility assessment,

namely the STI method, is presented.

26

Page 39: Final_1]2_ phd

Chapter 2 - Literature review

2.4.3 The Modulation Transfer Function (MTF)

Audio signals such as speech or music that propagate within an enclosure will reach a

receiver position at various time delays given their reflected energy in addition to the

direct field. This typically results in a smoothing of their spectrum when compared to the

original signal, while at the same time interfering noise and other effects (e.g. echoes)

will alter the signal in a comparable manner. Comprising essentially a smearing effect (as

described by Houtgast et al. [42, 43]) the amount of reverberation (and unwanted effects)

relates to the extent of smoothness in the original envelope at the input.

A signal that incorporates a simple, well defined envelope can be used to excite the input

thus enabling a subsequent comparison to the envelope of the received signal, or output,

to quantify the smearing effect. However, long-established signals using step function or

pulse envelopes are unable under the MTF approach to produce a result that relates to the

subjective perception of intelligibility for situations where decay curves differentiate

substantially from strictly exponential. In this sense, an alternative approach for assessing

a room’s objective characteristics was suggested by Houtgast et al. [44], as previously

introduced within the field of optics (see Houtgast et al. [42]); the new approach

reintroduced in this case the use of a sine-wave modulated envelope with the aim to

obtain a closer correlation to subjective impression.

The theory was based on the assumption of an enclosure being a linear system.

Consequently, modulations are defined by the intensity envelope of the signal, given that

on this basis only, reverberation and/or interfering noise will affect solely the depth of

modulation of a sinusoidal signal without changing its shape [13]. With the modulation

being dependent on the modulation frequency applied, see Table 2.1, a transmission path

can be quantified by the decrease of the modulation depth, by way of a comparison of the

test signal’s modulation index, mi, to the modulation index at the receiver, mo as a

function of modulation frequency, Eq 2.14 (Figure 2.9).

( )o

i

mm F

m (2.14)

27

Page 40: Final_1]2_ phd

Chapter 2 - Literature review

This is defined as the Modulation Transfer Function (MTF) [42] and can be interpreted

in terms of an apparent S/N ratio as in Eq. 2.15, to form the theoretical basis for

derivation of a group of parameters, closely relating to speech intelligibility [42, 43, 13].

( )

10 log1 ( )

Appm F

SNR dBm F

(2.15)

MTF correctly accounts for the influence of reverberation, distinct echoes and interfering

noise, while effectively forming a low pass filter in its domain.

Later publications by the authors of MTF have also described refined versions with

applications to telecommunications systems [45], and an extended theoretical approach for

use of the method as a prediction tool in room acoustics, utilizing information from

design specifications (e.g. Room volume and RT) [46]. Calculation of results in this case

is in line with statistical room acoustics where m(F) can be mathematically derived via

the RT (T) and S/N (comprised of the total reverberant level of speech over the noise

level at the receiver position), as shown in Eq. 2.16.

1/22

1/ /10( ) 1 2 1 10

13.8S NT

m F F

(2.16)

where F is the modulation frequency applied. The two terms in the second part of the

equation relate independently to the effect of reverberation and interfering noise.

In close succession, Plomp et al. introduced their approach for the prediction of MTF

using an image source computer model [47], while shortly after Rietschote et al. published

a paper in a comparable context, using ray tracing models [ 48 ] (see section 2.7 for

computer simulation). Schroeder [49] had earlier presented his view on obtaining the MTF

through a Fourier transform of the squared impulse response for linear time-invariant

systems. This formed the basis for use of the MTF in computer modeling and speech

applications in particular, based on its main derivative the STI. This single figure of merit

formed the means to establish the efficiency of the different approaches as described,

when compared to subjective methods of assessment; commonly PB scores have been

used. The STI method is presented in detail in the following section.

28

Page 41: Final_1]2_ phd

Chapter 2 - Literature review

Figure 2.9. Input/Output comparison with respect to modulation depth and resulting MTF spectrum

Octave Band (Hz) 125 250 500 1k 2k 4k 8k

Modulation frequency, f

0.63 m

0.8

1

1.25

1.6

2

2.5

3.15

4

5

6.3

8

10

12.5

Table 2. 1. Matrix used for the determination of MTF in the 14 modulation frequencies and seven octave bands

29

Page 42: Final_1]2_ phd

Chapter 2 - Literature review

2.4.3.1 The Speech Transmission Index (STI)

The STI methodology is based on establishing the MTF values, m(F), for 98 data points.

These are obtained from a combination of 14 modulation frequencies in the range of

0.63Hz to 12.5Hz (Table 2.1), using 1/3 octave intervals, and seven octave bands in the

range of 125Hz to 8kHz, giving a typical range for speech as demonstrated in figure 2.8.

On the basis that the phonemes comprising speech are characterized by their distinctive

frequency spectrum, speech clarity requires that the spectral differences of the phonemes

are preserved [13]. The spectral differences of a signal when compared at the input and

output of a transmission system (considered as a filter) are characterized by the product

of the MTF measurement for the seven octave bands. Unwanted effects such as distortion

of the (speech) signal from reverberation or interfering noise result in a reduction of the

sine wave modulated envelope depth thus, the effect on the signal is reflected on the

change of these fluctuations within the envelope function. As such, the MTF values

enable derivation of a single parameter directly relating to the subjective impression of

speech intelligibility [42, 50, 51], namely the STI.

Following from Eq. 2.15, calculation of the STI involves the determination of

transmission indices (TI) from the apparent S/N ratio, specific for octave band k and

frequency f as shown in Eq. 2.17. As STI is linearly related to S/N ratios in a 30dB range

from -15dB to 15dB [13], all related S/N ratios are adjusted to fit within the latter range

and thus allow for STI values in the range of 0 to 1.

,

,15

30

k fk f

SNRTI

(2.17)

The 14 transmission indices, obtained for each octave band, are then averaged to give the

modulation transfer index (MTIk), specific for octave band k, Eq. 2.18.

14

,

1

1

14k k

f

fMTI TI

(2.18)

While the final STI value is obtained through a weighted summation of the modulation

transfer indices for the seven octave bands, the revised form of the latter (STIr) [50]

requires a number of corrections in the calculation process to account for auditory

30

Page 43: Final_1]2_ phd

Chapter 2 - Literature review

masking and reception threshold. With the methodology to this stage remaining unaltered,

weighting and redundancy corrections are applied as follows:

Initially, the intensities of octave band specific audio masking effects (Iam,k) are

determined using Eq. 2.19

, 1am k kI I am f (2.19)

where 1kI is the intensity of the masking signal in band k -1 i.e. an octave lower than k,

amf is the auditory masking factor related to the absolute reception threshold by means of

lower limit of the masking noise level within each octave band (Irs,k) and dependent on

masking signal level (as defined in BS EN ISO 60268-16 [13]).

Applying the corrections for auditory masking and reception threshold gives the

corrected version of m, Eq. 2.20

, ,, ,

'k

k f k fk am k rs

Im m

I I I

k (2.20)

where is the modulation index for octave band k and frequency f, is the

corrected index and

,k fm

,'k f

m

kI is the signal’s intensity in octave band k.

Subsequently, the effective S/N ratio for octave band k and modulation frequency f takes

a revised form to represent the corrections made, as shown in Eq. 2.21.

,

,,

'10 log

1 '

k fk f

k f

mSNR dB

m

(2.21)

Using the corrected S/N ratios, updated TIk,f and subsequently updated MTIk values can

be derived. Ultimately, the revised speech transmission index, STIr, is obtained through a

weighted summation of the modulation transfer indices for the seven octave bands and

the corresponding redundancy corrections (see BS EN ISO 60268-16 [13]), as shown in Eq.

2.22

7 6

( 1)

1 1

r k k k k k

k k

STI MTI MTI MTI

(2.22)

31

Page 44: Final_1]2_ phd

Chapter 2 - Literature review

where k represents the octave weighting factor and k represents the redundancy

correction factor related to the contribution of adjacent frequency bands. As the particular

corrections are (speaker) gender depended, optimal gender related weighting and

redundancy factors along with threshold corrections are given in BS EN ISO 60268-16 [13].

STIr has been found to give marginally better results (4-6% in terms of CVC word scores) [52] when compared to the 1980 version. Results are currently rated as shown in the STI

scale, relating STI scores to the subjective perception of intelligibility (Table 2.2).

Table 2.2. STI scale and equivalent subjective perception of speech intelligibility (current)

STI 0 - 0.3 0.31 - 0.45 0.46 - 0.6 0.61 - 0.75 0.76 - 1

Subjective perception of intelligibility Bad Poor Fair Good Excellent

What is apparent is that the scale could prove to be insufficient to cover the variety of

conditions in which its use may be constructive and is rather linked to a more general

approach on the topic. As has been argued by a number of researchers [53 , 54 , 55 , 56 ],

alternative or perhaps additional scales might be more appropriate for different cases,

including assessments based on non-native or hearing impaired listeners and cases where

the application scope is rather different. In a typical example, environments concerned

about safety generally require an STI of 0.50 to comply with the safety code for

emergency announcements. In such cases a potential result of 0.60 would clearly qualify

the transmission system for its intended use and render it fundamentally successful,

however, considering the current scale the outcome would simply be rated as ‘fair’.

Moreover, a condition that might me labeled as ‘good’ may in fact be only ‘fair’ for a

hearing impaired or non-native listener (see van Wijngaarden et al. [53, 54]). Considering

analogous examples, alternative rating systems emerge as necessary for a better

interpretation of results in an appropriate context for individual cases.

32

Page 45: Final_1]2_ phd

Chapter 2 - Literature review

Taking another approach, STI could also be described as a metric assessing a worst case

scenario (i.e. potentially underestimating intelligibility scores in some cases) on the basis

of the monaural character of measurements that disregard any advantages of binaural

hearing in speech comprehension. This concern is particularly relevant for situations

where speech and interfering noise originate from different directions. To address this

gap a binaural model has been proposed [57, 58] based on a simplified approach of the

binaural hearing mechanisms (see section 2.5). The study confirms the validity of a

‘better ear’ approach from binaural measurement set ups, using the better result (per

octave band) from each ear. The specific method was shown to be comparable to the

binaural model of STI and thus constructive in the interpretation of results for a number

of conditions.

In concluding this section, a number of limitations apply to the STI method due to the

form of analysis (inherent limitations of the MTF theory) and typical test signals that are

used. The method should not be used to assess transmission channels that introduce

frequency shifts or frequency multiplication, or channels that include vocoders [13]. A

speech-based signal is recommended in the literature [59] in special cases, given advances

in developing such stimulants for efficient use, see van Wijngaarden et al. [59] and

Drullman et al. [60].

2.4.3.2 The STI Public Address (STIPA) method

The STIPA method is effectively a simplification of the STI, aiming in the assessment of

Public Address systems. STIPA applies 12 modulation frequencies, two to each of the

octave bands considered, see table 2.3. For male speech the octaves centered at 125Hz

and 250Hz are combined, although not always the case in relation to test signals as

compiled by different manufacturers [18], while the former band is simply ignored for

female speech. Frequency weighting relating to talker gender is adopted, see table 2.4.

Each frequency band is modulated by the two corresponding modulation frequencies

simultaneously thus, increasing the negative effect of fluctuating or impulsive noise [13].

However, the method does not require for the measurement stimulant to be synchronized

with the receiver (due to the use of spectrally shaped, pseudorandom noise that

33

Page 46: Final_1]2_ phd

Chapter 2 - Literature review

incorporates the signal modulations) therefore, simplifying the implementation technique [61].

Octave Band (Hz) 125-250 500 1k 2k 4k 8k First modulation frequency (Hz) 1 0.63 2 1.25 0.8 2.5

Second modulation frequency (Hz) 5 3.15 10 6.25 4 12.5

Table 2.3. STIPA modulation frequencies

Octave Band (Hz) 125-250 500 1k 2k 4k 8k

Male α 0.127 0.23 0.233 0.309 0.224 0.173

β 0.078 0.065 0.011 0.047 0.095 -

Female α 0.117 0.223 0.216 0.328 0.25 0.194

β 0.099 0.066 0.062 0.025 0.076 -

Table 2.4. Weighting factors adopted by STIPA

Limitations of STIPA, as described in BS EN ISO 60268-16 [13], dictate that the method

should not be used for public address systems that introduce vocoders, frequency shifts or

frequency multiplication (though it has been shown by Mapp [17] that measurements using

a STIPA signal are less or not affected by frequency shifters). Also, its use should be

avoided with systems that show strong nonlinear distortion or when impulsive

background noise is present. It should be noted that STIPA (and STI) cannot correctly

account for poor P.A. frequency response.

2.5 Implications of binaural hearing

The advantages introduced by binaural hearing as a consequence of the human head

shape and distance between the ears are based on a group of functions, including cross

correlation processing of the signal at the two ears (see IACC in section 2.2.2.2), among

others. Two effects in this sense, the Interaural Time Difference (ITD) and Interaural

Level Difference (ILD), are mainly responsible for the widely acknowledged, see

Colburn [ 62 ], localization abilities of a binaural listening system, primarily for the

horizontal plane on this basis.

34

Page 47: Final_1]2_ phd

Chapter 2 - Literature review

ITD is a function of time delay, as measured for sound arriving at the two ears, due to the

relative position of the source with regard to the listener (Figure 2.10 (I)). It is based on

the phase shift caused by the interaural time delay, thus the function is employed by the

human perceptual model at lower frequencies. An upper frequency limit relates to the

directional angle of the source, e.g. 743Hz for a 90˚ angle. Above this frequency the ear

simply cannot resolve the phase differences, while the limit increases with more acute

source-listening angles [ 63 ]. ILD is based on the level difference at the ears due to

shadowing effects caused by the head (Figure 2.10 (II)). Given that objects whose size is

small compared to the wavelength will appear acoustically invisible, the function is

effectively introduced into human perception over a threshold frequency value, relating to

the size of the listeners head.

Figure 2.10. Effects of binaural hearing, I) ITD, based on phase difference (for lower frequen

II) ILD, based on level difference (for higher frequencies) cies),

Bronkhorst and Plomp [ 64 , 65 ] studied the advantages of binaural hearing in noisy

environments (in terms of speech intelligibility) considering ITD and ILD, for normal

and hearing impaired listeners. Their results confirmed the advantage of a binaural

system (over monaural) for both categories of listeners, showing however a reduced

effect for hearing impaired listeners. It was also shown that the advantage of normal

hearing subjects over the hearing impaired were mainly based on ILD, the ITD benefit on

intelligibility being equal for both categories. Hawley et al. [66] support the advantage of

binaural hearing for speech intelligibility estimating the latter in a value of up to 7dB in

terms of speech reception threshold level for two or three interferers (speech or time-

reversed speech). An advantage of up to 4dB was found when using a single interferer of

a number of different types.

35

Page 48: Final_1]2_ phd

Chapter 2 - Literature review

It is evident that the contribution of a binaural system on speech intelligibility should be

considered, due to a potentially significant advantage over a monaural basis. This

realization reflects recent advances in the measurement of STI (i.e. binaural STI),

comprising an attempt to better model a binaural perception and thus obtain a better

prediction of intelligibility in terms of STI (see section 2.4.3.1). In recent studies [62]

moreover, Colburn argued that an additional parallel mechanism in binaural hearing

exists based on the combination of binaural information to extract ‘monaural’

information such as the overall level per frequency band, among others. While the

function of sound perception is generally not yet clearly defined for the wide range of

listening environments, Colburn’s suggestions are supported to an extent by the studies

on the binaural STI model, being potentially an approximation of the parallel mechanism

as described; notably, the better ear approach in this case as described by Wijngaarden [58],

essentially comprising a utilization of ‘monaural’ information from a binaural dataset.

2.6 Sound fields in enclosed spaces for speech intelligibility

Intelligibility can vary significantly even for spaces that are closely related in terms of

architectural distinctiveness. Several factors that have an effect have been identified

through a detailed assessment of the involved space’s architecture (see section 2.6.1).

Conditions, however, become more complex with the introduction of a sound

reinforcement system (see section 2.6.2) as the transmission channel between the source

and receiver is matched with additional components. Subsequently, the parameters

relevant to the resolution of a problem can be separated in two groups, the first one

relating entirely to architectural design and the second addressing the implications

introduced by a public address (P.A.) system. The following sections discuss the

objective reasoning, based on these categories, behind the resultant sound field for natural

acoustics and sound system assisted conditions.

36

Page 49: Final_1]2_ phd

Chapter 2 - Literature review

2.6.1 Sound fields in natural acoustics

Several parameters, involving a space’s architecture, need the attention of the designer

during the commissioning process. Considering intelligibility issues in particular, the

intrusion of extraneous noise that inevitably interferes with speech signals commonly

comprises one of the main problems faced. High background noise level (BGNL) can

generate masking effects in the sound field. The negative contribution has been shown to

be dependant on the direction, with respect to the listener, of both the speech signal and

interfering noise and subsequently their angle separation, see Bronkhorst and Plomp [64,

65]. Low values of the resultant S/N ratio are mainly responsible for impaired acoustic

performance [67].

The significance of BGNL in terms of S/N has been extensively reported in the literature,

particularly for educational spaces that to a large extent share common characteristics in

their design, under the classroom acoustics category (see section 2.6.3). S/N ratios

however are subject to additional parameters since reverberation, high level echoes and

very early or late reflections can also be considered as unwanted elements. It can be

pointed out that reverberant sound usually carries content of lower frequency when

compared to the original signal. As low frequency noise is (also) considered a more

effective masker at high levels or when noise is louder than speech, reverberation time

and energy ratios generally appear as key parameters in describing a space. Nonetheless,

in terms of acoustically small spaces, such as typical classrooms at low frequencies,

conditions are somewhat altered given that wave effects dominate the sound field.

Acoustic resonances are generally separated and reflect more of the room’s individual

acoustics. Sound decay is more profoundly frequency depended thus, reverberation time

alone does not comprise in such a case an adequate descriptor of the acoustic conditions [ 68 ] (i.e. additional measures are needed to characterize the room and obtain a better

description of the acoustical environment).

Furthermore, very early reflections can result in blurring speech consequently impairing

subjective perception. Echoes, also responsible for a negative effect on intelligibility, are

attributed in analogous functions. Energy from previously uttered syllables arriving at a

37

Page 50: Final_1]2_ phd

Chapter 2 - Literature review

later time when compared to direct sound can mask or obscure the sound of subsequent

syllables; the time delay and level of an echo being key factors in the process. Haas in

earlier studies using a single echo [69] showed that for a time delay of 1-30ms reflections

will pleasantly add up to the sound source and essentially not be perceived as an echo,

while longer delays will disturb the subjective perception of speech. However, it was

highlighted that the particular function is a dynamic one, depending on the RT, where a

longer RT will increase the immunity of the system to longer time delays. It is worth

mentioning that typically a 50ms time limit is used in speech applications to define useful

energy, while it has been shown that although not necessary, an 80ms limit can also on

occasions give acceptable results [28, 70] i.e. correlate with subjective intelligibility testing.

A number of additional parameters beyond the scope of the research should be

considered including focusing effects from concave planes and audible room modes,

functions that can potentially distort speech signals. Parameters, moreover, relating

purely to the subjective side of the various testing schemes would extend to the listener’s

acuity and talker’s gender, enunciation and rate of speech [71].

2.6.2 Sound system assisted sound fields

A number of additional factors exist, addressing specifically the technical characteristics

of a system and directly affecting intelligibility. The following paragraphs discuss the

topic, while effectively constituting a complement to section 2.6.1.

In assessing a transmission channel, the various components of a sound system such as

microphones, amplifiers, loudspeakers and their related frequency responses form

accordingly the overall system response. This comprises a characteristic that is central in

any sound system design. In psychoacoustic terms, an adequately flat frequency response

is essential [72] as the loudness quality of a signal that is directly affecting intelligibility is

subject mainly on the technical characteristics of the reproduction system; this assumes

an initially good quality signal. A wide response would also be preferred; however,

speech doesn’t need to sound natural in order to be intelligible. In the case of a high

quality system the ideal response would be in the range of 80Hz to 10kHz for male voices

38

Page 51: Final_1]2_ phd

Chapter 2 - Literature review

and accurate consonant reproduction [72]. Lower frequency response, below 80Hz, can be

cut off as it falls out of the speech range, while having by definition the potential to result

in a particularly destructive masking at high levels [72].

The directivity factor of microphones and especially loudspeakers has an important role

in the sense of minimizing meaningless reception (for microphones) and dispersion (for

loudspeakers) of sound thus, allowing more effective control of reverberation. In the case

of loudspeakers, S/N can be optimized by adjusting the radiation angle/direction,

therefore more effectively controlling the direct and near fields. Taking another approach,

the directivity and number of the loudspeakers will directly affect the coverage in terms

of SPL of a given system. A uniform coverage and consequently comparable

performance for a number of positions within a space is an essential characteristic in a

well performing space.

Considering loudspeaker interaction, a phenomenon that can be naturally generated in

rooms is comb filtering. The effect is a common filtering technique used by sound

engineers to attenuate (or emphasize) the multiples of a given fundamental frequency

through phase cancellation, in effect, a delay process controlling the gain at the feedback

path. Its significance lies in the fact that interacting loudspeakers can inadvertently affect

the character of the produced sound field, however the effect is generally unsought for

speech applications [72].

Distortion is an additional function affecting signals passing through a transmission

channel as every electrical component has certain restraints and limitations in its domain.

Given that application extends beyond threshold conditions, the signal passing through

the system becomes subject to distortion effects. In the case of audio equipment, restrains

commonly occur for high signal levels where amplifier, microphone or loudspeaker

clipping will result in distorting a signal. The implications of the effect are most obvious

in this case, as additional noise is introduced to the system (i.e. additional masking

potential), therefore compromising speech intelligibility.

39

Page 52: Final_1]2_ phd

Chapter 2 - Literature review

Intentional clipping and gain compensation in terms of signal processing i.e. compression,

forms a common technique in sound applications. However, its use on speech signals, as

suggested in the literature [73], is limited to situations with unfavorable S/N ratios (below

0dB) and low fidelity systems.

2.6.3 (University) classroom acoustics

Classroom acoustics is a complex topic with a particular focus on speech intelligibility

issues. Given the objective reasoning behind unintelligible speech in rooms, a significant

amount of research has been undertaken on the topic. Following from previous sections,

Houtgast established in his 1980 study for typical classrooms [67] that a S/N of a

minimum 15dB, with regard to the (speech) signal, is needed for good speech

communication conditions. Bradley [70] and Sato [74] supported the assumption that the

control of BGNL (consequently the resultant S/N) is of more importance than the effects

of room acoustics on signal propagation. The latter study also highlighted the need, given

the Lombart effect, for having very low noise levels in unoccupied conditions so as to

subsequently achieve acceptable levels in occupied conditions, while Hodgson [ 75 ],

though mainly based on RT products, emphasized the need to account for occupancy

effects at the design stage. Bradley et al. [ 76 ] in his studies on elementary school

classrooms confirmed the need for 15dB S/N for near ideal conditions while however

also pointing out that an acoustical environment appearing acceptable for adults may in

fact be less adequate for younger listeners. A range of increasing S/N, with decreasing

age of the listener, was found to be necessary to achieve comparable adult listening

results. Statistical analysis of the same data for a range of classrooms revealed a

maximum acceptable BGNL of 34dB, being very close to current British [ 77 ] and

American [ 78 ] recommendations for classroom design. Previous studies on a smaller

sample of classrooms resulted in a 30dB BGNL value, although accounting for hearing

impaired students and all age groups. Bistafa et al. [79] had previously suggested a BGNL

of 25dB and 20dB below speech levels at 1m in front of the talker, respectively for ideal

and acceptable conditions. Based on their statistical analysis from a range of classrooms

the recommended BGNL in combination with their optimum suggested RT (0.4-0.5sec)

would be expected to result in excess of 15dB S/N within typical classrooms.

40

Page 53: Final_1]2_ phd

Chapter 2 - Literature review

Hodgson developed in 1999 [80] a long term BGNL (and speech level) prediction method

using an empirical approach. Different methods can potentially be employed in

measuring absolute values. Shield et al. [81], having established in their study of primary

schools that the main source of noise in a classroom is the students themselves, measured

noise levels using LAeq as it was found to best represent classroom activity noise. The

average of multiple LAeq, 2min was believed to give a good indication of the fluctuating

noise in classrooms during the day, thus held as a highly efficient methodology. Mapp [82]

on another approach suggested an octave band L10 for the measurement of BGNL given

that the spectrum of the noise would have a significant effect in assessing the octave

contributions of the speech signal to intelligibility. It follows that new legislation would

need to account in such case for a finer approach in terms of providing more detailed

recommendations for optimum BGNL i.e. recommend values for individual octave bands

rather than as a general level.

In terms of RT, Bradley in 1986 [88] performed a comparative analysis of intelligibility

metrics in classrooms and concluded a 0.4-0.5 second RT as an optimum value. It is

worth noting that additional parameters such as EDT, Ts and early-to-late ratios could be

predicted from the RT within ±1dB for typical classrooms i.e. comparatively small sized

rooms. The same RT range was later supported by Bistafa [79], again for typical

classrooms and in combination with a recommended BGNL, while Sato [74] who

measured occupied rooms found that reverberation values were about 10% greater when

compared to the counterpart unoccupied conditions. Other researchers however [83, 84]

showed that defining an optimum value of RT is a dynamic process highly depended on

the speaker-listener-noise source relationship. On this basis it was established that an RT

of zero or near zero is optimum when the noise source is further than the speaker and non

zero when the noise source is between the speaker and listener for both normal and

hearing impaired subjects i.e. some reverberation is needed to enhance the speech signal

in adverse conditions.

41

Page 54: Final_1]2_ phd

Chapter 2 - Literature review

Bradley in his 2003 study [85] highlighted the importance of early reflections for efficient

speech communication for both normal and hearing impaired listeners, while also

suggesting the Early Reflection Benefit (ERB) measure for the assessment of rooms in

this context. An increase of up to 9dB was found in the effective S/N, an effect to be

considered in any design. A later study by Yang [84] supported the results and suggested

that an enhancement of early reflections within a sound field rather than a stricter control

on RT would be more beneficial for speech intelligibility. On another approach, Sato [74]

suggested that a design should also aim for a reduction of late-arriving reflection levels

for more distant listeners, e.g. at the back of a room, to levels well below the direct sound

and early reflections. This was believed to be a more direct/efficient methodology,

compared to RT control, in ensuring adequate speech communication conditions.

2.6.4 Relations between measures of speech intelligibility

The interrelations among measures for speech intelligibility and their individual relation

to the latter have been studied in the literature on the basis of quantifying their

mathematical relation, to enable inter-comparison and/or ‘translation’ of results, and to

establish the potential correlation to intelligibility i.e. efficiency/quality of measure.

Bradley [27, 28] and Bistafa [86] have studied these interrelations for a range of conditions

and established a mathematical connection in a number of cases. Most notably, the C50

ratio has been found to be linearly related to STI for simulated controlled natural

acoustics conditions, incorporating negligible background noise. An important side

outcome from this work is the derivation of a just noticeable difference (JND) relating to

the subjective perception of level changes to a sound field, subsequently enabling parallel

data processing methodologies that allow for an interpretation and further processing of

results using either of the measures. It should be noted nonetheless that in the C50-STI

relation individual errors of up to 0.1 STI within a given dataset were found to occur for

sound system assisted conditions, as previously established by Mapp [ 87 ]. This was

deduced from experiments covering a wide range of acoustic environments.

For small rooms, such as classrooms, using either a 50ms or 80ms time limit for defining

early energy can be expected to result in similar trends when compared to STI [28] thus,

42

Page 55: Final_1]2_ phd

Chapter 2 - Literature review

implying that for speech applications an efficient assessment can be made using either

limit. It was established by Bradley [88] that EDT and measures using the 50ms limit (as

opposed to 80ms) could be reasonably accurately (within ±1dB) predicted from the

reverberation times of a small room. The outcome concerning EDT is supported by Mapp [87] under low RT in sound system assisted conditions. The impact of the latter in the

context of the current study is that either RT or EDT could be used to describe a (small)

room; thus, giving a significant advantage and, as such, further confidence in the

computer simulation of an enclosed space, see sections 5.3.1 and 5.5.2.

The most widely used metric for speech intelligibility assessment, STI, has been

subjected to strong criticism on a number of occasions and on different grounds [89, 18, 90].

Nonetheless, more commonly the measure has been found to give a high correlation to

subjective testing results; an outcome that has been extensively reported in the literature,

see section 2.4.3.1. Thus, STI is generally acknowledged as an efficient methodology.

As a final point and given the multitude of available intelligibility measures, a direct

comparison of the outcomes from different methods would enable an assessment on

common grounds where necessary. In 1998 the International Electro technical Committee

(IEC) published the Common Intelligibility Scale (CIS) [ 91 ], relating a number of

measures on a single scale (Figure 2.11) while also including common subjective

methods (see section 2.4.1 for more details on subjective assessment methods).

The result is given as a CIS value, ranging from 0 (unintelligible) to 1 (excellent

intelligibility) and can comprise either a direct statement or form the basis for a later

translation of the outcome of a given methodology (original result), to describe the

assessment product.

43

Page 56: Final_1]2_ phd

Chapter 2 - Literature review

Figure 2.11. The Common Intelligibility Scale as determined by the IEC [91]

2.7 Sound field simulation

Sound propagation within enclosed spaces is the main focus in sound field simulation, as

used in this study. Widely used approaches based on the geometrical acoustics

framework are discussed in this section in the context of related simulation models,

applicable for the prediction of room acoustics and subsequent auralization techniques.

2.7.1 Geometrical acoustics

Primary functions in room acoustics such as reverberation or clarity are defined as time

domain effects. As such, the concept of geometrical acoustics appears as a suitable

framework for describing and simulating a sound field. The approach further enables a

relatively simple calculation of sound pressure levels and reverberation times [68], also

appealing to a straightforward visualization of sound propagation.

In geometrical acoustics sound is interpreted as rays or sound particles carrying sound

energy. Travelling in straight lines, rays or particles are reflected off the room boundaries

44

Page 57: Final_1]2_ phd

Chapter 2 - Literature review

losing energy from each reflection depending on the acoustic properties of the boundary

hit. The intensity of the ray also decreases by 1/r2 for the distance travelled, given that

each ray corresponds to a small portion of a spherical wave (where r is the distance from

the centre of origin). A high number of emission angles must be used to efficiently

simulate a source, the emission pattern relating to the source directivity. The number of

propagation paths hitting a receiver position can be used to determine the energy received

at that point in the timeline, thus facilitating an estimation of the impulse response.

In geometrical acoustics the description of the sound field is reduced to energy, transition

time and direction of rays [92] thus the theory is generally valid if broadband signals are

used and the wavelengths of the frequencies considered are much shorter than the room

dimensions; i.e. rooms are acoustically large. While this prerequisite is normally satisfied

in room acoustics, a more stringent approach would involve the use of a cut off

frequency. The latter, also known as Schroeder limit [93] can be defined in the metric

system as:

( )

62000s Hz

Tf c

A V (2.23)

where c is the speed of sound, A is the absorption units, T is reverberation time and V is

the room volume.

The Schroeder limit relates to the density of eigenfrequencies in a room and enables a

subdivision of the room behaviour into different regions/frequency ranges. At frequencies

exceeding fs the density of eigenvalues is high enough to assume a strongly overlapping

modal distribution. As such, it is a convenient way to characterize a room as

(acoustically) small or large in terms of its prospective frequency response, thus

indirectly evaluating the applicability of geometrical acoustics theory.

Geometrical acoustics based simulation methods can be divided into two categories:

stochastic and deterministic methods, from which the two main room acoustics

simulation algorithms i.e. ray tracing and image sources respectively, originate.

45

Page 58: Final_1]2_ phd

Chapter 2 - Literature review

2.7.1.1 Ray tracing (stochastic)

Digital simulation methods of room acoustics were first reported in the 1960s by

Schroeder et al [94]. The first publication detailing a ray tracing technique was published

in 1968 by Krokstad et al [95], followed by a considerable increase in general interest and

consequently a rapid development of the field in later years.

In ray tracing sound is radiated as clusters of energy particles forming rays, in effect

simulating for the purposes of post processing, temporal delta functions. A sound source

(point source) is defined in a three dimensional space emitting an adequately large

number of rays in any direction, depending on the source directivity pattern. The rays,

carrying a certain amount of energy will propagate in the room, bouncing off its

boundaries when hit. On impact, rays are reflected to a different direction depending on

the type of reflection occurring since wave scattering is considered, while each impact

with a boundary reduces the particle energy according to the boundary’s absorption

characteristics. Propagation continues until a ray’s energy is eliminated by surface and air

absorption or until a predefined ray truncation time limit is reached (figure 2.12).

A receiver is defined as a spherical volume in the space for the simple reason that the

probability of a ray intersecting a point receiver, thus triggering detection, is zero (a

surface receiver is also possible). Receiving positions sense particle movement through

the volume while the energy (intensity) carried and time elapsed from emission from the

source are also considered. When ray tracing is complete the impulse response at a

receiver position can be derived by counting of events i.e. rays having intersected the

detection volume, in combination with the timing and energy history information.

The density of ray detection events (carrying timing and angle of incidence information)

is directly related to the temporal resolution of the simulation. On this basis, ray tracing

techniques cannot efficiently meet the requirement for an acceptable sampling rate for

audio signal processing and auralization [92]. The number of rays used is, furthermore, a

factor closely connected with the resulting simulation accuracy and run time required. A

disadvantage of ray tracing thus emerges as the increased computation time, resultant

46

Page 59: Final_1]2_ phd

Chapter 2 - Literature review

predominantly from the increased number of rays required for an accurate simulation,

particularly of large spaces. Considering a prediction of SPL only, earlier research has

shown that the source-receiver distance is of primary importance in determining the

number of rays required for an accurate prediction [96]. For multiple sources and evenly

distributed receiver positions in the areas of interest it was furthermore demonstrated that

a significantly reduced number of rays could be used, reducing the required run time by

the same factor. The number of rays required has been shown [96, 97] to be more important

for sound distribution parameters than for SPL while in addition being frequency

dependent, requiring more rays for higher frequencies [97]. Accordingly, integral

parameters such as clarity, strength or definition could be quickly estimated if necessary,

but only if assuming that reference to the absolute sound field characteristics is not

essential in the assessment [92].

Figure 2.12. Ray tracing principle and example sound propagation paths

Given that an acceptable level of simulation accuracy is reached, an increased

computation time by means of additional rays is not justified except when given a target

to improve on the stochastic element in the prediction outcome. Additional computing

effort cannot compensate for the inherent limitations imposed by the simulation

algorithms, but can somewhat assist in a process of averaging out of potential errors.

Algorithmic limitations occur due to the absence of models for wave effects such as

47

Page 60: Final_1]2_ phd

Chapter 2 - Literature review

diffraction and interference, limiting the efficiency of a method, nonetheless

independently of the computation load. While it has been shown that the resultant sound

field approximation is adequate for the purpose of assessing room acoustic conditions via

simulations [92], additional limitations would relate to the model input data that are only

approximately defined. As such, room geometry and boundary acoustic characteristics

can have a significant impact on the derived results.

2.7.1.2 Image source method (deterministic)

The image source method has been efficiently used to generate synthetic reverberation in

small rooms by Allen and Berkley [98] in 1979. Nonetheless, their method was generally

restricted to simple room geometries i.e. rectangular rooms, given the intention to

establish a practical and easy to use method. Borish [99] in 1984 published his work based

on the Allen-Berkley model, extending the image source model to arbitrary polyhedra.

Image sources are created by mirroring the source sound ray at the reflecting plane, as

illustrated in figure 2.13. The mirror image creates a new virtual sound source (S΄)

emitting/reflecting the original ray specularly i.e. at a direction matching the angle of

incidence. The process is repeated for all new sources creating higher order reflections

until a predefined order of image sources is reached, see figure 2.14. The ray intensity is

reduced upon reflection from the incidence wall to 1-α (compared to the impact intensity)

to account for the energy lost due to surface absorption. A deterministic energy spreading

by distance is further considered.

Figure 2. 13. Image source principle

48

Page 61: Final_1]2_ phd

Chapter 2 - Literature review

The reflection order chosen can be seen as an analogy to the maximum truncation time

defined in ray tracing. Nonetheless, the specific aspect (indirectly defining a truncation

limit) is more critical in the image source method since the number of new virtual sources

increases exponentially [92]; each of the higher order images is mirrored by all walls in the

enclosure, however not all created images are valid. Accordingly, the predefined

reflection order limit significantly affects the computation time for the simulation, thus

imposing a considerable efficiency limitation on the method. The expression N(N-1)i-1

gives the number of images for reflection order i (for i ≥ 1) and N room surfaces.

Allowing for a reflection order limit of i0, the total number of images is given by adding

these expressions up to the highest reflection order [1], see Eq. 2.24:

0

0

( 1)( )

2

iNN i N

N

1

(2.24)

The image source method assumes purely specular reflections, thus the source can be

traced back from the receiver i.e. S-R positions are interchangeable. After image sources

are created an audibility test has to be performed in order to determine the relevance of

every image in relation to a specific receiving point; the S-R reciprocity characteristic is

utilized on this basis. In effect, the route of the image sources, specific for each S-R

combination, is traced back from the receiving point to the source to ensure that the

sequence of surfaces impacted is the reverse of the original path [100] and thus images are

valid, i.e. images will be able to reach the receiver. Pre-processing techniques are used to

exclude images that are not relevant on this basis, see Vorländer [92], and thus reduce the

computation load.

An impulse response can be constructed for audible images at a receiving point

considering the time delay of each contribution and associated amplitude. Given a

superposition of all significant contributions at the point of interest, the distance of the

image (virtual) sources to the point receiver, see figure 2.14, will determine the time

structure of the impulse response and partially the event amplitude; the latter considers in

addition the absorption coefficient of the surfaces crossed by the straight line connecting

49

Page 62: Final_1]2_ phd

Chapter 2 - Literature review

the virtual source to the receiver. Under the assumption that all sources (original and

virtual) emit the same signal simultaneously, the estimation of the impulse response is

performed by summing the energy contributions of all rays impacting the defined point

receiver, including direct sound from the original source [1]. A considerable advantage of

the method over ray tracing can be highlighted at this stage, as there are no limitations

relating to the temporal resolution of the system thus imposing no restrictions on the

choice of sampling rate. Consequently, audio signal processing and auralization can be

competently facilitated.

S (point) R (point)

Figure 2. 14. Image source and resultant sound ray examples in a room

The image source method is precise solitary for the case of entirely reflecting surfaces.

However, given a small angle θo of ray incidence i.e. considerably differentiating from

grazing incidence, it comprises a good approximation for the more common case of

absorptive room boundaries [92].

50

Page 63: Final_1]2_ phd

Chapter 2 - Literature review

2.7.1.3 Hybrid models (deterministic-stochastic)

Ray tracing and image sources have been presented in the previous sections. Focusing on

the advantages and disadvantages of the two algorithms and as such the range of

conditions applicable, a number of models have been developed as a combination of

both, see Vorländer [ 101 ], Dalenbäck (CATT) [ 102 ], Maercke and Martin [ 103 ], Naylor

(ODEON) [104]; hybrid methods appear as a logical advance for simulation tasks in room

acoustics.

Image source methods alone are considered insufficient for typical acoustic conditions as

wave scattering is not accounted for. Scattered/diffused energy has been found to

dominate the sound field after a few reflections, see Vorländer [92], thus comprising an

important aspect of room acoustic conditions. Hodgson [105] also presented evidence of

diffuse reflections in rooms, portraying effectively diffuse sound as a critical element in

computer simulations. Results are supported in the same context by Howarth and Lam [106] and three international round-robin projects on room simulation, see Vorländer [107]

and Bork [ 108 , 109 , 110 ], while Torres et al. [ 111 ] have further shown that the specific

characteristic is audible in the end result i.e. auralization. Nonetheless, using image

sources a high temporal resolution that facilitates good quality sampling rates for audio

processing can be achieved.

In contrast, ray tracing techniques offer effectively the opposite attributes, as scattering

can be included in the simulation, while however the temporal resolution of the

simulation system is limited.

Hybrid models were subsequently developed to facilitate a high temporal resolution and

inclusion of scattering effects. Based on the idea that a specularly reflecting ray tracing

procedure can be used to efficiently find audible image sources, i.e. a faster forward

(from the source to the receiver) audibility check, a faster simulation time compared to

classic image source methods is achieved. Different ray models include ‘cone’ and

‘pyramid’ structures, thus geometrically expanding during propagation. In this way it is

51

Page 64: Final_1]2_ phd

Chapter 2 - Literature review

ensured that the contribution of image sources is counted only once, while the latter is

estimated in a deterministic manner considering the distance from the source.

Hybrid models can be complemented by a stochastic element for the late part of the

sound decay (>100ms) to further improve the simulation run time. This is possible as the

level of detail involved in the early response part is not needed in estimating late

reverberation i.e. a high temporal resolution is unnecessary. The approximation provided

by a stochastic approach such as ray tracing is sufficient in this case, allowing for a

significant improvement in the simulation run time.

2.7.2 Auralization

An auralization can be considered as an end result since its generation is based on the

predicted, in this case, room impulse responses (RIR’s). The term is used to describe the

process and product of achieving an audible example of the acoustics of a space by

means other than directly experiencing the actual space or listening to a recording.

Different methodologies can be used to generate an auralization; nonetheless, hybrid

computer models are a practical and efficient approach in this sense.

The use of auralization comprises a wide range of applications such as for training and

education purposes [ 112 ], most notably however as an efficient and direct way to

demonstrate the prospective result of an acoustics project to both Acoustics experts and

non experts.

An auralization is generated for a specific receiver model, commonly a human listener.

The predicted RIR has to be suitably processed to obtain a Binaural RIR (BRIR) at the

receiver position of interest, consequently enabling a convolution with anechoic material.

Nonetheless, a number of limitations apply in the auralization technique mainly relating

to the RIR prediction methodology, see section 2.7.1. Notably, the lack in some cases of

an accurate sound source model is considered an important setback [ 113 ] though the

problem is lessened for sources approximating a point source, such as a human talker.

52

Page 65: Final_1]2_ phd

Chapter 2 - Literature review

Similarly, the directivity and frequency response input required for loudspeakers is a key

factor in the process. A near field source response, as used in this study, is normally

required so as to better approximate sound propagation in small rooms, see section

5.2.1.2.

It was previously concluded that some differences between auralizations might be due to

the limitations imposed on the procedure, however it was assumed that some acoustic

properties as e.g. speech intelligibility are well represented [113]. Overall, up to date

auralization techniques appear to be capable of producing material that closely

approximates the actual conditions, also in terms of realism. This assumption is supported

by research undertaken by the author [114].

2.8 Summary and conclusions

This chapter reviewed key literature relating to the different aspects of this study.

Epigrammatically, impulse response (and related parameters’) measurement

methodologies, speech intelligibility, classroom acoustics and computer prediction

methods have been discussed.

Hesitation in the selection of the type of test signal that is most appropriate for a given

measurement session is a common condition. From the multitude of signals that have

been reviewed here it is evident that an exponential sine sweep and MLS are most

appropriate for the purposes of the study. However, a potential advantage in the current

context of MLS over sine sweeps (i.e. providing the option for level calibration) is not

required here. The current study is based on an approach considering the general potential

of a space thus, absolute speech levels are typically not used and are rather compensated

for during post processing, where necessary. In contrast, advantages of an exponential

sine sweep, such as the possibility to omit potential distortion artifacts from the

measurement system, are utilized here in the attempt to more efficiently control the

measurement conditions and provide a consistent basis for comparison of room

performance. The current methodology, as discussed in more detail in later sections,

53

Page 66: Final_1]2_ phd

Chapter 2 - Literature review

indicates that the measurement system used is capable of measuring the room’s actual

potential, not being limited to an extent by discontinuities of the type.

The parameters used to describe a space are another factor of ambiguity since not all

measures are relevant in a given context. RT alone is not an adequate descriptor of

conditions, particularly for small rooms, while speech intelligibility specific descriptors

are by definition liable to lead to misinterpretation of results. A number of parameters

could also be directly related to RT (for small rooms) thus reducing the value of

information in these terms for a room’s performance. Parts of this study aim to clarify the

interrelations of these parameters in connection to the influence of the acoustic

environment. A combination of parameters is typically used (as later discussed in more

detail), mainly RT, EDT, C, D, Ts and STI, to analyze the case studies and draw

conclusions on the topic.

For the STI case, a binaural model has been discussed in combination with a brief review

on the advantages of binaural hearing systems in the context of speech intelligibility.

A discussion / review on different studies considering classroom acoustics revealed that

the specific room category comprises a special group of spaces, given also the absolute

need for good acoustic performance, based largely on the ‘acoustically small’

characteristic i.e. longest wavelength considered exceeds the room dimensions. Given

this attribute, the different interrelations of the acoustic parameters have been discussed,

forming a basis for this study to further expand on university lecture rooms in particular.

Computer modelling and room auralizations are used in this study as a post evaluation

tool. Emphasis is given on model accuracy in terms of both direct numerical output and

auralization. For the purposes of the current study, a ray tracing model combined with the

image source theory for the initial parts of the prediction (hybrid deterministic-stochastic

model emerges as a consistent/reliable methodology. Assuming an accurate auralization,

a particularly efficient means of demonstrating results to both Acoustics experts and non-

experts can be highlighted. Validation, however, of the auralization is an on going

54

Page 67: Final_1]2_ phd

Chapter 2 - Literature review

55

research topic closely related to model quality and performance. As such, the study

expands further on the topic with the aim to derive an efficient methodology in these

terms.

Page 68: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

CHAPTER 3

Room Acoustics Measurements

3.1 Introduction

Previous research [27, 28, 86, 87, 88, 115 ] investigated a large number of rooms within the

context of classroom acoustics and university lecture rooms for comparative purposes.

Objectives ranged from defining optimum RT’s within classrooms to prospective

interrelations of acoustic parameters that are typically used in describing acoustic

conditions and speech intelligibility in particular. In the context of the current study, it

has been reported that EDT and energy ratio (using a 50ms limit for early energy) values

for small spaces / classrooms are closely related to T30 and thus could be predicted

consistently [88]. However, it is not clear if the result is valid for fitted spaces while little

information relating to the measuring positions is given, a factor that could have an effect

on the measurement output. Another approach would include using a multi source sound

system while the effect of such a configuration on general acoustic parameters e.g. RT

has not been addressed in comparison to standard acoustical measurements using a single

source.

This chapter is primarily based on the room acoustics measurements for a combination of

ten university lecture theatres/classrooms that are considered typical within university

premises. The measurement methodology is described in detail for the number of

different conditions examined and results are analyzed in terms of the type of source used

to determine the level of usability for a consistent assessment via either configuration.

Acoustic performance is further discussed in the context of measure interrelations.

56

Page 69: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

57

Given the increased practicality of a simplified measurement procedure for in situ

conditions, the chapter concludes with an investigation into closed and open loop (see

section 3.4) measurement methodologies. Open loop systems have previously been

successfully utilized in large underground spaces, such as ticket halls and train platforms [ 116 , 117 ], consequently significantly simplifying the measurement procedure. In the

attempt to establish the efficiency of this type of simpler implementation (open loop) the

latter is discussed in terms of feasibility of use within the context of classroom acoustics.

3.2 Acoustic conditions for the different source configurations

The following sections discuss the assessment in the ten case studies, considered under a

range of source configurations. The measurement methodology and rationale are initially

presented, followed by a comparison of the acoustic conditions in terms of related

parameters.

3.2.1 Measurement methodology

Measurements undertaken for the current part of the study aimed at a general assessment

of the spaces considered, enabling nonetheless further post processing where necessary

e.g. accounting for absolute speech and background noise levels. Natural acoustics were

assessed with the use of a single dodecahedron (omni directional) loudspeaker for two

positions, while the sound system assisted conditions were based on a portable sound

system set up (SS) common in all rooms, to eliminate inconsistencies due to different

system characteristics. Both source types approximated a flat frequency response while

the directivity pattern and aiming of the distributed system was not considered at this

stage. Two conditions of the portable sound system consisted of a combination of two or

four monitor loudspeakers, positioned respectively at the two front or main four corners

of the audience area within a room. Overall, a total of four source configurations were

considered (Figure 3.1): two omni directional source positions, two sound system

formations (2 x loudspeaker, 4 x loudspeaker).

Page 70: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Figure 3.1. Sound source configurations (I-IV)

Room acoustics measurements were based on the WinMLS 2004 i [6] platform and a

combination of a sound source (single omni directional or distributed configuration) with

an omni directional receiver. The receiver positions and omni directional source positions

were set at a height of 1.1m-1.2m and 1.7-1.8m, respectively, the former aiming at the

stage (see Appendix 1). Positioning of the distributed system in terms of height depended

largely on the distance from the ceiling in each case, given an angled audience area for a

number of rooms; a height setting in the range of 1.8m -2.35m was thus used, as

appropriate.

An exponential sine sweep test signal was preferred since potential distortion artefacts

originating at the sound system’s transfer function would be excluded from the outcomes

and thus not influence the measurement consistency. The matching system characteristics

for all rooms in the same sense could potentially eliminate prospective inconsistencies

related to differentiated system performance and to the system itself. A 10 second swept

sine was used in all cases and multiple impulse response measurements in line with BS

ISO 3382-2: 2008 [ 118 ] were taken. Typical divergence from the standard procedure

implicated the use of a number of receiver positions close to reflective surfaces i.e. desks.

This variation aimed in assessing a more realistic environment; however, alternative

positioning was chosen were possible. Samples of BGNL were recorded throughout the

sessions (typically 5 samples in every room for a 4 hour session) with a class 1 sound

level meter at the centre of the room or at the low level measurement position, as shown

i WinMLS is a sound card based software platform for audio, acoustics and vibration measurements using a PC.

58

Page 71: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

in the detailed room information in Appendix 1. BGNL was estimated in terms of Leq, 1min

in unoccupied conditions.

All room acoustics measurements were performed in fitted rooms and unoccupied

conditions at typical working hours during the day.

3.2.2 Equipment list

Norsonic 140 sound level meter (SLM)

B&K calibrator Type 4230

Dell Latitude PC D610

Digigram VXpocket v2 sound card

WinMLS 2004 professional measurement software for Windows environment

Audio SR707 power amplifier

Audio splitter (1 input, 4 outputs)

Dodecahedron omni directional loudspeaker

Yamaha HS50M studio monitor loudspeaker on tripod stand (x4)

Earthworks M30BX omni directional measurement microphone

3.2.3 Test rooms

The range of test rooms that was considered in this study consisted of fitted lecture

theatres and classrooms, typically found with university premises (see figure 3.2). The

majority of the rooms can be described as small to medium sized spaces, while two larger

rooms were also included in the analysis. Overall, ten rooms were examined, generically

categorized as seen in Table 3.1. Details on the construction type, fittings and

architectural characteristics of the rooms, among with different views and source/receiver

positioning details can be found in Appendix 1.

59

Page 72: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Figure 3.2. Examples of classroom population used in the study

Room Capacity (seating) Volume (m3) Size category Room 1 (B336) 30 138 Small

Room 2 (B337) 30 156 Small

Room 3 (T214) 50 260 Small

Room 4 (K604) 30 148 Small

Room 5 (B361) 40 218 Small

Room 6 (B302) 80 242 Medium

Room 7 (B307) 140 250 Medium

Room 8 (Lecture room A) 110 267 Medium

Room 9 (B360) 62 356 Large

Room 10 (Event theatre) 240 753 Large Table 3.1. Room list for room acoustics measurements

60

Page 73: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

3.2.4 Results

Table 3.2 presents the average BGNL in terms of Leq, 1min measured over the ten test

rooms. The overall linear and A-weighted levels varied from 39.7dB-57.8dB and

34.1dBA-48.4dBA, respectively; a number of rooms having at times increased exposure to

low frequency noise. T30, 1kHz, EDT1kHz, C50, 1kHz, C80, 1kHz, Ts, 1kHz, MTI1kHz, STI values

and a statistical summary of the results over all rooms are given in table 3.3 to establish

the general character of the rooms considered. T30 as such varied from 0.41-0.86 while

EDT was measured within the range of 0.36-0.76. It has been previously suggested [88]

that for the specific size category, EDT closely approximates T30 values (with an

exception at lower frequencies), consequently optimum RT recommendations could be

given solely in terms of T30. The current measurement results partially support these

conclusions as for lower frequency bands T30 was typically found to be considerably

different from EDT. The differences observed, based on average values over all receiver

positions and source configurations, approached 31% and 16% for the 125Hz and 250Hz

octave bands, respectively. Larger differences could be expected for individual source

setups i.e. for data processed prior to averaging for all source configurations and more so

for individual receiver positions.

Average BGNL over ten rooms

Frequency (Hz) 125 250 500 1000 2000 4000 8000 Overall level

Linear 37.3 34.8 34.1 33.9 32.6 27.8 24.1 42.1dB

A-weighted 21.2 26.2 30.9 33.9 33.8 28.8 23.0 38.8dBA Table 3.2. Average BGNL over ten rooms (Leq, 1min)

Table 3.3. Acoustic parameters measured in ten test rooms and statistical summary

Room 1 2 3 4 5 6 7 8 9 10 Min Average Max STD

EDT1kHz(s) 0.39 0.41 0.36 0.76 0.54 0.46 0.5 0.46 0.59 0.46 0.36 0.49 0.76 0.12

T301kHz(s) 0.41 0.45 0.45 0.86 0.62 0.5 0.53 0.53 0.82 0.52 0.41 0.57 0.86 0.15

Ts, 1kHz(ms) 26 27.4 32.8 54.3 36.5 29.6 42 37.1 43.8 30.8 26 36.0 54.3 8.73

C50, 1kHz(dB) 7.6 7.3 6.9 2.1 5.2 6.4 4.3 5.2 4.2 6.4 2.1 5.6 7.6 1.70

C80, 1kHz(dB) 13 11.9 12 5.5 8.7 10.3 8.6 9.9 7.7 10.3 5.5 9.8 13 2.25

D50, 1kHz (%) 84.8 84 82.3 65 76.3 81.7 71.5 74.7 71.8 80.8 65 77.3 84.8 6.504

MTI1kHz 0.79 0.78 0.81 0.65 0.74 0.77 0.76 0.76 0.72 0.76 0.65 0.75 0.81 0.044

STI 0.77 0.75 0.79 0.66 0.70 0.74 0.74 0.76 0.70 0.78 0.66 0.70 0.79 0.041

BGNL (dBA) 39.2 39.4 36.7 39.9 48.4 41.8 37.4 34.1 40 39.6 34.1 39.7 48.4 3.7

61

Page 74: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Previous results are only partially confirmed in the current investigation since for higher

frequency bands various degrees of differentiation between T30 and EDT were found. It is

of interest that in a number of cases the two parameters are in close approximation only

for individual source setups, an effect that can differentiate between receiver positions

within the same room, see Appendix 2. Taking in to account the experimental output as

reported by Bradley in terms of the T30 - EDT similarity for small enclosures [88] the

current outcomes suggest that fittings within the spaces considered have a significant

effect on the measurement results, predominantly for the EDT parameter. A correlation

analysis between the two measures revealed a wide range of results, as would be

expected given partly unpredictable EDT results.

A comparison of results was further performed in terms of reverberation times to

establish the effect of the source type on the room response and assess the feasibility of

substituting source types with alternatives when necessary, or simply when more

practical to do so. Considering the data as shown in Appendix 2 it can be deduced that the

majority of the test rooms produced a primarily diffuse sound field for the higher

frequency range, with the partial exception of Room 10. Accordingly, T30 output suggests

that the source type does not significantly influence the measurement results on this basis.

Tables 3.4-3.13 show the standard deviation for the T30 and EDT variation among the

four source configurations as an average over all receiver positions in a room. For

comparison purposes, the standard deviations are also interpreted as an equivalent

percentage (%) relating in each case to the mean value at the particular data point.

62

Page 75: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

T30 and EDT σ among four source types (over all receivers)

Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.069 0.052 0.022 0.019 0.019 0.011 0.021 T30 σ 0.055 0.02 0.02 0.006 0.006 0.006 0.041

% EDT 11.7 9.3 5.6 4.8 4.2 2.6 6.2 % T30 7.7 3.5 4.7 1.5 1.3 1.3 10.3 Table 3.4. Standard deviation for T30 and EDT among the four source types in Room 1

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.087 0.041 0.027 0.029 0.013 0.022 0.045 T30 σ 0.071 0.044 0.014 0.002 0.003 0.008 0.047

% EDT 12.8 7.0 6.3 7.2 2.8 4.8 11.8 % T30 8.7 7.1 3.0 0.4 0.6 1.5 9.9 Table 3.5. Standard deviation for T30 and EDT among the four source types in Room 2

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.046 0.024 0.051 0.027 0.018 0.022 0.021 T30 σ 0.068 0.007 0.009 0.023 0.032 0.038 0.008

% EDT 7.8 4.8 13.1 7.4 4.1 5.7 7.0 % T30 11.5 1.3 2.2 5.1 5.6 7.6 2.2 Table 3.6. Standard deviation for T30 and EDT among the four source types in Room 3

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.258 0.053 0.042 0.038 0.019 0.008 0.062 T30 σ 0.293 0.157 0.028 0.011 0.009 0.014 0.029

% EDT 26.6 4.6 4.2 5.0 2.9 1.2 11.9 % T30 33.8 14.0 2.8 1.3 1.2 1.9 4.9 Table 3.7. Standard deviation for T30 and EDT among the four source types in Room 4

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.179 0.081 0.039 0.016 0.056 0.042 0.059 T30 σ 0.107 0.048 0.019 0.006 0.008 0.01 0.049

% EDT 12.2 9.0 6.5 3.0 10.3 7.7 12.2 % T30 5.9 4.5 2.9 1.0 1.2 1.4 7.1 Table 3.8. Standard deviation for T30 and EDT among the four source types in Room 5

63

Page 76: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

T30 and EDT σ among four source types (over all receivers)

Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.052 0.061 0.017 0.026 0.012 0.052 0.025 T30 σ 0.028 0.072 0.013 0.01 0.02 0.065 0.055

% EDT 7.7 9.5 3.3 5.7 2.2 10.1 6.3 % T30 4.8 11.1 2.5 2.0 3.3 10.8 11.3 Table 3.9. Standard deviation for T30 and EDT among the four source types in Room 6

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.093 0.044 0.053 0.03 0.033 0.048 0.047 T30 σ 0.067 0.021 0.007 0.01 0.007 0.012 0.026

% EDT 14.8 7.3 9.9 6.0 6.2 8.6 10.7 % T30 7.3 3.3 1.3 1.9 1.2 2.0 5.2 Table 3.10. Standard deviation for T30 and EDT among the four source types in Room 7

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.109 0.083 0.038 0.042 0.042 0.046 0.06 T30 σ 0.089 0.023 0.003 0.009 0.02 0.007 0.035

% EDT 17.0 15.4 7.6 9.1 8.6 9.4 16.5 % T30 10.1 3.6 0.6 1.7 3.4 1.3 8.2 Table 3.11. Standard deviation for T30 and EDT among the four source types in Room 8

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.073 0.031 0.044 0.04 0.108 0.233 0.089 T30 σ 0.035 0.014 0.018 0.012 0.007 0.007 0.05

% EDT 7.2 3.7 7.4 6.8 15.3 30.6 19.1 % T30 3.1 1.5 2.8 1.5 0.7 0.7 5.9 Table 3.12. Standard deviation for T30 and EDT among the four source types in Room 9

T30 and EDT σ among four source types (over all receivers) Octave band 125 250 500 1000 2000 4000 8000

EDT σ 0.056 0.036 0.019 0.024 0.045 0.08 0.082 T30 σ 0.031 0.032 0.001 0.003 0.037 0.098 0.107

% EDT 11.5 6.4 3.6 5.2 10.4 20.6 25.6 % T30 4.9 5.3 0.2 0.6 6.5 16.7 22.9

Table 3.13. Standard deviation for T30 and EDT among the four source types in Room 10

64

Page 77: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Overall, low standard deviations were established for the T30 case over all rooms with

values well below 5% (1 JND, ISO 3381-1 after Vorländer [92]) error for the majority of

the experimental data. A notable exception can be seen at higher frequencies in Room 10

where errors approached 23% at 8kHz. Room 10 however comprised the largest room in

the investigation thus, having produced a quasi-diffuse sound field the discrepancies

observed for the 4kHz and 8kHz octave bands can be justified. In the EDT case, larger

differences were observed given the different source configurations and related

positioning within the room. Averaged results in tables 3.4-3.13 however suggest that a

reasonably accurate assessment can be made with an error margin below 10% (2 JND).

Considering any exceptions in large rooms (Rooms 9-10) at higher frequencies,

distinctively the smaller rooms in the investigation gave confidence in the consistency of

the measurements.

Consequently, a room assessment in terms of T30 and EDT can be performed using either

of the source types to characterize a room on general performance.

Figures 3.3-3.12 show the measured values in terms of C50 for four source configurations

as an average over the measuring positions. C50 values depended largely on the relative

positioning of the receiver with respect to the source(s), while optimization of source

aiming or positioning was not considered.

65

Page 78: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Average C50 over all receivers for four source conditions (dB)

-4.0

-2.0

0.0

2.0

4.0

6.0

8.0

10.0

12.0

125 250 500 1000 2000 4000 8000Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Average C50 over all receivers for four source conditions (dB)

-4-202468

1012

125 250 500 1000 2000 4000 8000

Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.3. Source efficiency in terms of C50- Room 1

Figure 3.4. Source efficiency in terms of C50- Room 2

Average C50 over all receivers for four source conditions (dB)

-4

-2

0

2

4

6

8

10

12

125 250 500 1000 2000 4000 8000Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.5. Source efficiency in terms of C50- Room 3

66

Page 79: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Average C50 over all receivers for four source conditions (dB)

-4-20

2468

1012

125 250 500 1000 2000 4000 8000Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.6. Source efficiency in terms of C50- Room 4

Average C50 over all receivers for four source conditions (dB)

-4-202468

1012

125 250 500 1000 2000 4000 8000Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.7. Source efficiency in terms of C50- Room 5

Average C50 over all receivers for four source conditions (dB)

-4-202468

1012

125 250 500 1000 2000 4000 8000Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.8. Source efficiency in terms of C50- Room 6

67

Page 80: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Average C50 over all receivers for four source conditions (dB)

-4

-2

0

2

4

6

8

10

12

125 250 500 1000 2000 4000 8000Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.9. Source efficiency in terms of C50- Room 7

Average C50 over all receivers for four source conditions (dB)

-4-202468

1012

125 250 500 1000 2000 4000 8000

Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.10. Source efficiency in terms of C50- Room 8

Average C50 over all receivers for four source conditions (dB)

-4-202468

1012

125 250 500 1000 2000 4000 8000

Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.11. Source efficiency in terms of C50- Room 9

68

Page 81: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Average C50 over all receivers for four source conditions (dB)

-4-202468

1012

125 250 500 1000 2000 4000 8000

Octave band

C50

diff

eren

ce (

dB)

S1

S2

SS4Loudspeakers

SS2Loudspeakers

Figure 3.12. Source efficiency in terms of C50- Room 10

Overall, results revealed that for a number of rooms (e.g. Rooms 1-4) source

configurations involving the SS produced a more consistent sound field with higher

clarity (50ms) values. In smaller flat rooms nonetheless, only small differences were

found among the source types. In many cases SS based configurations did not perform

better when compared to the omni source, something that was attributed to potential

comb filtering effects in the room. Directivity and aiming also had a significant impact on

the results as no optimization was performed in these terms; this would predominantly

lead in an underestimation of the effectiveness of the SS in either formation.

Taking another approach, in larger rooms where an increased distance of the SS to the

receivers was regularly the case, omni directional source configurations produced a better

result, see Rooms 7-9, particularly when positioned in the middle and in close proximity

to all receiver positions, see Room 6.

Overall, C50 results were subject to the relative source-receiver distance, however, the

differences observed among source types did not exceed 2-3dB in most cases (note: C50

JND is 1.1dB), being below the practically detectable limit of 3dB [27].

The STI results are discussed in the following section in comparison to measurements

post processed to account for BGNL and speech level within the rooms.

69

Page 82: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

3.3 Measurement output accounting for typical speech and BGNL

The incorporation of typical speech level and actual or expected BGNL enabled an

additional assessment of the measurements on a more realistic basis. Current results were

post processed to account for the two variables using typical speech levels [119] and the

mean value of background noise, as measured in the ten test rooms. It is worth noting that

the measured BGNL closely approximated findings as reported by Hodgson [ 120 ] in

relation to typical noise levels in university classrooms.

Figures 3.13-3.22 show the derived STI in ten rooms for the different source

configurations. The average difference among source types was in the JND range (≤0.02

STI) with no optimization in terms of sound system positioning or aiming. On a more

realistic approach, speech and BGNL was incorporated in the measurements producing

significantly reduced STI values, see figures 3.13-3.22. It should be noted that post

processed results constituted effectively a consistent conversion of the initial output with

nearly identical interrelations among receiver position performance, since the same

speech and noise levels were used in every case. Nonetheless, with results being more

realistic, the STI reduction was mainly attributed to the higher frequency bands that were

more profoundly affected by background noise (LAeq 55dB for a standard spectrum [6]

was used for speech). Overall, a consistent STI assessment can be made using either

source configuration to generically describe a room, the average differences being in the

JND range; a realistic model for speech and BGNL individually for the receiver positions

is not considered in this case.

Figure 3.13. STI for four source configurations in Room 1, I) Primary - II) Post processed

70

Page 83: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Figure 3.14. STI for four source configurations in Room 2, I) Primary - II) Post processed

Figure 3.15. STI for four source configurations in Room 3, I) Primary - II) Post processed

I) II)

Figure 3.16. STI for three source configurations in Room 4, I) Primary - II) Post processed

71

Figure 3.17. STI for four source configurations in Room 5, I) Primary - II) Post processed

Page 84: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Figure 3.18. STI for four source configurations in Room 6, I) Primary - II) Post processed

Figure 3.19. STI for four source configurations in Room 7, I) Primary - II) Post processed

Figure 3.20. STI for four source configurations in Room 8, I) Primary- II) Post processed

Figure 3.21. STI for four source configurations in Room 9, I) Primary - II) Post processed

72

Page 85: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Figure 3.22. STI for four source configurations in Room 10, I) Primary - II) Post processed

The methodology used for measuring general room performance is satisfactory for room

classification. Compared to level calibrated measurements moreover, the sessions were

not limited to a single measurement task as results originated from general impulse

response data. An assessment, nonetheless, on the efficiency of the source type in this

respect would require the process to account for the effect of the source-receiver distance

on the resultant speech level, assuming the source types considered have been optimized

for their intended use. Practical limitations in handling the portable sound system did not

allow for an in depth assessment on this basis.

Detailed data relating to the room acoustics measurements can be found in Appendix 2.

3.4 Parameter interrelations In the following sections, key acoustic parameters are analyzed to establish any

association between them and determine the extend of the latter and related limitations.

Considering speech intelligibility parameters in particular, different elements of the

acoustic conditions are used to obtain a result. For example, clarity energy ratios make

use of the room effect on acoustic behaviour while ignoring background noise,

effectively the S/N. Other parameters such as the STI comprise a more elaborate

approach in an attempt to account for all the variables that affect acoustic performance.

The conditions present in a space during a measuring session will thus unavoidably affect

the output in different ways for different measures. As such, care needs to be taken when

73

Page 86: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

comparing dissimilar parameters or making an assumptive assessment, based on a

particular methodology.

3.4.1 Clarity (C) energy ratios versus STI

STI comprises a measure describing speech intelligibility using a single number for seven

octave bands, subsequently corresponding to more than a single C value. In order to

enable a comparison in octave band level detail, the modulation transfer index (MTI) is

considered as the equivalent octave band ‘STI’, nonetheless the benefit of octave band

weighting and redundancy corrections as such is not considered and therefore results

could underestimate the potential relationship. In assessing the relation between the C

energy ratio and STI it should be reminded that the former does not account for the

influence of BGNL. Thus, the particular interrelation is subject to change in every

environment, depending on the noise conditions present.

Figure 3.23 shows the relation of C50, 80 to MTI for two conditions, with and without

background noise. For noiseless conditions the relation of the two measures was

approximately linear, coinciding with earlier results by Bradley [27], while C80 appeared to

be better related to MTI. The related correlation coefficients were nonetheless

comparable with values of 0.91 and 0.96, respectively for the pairs C50-MTI and C80-MTI,

see figure 3.23 (I-II). In the conditions accounting for BGNL the particular associations

break down, as the measures compared are effectively modified in to two fundamentally

different measures. Considering that the particular relation, see figure 3.23 (III-IV) could

be altered even within the same room under different noise conditions, a comparison of C

to STI when accounting for BGNL would appear as of minor significance unless some

level of consistency can be expected in terms of the background noise character.

When the S/N is high enough to render the effect of BGNL negligible in a practical

application, it would be possible to predict the speech intelligibility in terms of STI from

the C50, 80 datasets with a high level of accuracy. Therefore, for a high signal level

condition C might also be used as a direct descriptor of speech intelligibility. When

considering marginal conditions nonetheless, the relationship would be invalidated to a

74

Page 87: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

large extent, as seen in the example in figure 3.23 (III-IV). Consequently, the latter

contaminated relationship could be used, if established, to ascertain BGNL as a

significant factor in the acoustical conditions.

I) II)

III) IV)

Fi I gure 3.23. Relation of Clarity to MTI in ten test rooms, I) C50 to MTI without background noise, II) C80 to MTwithout background noise, III) C50 to MTI with background noise, IV) C80 to MTI with background noise

3.4.2 Room reverberance (EDT, T30) versus STI

The relation of room reverberance to STI followed a similar trend. Considering EDT and

T30, an evident relation of reverberance to the MTI was found for noiseless (or adequate

S/N) conditions, see figure 3.24. EDT was more closely related to MTI as expected with

75

Page 88: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

a correlation coefficient of 0.98 (0.85 for T30) having a near linear interaction, while a

similar degree of agreement was further found for all four source configurations.

Figure 3.24. MTI relation to space reverberance in ten test rooms (no noise)

The close relationship between the measures became less evident with the incorporation

of background noise, resulting in a correlation coefficient of 0.67 for both reverberation

indices. The resulting relationship would again be subject to the character of noise, being

the only altered variable between the two conditions, subsequently having an effect on

the value of a potential comparison in these terms similar to the C case.

3.4.3 EDT versus T30

Section 3.2.1 has considered the interrelations of T30 and EDT values, with any

conclusions partially supporting earlier studies [88] suggesting similarity of the two

measures for small rooms. Figure 3.25 shows the relation of the two measures for four

source configurations, showing a partial similarity of the two measures. A direct effect

would be that T30 would not be suitable for use as a baseline to predict EDT and C50,

among other measures, within fitted university classrooms and lecture theatres. However,

considering alternate source configurations appeared to influence the T30-EDT

relationship. Figure 3.25 illustrates the closer connection between the reverberation

indices when the portable sound system is used as a source, particularly in the

76

Page 89: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

SS2loudspeakers case. While all four configurations produced a relatively small deviation in

terms of correlation between parameters, it should be noted that excluding the two larger

rooms of the study (Rooms 9-10) from the statistical analysis resulted in an enhanced

interrelation for all conditions, see figure 3.26. Consequently, this produced a better

similitude between the different conditions giving further confidence in assessing smaller

sized rooms, as discussed in section 3.2.1.

Figure 3.25. Relation of EDT to T30 for four source configurations in ten test rooms (S1, S2, SS4loudspeakers, SS2loudspeakers)

Figure 3.26. Relation of EDT to T30 for four source configurations after excluding Rooms 9-10 (S1, S2, SS4loudspeakers, SS2loudspeakers)

3.4.4 EDT versus C50

Examining the relation between EDT and C50 resulted in a clear trend, as expected, of

increasing clarity with decreasing values of EDT, see figure 3.27. In effect, an indication

of clarity can be obtained via the particular trend. Results hold for either of the four

source configurations, while a closer relationship was established for smaller rooms in

the study. It should be noted nonetheless that earlier studies [87] have established a

77

Page 90: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

significantly reduced correlation for longer EDT, having considered sound system

measurements and a wider range of conditions, not typical within classrooms.

R2=0.874

Figure 3.27. Relation of C50 to EDT in ten test rooms

3.5 Comparison of measurements for closed/open loop

In this section, data relating to the core measurement system (closed loop, see figure 3.28

(I)) as described in section 3.2.1 is compared to the equivalent data resultant of an open

loop system configuration (figure 3.28 (II)). An open loop measurement methodology has

been previously successfully applied in large underground spaces [116, 117], however, the

method has not been assessed within smaller rooms. Measurements can be taken if

necessary using a portable receiver with recording capabilities (therefore reducing

cabling) suggesting that an open loop measurement system can be a more practical

methodology, when under more restrictive circumstances.

An assessment of closed and open loop measurement systems is performed to determine

the efficiency of an open loop methodology in the context of classroom acoustics.

3.5.1 Open loop measurement methodology

An open loop system (figure 3.28 (II)) typically consists of a processing unit, a sound

source and a receiver with recording capabilities. For the purposes of this investigation,

the software platform of the closed loop system was initially replaced by a multi-track

78

Page 91: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

recorder for the first stage of the session. A 10 second sine sweep (exp) was arranged in a

track, with visual time markers at the beginning and end of the test signal. A secondary

track was set to simultaneously record the system’s microphone input during

reproduction of the sine track, in turn from the four source configurations as described in

section 3.2.1. The time markers were later used within an audio editor to accurately

identify and extract the precise time limits in the recorded sample, incorporating the test

signal with the room response. The extracted samples were subsequently converted into

time domain raw data that is the WinMLS binary format. Three measurements were

processed per receiver position (per source configuration) to ensure consistency among

tests while, post processing of selected data allowed the derivation of the acoustic

parameters of interest for the 125Hz – 8kHz octave band range.

Figure 3.28. Measurement system, I) Closed loop configuration, II) Open loop configuration

3.5.2 Supplementary equipment list for open loop measurements

Sony Vegas Pro 8, professional multi-track audio editing software

Sony Sound Forge 9, professional digital audio production suite

3.5.3 Two system data comparison (Closed-Open loop data)

In the following paragraphs, data derived from the open loop measurement configuration

in ten test rooms is compared to its closed loop equivalent to validate the approach. T30,

EDT, C50 and STI are considered, while for the purposes of the current assessment the

closed loop system is assumed as most consistent and therefore used as the reference for

performance.

79

Page 92: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Figures 3.29-3.38 show a comparison of T30-EDT results for the two systems as an

average over the measuring positions per source configuration. Overall, different degrees

of error were produced for the two measures, EDT typically being more accurate.

Discrepancies of up to 19% EDT were observed in the octave bands of interest, excluding

lower frequencies, while considerable mismatches were found for a number of distinct

data points among the test rooms. In terms of T30, differences approached 26% in a

similar manner with further distinct erroneous data points within the results. For the

majority of the T30-EDT experimental data, accurate matches to the corresponding

reference values were produced with a maximum error of up to 10% and normally less

than 5%.

Similarly, C50 results (see Appendix 3) were typically within ±2dB for all octave bands in

either source configuration. A number of exceptions, most notably in Room 10, were

observed with the average error in this worst case approaching 3dB. Notably, a JND of

1.1dB has been previously established for C50 however a 3dB value was further suggested

as more suitable for everyday situations [27].

STI results were within the JND for all receiver positions and for all the source

configurations, with the exception of a small number of erroneous data points. A

maximum error of 0.06 STI however, was found for the latter case i.e. double the JND.

Considering the average over all receiver positions in the test rooms, suggested an STI

per source configuration within the JND for either arrangement.

80

Page 93: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.29. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 1 (Four source configurations)

81

Page 94: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.30. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 2 (Four source configurations)

82

Page 95: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.31. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring positions in Room 3 (Four source configurations)

83

Page 96: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.50

1.00

1.50

2.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.50

1.00

1.50

2.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.50

1.00

1.50

2.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.32. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring positions in Room 4 (Three source configurations)

84

Page 97: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.50

1.00

1.50

2.00

2.50

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.50

1.00

1.50

2.00

2.50

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.50

1.00

1.50

2.00

2.50

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.50

1.00

1.50

2.00

2.50

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.33. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 5 (Four source configurations)

85

Page 98: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.34. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 6 (Four source configurations)

86

Page 99: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.20

0.40

0.60

0.80

1.00

1.20

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.20

0.40

0.60

0.80

1.00

1.20

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

1.20

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

1.20

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.35. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 7 (Four source configurations)

87

Page 100: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S2

0.000.200.400.600.801.001.201.40

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Figure 3.36. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all measuring positions in Room 8 (One source configuration)

88

Page 101: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.000.200.400.600.801.001.201.40

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.000.200.400.600.801.001.201.40

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.000.200.400.60

0.801.001.201.40

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.000.200.400.600.801.001.201.40

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.37. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 9 (Four source configurations)

89

Page 102: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Comparison of average T30 and EDT for Closed Open loop using S1

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S1 Closed loop T30

S1 Open loop T30

S1 Closed loop EDT

S1 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using S2

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

S2 Closed loop T30

S2 Open loop T30

S2 Closed loop EDT

S2 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS4 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS4 Closed loop T30

SS4 Open loop T30

SS4 Closed loop EDT

SS4 Open loop EDT

Comparison of average T30 and EDT for Closed Open loop using SS2 loudspeakers

0.00

0.20

0.40

0.60

0.80

1.00

125 250 500 1000 2000 4000 8000

Octave band

T(s

ec)

SS2 Closed loop T30

SS2 Open loop T30

SS2 Closed loop EDT

SS2 Open loop EDT

Figure 3.38. Comparison of T30 and EDT results for Closed-Open loop systems as an average over all

measuring positions in Room 10 (Four source configurations)

90

Page 103: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

3.5.4 Comments on Open loop measurement sessions

Measurements can generally be performed using a suitable receiver e.g. a sound level

meter with recording capabilities. However, time synchronization of the recorded test

signal should be considered since a misalignment could result in significant errors. In the

current methodology this issue was resolved by using time markers at the recording stage

to accurately identify the position of the test signal and recorded response within the

audio track.

It is required by the open loop methodology for settings relating to the test signal setup

within the measurement software to remain unchanged, i.e. the signal used to record the

room responses in open loop mode should match the internal software measurement

settings at the post processing stage, when using the recordings. As such, minor

differences in e.g. the length of the silence segment following the test signal or the

system’s sampling rate and bit depth have been found to result in additional errors.

An important negative contribution is similarly made at the post processing stage by

altering the frequency range used to initially produce the test signal, since small

variations in the range of a few Hertz can produce significant error. A signal propagating

in an enclosure will unavoidably alter its spectrum characteristics; however, this process

is not accounted for in system calibration. An additional uncertainty factor is thus

introduced at the input signal and potentially an open loop system is more notably

affected. Ultimately, there is a significantly increased number of aspects within a

measurement session when compared to a closed loop system that could invalidate the

results.

Overall, it was established that an open loop methodology could be used as an alternative

to a closed loop system for the case where circumstances restrict the measurement

session in terms of practicality. Individual errors, inappropriately high on occasions, did

occur within the data sets from the ten test rooms suggesting that the method would

conceivably not be suitable for a detailed assessment. Considering an averaged output

over the measuring positions facilitated a reasonably accurate room characterization.

91

Page 104: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

Notably for a speech intelligibility assessment, the open loop methodology produced an

acceptable level of confidence.

For a multi faceted assessment such as use of room acoustic data in computer modelling,

the potential accumulation of errors and subsequent expansion of error margins would

ultimately need to be accounted for to retain confidence in an assessment.

3.6 Conclusions

The current analysis has considered ten test rooms covering a typical range of acoustic

conditions found within university classrooms. Previous research on T30-EDT

measurements for small rooms could be partially confirmed. However, the relation of T30

and EDT appeared to have different degrees of variation, being highly dependent on the

source configuration in use. The effect was attributed to the fittings present within the

rooms thus the EDT could be expected to deviate from T30 in different cases. This

assumption was supported by a correlation analysis where a wide range of results was

observed for the relation between T30 and EDT.

Either of the four source configurations could be used for a consistent room assessment in

terms of T30. Larger differences were observed between source types for EDT, based on

the relative source-receiver positions. However, average values over the measuring

positions suggested the feasibility of a reasonably accurate assessment. Only smaller

rooms in the investigation (Rooms 1-8) gave confidence for an acceptable error margin.

C50 values depended on the relative source-receiver distance. Nonetheless, differences

among source types did not exceed 2-3dB in most cases, being marginally over the JND;

a realistic JND is defined as 3dB. A high correlation coefficient was established for the

relation between C50 and EDT, thus an indication of clarity could be obtained in small

rooms from the particular trend. The measured differences among source types for STI

were in the JND range (≤0.02) in all but two cases. However, an assessment aiming at a

direct STI evaluation would require the utilization of the source type and setup normally

92

Page 105: Final_1]2_ phd

Chapter 3 – Room acoustics measurements

93

used in a space, followed by post processing of data to account for a realistic S/N at

individual receiver positions.

Good correlation was established between Clarity and STI (0.91 and 0.96 for C50 and C80

respectively) for noiseless or adequate S/N measurement conditions. Speech intelligibility

in terms of STI can thus be predicted from clarity datasets with confidence. For high S/N

in an actual case, clarity ratios can be used as a direct descriptor of intelligibility. Given

that the interrelation breaks when background noise is considerable, BGNL can be

identified as a significant factor for the acoustical conditions of a space in such a case. A

similar influence of background noise was found for the relation between room

reverberance (EDT, T30) and STI, particularly for EDT (correlation=0.98).

In terms of a more practical measurement approach, it was established that an open loop

measurement methodology could potentially be used as an alternative to closed loop

systems. Individual errors within the derived datasets suggested however that the method

would not be suitable for a direct assessment for individual receiver positions and is

rather aimed at a general room characterization. The increased number of aspects that

could invalidate a measurement was highlighted; nonetheless, an acceptable level of

confidence for a speech intelligibility assessment in a room was established.

Page 106: Final_1]2_ phd

Chapter 4 – Low level measurements

CHAPTER 4

Low Level Measurements

4.1 Introduction

There are two reasons primarily responsible for low resultant S/N during a measurement

session, excessive background noise levels or an intentional reduction in the signal level

to minimize annoyance caused to the public by a high test signal level. A significant error

may consequently be introduced in the resulting data since a measurement methodology

requires, among other parameters, a minimum S/N for an accurate outcome. Threshold

efficient values of S/N are highly affected by additional factors, including the general

acoustic characteristics of the space considered, the size of the space and relative

positioning of the source-receiver configuration, the test signal in use and the

measurement duration. Speech intelligibility measurements require a high degree of

accuracy since output variations can result in unacceptable errors. Noise sensitive spaces

in particular are where conditions might be marginally acceptable, thus small errors will

have a large impact in the space assessment procedure and subsequent action.

With the aim to establish a point of reference, the following sections consider the

parameter interrelations in terms of T30, EDT and STI when approaching marginal

accuracy conditions. A single acoustic parameter measured by any means may be used

for further evaluation of an acoustical environment by utilizing results in the validation of

a computer simulation [114, 121], see section 5.2.2. Assuming acceptable precision of the

reference parameter e.g. T30 or EDT, an acceptable level of model prediction accuracy

can be further achieved. Choosing a suitable reference parameter is vital for the

subsequent computer model prediction methodology, particularly when related values

originate from measurements performed at marginal conditions. The applicability of the

94

Page 107: Final_1]2_ phd

Chapter 4 – Low level measurements low level measurement session outcomes is thus put in to focus to assess the potential

gain in assessment efficiency in these terms.

4.2 Measurement methodology

In the following sections the measurement methodology for undertaking low level

measurements and accuracy verification course are presented.

4.2.1 Threshold efficient signal to noise ratio (S/N) measurements

Impulse response measurements within ten test rooms were performed using a WinMLS

based platform combined with the source-receiver configurations as described in previous

sections (also see Appendix 1). Multiple measurements using a 10 second exponential

swept sine test signal were performed in descending 1dB steps, to the point where an

outcome became marginal (threshold efficient S/N). The reference to establish an

accurate performance consisted of a high S/N measurement, obtained prior to

commencing on the process while the original RT (T30) curve was used to monitor the

output during a session. The low level resultant data from the series of measurements

was later compared to the reference T30 curve to reflect the effect of a reduced S/N on

mainly intelligibility related measures. Depending on the session, two different source

configurations were used, a dodecahedron omni directional source and a four loudspeaker

portable sound system formation, as previously described in section 3.2.1.

The measurement methodology in the initial screening sessions considered a

dodecahedron sound source in a single test room (reverberation chamber, N11) for two

RT conditions, referred to as ‘high’ and ‘low’.

4.2.2 Noise source incorporation

A pink noise source was used to eliminate the risk of the test signal level not being high

enough to excite the room. Moreover, existing and potentially varying BGNL was

expected to limit consistency in the results, given that the investigation concerned

95

Page 108: Final_1]2_ phd

Chapter 4 – Low level measurements marginal conditions i.e. close to the noise floor. Pink noise was thus used in an attempt to

achieve a steady state BGNL. The incorporation of the background noise in to the system

was achieved using an audio mixer. A secondary portable computer was used as a noise

generator and mixed in to the system output. The different test configurations consisted

of the test signal being reproduced by any of the two source types, alone or mixed with

simulated noise. In examining the risk of measurement inconsistencies due to

reproduction of signal and noise from the same loudspeaker a further two combinations

were later considered with the signal and noise reproduced by individual sources (see

section 4.4.3). Thus, a number of six system configurations (Figure 4.1 (I-IV)) were used

overall, referenced as:

Signal from Omni directional (no simulated noise)

Omni (signal), Omni (background noise)

Omni (signal), SS4loudspeakers (background noise)

Signal from Sound system (no simulated noise)

SS4loudspeakers (signal), SS4loudspeakers (background noise)

SS4loudspeakers (signal), Omni (background noise)

Test room

Test room

Test room

Test room

Test room

Test room

Figure 4.1. Six system configurations (I-VI)

96

Page 109: Final_1]2_ phd

Chapter 4 – Low level measurements 4.3 Initial investigation and screening sessions

Early investigations by the author [122] considered a sine sweep test signal for two RT

conditions within a single test room (reverberation chamber) with variable absorption

characteristics, see Figure 4.2. It was found that the threshold S/N is not a static

parameter, and is highly affected by several variables, including primarily RT. A

correlation analysis performed on the measurement results revealed correlation

coefficients up to 0.7 between threshold efficient S/N and T30. Increasing threshold S/N

for increasing RT values were observed, see table 4.1.

Reverberation chamber

Results however could not be assumed as straightforwardly repeatable, since correlation

coefficients up to 0.8 were observed under the same conditions when the measurements

were repeated. The particular values approached perfect correlation when the 125Hz and

250Hz octave bands were excluded, generally appearing to be problematic for the test

room.

Reverberation chamber High RT Low RT

F(Hz) RT(s) S/N (dB) RT(s) S/N (dB)

125 2.42 15 1.31 12.1 250 2.35 18 1.32 11 500 2.35 14 1.27 12.3 1k 2.59 16 1.41 16.3 2k 2.51 9.7 1.37 14 4k 2.21 9.8 1.3 4.1 8k 1.54 4.9 1 5.5

Table 4.1. Sample threshold efficient S/N ratios for Sine sweep in a test room

10m2 absorptive material (removable)

V=105m3 Source

Receiver

Figure 4.2. Test room (reverberation chamber) schematic with source-receiver positioning

97

Page 110: Final_1]2_ phd

Chapter 4 – Low level measurements

The positioning of the source/receiver pair within the test room was considered as a

primary reason for differentiated results. Receiver positions close to the single absorbing

surface and to a lesser extend close to reflective surfaces, were found to be responsible

for unpredictable system behavior (some random results) and low correlation coefficients.

It was found that absorptive material would be preferably found uniformly across the

room surfaces for increased confidence on the measurement, given marginal conditions.

Overall, conditions enhancing diffusivity in the sound field were found to be most

suitable. Similarly, source positions promoting a diffuse field, were found to be essential.

The accuracy level, in terms of STI values derived from low S/N measurements, was

verified by comparison to the high S/N reference values, see table 4.2. The modulation

transfer index (MTI) was central in the investigation as the values among individual

octave bands do not necessarily relate to a single measurement. Good agreement was

found for all conditions, demonstrating the efficiency of the test signal in the current test

room configurations and for the levels used. It should be noted however that good

agreement was further found in the screening session for lower S/N ratios, derived from

less accurate, in terms of T30, impulse response measurements (inadequate signal strength

to calculate T30). This effect was repeated throughout the sessions and is later discussed

in more detail, see section 4.4.1. Later experimentation tracked a finer approach where

marginal MTI values were considered individually, referencing the S/N required to

achieve an accurate MTI (per octave) rather than an accurate T30.

Modulation transfer index (MTI)

F(Hz) 125 250 500 1k 2k 4k 8k

Average (unweighted

STI) Reference (High RT) 0.46 0.43 0.4 0.43 0.39 0.42 0.52 0.44

Marginal conditions (High RT) 0.46 0.43 0.4 0.42 0.39 0.43 0.55 0.44 Reference (Low RT) 0.58 0.61 0.56 0.56 0.52 0.55 0.62 0.57

Marginal conditions (Low RT) 0.57 0.62 0.57 0.57 0.52 0.57 0.64 0.58 Table 4.2. Comparison of STI for reference and experimental (marginal) conditions

98

Page 111: Final_1]2_ phd

Chapter 4 – Low level measurements The screening session overall suggested that BGNL and its fluctuating character were key

parameters, affecting the measurements for the duration of the process. These

characteristics prevented accuracy for particular measurements, given also that results

were analyzed in 1dB steps.

It should be noted that the calculation of related S/N was based on a BGNL estimation

using the last 10% of the measured impulse response thus the particular data sets cannot

be considered as decisive. Moreover, the late part of the sound decay within a room was

assumed to be linear; this prerequisite however was often not the case, consequently

affecting the estimation of the threshold efficient S/N. The significance and practical

application of the particular datasets in terms of absolute values is thus limited. However

results constitute an indication of the functions involved when approaching threshold

efficient S/N. In the following sections a more detailed approach to the topic is presented

to verify and complement the findings and assumptions of the initial investigation.

4.4 Low level measurements in ten test rooms

The low level measurement methodology has been applied in ten test rooms of varying

size category, following the room acoustics measurements in Chapter 3. The majority of

the measurements were based on a system configuration using the portable sound system

as a source for both the signal and simulated background noise.

The initial scope of the investigation at its current state concerned the interrelation

between T30 and EDT in relation to a reducing signal level in the series of measurements

i.e. reducing S/N. Finally, reference to MTI and subsequent resultant STI is made for a

finer assessment of the connection between the measures involved.

4.4.1 Measure interrelations and performance

The accuracy performance of T30 and EDT for a series of measurements at reducing

signal level (-1dB per measurement) was initially examined against reference values

originating from high S/N measurements for a broader view of the performance reduction

99

Page 112: Final_1]2_ phd

Chapter 4 – Low level measurements rate, see figures 4.3-4.12. Without particular emphasis at this stage to the associated S/N,

the graphs for each test room relate individually to the seven octave bands considered.

For the non viable ideal conditions, continuous accuracy within the series would be

implied i.e. a straight line within the graphs starting at the reference value. Values thus

that deviate (>2JND) from the trend set by the reference measurement can be identified

as erroneous.

Considering the relation in each case of the two measures to the reference state, it was

evident that EDT gave more consistent results in the series of measurements i.e. reduced

value fluctuations when compared to T30, see figures 4.3-4.12, thus supporting previous

results [123].

In Rooms 1 and 2, being similar in terms of shape and construction materials, differences

were observed for lower frequency bands where T30 in Room 1 gave significantly

inconsistent results as opposed to EDT. The effect was attributed to the single

construction difference between the rooms, a small coupled space in Room 1. The

aperture size of the latter (2m2) was partly blocked during the measurement session by a

flat surface positioned against the host wall, thus forming a smaller sized aperture.

However, considering wave diffraction effects, results would coincide with the

fluctuating character of T30 values (at lower frequencies) as is typical for coupled spaces.

It is of interest here that EDT ignored the specific factor, resulting in a more fixed

outcome.

In Room 4 (Figure 4.6) where measurements extended further beyond threshold

conditions, the performance reduction rate among octaves showed a clearer relation to the

BGNL in the room. Having a characteristic background noise spectrum with higher level

at lower frequency octaves and assuming a flat frequency response for the measurement

system, the associated low level plots demonstrated the correlation between the two

functions.

100

Page 113: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=13.2dB (T30)

S/N=6.2dB (T30) S/N=5.1dB (EDT)

S/N=7.6dB (T30)

S/N=2.2dB (EDT)

S/N=10.1dB (T30)

S/N=5.1dB (EDT)

S/N=-0.8dB (T30)

S/N=-0.8dB (EDT)

S/N=1.6dB (T30)

S/N=1.6dB (EDT)

S/N=4.2dB (EDT)

S/N=6.3dB (EDT) S/N= Questionable outcome (T30)

EDT limit

T30 limit

101

Figure 4.3. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 1, Signal from Sound system (no simulated noise)

Page 114: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=6.4dB (EDT)

S/N=6.4dB (T30)

S/N=3.8dB (EDT)

S/N=8.5dB (T30)

S/N=9.2dB (EDT)

S/N=9.8dB (T30)

S/N=3.8dB (EDT)

S/N=9dB (T30)

S/N=4.7dB (EDT)

S/N=12.1dB (T30)

S/N=10.4dB (EDT)

S/N=10.4dB (T30)

S/N=12.2dB (EDT)

S/N=12.2dB (T30)

EDT limit

T30 limit

102

Figure 4.4. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 2, Signal from Sound system (no simulated noise)

Page 115: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=-2.2dB (EDT)

S/N=-2.3dB (EDT)

S/N=6.2dB (T30)

S/N=2.7dB (EDT)

S/N=8dB (T30)

S/N=4.4dB (T30) S/N=1.5dB (EDT)

S/N=2.5dB (EDT)

S/N=3.3dB (T30)

S/N=2.3dB (T30)

S/N=3.3dB (EDT)

EDT limit

T30 limit

S/N=0.9dB (EDT)

S/N=18.6dB (T30)

S/N=4.9dB (T30)

103

Figure 4.5. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 3, SS (signal), SS (noise)

Page 116: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=-2.2dB (T30)

S/N=-2.2dB (EDT)

S/N=3.5dB (EDT)

S/N=3.5dB (T30)

S/N=0dB (EDT)

S/N=3.9dB (T30)

S/N=6dB (EDT)

S/N=6dB (T30)

S/N=2.3dB (EDT)

S/N=7.9dB (T30)

S/N=8.6dB (EDT)

S/N=8.6dB (T30)

S/N= Questionable outcome (EDT)

S/N=Questionable outcome (T30)

EDT limit

T30 limit

104

Figure 4.6. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 4, SS (signal), SS (noise)

Page 117: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=-1.4dB (EDT)

S/N=0.4dB (EDT)

S/N=-0.6dB (EDT)

S/N=2.3dB (EDT)

S/N=12.6dB (EDT)

S/N=17.8dB (T30)

S/N=8.76dB (T30)

S/N=-4.4dB (EDT)

S/N=6.5dB (T30)

S/N=-3.2dB (EDT)

S/N=4.4dB (T30)

S/N=14dB (T30)

S/N=19.1dB (T30)

S/N=26.4dB (T30)

EDT limit

T30 limit

Figure 4.7. T30 and EDT accuracy/performance decay in octave bands for a series of measurements

(-1dB in signal level per measurement), Room 5- SS (signal), SS (noise)

105

Page 118: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=-0.1dB (EDT)

S/N=-2.2dB (EDT)

S/N=8.4dB (T30)

S/N=1.9dB (EDT)

S/N=7dB (T30)

S/N=6.5dB (T30)

S/N=0.1dB (EDT)

S/N=5dB (T30)

S/N=5.1dB (EDT)

S/N=6.8dB (T30)

S/N=7.4dB (T30) S/N=7.4dB (EDT)

S/N=4.7dB (T30) S/N=0.3dB (EDT)

EDT limit

T30 limit

106

Figure 4.8. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 6, SS (signal), SS (noise)

Page 119: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=9.7dB (EDT)

S/N=0.3dB (EDT)

S/N=2dB (T30)

S/N=1.1dB (EDT)

S/N=1.9dB (T30)

S/N=2.1dB (EDT)

S/N=2.4dB (T30)

S/N=0.5dB (EDT)

S/N=6.7dB (T30)

S/N=6.7dB (EDT)

S/N=6.7dB (T30)

S/N=21.2dB (T30)

S/N=5.1dB (T30) S/N=0.6dB (EDT)

EDT limit

T30 limit

107

Figure 4.9. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 7, SS (signal), SS (noise)

Page 120: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=0.8dB (EDT)

S/N=-2.9dB (EDT)

S/N=5.9dB (T30)

S/N=7.7dB (T30) S/N=1.1dB (EDT)

S/N=-4.3dB (EDT)

S/N=-2.6dB (T30)

S/N=2.1dB (EDT)

S/N=7.1dB (T30)

S/N=20.1dB (T30)

S/N=6.5dB (EDT) S/N=15.6dB (T30)

S/N=3.8dB (EDT)

S/N=6.9dB (T30)

EDT limit

T30 limit

108

Figure 4.10. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 8, SS (signal), SS (noise)

Page 121: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=2.8dB (EDT)

S/N=10.2dB (EDT)

S/N=9.5dB (EDT)

S/N=2.5dB (EDT)

S/N=16.7dB (T30)

S/N=20.6dB (T30)

S/N=23.9dB (T30)

S/N=2.9dB (EDT)

S/N=7.3dB (T30) EDT limit

T30 limit

S/N=9.1dB (EDT)

S/N=4.9dB (EDT)

S/N=20.5dB (T30)

S/N=16dB (T30)

S/N=17dB (T30)

109

Figure 4.11. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 9, SS (signal), SS (noise)

Page 122: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=4.9dB (EDT)

S/N=-1dB (EDT) S/N=8.4dB (T30)

S/N=12.4dB (T30)

S/N=3.3dB (EDT)

S/N=7.4dB (T30)

S/N=4.4dB (EDT)

S/N=4.6dB (T30)

S/N=7.5dB (EDT)

S/N=4.2dB (EDT)

S/N=7.4dB (T30)

S/N=7.9dB (T30) EDT limit

T30 limit

S/N=6dB (EDT)

S/N=29.6dB (T30)

110

Figure 4.12. T30 and EDT accuracy/performance decay in octave bands for a series of measurements (-1dB in signal level per measurement), Room 10, SS (signal), SS (noise)

Page 123: Final_1]2_ phd

Chapter 4 – Low level measurements

A side outcome of the Room 10 results showed that for both T30 and EDT, accurate

measurements were not produced for lower frequencies when simulated background

noise was used with the omni directional source. In this particular setup the omni source

was positioned in between and closer to the receiver compared to the nearest loudspeaker

of the portable sound system. Accordingly, noise was of a higher level at the particular

receiving direction, as opposed to noise simulation from the portable sound system where

the overall level measured at the receiver position was a cumulative result, approximately

from all directions. A higher signal level would be needed from the sound system to

overcome noise in such configurations, however the same level would possibly be

excessive at different receiver positions. As such, the reproduction of background noise

from multiple loudspeakers around the room rather than a single omni directional source

appeared to be more efficient for low level measurements. Better coverage was achieved,

consequently implying more consistent outcomes for the different system configurations

at marginal conditions.

Overall, from the series of measurements performed EDT values emerged as more

consistent. EDT had smaller range value fluctuations and appeared to be significantly

more accurate for particular octave bands among the datasets; the fact that EDT is

generally shorter than T30 would certainly influence performance in favour of the EDT

measure, see section 4.4.2. The EDT relies on the initial level drop rather than the full

decay range used by T60 (or x dB decay for Tx), thus for a non linear sound decay in

particular the EDT will have an advantage producing a representative value under

threshold conditions, as consistently evidenced in figures 4.3-4.12. Under strong non

linearity characteristics the effect would further depend on the degree of non linearity in

relation to the signal level. In an example within Room 3 (Figure 4.5) unstable T30

measurement behaviour was observed at the 1kHz octave band, where the difference

between T30 and EDT values increased for a limited signal level range; this was in-

between consistent T30 measurements. For a linear decay, conditions would primarily

depend on the signal level and the attainment or not of the decay range necessary for the

estimation of the measures considered.

111

Page 124: Final_1]2_ phd

Chapter 4 – Low level measurements

The examination of the functions relating to the main intelligibility measure, STI, was

based on the analysis of MTI, effectively an octave band specific ‘STI’, in relation to the

T30 and EDT values; for screening purposes, MTI values were used to derive an

unweighted version of STI for a given measurement, so as to enable a comparison

between reference and experimental conditions.

The MTI values corresponding to the series of low level measurements were assessed in

view of the T30 and EDT values involved, see figure 4.3. In an example for the 250Hz

octave band in Room 1, MTI (Figure 4.13) appeared to be more closely related to EDT,

given that the initial large fluctuations in T30 did not significantly affect the resultant MTI

at the corresponding measurements. Comparable behavior was observed for all the rooms

considered, see figures 4.13-4.22, where T30 was entirely surpassed by EDT in terms of

significance in an MTI assessment.

The extend of the influence of a single erroneous octave band in terms of MTI appeared

to be subject to an averaging procedure at the final STI estimation, see figures 4.13-4.22

for MTI and figures 4.23-4.32 for STI. For example, in Room 1 (figure 4.23) the 250Hz

octave band alone did not appear to notably affect the final STI; accurate STI values were

derived even when involving MTI(250Hz) variation exceeding the JND. The effect is more

obvious when considering additional octaves, given that accurate STI values in the series

of measurements extend beyond the first erroneous MTI values, occurring at particular

octave bands. Depending on the frequency considered, an octave band weighting that is

normally applied when estimating the final STI value could enhance or reduce the effect.

112

Page 125: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=5.1dB (MTI)

S/N=2.2dB (MTI)

S/N=5.1dB (MTI)

S/N=-0.8dB (MTI)

S/N=1.6dB (MTI)

S/N=2.4dB (MTI)

S/N=4.3dB (MTI)

MTI limit

Figure 4.13. MTI data in Room 1

113

Page 126: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=6.4dB (MTI)

S/N=5.2dB (MTI)

S/N=8.8dB (MTI)

S/N=6.8dB (MTI)

S/N=9.5dB (MTI)

S/N=10.4dB (MTI)

S/N=10.9dB (MTI)

Figure 4. 14. MTI data for Room 2

MTI limit

114

Page 127: Final_1]2_ phd

Chapter 4 – Low level measurements S/N=0.9dB (MTI)

MTI limit S/N=3.3dB (MTI)

S/N=2.5dB (MTI)

S/N=-0.7dB (MTI)

S/N=2.1dB (MTI)

S/N=1.5dB (MTI)

S/N=2.4dB (MTI)

Figure 4. 15. MTI data for Room 3

115

Page 128: Final_1]2_ phd

Chapter 4 – Low level measurements S/N=5.8dB (MTI)

MTI limit S/N=3.5dB (MTI)

S/N=2.5dB (MTI)

S/N=6dB (MTI)

S/N=-3.6dB (MTI)

S/N=-2.3dB (MTI)

S/N=-3.2dB (MTI)

Figure 4. 16. MTI data for Room 4

116

Page 129: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=-1.4dB (MTI)

S/N=-2.9dB (MTI)

S/N=3dB (MTI)

S/N=3.3dB (MTI)

S/N=3.1dB (MTI)

S/N=1.3dB (MTI)

S/N=10.3dB (MTI)

MTI limit

Figure 4. 17. MTI data for Room 5

117

Page 130: Final_1]2_ phd

Chapter 4 – Low level measurements S/N=7.4dB (MTI)

MTI limit S/N=5.1dB (MTI)

S/N=1.5dB (MTI)

S/N=4.2dB (MTI)

S/N=1.9dB (MTI)

S/N=-1.3dB (MTI) S/N=3dB (MTI)

Figure 4. 18. MTI data for Room 6

118

Page 131: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=3.8dB (MTI)

S/N=2dB (MTI)

S/N=1.9dB (MTI)

S/N=2.4dB (MTI)

S/N=5.5dB (MTI)

S/N=6.7dB (MTI)

S/N=9.7dB (MTI)

Figure 4. 19. MTI data for Room 7

MTI limit

119

Page 132: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=4.8dB (MTI)

S/N=2.8dB (MTI)

S/N=6.3dB (MTI)

S/N=-0.3dB (MTI)

S/N=1.5dB (MTI)

S/N=0.8dB (MTI)

S/N=11dB (MTI)

MTI limit

Figure 4. 20. MTI data for Room 8

120

Page 133: Final_1]2_ phd

Chapter 4 – Low level measurements

S/N=2.5dB (MTI)

S/N=9.5dB (MTI)

S/N=10.2B (MTI)

S/N=8.9dB (MTI)

S/N=2.9dB (MTI)

S/N=7.6dB (MTI)

S/N=9.1dB (MTI)

Figure 4. 21. MTI data for Room 9

MTI limit

121

Page 134: Final_1]2_ phd

Chapter 4 – Low level measurements

122

S/N=6.2dB (MTI)

S/N=7.9dB (MTI)

S/N=4.8dB (MTI)

S/N=4.6dB (MTI)

S/N=2.3dB (MTI)

S/N=7.4dB (MTI)

S/N=7.5dB (MTI)

MTI limit

Figure 4. 22. MTI data for Room 10

Page 135: Final_1]2_ phd

Chapter 4 – Low level measurements

STI limit

Figure 4.23. STI in Room 1 (no frequency weighting)

STI limit

Figure 4.24. STI in Room 2 (no frequency weighting)

STI limit

Figure 4.25. STI in Room 3 (no frequency weighting)

STI limit

Figure 4.26. STI in Room 4 (no frequency weighting)

STI limit

Figure 4.27. STI in Room 5 (no frequency weighting)

123

Page 136: Final_1]2_ phd

Chapter 4 – Low level measurements

STI limit

Figure 4.28. STI in Room 6 (no frequency weighting)

STI limit

Figure 4.29. STI in Room 7 (no frequency weighting)

STI limit

Figure 4.30. STI in Room 8 (no frequency weighting)

STI limit

Figure 4.31. STI in Room 9 (no frequency weighting)

STI limit

Figure 4.32. STI in Room 10 (no frequency weighting)

124

Page 137: Final_1]2_ phd

Chapter 4 – Low level measurements

As the derivation of STI is subject to the averaging processes, at the MTI (and mF) level,

a somewhat reduced sensitivity of the measure to T30 and EDT fluctuations could be

expected and appear normal.

4.4.2 Correlation of T30 (and EDT) with threshold efficient S/N

The relation between threshold efficient S/N and T30 – EDT values was examined for the

ten test rooms in order to verify a prospective relation.

Assumptions made in the initial investigations in relation to T30 could be confirmed, with

a number of exceptions as in Room 3 and Room 9, see Appendix 4.1. The potential effect

of the measurement system in this respect was later examined by a series of

measurements, aimed in verifying consistency as well as repeatability for a given

configuration, see section 4.4.3. The outcome relating to Room 3 was attributed to the

room’s ceiling design, forming effectively a coupled space, given an opening of 0.5m

around the perimeter of the suspended ceiling. The size of Room 9 in combination with

transient noise events exceeding the simulated noise level was mainly responsible for

differentiating results in the specific room. Nonetheless, correlation coefficients ranging

from 0.70-0.92 were established for the greater part of experimental conditions with a

mean average approximating 0.80; see Appendix 4.1 for the full set of results.

Comparable performance was further observed in terms of EDT, see Appendix 4.1, the

only exceptions being Room 1 and Room 3 due potentially to the shape characteristics

(coupled space) in Room 1 and the ceiling design in Room 3, as previously described.

The overall results, as shown in figures 4.33-4.34, demonstrate an evident trend in terms

of the relation of threshold efficient S/N to T30 and EDT. For increasing values of the

latter a higher S/N is generally required for an accurate measurement, while for the

smaller sized test rooms forming the main focus of the current study the effect was more

evident. Data points relating to smaller rooms resulted in a better defined relation in the

T30 case, see figure 4.33, than larger or noisier rooms (when considering BGNL and

transient noise events, see Appendix 5). A smaller spread of results was found for data

125

Page 138: Final_1]2_ phd

Chapter 4 – Low level measurements

relating to EDT accuracy, see figure 4.34, while the advantage of the measure over T30

for marginal conditions could be highlighted, given the lower S/N appearing within the

data sets.

Threshold efficient S/N for ascenting T30

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

-5 0 5 10 15 20 25 30

S/N (dB)

Room 1Room 2

Room 3

Room 4

Room 5

Room 6Room 7

Room 8

Room 9

Room 10

Figure 4.33. Threshold efficient S/N trend in relation to T30, 125-8kHz octave band data in ten test rooms

Threshold efficient S/N for ascenting EDT

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

-5 0 5 10 15 20 25 30

S/N (dB)

Room 1

Room 2Room 3

Room 4

Room 5

Room 6Room 7

Room 8

Room 9

Room 10

Figure 4.34. Threshold efficient S/N trend in relation to EDT, 125-8kHz octave band data in ten test rooms

126

Page 139: Final_1]2_ phd

Chapter 4 – Low level measurements

Higher signal levels are used to obtain adequate S/N at lower frequencies in particular,

due to the effect of BGNL. Given that higher frequencies are more intrusive in terms of

annoyance, a significantly more tolerable level can be achieved by suitable signal

equalization to avoid unnecessary high levels at particular octave bands, i.e. equalization

according to the BGNL, to produce a constant S/N across the frequency spectrum,

facilitating measurement accuracy. The use of equalization on this basis assumes the

derivation of speech intelligibility parameters via post processing of the measurements to

suitably account for the influence of speech level and BGNL.

4.4.3 Repeatability of results with/without simulated noise floor

In an attempt to enhance control on the variable of fluctuating BGNL, a pink noise source

was incorporated in the reproduction system during measurements for the majority of the

sessions. The efficiency of the approach was examined within two test rooms (a small

sized reverberation chamber (N11) and Room 10, see figure 4.35), using six system

configurations as described in section 4.2.2 so as to verify the consistency of the

measurements in this respect. With repeatability of results suffering from occasional

transient noise events exceeding the simulated noise level and partly non linear decays

within the rooms, it was shown that the incorporation of pink noise provided on

occasions a limited advantage in terms of measurement repeatability.

Tables 4.3-4.4 compare the outcome for the six different system configurations.

Considering results in two groups within each table depending on the source type used

for the signal reproduction (i.e. SS or Omni), a close relationship was found among the

datasets. A number of exceptions were observed, relating mainly to the two

configurations not using simulated noise. Accordingly, the incorporation of pink noise

appeared overall to increase confidence on the results. Taking into account the S/N

derivation methodology (i.e. using the last 10% of the impulse response) however, it

should be reminded that a deterministic outcome cannot be assumed, given also the

variable of decay linearity that could not be controlled.

127

Page 140: Final_1]2_ phd

Chapter 4 – Low level measurements

I) II)

Figure 4.35. Test rooms used in the assessment of result repeatability, I) N11, II) Room 10

Threshold efficient S/N (dB) Measurement configuration

125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz SS (signal), SS (noise) 14.3 21.3 13.7 8.2 9.8 4.3 16.5

SS (signal), Omni (noise) 14.6 21.7 13.2 9.3 18.2 5.1 15.3

SS (signal), No simulated noise 20 19.4 14.4 10 10.7 6.3 16.8

Omni (signal), Omni (noise) 11.1 11.5 8.2 17.3 5.8 12.4 6.6

Omni (signal), SS (noise) 9.9 10.6 15.7 13.9 7.3 13.3 5.6

Omni (signal), No simulated noise 17.3 10.6 10 7.5 10.2 10.9 1.6

Table 4.3. Threshold efficient S/N for the six system configurations derived from marginal T30 data in N11(reverberation chamber)

Threshold efficient S/N (dB) Measurement configuration

125Hz 250Hz 500Hz 1kHz 2kHz 4kHz 8kHz SS (signal), SS (noise) 11.8 8 -0.1 6.4 3.6 6.9 6.4

SS (signal), Omni (noise) 10.5 6.1 -1 6.9 3.3 5.4 7.9

SS (signal), No simulated noise 12.1 0.8 2.4 5 3.8 6.7 -5.7

Omni (signal), Omni (noise) 10.9 13.6 6.4 6.6 5.7 3.6 3.9

Omni (signal), SS (noise) 9.6 12.1 4.5 8.8 4.4 4.9 5.7

Omni (signal), No simulated noise 12.3 13.4 5.3 8.4 4.7 4.6 3.6

Table 4.4. Threshold efficient S/N for the six system configurations derived from marginal T30 data in Room 10

In a correlation analysis for the two datasets, see Appendix 4.2 for an overview, results

within N11 demonstrated a high correlation of EDT with the threshold efficient S/N

128

Page 141: Final_1]2_ phd

Chapter 4 – Low level measurements

approaching 0.91 and a mean average of 0.81, while being consistently over 0.77. The

equivalent results within Room 10 showed a similar degree of consistency, excluding

however the system configurations using the portable sound system for reproduction of

the test signal. The latter resulted in a considerably less correlated relation between EDT

and threshold efficient S/N due to the size of the space, in connection to the additional

number of sources. It should be remembered that as the receiver positioning within a

space highly influences the EDT, so does the source positioning. For marginal conditions,

an increased number of sources would thus potentially increase measurement uncertainty

in terms of EDT.

Comparable results were obtained in terms of T30 for Room 10, where using a single

source demonstrated a more evident relation of threshold efficient S/N to T30. The N11

results produced the same trend in terms of the sound source used, however, with

significantly lower correlation values overall. The reverberation chamber appeared to

give less consistent results due to fluctuating noise levels and a relatively long RT in a

small room. A similar effect was also found for the portable sound system in both rooms,

given the size of Room 10 and the long RT in N11.

It was established that using simulated background noise in any of the four possible

configurations did not introduce any setbacks, having comparable results for all four

datasets in terms of threshold efficient S/N derived from marginal T30 data. Considering

the output from the two configurations not using simulated background noise it was

established that the incorporation of pink noise did provide a limited advantage in terms

of measurement repeatability, while the equivalent results for S/N derived from marginal

EDT data produced a less evident differentiation in performance.

A correlation analysis showed only small deviations when considering the EDT data for

both rooms, with the exception of the ‘SS’ case in Room 10 (large room). Here, using the

multiple loudspeaker arrangement (SS) for signal reproduction, the use of simulated

background noise resulted in a less correlated relation of threshold efficient S/N to both

T30 and EDT. In terms of T30, system performance for the three remaining datasets gave

129

Page 142: Final_1]2_ phd

Chapter 4 – Low level measurements

130

no clear differentiation. However, the use of a single (omni directional) source rather than

a multi source arrangement (SS) for signal reproduction appeared to give a more

consistent outcome in terms of the relation of T30 to threshold efficient S/N, for both

Rooms 10 and N11.

4.5 Conclusions

In this chapter low level measurements within ten test rooms have been analyzed in terms

of a reducing S/N and its effect on measured parameters, most notably T30, EDT and STI.

Overall, EDT emerged as the more consistent parameter when compared to T30, having

reduced value fluctuations for the series of measurements performed and appearing to be

significantly more accurate under marginal conditions. The outcomes as presented,

demonstrated a trend in terms of the relation between the threshold efficient S/N to T30

and EDT. For increasing values of room reverberance a higher S/N is generally required

for an accurate measurement in the respective terms. Overall, a 20dB and 12dB S/N was

necessary in this study for an accurate estimation of T30 and EDT respectively, over all

frequency bands with no signal averaging. An advantage of EDT could thus be

highlighted.

STI has been found to be more closely related to EDT than to T30. The different

averaging processes in deriving STI were highlighted, suggesting that errors at the MTI

(or mF) level can potentially be averaged out.

In the following chapter the acoustical conditions of the test rooms are analyzed on the

basis of computer modelling, where the outcomes relating to EDT efficiency are put into

a different context to highlight obtainable advantages in the new terms.

Page 143: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

CHAPTER 5

Computer Modelling of Test Spaces

5.1 Introduction

In Chapter 2 core modelling approaches for the prediction of acoustic conditions in

enclosed spaces have been presented. Hybrid models have emerged as the most

appropriate solution for room acoustic investigations [101]. Clear guidelines for the

preparation of models however do not exist. The geometry detail level needed for an

accurate prediction is one of the key factors that is open for interpretation. Additional

parameters central to the process include absorption and scattering functions relating to

the room surfaces. As such, absorption and scattering coefficients need to be realistically

defined considering the actual room configuration for an acceptably accurate prediction.

Often these coefficients are approximated, leading to potentially significant prediction

errors particularly when combined with additional variants e.g. a simplified definition of

room geometry or erroneous source directivity.

An approximation of the actual room geometry is commonly used depending on the size

of the room and its features, initiating from rules of thumb that arguably concern large

spaces, see section 2.7.1. Consequently, they do not apply in smaller classroom type

spaces as will be shown in this chapter.

A large selection of absorption coefficient data is widely available, normally relating to

random incidence of sound. This approximation is considered suitable for simulation

purposes, though random incidence angles present an uncertainty factor that is

nonetheless acceptable [92]. Possible exceptions in the usability of such data are cases

requiring angle dependent data, e.g. flat rooms [92]. Accordingly, classroom type spaces

131

Page 144: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

can comprise a special case for which the estimation of suitable absorption coefficients

becomes more complex. The use of typical values in this case will invalidate the

prediction.

Similarly, a set of scattering coefficients needs to be defined to describe the amount of

energy that will not follow a specular reflection from a surface. Sound diffusion and

scattering are used to characterize diffuse reflection and comprise a more ambiguous term

as their measurement has not been standardized until recent years; data for different

surfaces is thus limited. In 2001, the AES Working Group SC-04-02 published a

document [ 124 ] standardizing the measurement procedure to quantify diffusion

coefficients. The latter however was not intended as input for computer models and is

thus generally incompatible with the type of input required by simulation algorithms. In

contrast, the scattering coefficient is compatible with geometric room acoustics models

(Cox, cited in Nironen [125]). The related ISO document [126] describes the measurement

of random incidence scattering coefficients in a reverberation chamber, based on a

procedure suggested by Vorländer and Mommertz [127] in 2000. Edge diffraction leading

to scattering is not accounted for in the calculation although it can be considered as less

important given acoustically large room surfaces. It is worth noting that although

scattering coefficient data is gradually becoming available for different surface types,

prediction input values will still have to be estimated in most cases, see Dalenbäck [128].

This observation is a result of the uncertainties involved in the calculation of scattering,

relating mainly to the given surface size and potential geometry simplifications that need

to consider accordingly an altered acoustic behaviour in these terms. Although not always

the case, including finer geometric details can simplify the scattering considerations in

this respect, predominantly for small rooms.

Three international round robin projects [107, 108, 109, 110] have been undertaken to test room

acoustics prediction software and assess different aspects of the uncertainties involved in

combination with the resulting prediction accuracy/consistency. Results indicated that a

reasonably accurate outcome in the range of 1-2 JND for the different measures or within

the inherent experiment uncertainties can be expected. User input however remains a

132

Page 145: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

critical variant that can be reduced in the design of the prediction model and in its

preparation i.e. choosing suitable input data in terms of the absorption and scattering

coefficients. In the same context, different authors have presented their approach in

estimating the particular prediction input (see Hodgson et al. [129], Zeng et al. [130], Saher

et al. [ 131 ] etc.). The applicability of the methods however is not universal, thus

questionable in many cases.

In the following sections, the design and preparation of computer models is discussed in

more detail to establish a suitable approach for the types of space used in this study.

Validation of the prediction is similarly a crucial stage, thus the proposed methods are

assessed in terms of their efficiency and resulting accuracy for a number of related

parameters. Ten test rooms are modelled and later examined in view of the experimental

results in section 5.3.

5.2 Preparation methodology

5.2.1 Model design

In large spaces wave effects can be efficiently approximated using statistical theory and

thus, computer modelling can more competently approximate the actual conditions for

such enclosures. Accordingly, modelling small spaces is a challenge, as wave effects are

most prominent in such rooms.

A general rule of thumb in designing a computer model is to use a maximum resolution

of 0.5m for the interior of the room [92]. However, it is argued that rules of thumb do not

apply in classroom type enclosures. In small rooms the detail resolution limit can

potentially be extended to marginally finer resolutions for a better interpretation of

conditions (e.g. for receiver positions in close approximation to structures) and a more

efficient account of scattering by the automated algorithms. CAD software can assist in

the design process by offering an improved platform, with subsequent effects on a

number of aspects of the efficiency of the process.

133

Page 146: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

A methodology that enables the actual conditions to be better approximated is the main

objective of this chapter. An efficient approach could thus become simpler to define.

5.2.1.1 Model detail resolution

Representing room geometry in a computer model requires an assessment of the room

and its architectural details, so as to evaluate the design resolution necessary for an

adequate computer representation. With the dimensions and shape of the room as a

starting point, additional parameters can be accounted for such as the room’s fittings;

related dimensions, location and density information needs to be appropriately assessed,

considering the case of a simplified version of the specific geometry characteristics, to

determine the model input in these terms. The time needed to construct the actual model

is also an important consideration central to the process. Third party CAD software can

be applied to speed up the process of model construction, influencing the resulting detail

and accuracy of the space representation. Automated processes such as point to point

connections can allow for a reduced number of setbacks in the construction process, e.g.

an ‘open model’, misrepresented room symmetry and erroneous object positioning

among others. Considering that CAD platforms are typically architecture oriented, a

reduced construction time can be expected. Geometry details can be significantly

enhanced, see figure 5.1, however this can affect the crucial balance between an accurate

representation and prediction efficiency. The processing time needed for the simulation

can be adversely affected as a direct result of the altered overall process.

Common outcomes of current methodologies are either overly simplified room

geometries, thus being inadequate for a detailed assessment, or inefficient complex

geometries resultant of a simpler (when using CAD software) approach. An experimental

session is presented in this context based on typical lecture rooms i.e. small rooms, see

section 5.3.2. This will establish the potential advantage of a finer design resolution in the

construction of a model, as reflected on the prediction output for the space type

considered.

134

Page 147: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Figure 5. 1. Example room geometry, I) Simple representation II) Enhanced detail

5.2.1.2 Source directivity

The directional characteristics of the sources used in computer simulations have been

shown to affect, to various degrees, both the numerical output as well as the subjective

perception of auralized material, see Dalenbäck [132], Wang [133]. For critical conditions in

particular, source directivity can improve or significantly impair speech intelligibility.

A white paper by Dalenbäck [134] suggests that prediction accuracy does not necessarily

rely solely on the angular resolution of directivity data or frequency resolution used in

computer predictions. An accurate representation of the near field of the source could

also be a critical point. It can thus be deduced that for smaller rooms it is vital to utilize

directivity data measured in the near field of the source considered.

Two types of source were used within the rooms as described in Chapter 3. While a

default omni directional model was used for the omni source, near field response data at

an acceptable angular and frequency resolution for the studio monitors needed to be

established. The latter were measured at a distance of 1m from the source in free field

conditions for a 10˚ resolution (1/1 octave bands), according to the measurement

procedure described BS EN 60268-5:2003 [ 135 ]. The obtained results, as used in the

simulations are shown in figures 5.2- 5.3.

135

Page 148: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Figure 5. 2. Measured directivity response for Yamaha HS50M monitors at 1m in free field conditions (balloon)

136

Page 149: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Figure 5. 3. Measured directivity response for Yamaha HS50M monitors at 1m in free field conditions (polar)

5.2.1.3 Definition of absorption and scattering coefficients

Accurate absorption and scattering coefficients are fundamental in achieving acceptable

prediction accuracy. A number of aspects are involved in the process as described in

section 5.1; surface absorption is primarily considered.

Rules of thumb for defining scattering functions can be followed i.e. ‘a minimum of 20%

default diffusion for average-size smooth flat surfaces or 10% for big flat smooth

surfaces’, ‘a high value (80%) for rough surfaces where the roughness scale is of the

order of the wavelength’ and edge diffusion [128]. At the same time it is clear that

overestimating rather than underestimating scattering coefficients is preferred. These

simple steps, in combination with an adequately detailed room geometry, can serve to

efficiently approximate scattering effects, though minor tuning might be needed post

prediction to rebalance the influence of absorption. It is worth noting that the sensitivity

of a room to scattering coefficients in the context of prediction output may depend largely

on the amount of absorption present; i.e. larger output variations for lower overall

absorption when altering scattering input values [136].

137

Page 150: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Absorption coefficients can be deduced from measurements if the computer model is to

be used as a post evaluation tool. However, this is not the case for the design condition.

In the latter case a calibration procedure to match the prediction outcome to measured

values (see section 5.2.2) cannot be performed since room acoustics measurements are

unavailable. Predefined values of absorption coefficients thus need to be used and the

user is expected to assess the suitability of any data. There is a large selection of related

databases available however, text book values are to some extent not valid for normal

enclosed spaces due to the unrealistic conditions under which they were measured e.g.

free field. Acceptable statistical approximations in the context of absorption coefficients

are described in section 5.1; however, for small rooms in particular a given dataset will in

many cases not match the actual conditions, if individual cases e.g. a representation of an

audience area are not suitably evaluated and their acoustical characteristics readjusted

when necessary. An alternative approach is an empirical estimation of absorption, as

described by Hodgson and Scherebnyj [129]. Using acoustical measurements in actual

classrooms, the absorption coefficients for a number of materials were empirically

calculated, in some cases significantly differentiating from text book values. However,

the applicability of such data is conceivably limited to nearly identical rooms since the

definition of absorption is an absolute case. A combination of carefully selected text book

values with empirical data would potentially be a most efficient approach.

In contrast, given that a computer simulation is utilized as a post evaluation tool, the

potential availability of actual room acoustics data enables the option of an objective

prediction accuracy assessment and fine tuning of results i.e. validation/ calibration.

Considering the limitations of empirical procedures, an estimation of absorption

coefficients by comparison to actual room acoustics measurements would certainly be the

most appropriate solution.

The process of confirming the prediction consistency by comparison to measured

parameters is referred to as model validation. Similarly, fine tuning a prediction to match

the results to measured parameters is referred to as model calibration. A

validation/calibration procedure in terms of T30 has been found to be an efficient

138

Page 151: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

methodology [114, 121]. Absorption (and scattering) coefficient values are initially

approximated allowing the fine tuning of the simulation by comparing actual and

predicted acoustic parameters, normally T30, although alternative parameters could be

used [114]. The validation/calibration procedures are described in the following section.

5.2.2 Model validation/calibration methodology

The validation/calibration procedures assume the use of computer models as a post

evaluation tool. As such, approximate scattering and absorption coefficients act as a

starting point. Scattering coefficients must be generically set if no specific data is

available, according to theoretical assumptions i.e. depending on the surface area and

material roughness of related surfaces, see section 5.2.1.3. Fine tuning can be later

performed, if necessary, after establishing a balance in the definition of surface

absorption coefficient values; however further action is rarely required.

A simple calibration procedure can be used to generically adapt the model to the actual

conditions in terms of the absorption coefficients. Reverberation time (T30) values

comprise the determinants of model performance and thus, a step by step evaluation is

enabled using multiple receiver positions, typically six. Performance optimization is

initially undertaken for one receiver, preceding sequential reference to data for the

additional receiver positions. Reaching a balance among the prediction setups will enable

the determination of a set of acoustic properties for the room surfaces, which can be

assumed as correct and thus, allow for reliable predictions under the conditions necessary

for different scenarios/experimental setups in the same room.

The calibration procedure can be similarly applied with the use of alternative acoustic

parameters e.g. EDT rather than T30. While T30 has been shown to largely incorporate

most of the room acoustical characteristics [1] the potential advantages of using EDT in

particular is later examined and discussed in more detail, see sections 5.4 and 5.5.

139

Page 152: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.3 Experimental results

In the context of small enclosed spaces, this section presents experimental results relating

to the validation/calibration procedures. The focus is on the acoustic parameter to be used

in the process, so that the prediction output and room acoustics measurements can be

compared. Results are further complemented by an assessment of the level of detail

resolution that is required by a model for an accurate prediction in the terms used.

5.3.1 Basis for model validation/calibration

The key procedures were examined in practice in a pilot study with the use of three test

rooms. The latter were objectively assessed onsite and their geometries modelled so as to

enable a comparison of results, thus allowing for an evaluation of the methods used in the

approach. Four source configurations utilizing an omni directional speaker and a portable

sound system were used in each room, as described in Chapter 3. This provided the basis

for the prediction and measurement sessions. In the following sections, the test

methodology is described and key points are discussed in view of the results obtained.

5.3.1.1 Test methodology - Room acoustics measurements

Room acoustics measurements were performed in the three test rooms using a WinMLS

2004 [6] based measurement system combined with a pair of omni directional sound

source and receiver. A swept sine test signal was utilized to excite the space and multiple

measurements, based on BS ISO 3382 [118], were taken for six receiver and two source

positions.

5.3.1.2 Test methodology – T30 calibration

A T30 based approach was used as described in section 5.2.2 to match the prediction

output to room acoustics measurements; the procedure considered only one source

configuration.

140

Page 153: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.3.1.3 Test methodology – Results via output comparison

The calibration procedure precedes a direct comparison of prediction output to actual

measured values. Consequently, the simulation quality can be assessed in terms of any

number of parameters that are included in the process.

The calibrated models for the three test rooms (figure 5.4) achieved a good result in terms

of the measured T30, i.e. prediction and measurement output was comparable. To validate

the simulation’s response to parameters other than the reference, used for calibration,

additional measures were examined, with the C50 and STI measures being of most interest

due to their high correlation to speech intelligibility. Table 5.1 shows a comparison

between actual and predicted values.

For the single source case it was found that resulting values were comparable, giving an

STI (and C50) marginally over the JND. A somewhat altered character for the sound

system assisted conditions was observed here in some instances, any differentiations

however attributed to the input in terms of the source’s characteristics and simple room

geometry.

The incorporation of a sound system generally requires accurate performance

characteristics at hand to ensure a realistic comparison. However, given a consistent

model, experimentation for different scenarios is enabled in any case since predictions

can reveal to a large extend the room potential and/or limitations on a relative basis. The

calibration procedure provided confidence that the models would perform to an extent

consistently under different configurations, such as for alternative source types.

Figure 5. 4. Top view of test rooms for the validation of T30 calibration methodology

141

Page 154: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

142

F[Hz] 125 250 500 1000 2000 4000 Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.58 0.61 0.56 0.60 0.61 0.47 0.67 0.53 0.73 0.57 0.71 0.54

T15[s] 0.58 0.61 0.55 0.55 0.61 0.57 0.7 0.64 0.76 0.68 0.72 0.67

T30[s] 0.60 0.65 0.58 0.58 0.64 0.63 0.72 0.77 0.78 0.80 0.73 0.73

C50[dB] 4.2 2.8 4.4 3.5 3.6 5.8 3 5.3 2.2 4.2 2.6 4.5

SPL[dB] 76.8 N/A 76.6 N/A 77.3 N/A 78.1 N/A 78.6 N/A 78.2 N/A

STI 0.68 0.73 Rating: Good Good

STIrMal N/A 0.73 Rating: N/A Good

STIrFem N/A 0.74 Rating: N/A Good

Table 5. 1. Example mean values for single omni directional source (prediction against measurement)

5.3.1.4 Session conclusions

The validation procedure appears to be a reliable way to adapt the models to the actual

conditions. The process, performed in terms of the RT (T30) values at the receiver

positions confirmed, to some extent, that the specific parameter largely incorporates the

general room characteristics for computer modeling purposes.

Smaller differences, found for T15 and EDT in the comparison, suggested that

optimization relying solely on T30 is principally a simplified though efficient version of

the process. Given a more detailed approach, a need for further references e.g. EDT, SPL

might aid the simulation to perform at a higher accuracy level.

5.3.2 Model resolution

In the following sections, the influence of the detail level incorporated in a computer

model is examined in terms of the resulting prediction accuracy and overall efficiency.

5.3.2.1 Assessment preparation and the impact of detail resolution

The computer models used in this assessment were constructed using two different

approaches to examine the efficiency of a particular design in terms of level of detail. For

reference purposes, the first approach used a coordinate system in text file format that is

typical within modelling software packages. Considering the time needed to obtain a final

Page 155: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

working version, see figure 5.5 (I), basic architectural detail was incorporated with a

construction time ranging from ~24-96 hours.

In the second approach, third party CAD software i [137] was used to construct and later

export [138] the models in a format compatible with the prediction software. Given a faster

working routine a greater amount of room detail could be incorporated in the models, see

figure 5.5 (II), within a significantly reduced construction time. Subsequent effects on

simulation run time and quality of results were recorded and are later referenced in more

detail. Model construction required ~4-8 hours for a final working version of the rooms.

II) I)

Figure 5. 5. Example of model detail resolution in Room 8, I) Via coordinate system, II) Via CAD software

Computer simulations were performed in CATT Acoustics v8.0f ii [128] where all models

were debugged prior to use. A calibration procedure based on T30 values was used to

generically adapt the simulations to actual conditions, resulting in the derivation of two

data sets for the test conditions. A direct comparison of the predictions to actual values

was thus possible.

5.3.2.2 Assessment result for a single omni directional source

The output from the prediction process was compared to the equivalent output from

actual measurements to determine the model efficiency in the terms used (T30, EDT, C50

and STI). The models analyzed, abbreviated to ‘simple’ and ‘CAD’, achieved a diversity

of results, with the accuracy level overall being greater for the CAD model. Exemplar

i CAD software platform was selected by considering processing features, simplicity of use and availability. Selected software was Google SketchUp Pro v.6 incorporating the Rahe-Kraft exporting plug-in. ii CATT Acoustics v8.0f is a leading simulation software platform, selected on account of its hybrid nature, processing features and availability.

143

Page 156: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

data presented in this section for Room 8 (tables 5.2-5.7, figures 5.6-5.9) demonstrate the

differences for the two models that, at various instances, both provided a satisfactory

result.

Given that validation was performed in terms of T30 values both models achieved a good

result on this basis (tables 5.2 and 5.5). For additional parameters and/or altered

conditions nonetheless e.g. different source type and position (see also section 5.3.2.3),

the ‘simple’ model was found to be unconstructively influenced in terms of its prediction

accuracy.

Table 5. 2. T30 for actual and predicted conditions (simple) in Room 8

Table 5. 3. EDT for actual and predicted conditions (simple) in Room 8

Figure 5. 6. Mean EDT for actual and predicted conditions (simple) in Room 8

Table 5. 4. C50 for actual and predicted conditions (simple) in Room 8

Figure 5. 7. STI comparison for actual and predicted conditions (simple) in Room 8

144

Page 157: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Table 5. 5. T30 for actual and predicted conditions (CAD) in Room 8

Table 5. 6. EDT for actual and predicted conditions (CAD) in Room 8

Figure 5. 8. Mean EDT for actual and predicted conditions (CAD) in Room 8

Table 5. 7. C50 for actual and predicted conditions (CAD) in Room 8

Figure 5. 9. STI comparison for actual and predicted conditions (CAD) in Room 8

In the EDT comparison high accuracy was observed for the ‘CAD’ model, while the

‘simple’ model appeared to loose consistency in these terms (tables 5.3 and 5.6). The

EDT prediction had a direct effect on the derivation of STI, as seen in figures 5.7-5.9.

With the STI (and C50) measure being of most interest, the comparison to measured

values showed STI prediction errors up to twice the JND for the ‘simple’ model with

discrepancies over 0.04 at all receiver positions. For the ‘CAD’ model in contrast, values

were within the specified limits over all but one receiver position. C50 results appeared

more satisfactory for both models, however with noticeably higher precision for the

‘CAD’ model, see tables 5.4 and 5.7. Receiver position 6 resulted in a larger error than

the trend for the prediction for both models; this however attributed to a limited effect of

localized BGNL on the consistency of room acoustics measurements.

145

Page 158: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.3.2.3 Use of Alternative Source Configurations

Experimentation using source configurations other than the original omni directional

arrangement (S1) aimed in establishing the simulation consistency for additional

configurations. Given that validation normally takes place for a single condition i.e. one

source and a set of receiving positions, the combination of validation comprehensiveness

and model detail can be used as an indication of the prospective prediction consistency

when using alternative configurations.

Given the current methodology, the ‘simple’ and ‘CAD’ models were examined to

establish differences in performance on this basis. Simulating a sound system installation

of four directional loudspeakers a new set of predictions was performed without

additional fine tuning, and compared to the measured values (tables 5.8 and 5.9, see also

section 5.4). The variation in each case when compared to measurements showed that the

‘simple’ model resulted in reasonable accuracy being however notably influenced by the

altered environment. Discrepancies in terms of Ts, C50 and C80 were marginally

acceptable however an error of 0.04 was found in the predicted STI. A more elaborate

initial validation could potentially be used in this case to enhance performance. Smaller

magnitude differences were found for the ‘CAD’ model, where most notably the STI

discrepancy was 0.01.

F[Hz] 125 250 500 1000 2000 4000 Ts[ms] 17.2 5.8 10.7 13.6 11.2 12.1

C50(dB) -0.9 -0.4 -2.6 -4.4 -3.5 -3.2 C80(dB) -0.9 0.2 -3.0 -4.4 -3.7 -3.6

STI -0.04 Table 5. 2. Example of average error for ‘Simple’ model using an alternative

source configuration in Room 8 (sound system, SS4)

F[Hz] 125 250 500 1000 2000 4000 Ts[ms] 7.2 1.2 4.8 7.6 4.0 5.5

C50(dB) 0.8 0.7 -0.7 -1.9 -0.8 -1.2 C80(dB) 0.9 1.1 -1.3 -2.2 -1.3 -1.9

STI -0.01

Table 5. 3. Example of average error for ‘CAD’ model using an alternative

source configuration in Room 8 (sound system, SS4)

146

Page 159: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

A more detailed data examination revealed the extend of the discrepancies for individual

receiver positions, see figures 5.10 and 5.11. Significantly larger predicted STI errors up

to 0.09 (typically ≥ JND for all receiver positions) were found in the ‘simple’ model,

while the ‘CAD’ model predictions were more consistent with the measurements. In the

latter case, discrepancies of up to 0.05 (typically ≤ JND for the majority of receiver

positions) were found. Thus, given an assessment via ‘simple’ type models, the user

would need to account for an increased error margin that is introduced due to the reduced

detail resolution.

Figure 5. 10. Example STI error in multi source conditions

(S4) for ‘Simple’ model in Room 8

Figure 5. 11. Example STI error in multi source conditions

(S4) for ‘CAD’ model in Room 8

Overall the session gave further confidence in using a more detailed model for alternative

experimental conditions.

5.3.2.4 Discussion

In the following paragraphs, a number of simulation efficiency aspects are discussed with

reference to the model conditions that enhance usability in the current context.

Reference for Model Performance (T30)

The validation procedure using T30 appeared as a reliable way to adapt the models to the

actual conditions. Given the detail resolution in each case the process could calibrate the

147

Page 160: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

148

simulation, nonetheless being limited to an extent by the accuracy potential of a ‘simple’

model. The resultant prediction accuracy could be described as adequate; however, for

finer model detail the calibration process enabled an efficient simulation in more complex

cases when examining alternative experimental conditions.

Earlier investigations on the topic [114] suggested that a more detailed approach in the

validation procedure e.g. using EDT and SPL as additional references in model

calibration, could allow a simple model to perform closer to the required accuracy for a

complicated task. This study demonstrated that for a more detailed model the accuracy

level is simultaneously increased for additional parameters (e.g. EDT, C50) without

further effort, thus a more elaborate approach would be unnecessary at this stage.

Model Optimization for Improving Run Time

Simulation efficiency is a complex topic, nonetheless, often resolving in balancing run

time with prediction accuracy and model development time. Typical run times ranged

from 3-5 minutes for the ‘simple’ version and 8-20+ minutes for the ‘CAD’ version (on a

Pentium M 2.0Ghz computer). Concentrating on the latter case, it was established that the

increased and to some extent unnecessary detail resulted in an increased number of

surfaces within the model. This was a key factor for the additional time required to

complete the prediction. To address the problem, simple steps were taken via model

redesign to reduce the number of surfaces in use, see figure 5.12.

I) II)

Figure 5. 12. Model detail example, I) Full, II) Optimized

The main characteristic in the current example is an improved construction allowing the

model to retain the original design e.g. including individual desk and seating models,

having nonetheless a significantly reduced number of surfaces. The end result in a

number of cases was a reduced run time, being on average ~3 times faster than the draft

Page 161: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

CAD version (depending on the number of surfaces being excluded) and thus comparable

to the ‘simple’ model.

5.3.2.6 Session conclusions

The experimental procedure suggested that using either ‘simple’ or ‘CAD’ models can

result in an acceptably accurate prediction. Nonetheless, enhanced performance was

observed for the more detailed geometry representation in this study.

The validation/calibration procedures using T30 values proved to be an efficient

methodology for the purpose of adapting the simulation to an actual environment without

the need for additional references in the process i.e. a more elaborate calibration. While

the latter option, although notably impractical to implement, could potentially for a

simple model enhance the prediction accuracy for parameters other than the reference RT,

it appeared unnecessary for more detailed models e.g. ‘CAD’ where a high degree of

consistency was demonstrated in this respect. The more detailed geometry also enabled a

more accurate prediction of acoustical environments for conditions differentiating from

the original model state that was used for validation.

Overall, it was established that only details that are essential for the prediction output

should be incorporated in a computer model, as the complexity of the latter has a direct

effect in resulting efficiency. Among others, a significantly reduced run time can also be

expected for models having a well balanced detail level.

5.4 Prediction results

In the following sections a comparison of predictions and measurements is presented for

the ten primary test rooms in the study. The simulation sessions were based on models

that were designed (see figures 5.13-5.22) in line with the experimental outcomes as

described in section 5.3. Results shown (tables 5.10-5.29) are the product of the

validation/calibration process i, performed primarily in terms of T30. EDT was also used

i For comparison purposes and to confirm the suitability of the methodology, see section 5.2.1.3, Appendix 6.1 presents exemplar prediction data for a generic/initial version of the models, i.e. based on a generic definition of scattering coefficients, and textbook derived absorption coefficient input.

149

Page 162: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

in a number of cases, see section 5.5.2, to support the viability of the alternative approach.

The optimization of the prediction process and outcomes are later discussed, further

referencing the use of EDT for prediction calibration purposes.

5.4.1 Room 1 data

Table 5. 4. Comparison of prediction data to room acoustics measurements in Room 1 (omni source), averaged over all receiver positions

Measurement and prediction data in Room 1 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.63 0.55 0.52 0.60 0.39 0.4 0.36 0.41 0.39 0.46 0.40 0.41

T30[s] 0.78 0.70 0.61 0.58 0.44 0.43 0.44 0.41 0.56 0.47 0.51 0.47

Ts[ms] 41.3 48.3 33.5 47.8 21.6 29.5 18.9 27.8 21.2 29.3 20.6 26.8

C50[dB] 3.9 4.4 5.4 3 8.4 7.1 9.1 7 8.5 6.3 8.5 7.2

C80[dB] 7.5 8.2 9.3 7.4 12.8 12.7 13.7 12.2 12.9 11 12.8 11.6

Om

ni

sou

rce

1

STI 0.79 0.77 Rating: Excellent Excellent

EDT[s] 0.63 0.57 0.52 0.57 0.38 0.38 0.36 0.41 0.37 0.44 0.39 0.42

T30[s] 0.76 0.69 0.68 0.58 0.48 0.45 0.49 0.42 0.5 0.48 0.54 0.47

Ts[ms] 38.9 47.5 32.3 40 20.5 27.8 18.8 25 19.4 27.5 20.5 26.8

C50[dB] 4.3 4.4 5.6 4.9 8.6 7.5 9.4 7.5 8.9 6.5 8.5 7

C80[dB] 7.8 8.7 9.5 9.2 13.1 12.4 13.5 13.2 13 11.4 13 11.7

Om

ni

sou

rce

2

STI 0.79 0.77 Rating: Excellent Excellent

Measurement and prediction data in Room 1 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 43.2 54.8 36.6 41.3 24.1 28.8 23.1 26.8 26.7 29 24.7 27.3

C50[dB] 3.6 3.4 4.9 4 8 7.8 8.3 7.8 7.3 6.2 7.8 6.8

C80[dB] 7 5.9 8.6 8 12.2 12.5 12.6 13 11.7 11.6 12.2 11.6

2 sp

eak

ers

STI 0.77 0.77 Rating: Excellent Excellent

Ts[ms] 40.4 46.3 35.1 39 24.1 28.5 19.6 24.5 24.9 29 23.9 26

C50[dB] 4.1 4.8 5 5.2 8 6.8 9.2 8.2 7.7 5.7 8 6.8

C80[dB] 7.4 8.6 8.8 9.4 12.3 11.8 13.6 13.6 11.6 11.3 12 11.8

4 sp

eak

ers

STI 0.77 0.77 Rating: Excellent Excellent

Table 5. 5. Comparison of prediction data to room acoustics measurements in Room 1 (sound system), averaged over all receiver positions

A0, A1 Omni directional source positions B1, B2, B3, B3 Sound system (directional) sources 01, 02, 03, 04, (05, 06) Receiver positions

150

Figure 5. 13. Geometry representation of Room 1 in simulation software

Page 163: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.2 Room 2 data

Measurement and prediction data in Room 2 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.64 0.62 0.53 0.59 0.39 0.42 0.37 0.40 0.40 0.47 0.39 0.45

T30[s] 0.76 0.89 0.64 0.61 0.46 0.47 0.46 0.45 0.52 0.54 0.53 0.55

Ts[ms] 42.2 54.0 34.8 46.5 21.4 32.8 18.9 29.5 20.1 34.0 19.8 32.3

C50[dB] 3.7 3.5 5.1 4.2 8.6 6.8 9.2 7.0 8.7 5.7 8.8 6.0

C80[dB] 7.3 7.1 9.1 7.6 13.1 10.6 14.0 11.5 13.1 9.9 13.1 10.2

Om

ni

sou

rce

1

STI 0.79 0.75 Rating: Excellent Good

EDT[s] 0.63 0.70 0.55 0.63 0.39 0.43 0.39 0.44 0.38 0.46 0.38 0.44

T30[s] 0.79 0.80 0.67 0.68 0.49 0.50 0.44 0.45 0.56 0.54 0.52 0.55

Ts[ms] 38.3 55.3 33.1 46.8 19.3 30.5 18.1 28.0 19.0 30.8 18.2 27.0

C50[dB] 4.3 2.5 5.4 3.6 8.9 7.0 9.2 6.8 9.2 6.1 9.3 7.1

C80[dB] 7.9 6.5 9.1 7.6 13.4 11.4 13.6 11.4 13.3 10.7 13.2 11.2

Om

ni

sou

rce

2

STI 0.79 0.76 Rating: Excellent Excellent

Table 5. 6. Comparison of prediction data to room acoustics measurements in Room 2 (omni source), averaged over

all receiver positions

Table 5. 7. Comparison of prediction data to room acoustics measurements in Room 2 (sound system), averaged over all receiver positions

Measurement and prediction data in Room 2 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 41.8 67.0 36.7 42.0 24.2 33.0 23.0 28.3 25.5 34.0 25.6 29.5

C50[dB] 3.8 1.6 4.7 4.3 8.0 6.0 8.4 7.6 7.5 5.2 7.4 6.0

C80[dB] 7.1 3.3 8.4 7.7 11.9 9.4 12.2 12.3 11.4 9.8 10.9 10.2

2 sp

eak

ers

STI 0.77 0.75 Rating: Excellent Excellent

Ts[ms] 39.8 51.5 35.2 41.0 23.1 27.0 19.6 23.8 23.1 26.8 23.9 28.3

C50[dB] 4.1 4.3 4.9 4.1 8.0 7.5 8.9 7.9 7.8 7.0 7.7 6.6

C80[dB] 7.5 7.0 8.6 8.9 12.1 12.2 12.9 12.4 12.0 11.2 11.4 10.9

4 sp

eak

ers

STI 0.77 0.77 Rating: Excellent Excellent

Figure 5. 14. Geometry representation of Room 2 in simulation software

151

Page 164: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.3 Room 3 data

Measurement and prediction data in Room 3 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred.

Table 5. 8. Comparison of prediction data to room acoustics measurements in Room 3 (omni source), averaged over all receiver positions

Table 5. 9. Comparison of prediction data to room acoustics measurements in Room 3 (sound system), averaged over

all receiver positions

Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.45 0.56 0.39 0.35 0.51 0.45 0.37 0.38 0.42 0.44 0.41 0.37

T30[s] 0.56 0.54 0.50 0.53 0.71 0.42 0.44 0.42 0.49 0.53 0.44 0.45

Ts[ms] 27.4 47.5 42.8 24.6 21.0 37.5 22.5 33.8 27.6 36.8 25.8 33.3

C50[dB] 6.8 4.4 4.6 7.7 9.0 5.1 8.5 6.2 6.8 5.2 7.1 6.4

C80[dB] 11.2 8.0 8.9 12.2 13.9 9.6 13.5 11.6 11.5 10.1 12.1 11.8

Om

ni

sou

rce

1

STI 0.78 0.78 Excellent Rating: Excellent

EDT[s] 0.45 0.65 0.42 0.47 0.37 0.37 0.35 0.38 0.42 0.42 0.41 0.37

T30[s] 0.54 0.53 0.48 0.52 0.47 0.43 0.47 0.48 0.56 0.60 0.49 0.49

Ts[ms] 25.4 56.0 23.2 39.5 19.7 31.8 19.6 31.0 25.1 34.0 23.6 30.8

C50[dB] 7.1 1.2 7.7 5.2 9.2 6.7 9.5 7.3 7.5 6.1 7.8 7.2

C80[dB] 11.4 6.8 12.3 10.4 13.8 12.8 13.8 11.7 11.7 10.5 12.4 12.2

Om

ni

sou

rce

2

STI 0.79 0.79 Rating: Excellent Excellent

Measurement and prediction data in Room 3 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 30.2 66.3 25.9 52.5 21.9 40.5 21.1 37.5 23.9 41.5 23.5 38.8

C50[dB] 6.3 1.1 7.1 3.2 8.0 5.4 8.8 6.1 7.5 4.9 7.6 5.4

C80[dB] 10.4 6.8 11.5 8.0 12.9 11.2 13.2 11.6 12.0 9.6 12.2 10.3

2 sp

eak

ers

STI 0.79 0.79 Rating: Excellent Excellent

Ts[ms] 29.0 62.3 28.3 48.3 22.2 29.8 22.1 29.0 25.1 34.0 25.4 31.5

C50[dB] 6.5 1.8 6.7 2.7 8.0 7.7 8.3 8.1 7.2 5.6 7.2 6.6

C80[dB] 10.6 7.6 11.0 8.5 12.7 13.1 13.0 12.9 11.6 10.1 11.9 11.1

4 sp

eak

ers

STI 0.78 0.79 Rating: Excellent Excellent

Figure 5. 15. Geometry representation of Room 3 in simulation software

152

Page 165: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.4 Room 4 data

Measurement and prediction data in Room 4 for Omni source (1) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 1.41 1.26 1.23 1.11 0.93 0.95 0.71 0.75 0.64 0.67 0.60 0.65

T30[s] 1.57 1.18 1.33 1.18 1.11 1.01 0.80 0.85 0.77 0.76 0.71 0.72

Ts[ms] 98.9 105.5 86.1 87.5 63.6 68.5 48.3 54.8 42.3 46.0 39.5 45.0

C50[dB] -1.6 -2.8 -0.8 -1.7 1.2 0.2 2.7 2.1 3.7 2.9 4.1 2.8

C80[dB] 1.1 0.4 2.0 2.2 4.2 3.9 6.2 5.3 7.2 6.4 7.8 6.1

Om

ni

sou

rce

1

STI 0.65 0.64 Rating: Good Good

Table 5. 10. Comparison of prediction data to room acoustics measurements in Room 4 (omni source), averaged over all receiver positions

Table 5. 11. Comparison of prediction data to room acoustics measurements in Room 4 (sound system), averaged over all receiver positions

Measurement and prediction data in Room 4 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 98.1 67.5 85.7 82.3 63.8 66.3 45.6 50.5 39.2 41.3 38.8 43.3

C50[dB] -1.5 2.4 -0.8 -0.1 1.1 0.8 3.2 2.7 4.1 3.6 4.1 2.9

C80[dB] 1.2 6.4 2.0 3.3 4.0 4.0 6.3 6.1 7.4 7.5 7.7 6.7

2 sp

eak

ers

STI 0.65 0.68 Rating: Good Good

Ts[ms] 97.7 60.5 85.5 91.8 64.8 73.0 47.1 57.5 40.9 46.0 40.7 46.0

C50[dB] -1.4 2.4 -0.7 -0.7 1.0 -0.1 2.9 1.6 3.9 3.1 3.9 2.8

C80[dB] 1.2 8.2 2.0 2.6 3.9 3.4 6.2 5.1 7.2 6.7 7.6 6.9

4 sp

eak

ers

STI 0.65 0.67 Rating: Good Good

Figure 5. 16. Geometry representation of Room 4 in simulation software

153

Page 166: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.5 Room 5 data

Table 5. 12. Comparison of prediction data to room acoustics measurements in Room 5 (omni source), averaged over all receiver positions

Measurement and prediction data in Room 5 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 1.21 1.27 0.91 0.92 0.63 0.58 0.52 0.54 0.51 0.47 0.47 0.50

T30[s] 1.46 1.70 1.36 1.13 1.07 0.67 0.75 0.62 0.72 0.66 0.68 0.72

Ts[ms] 83.7 96.2 60.2 64.8 35.8 42.2 28.3 35.3 27.3 33.0 27.2 32.8

C50[dB] -0.5 -0.6 1.7 1.5 5.0 4.1 6.3 5.4 6.6 6.1 6.7 5.8

C80[dB] 2.3 2.0 4.7 4.0 8.2 7.8 9.9 8.8 10.2 9.7 10.7 9.7

Om

ni

sou

rce

1

STI 0.71 0.70 Rating: Good Good

EDT[s] 1.20 1.38 0.92 1.00 0.65 0.65 0.53 0.53 0.51 0.53 0.48 0.52

T30[s] 1.44 1.77 1.36 1.08 1.10 0.66 0.73 0.62 0.67 0.66 0.67 0.72

Ts[ms] 82.6 103.7 61.5 69.7 36.1 39.8 29.4 35.5 27.9 35.5 26.7 34.2

C50[dB] -0.6 -1.2 1.4 1.1 4.7 4.5 5.9 5.4 6.2 5.4 6.8 5.6

C80[dB] 2.3 1.5 4.5 3.5 8.2 7.4 9.8 8.9 10.1 9.1 10.7 9.3

Om

ni

sou

rce

2

STI 0.71 0.70 Rating: Good Good

Table 5. 13. Comparison of prediction data to room acoustics measurements in Room 5 (sound system), averaged over all receiver positions

Measurement and prediction data in Room 5 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 88.0 114.2 66.3 66.8 44.8 39.7 34.1 34.0 29.1 38.2 28.8 38.2

C50[dB] -1.0 -0.6 1.0 0.8 3.8 4.6 5.4 5.3 6.4 4.8 6.1 4.5

C80[dB] 1.8 1.6 3.9 4.8 6.6 8.1 8.3 9.0 9.3 8.0 9.3 8.1

2 sp

eak

ers

STI 0.69 0.69 Rating: Good Good

Ts[ms] 84.7 125.3 63.8 69.7 40.9 47.2 31.4 41.0 25.8 44.0 26.9 45.5

C50[dB] -0.6 -2.4 1.3 0.5 4.2 3.5 5.8 4.6 6.9 3.8 6.7 3.3

C80[dB] 2.1 0.9 4.1 5.0 7.2 7.5 8.9 8.2 10.0 7.5 9.9 7.4

4 sp

eak

ers

STI 0.70 0.70 Rating: Good Good

Figure 5. 17. Geometry representation of Room 5 in simulation software

154

Page 167: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.6 Room 6 data

Measurement and prediction data in Room 6 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.74 0.62 0.67 0.61 0.49 0.53 0.34 0.46 0.46 0.52 0.43 0.47

T30[s] 0.80 0.54 0.70 0.67 0.53 0.53 0.49 0.49 0.61 0.62 0.62 0.62

Ts[ms] 42.4 46.8 39.3 40.5 26.6 32.0 16.4 28.3 24.1 37.0 22.7 35.5

C50[dB] 3.8 4.8 4.1 4.4 6.6 5.9 9.8 6.6 7.4 4.8 7.9 5.0

C80[dB] 6.8 7.5 7.4 8.6 10.6 9.5 14.6 10.5 11.4 8.8 11.9 9.5

Om

ni

sou

rce

1

STI 0.76 0.75 Rating: Excellent Good

EDT[s] 0.81 0.64 0.70 0.57 0.51 0.53 0.35 0.43 0.48 0.53 0.45 0.48

T30[s] 0.80 0.58 0.70 0.54 0.53 0.51 0.52 0.51 0.61 0.62 0.62 0.64

Ts[ms] 57.3 59.0 49.2 45.0 35.0 37.3 22.6 31.0 32.8 39.0 30.5 36.0

C50[dB] 1.3 1.3 2.3 3.9 4.9 4.8 8.8 6.1 5.7 4.3 6.2 4.7

C80[dB] 4.7 6.8 6.0 8.7 9.2 8.8 13.3 10.4 9.9 8.2 10.6 8.3

Om

ni

sou

rce

2

STI 0.74 0.73 Rating: Good Good

Table 5. 14. Comparison of prediction data to room acoustics measurements in Room 6 (omni source), averaged over all receiver positions

Measurement and prediction data in Room 6 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 54.3 63.8 46.8 52.8 32.0 36.0 20.6 30.3 27.1 34.8 28.5 32.3

C50[dB] 1.9 1.5 2.8 2.5 5.5 5.2 8.7 6.4 6.7 5.1 6.3 5.7

C80[dB] 5.0 4.5 6.3 5.7 9.7 9.9 13.6 10.5 10.7 8.8 10.4 10.4

2 sp

eak

ers

STI 0.74 0.75 Rating: Good Good

Ts[ms] 53.0 63.3 46.4 51.5 31.0 38.0 20.5 28.8 27.4 33.5 29.8 39.0

C50[dB] 2.2 1.5 3.0 2.7 5.8 4.7 8.9 6.5 6.6 5.2 6.2 4.0

C80[dB] 5.3 5.5 6.4 5.9 9.8 9.2 13.2 9.9 10.4 9.2 10.1 7.8

4 sp

eak

ers

STI 0.74 0.73 Rating: Good Good

Table 5. 15. Comparison of prediction data to room acoustics measurements in Room 6 (sound system), averaged over all receiver positions

Figure 5. 18. Geometry representation of Room 6 in simulation software

155

Page 168: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.7 Room 7 data

Measurement and prediction data in Room 7 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.84 0.57 0.69 0.60 0.48 0.58 0.42 0.53 0.43 0.58 0.39 0.58

T30[s] 0.88 0.86 0.69 0.66 0.52 0.57 0.54 0.54 0.61 0.60 0.55 0.59

Ts[ms] 60.0 58.8 48.7 58.0 32.0 50.0 24.6 47.5 26.0 51.5 23.8 50.8

C50[dB] 1.1 3.7 2.5 1.5 5.7 2.8 7.8 3.1 7.4 2.5 8.2 2.3

C80[dB] 4.4 7.4 6.1 6.8 9.9 7.0 11.7 7.6 11.3 6.9 12.1 6.8

Om

ni

sou

rce

1

STI 0.75 0.74 Rating: Good Good

EDT[s] 0.80 0.55 0.64 0.54 0.46 0.46 0.39 0.46 0.40 0.50 0.37 0.50

T30[s] 0.87 0.85 0.69 0.62 0.54 0.56 0.57 0.54 0.60 0.60 0.52 0.58

Ts[ms] 45.2 47.5 37.3 36.5 25.5 32.2 18.4 29.3 19.9 30.3 17.8 30.3

C50[dB] 3.5 5.2 4.5 5.9 7.1 6.7 9.1 6.9 8.7 6.3 9.6 6.3

C80[dB] 6.2 8.6 7.8 9.3 11.1 10.0 13.1 10.3 12.5 10.0 13.9 9.9

Om

ni

sou

rce

2

STI 0.78 0.75 Rating: Excellent Good

Table 5. 16. Comparison of prediction data to room acoustics measurements in Room 7 (omni source), averaged over all receiver positions

Table 5. 17. Comparison of prediction data to room acoustics measurements in Room 7 (sound system), averaged over all receiver positions

Measurement and prediction data in Room 7 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 59.2 77.7 46.3 63.0 31.7 54.5 24.1 47.8 23.4 49.7 23.3 53.8

C50[dB] 1.2 -0.3 2.8 0.4 5.7 2.0 7.5 3.5 7.8 3.1 7.7 2.0

C80[dB] 4.4 5.1 6.4 5.1 9.6 7.0 11.5 7.7 11.5 7.1 11.6 6.0

2 sp

eak

ers

STI 0.75 0.74 Rating: Good Good

Ts[ms] 57.3 69.0 46.3 53.8 31.9 50.0 25.0 43.3 24.7 42.5 24.4 46.5

C50[dB] 1.6 0.7 3.0 2.1 5.6 3.1 7.3 3.9 7.4 4.3 7.3 3.1

C80[dB] 4.7 6.4 6.5 6.4 9.7 7.6 11.3 8.6 11.1 8.1 11.5 7.4

4 sp

eak

ers

STI 0.75 0.74 Rating: Good Good

Figure 5. 19. Geometry representation of Room 7 in simulation software

156

Page 169: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.8 Room 8 data

Measurement and prediction data in Room 8 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.65 0.68 0.56 0.59 0.43 0.50 0.38 0.52 0.45 0.50 0.41 0.49

T30[s] 0.80 0.79 0.66 0.66 0.58 0.54 0.55 0.53 0.56 0.58 0.54 0.54

Ts[ms] 41.5 55.5 35.2 47.2 26.7 38.8 22.3 38.0 28.1 39.0 26.2 35.8

C50[dB] 4.2 3.3 5.1 3.8 7.3 4.6 8.3 4.7 6.9 4.6 7.3 5.3

C80[dB] 7.3 6.9 8.6 8.0 11.2 9.7 12.4 9.1 10.8 9.2 11.6 9.5

Om

ni

sou

rce

1

STI 0.77 0.74 Rating: Excellent Good

EDT[s] 0.76 0.50 0.56 0.42 0.43 0.46 0.38 0.42 0.47 0.43 0.44 0.43

T30[s] 0.82 0.82 0.63 0.61 0.55 0.54 0.58 0.54 0.59 0.62 0.55 0.56

Ts[ms] 53.6 47.7 36.8 43.7 27.0 38.5 23.4 38.2 31.6 38.7 27.6 35.5

C50[dB] 1.7 4.7 4.6 5.7 6.9 5.4 8.3 5.2 5.8 5.4 6.8 6.0

C80[dB] 5.2 9.5 8.5 9.3 11.4 9.6 12.7 9.9 10.1 10.1 11.2 10.2

Om

ni

sou

rce

2

STI 0.76 0.78 Rating: Excellent Excellent

Table 5. 18. Comparison of prediction data to room acoustics measurements in Room 8 (omni source), averaged over all receiver positions

Table 5. 19. Comparison of prediction data to room acoustics measurements in Room 8 (sound system), averaged over all receiver positions

Measurement and prediction data in Room 8 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 43.4 71.2 36.5 59.3 24.9 46.8 17.7 42.2 20.6 40.7 21.3 41.2

C50[dB] 3.7 1.2 4.7 1.0 7.4 3.7 9.4 4.3 7.7 4.4 8.0 4.3

C80[dB] 7.0 4.8 8.4 6.1 11.6 7.9 13.7 9.7 12.1 9.0 12.3 8.7

2 sp

eak

ers

STI 0.77 0.76 Rating: Excellent Excellent

Ts[ms] 48.8 56.0 42.1 43.3 28.2 33.0 22.6 30.2 26.1 30.0 25.8 31.3

C50[dB] 2.9 3.7 3.7 4.4 6.7 6.0 8.4 6.5 6.8 6.0 7.0 5.9

C80[dB] 6.4 7.2 7.7 8.8 10.9 9.6 13.0 10.9 11.3 10.0 11.6 9.7

4 sp

eak

ers

STI 0.77 0.76 Rating: Excellent Excellent

Figure 5. 20. Geometry representation of Room 8 in simulation software

157

Page 170: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.9 Room 9 data

Measurement and prediction data in Room 9 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 1.07 1.11 0.84 0.88 0.48 0.61 0.51 0.57 0.67 0.63 0.59 0.57

T30[s] 1.10 1.10 0.87 0.93 0.86 0.64 0.76 0.81 1.11 1.06 0.96 1.05

Ts[ms] 68.5 77.7 53.0 55.2 25.1 39.0 26.1 36.8 40.0 38.2 35.7 33.8

C50[dB] 0.8 0.5 2.2 2.9 7.4 5.0 7.1 5.7 4.4 5.6 5.2 6.3

C80[dB] 3.5 3.3 5.3 5.4 11.2 8.0 10.6 8.6 7.6 8.4 8.6 9.1

Om

ni

sou

rce

1

STI 0.71 0.71 Rating: Good Good

EDT[s] 1.07 1.01 0.82 0.82 0.44 0.64 0.52 0.56 0.65 0.60 0.57 0.55

T30[s] 1.11 1.08 0.88 0.95 0.56 0.66 0.87 0.84 1.01 1.07 0.87 1.06

Ts[ms] 70.4 86.0 53.9 62.3 24.0 42.7 29.4 40.3 39.3 41.7 34.6 37.3

C50[dB] 0.6 -0.8 2.1 1.0 7.6 4.3 6.6 4.9 4.5 4.9 5.4 5.9

C80[dB] 3.3 1.8 5.2 4.7 11.6 7.1 9.8 8.2 7.8 8.4 8.7 9.0

Om

ni

sou

rce

2

STI 0.71 0.70 Rating: Good Good

Table 5. 20. Comparison of prediction data to room acoustics measurements in Room 9 (omni source), averaged over all receiver positions

Measurement and prediction data in Room 9 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 72.0 80.3 53.6 66.5 26.2 43.5 28.8 49.0 35.6 51.0 35.4 58.0

C50[dB] 0.3 0.2 2.2 1.3 7.3 4.3 6.7 3.0 5.4 3.7 5.4 2.9

C80[dB] 3.1 3.7 5.1 4.4 10.8 7.8 9.8 6.9 8.1 6.4 8.1 5.3

2 sp

eak

ers

STI 0.70 0.70 Rating: Good Good

Ts[ms] 74.2 93.0 56.2 69.2 28.0 47.2 30.3 48.8 38.7 54.7 39.0 64.7

C50[dB] 0.1 -2.4 1.8 0.7 6.9 3.8 6.4 3.3 4.8 3.0 4.7 2.0

C80[dB] 2.9 1.9 4.9 4.3 10.3 7.8 9.5 7.1 7.7 6.0 7.6 4.7

4 sp

eak

ers

STI 0.70 0.69 Rating: Good Good

Table 5. 21. Comparison of prediction data to room acoustics measurements in Room 9 (sound system), averaged over all receiver positions

Figure 5. 21. Geometry representation of Room 9 in simulation software

158

Page 171: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.4.10 Room 10 data

Measurement and prediction data in Room 10 for Omni source (1 – 2) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

EDT[s] 0.52 0.43 0.52 0.57 0.53 0.55 0.45 0.45 0.36 0.37 0.26 0.31

T30[s] 0.67 0.62 0.79 0.64 0.77 0.54 0.66 0.53 0.66 0.54 0.54 0.50

Ts[ms] 31.8 37.5 30.7 34.3 29.5 32.0 25.6 27.3 22.4 22.8 17.6 18.0

C50[dB] 6.2 7.4 6.5 5.8 6.6 6.2 7.7 7.3 8.7 8.8 11.1 10.5

C80[dB] 9.4 11.1 9.5 9.8 9.5 9.5 10.6 10.5 12.3 12.1 14.8 14.5

Om

ni

sou

rce

1

STI 0.80 0.80 Rating: Excellent Excellent

EDT[s] 0.58 0.56 0.59 0.51 0.58 0.54 0.54 0.49 0.50 0.46 0.45 0.44

T30[s] 0.74 0.60 1.29 0.60 0.93 0.54 0.86 0.52 0.92 0.61 0.70 0.67

Ts[ms] 34.9 51.0 37.8 43.0 35.9 37.7 32.1 36.3 30.1 35.2 25.8 31.8

C50[dB] 4.8 4.0 4.7 5.0 5.1 4.5 5.5 5.0 6.2 5.4 7.2 6.1

C80[dB] 8.7 8.2 8.7 9.3 8.9 9.5 9.5 9.7 10.5 10.4 12.1 11.5

Om

ni

sou

rce

2

STI 0.75 0.75 Rating: Good Good

Table 5. 22. Comparison of prediction data to room acoustics measurements in Room 10 (omni source), averaged over all receiver positions

Measurement and prediction data in Room 10 for Sound system (2 - 4 speakers) F[Hz] 125 250 500 1000 2000 4000

Prediction Measurement Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas. Pred. Meas.

Ts[ms] 34.9 45.7 34.4 35.8 29.7 32.5 28.0 28.3 23.9 22.5 18.0 18.2

C50[dB] 5.1 6.8 5.2 5.9 6.2 5.9 6.4 7.0 7.7 8.4 10.0 10.3

C80[dB] 8.7 10.0 8.7 8.8 9.7 9.5 10.2 10.4 11.3 12.2 13.5 14.5

2 sp

eak

ers

STI 0.79 0.79 Rating: Excellent Excellent

Ts[ms] 36.4 51.8 40.0 46.5 35.2 37.7 30.2 31.0 26.9 31.5 22.6 31.8

C50[dB] 4.3 4.9 4.0 3.9 4.8 4.8 5.6 6.3 6.5 5.9 7.8 6.3

C80[dB] 8.4 8.6 8.0 7.2 8.7 9.3 9.7 10.8 10.6 10.8 12.1 10.7

4 sp

eak

ers

STI 0.75 0.77 Rating: Good Excellent

Table 5. 23. Comparison of prediction data to room acoustics measurements in Room 10 (sound system), averaged over all receiver positions

Figure 5. 22. Geometry representation of Room 10 in simulation software

159

Page 172: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Detailed prediction data can be found in Appendix 6.2.

5.5 Discussion

In the following sections different simulation aspects are discussed in view of the

numerical output for the ten test rooms. The use of EDT in the calibration process is

further assessed in terms of the resulting procedure’s capacity, while a model design and

preparation guideline is defined. Simple steps to enhance the efficiency of the overall

prediction process are also considered.

5.5.1 Prediction results

The numerical output of the simulations, as presented in section 5.4, showed a high

degree of accuracy for the majority of experimental conditions over the terms considered,

i.e. EDT, T30, Ts, C50, C80 and STI. An error analysis in comparison to measured values

revealed differences normally in the range of a few ms for EDT and T30. A maximum

error of up to 26% for EDT and T30 was observed after excluding odd values, while value

differences typically below 10% were found.

With overall results deemed satisfactory, the partly problematic T30 output for omni

source 2 in Room 10 was attributed to the lack of a dedicated seating model for the room.

The source in this case was positioned in-between seating rows; consequently, the

geometry simplification for the audience area did not accurately simulate source emission

into the large volume of the room. While a more complex geometry would result in a

significantly increased computation time and having by definition the capacity to

invalidate the prediction, the effect could potentially be avoided by placing the source

further away from overly simplified sections. The particular source should certainly not

be used for calibration; however it is worth noting that similar configurations did not

appear problematic within smaller rooms in the study.

In terms of clarity (C ratios), good agreement was found over all configurations for the

majority of rooms, with differences being less than 3dB for both C indices. A partial

160

Page 173: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

exception was Room 7 where differences up to 6dB were found for omni source 1 and

SS2loudspeakers at higher frequencies. The prediction in this case resulted in higher clarity

values, however, potentially due to the influence of the BGNL on room acoustics

measurements reducing confidence on the results at these particular data points. The

effect did not significantly stand out in terms of complementing parameters, except centre

time, with values nonetheless within 10ms in most cases.

Differences in terms of STI were at most 0.04 in one room for one source configuration,

and normally below the JND or zero for the remaining source configurations in all rooms.

As such, the simulations appeared exceedingly efficient in the context of intelligibility

prediction (STI and in part C50 and Ts) where any errors of complementing parameters,

though limited, were in effect not reflected on STI in particular.

Overall, the prediction sessions achieved a high degree of accuracy based on the T30 (or

EDT) calibration as performed for one source condition, for both the original and

alternative configurations (i.e. omni source 2 and two sound system configurations)

without alterations or further tuning post calibration.

5.5.2 Simulation calibration using reference EDT

Section 5.3.2.3 has discussed the prospective use of additional parameters i.e. other than

T30 in the calibration process, so as to achieve a higher level of prediction accuracy. It

was established that while a more elaborate procedure could potentially enhance the

accuracy level for a ‘simple’ model, it is not essential for a model build with a finer detail

resolution. Chapter 4, however, has discussed the increased robustness of EDT over T30

at marginal conditions i.e. low S/N during room acoustics measurements. In the context

of post evaluation, considering that the accuracy of a simulation relies on accurate room

acoustics measurements, it is evident that using EDT values in the calibration procedure

will enhance confidence in a prediction when the consistency of the measurement output

is questionable.

161

Page 174: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

Reference EDT values were used for a number of randomly selected simulations in

section 5.4 (see Rooms 4, 5 and 10) to establish the potential effect on simulation

efficiency and accuracy of results.

Assuming that the model geometry is consistent, the predicted relation between T30 and

EDT is going to be somewhat fixed. Consequently, although EDT is not as wide

descriptor of room acoustics as T30 its use in the calibration process instead of T30 was

not expected to result in significant accuracy differences in the prediction process. This

assumption was supported by the prediction outcomes, see sections 5.4 and 5.5.1. Given a

‘fixed’ relation of EDT with T30 based on the geometry influence, an advantage in using

EDT is also highlighted, having a closer monitoring of the latter while considering its

influence on the calculation of STI in particular.

During the calibration process, the prediction outcome is continuously compared to

measured values to determine the simulation precision at a given stage. Disagreement of

results will lead to alteration of the acoustic characteristics of particular surfaces so as to

improve the prediction quality. Thus an attempt to control the early part of sound decay

in this context would be more efficient. The calibration process for the test rooms using

reference EDT’s suggested that the alternative process facilitates a simpler identification

of the surfaces and acoustic characteristics that need alteration/tuning, particularly when

EDT is short. As a result the calibration procedure is expedited.

5.5.3 Model design and preparation

Methodologies for the design and preparation of computer models are often open for

interpretation by the user; as a result, prediction accuracy can suffer due to invalid model

input in terms of e.g. the defined geometry or sound source characteristics. Sections 5.2

and 5.3 addressed different aspects of the process to facilitate a better defined approach

for the prediction of acoustic conditions in rooms. Accordingly, a synopsis of the

outcomes was compiled, effectively formulating a modelling guideline, see section

5.5.3.2.

162

Page 175: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.5.3.1 Detail resolution requirement

The level of detail incorporated in a model is directly related to the frequency range

considered. However, small entities that individually can be considered as acoustically

invisible e.g. desks and chairs, can influence the resulting sound field when positioned in

close approximation to each other e.g. stacked or at contiguous arrangements; this is not

directly accounted for in a simulation though the combined influence of such smaller

surfaces can be incorporated by approximation, being dependant on the surface size, via

scattering effects. The simulation in this case is not entirely realistic nonetheless the

particular approach appears advantageous for small (or flat), as opposed to large rooms,

given the short propagation path, see section 5.3.2. Enhanced detail resolution is

accordingly preferred in particular cases, see section 5.5.3.2.

5.5.3.2 Room acoustics modelling guideline

The following process chain (steps 1-7) summarizes the recommended approach for

model design and preparation in the context of the present study:

1. Room appraisal to determine significance of architectural details in designing a

model. A detail resolution of ~0.5m should be used as a starting point [92],

however see 2.

2. Finer detail resolution should be included for smaller objects that are stacked or

positioned in close approximation e.g. chairs and desks, in line with section

5.3.2.4 for an efficient representation of different entities (for small or flat rooms).

3. Generic definition of scattering coefficients, see section 5.2.1.3.

4. Generic definition of absorption coefficients, see section 5.2.1.3.

5. Source characterization in terms of near field directivity and frequency response

(small rooms), see section 5.2.1.2.

6. Validation/calibration procedure for one source configuration using reference

EDT (recommended) or T30, see section 5.2.2.

7. Prediction of acoustic conditions for any source-receiver configuration, also see

table 7.1 for related uncertainty factors.

163

Page 176: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

5.5.4 Comments on modelling sessions

A significant drawback in the process can be the lack of prediction repeatability. For

instance, two differentiating sets of data from the same unaltered model can be

misleading when investigated in detail, particularly when the effect is not detected at an

early stage. For more complex models, prediction times can be in the range of hours and

thus a significant error can be introduced simply by relying on a single prediction using

the final model version.

A pilot approach has established a close association between the number of materials

used to represent room surfaces to prediction repeatability issues, while prediction output

suggested that an increased number of materials adds to the uncertainty. Problematic

models appear to become more tolerant and error free in this respect when using up to a

limited number of different materials. Ultimately, model design should account for

differentiated surface materials only when a simplification in this respect removes

essential model detail, see section 5.3.2 and section 5.5.3. An increased number of rays

traced will allow for enhanced consistency in the prediction results when necessary,

while however increasing the computation time in such cases.

Considering the time needed for model calibration, it was established that using as a

starting point surface acoustic characteristics as defined in rooms that are similar to a

given case study can lead to reasonably accurate prediction results. Given differences

between rooms some tuning is normally necessary, however the time needed to complete

the process is somewhat reduced.

5.6 Conclusions

This chapter has considered different aspects of computer simulation efficiency; the

outcomes were utilized in computer simulations for the primary test rooms in the study.

The definition of suitable scattering and absorption coefficient data was highlighted as a

critical element in reducing simulation uncertainties, thus enhancing confidence in the

164

Page 177: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

result. Different approaches have been found in the literature however their applicability

is not universal and thus questionable in many cases. At the prediction input, scattering

functions can be initially generically defined. The estimation of absorption coefficient

data can be accomplished using model calibration which appears as most suitable for

predictions that are used in post evaluation applications. Source directivity (and

consequently aiming) is an additional parameter central in the process, requiring a near

field response for simulating acoustic conditions in small rooms.

Experimental results revealed that a more detailed room geometry is preferred for

modelling smaller rooms, as opposed to large spaces. In this sense, CAD platforms can

greatly assist given the specialist nature of the software for this purpose by simplifying i.e.

speeding up model generation. This has a direct effect on the resulting model, particularly

in terms of the processing time required for a prediction. The latter is normally longer for

more complex models but could be optimized to approximate ‘simple’ model run times.

A balance between overly simplified rooms and complex inefficient geometries is

necessary i.e. only essential room characteristics should be included in the design.

Generally, simple steps can be taken to improve the simulation efficiency, such as

reducing the number of surfaces in the model while retaining the required detail

resolution.

The calibration of predictions was done in terms of a single parameter, typically T30. The

use of additional reference parameters appeared as unnecessary for models with more

detailed geometry representations. The use of EDT was also examined in this context and

found as a viable approach, resulting in similar accuracy levels. Considering the

conclusions from Chapter 4 relating to the increased robustness of EDT (when compared

to T30) an EDT based validation/calibration appeared to be more efficient in reducing the

related uncertainty factor in the prediction.

Pilot approaches have established prediction results marginally over the JND of STI for

‘simple’ models. The same parameter was within the JND for models with enhanced

165

Page 178: Final_1]2_ phd

Chapter 5 – Computer modelling of test spaces

166

detail resolution. Results in terms of C50, T30 and EDT could be described as acceptable

for both approaches, however with noticeably enhanced accuracy for the detailed models.

Predictions for alternative experimental conditions resulted in similar accuracy level for

the two modelling approaches, without the need for any alterations (or further tuning) of

the simulation post calibration, i.e. based on the original T30 (or EDT) calibration for one

source configuration.

A design and preparation guideline was defined for simulating university classroom type

rooms and alike spaces.

Simulations for ten test rooms were accurate in most cases in terms of the parameters

considered i.e. EDT, T30, Ts, C50, C80 and STI. The models emerged as particularly

efficient in terms of a speech intelligibility assessment.

Page 179: Final_1]2_ phd

Chapter 6 – Auralization

CHAPTER 6

Auralization

6.1 Introduction

Auralization is typically the final stage in a predictive assessment. The end product

enables a number of different approaches, such as the use for training or educational

purposes [112], most commonly to demonstrate of a space’s prospective performance to

both experts and non experts in acoustics. Auralizations are considered a highly efficient

methodology for this purpose as they comprise a direct link to prospective acoustical

conditions, thus no specialized knowledge is required in assessing a given case.

The quality of auralized material, in this study based on hybrid algorithms, is dependant

on the computer simulation being able to accurately represent the acoustic conditions in a

room. Thus an auralization verification issue arises as some means of confirming

consistency with the prediction’s numerical output and consequently with the measured

acoustic parameters is necessary. It is reasonable to assume that a simulation is consistent

with acoustic measurements if a validation/calibration process has previously been used.

Based on this assumption an auralization is required to be verified against the prediction

output only, while however accounting for related error margins at the prediction stage.

Verification is typically based on a subjective assessment where recorded room responses

are compared to auralized material via listening tests to assess realism. A suitable HRTF

and a frequency response filter to compensate for the source characteristics are normally

needed for consistency, given a typical binaural audio presentation approach through

headphones. This process provides a direct way for the assessment of audio realism but

no absolute information of the acoustic characteristics of a space can be extracted. For

167

Page 180: Final_1]2_ phd

Chapter 6 – Auralization

speech intelligibility in particular, a plain estimation is unfeasible unless a listening test

involving human subjects (see section 2.4.1) is undertaken. This process, though viable [ 139 ], is very time consuming and not cost effective. The prospect of an objective

validation appears thus more suitable for an efficient verification process. In this context,

Christensen [ 140 ] has presented a method based on post processing analysis of the

predicted impulse response. The extent to which an actual auralization can reproduce the

room acoustics characteristics as defined by the predicted impulse response filter is

however not clear.

Taking a different approach, a simulation can result in a reasonably accurate estimation

of acoustic parameters, even without producing an analogous result in terms of audio

realism. In the context of this study, the detail resolution of a model could be an

important factor not only in obtaining an objectively accurate auralization but a realistic

sounding auralization as well. Previous research using models that incorporated basic

architectural detail, demonstrated that numerical output and realism in the auralization

part are not necessarily related [114]. In another publication by the author the advantage of

using a more detailed model for small enclosed spaces was shown thus, suggesting that

realism of the auralization is similarly influenced [121].

This chapter discusses primarily the objective validation of auralized material using an

improved hybrid approach for a monaural setup. The relation of auralization quality, i.e.

realism, to the model detail resolution is examined using subjective listening tests in

comparison to binaural recordings for different cases. In the following sections, the

methodologies used and results obtained are presented.

6.2 Objective validation of auralized responses

The product of an auralization process is based on the predicted (or otherwise obtained)

impulse response of a system. The validation methodology as used by Christensen [140]

was based on post processing of the actual impulse response, i.e. a direct deconvolution

of an impulse response with a Dirac signal. The process thus enabled the derivation of

168

Page 181: Final_1]2_ phd

Chapter 6 – Auralization

roduct.

acoustic parameters via an open loop measurement system (Dirac v3.0 [ 141 ]) to

objectively quantify the quality of the auralization filter. This assumed that the same level

of accuracy would be conveyed in the final auralization. On this basis the impulse

response filter is nonetheless an intermediate p

An altered method is suggested as an improvement to enable the assessment of the end

product that is the auralization. By examining the auralization process at a final stage,

the overall accuracy level of a simulation system can be objectively quantified /

confirmed. At the same time it is clear that, depending on the deconvolution software

used, it might be more practical to use one method over the other, since both methods

dictate the use of explicitly either a reference Dirac pulse or sine sweep at the

deconvolution stage. However, alternative signals could potentially be used by the

proposed method. In the following section, the altered methodology is presented.

6.2.1 Objective validation of auralized responses using a swept sine

The new hybrid method utilizes a swept sine test signal within the deconvolution process

to replicate an assessment of the end product of an auralization.

An anechoic sample of the swept sine signal was used as the audio material, convolved

with the impulse response filter that was predicted by the preceding computer simulation

for the different source-receiver configurations in the test rooms. The duration of the

signal was dependant on the RT, while the source characteristics and the type of receiver

within the modeling software were matched to the prediction settings so as to enable a

consistent comparison of auralization to the numerical output. The characteristics of the

impulse response filter could thus be expressed as an end product i.e. an auralization.

Results, being in effect raw room responses as would be measured by typical acoustical

measurement practices, were assigned at the input of an open loop measurement system

(see also section 3.5.1 for open loop procedure details) and post processed to derive a set

of room acoustics parameters. Figure 6.1 shows a schematic for the process.

169

Page 182: Final_1]2_ phd

Chapter 6 – Auralization

Open loop processing Auralization Computer prediction

Figure 6. 1. Objective validation of auralization schematic

6.2.2 Evaluation of results

The accuracy of the predicted auralizations for ten test rooms was assessed using the

current and proposed method of objective validation. Ultimately, given that the datasets

match the predicted and/or actual data an auralization could be described as accurate in

the terms used for the assessment e.g. STI, T30, EDT etc. A monaural filter was used in

the experimentation as an indication of the auralization process efficiency.

The scope of the preliminary assessment included a comparison of the two validation

methods to determine the level of accuracy that is conveyed from the auralization filter

(assessed by the current method) to the end product i.e. the actual auralization (assessed

by the proposed method). Primarily, the proposed method was used for the general

verification of auralization accuracy over all of the different source/receiver

configurations in the ten test rooms. Data for the assessment objectives are shown in

section 6.4.2.

Given that data for both methods are derived using an open loop measurement system, it

is worth noting that the validity of the particular measurement method has been assessed

in earlier work by the author [116, 117], see also section 3.5.3. It was found that an open loop

methodology is reasonably consistent when compared to a closed loop system, while the

accuracy of the method is further improved given the controlled conditions in the current

experiment (processing is performed internally i.e. physical computer I/O is not used).

170

Page 183: Final_1]2_ phd

Chapter 6 – Auralization

8

6.3 Subjective validation of auralized responses

The subjective validation of auralizations involved the recording of audio samples in

actual conditions, see figures 6.2-6.3. Using speech and music stimuli, room responses

were obtained within the ten test rooms to enable a direct comparison to the predicted

auralization via listening tests. The recording procedure is described in the following

sections.

Figure 6. 3. Head and torso simulator in measurement position

Figure 6. 2. Binaural recording setup

6.3.1 Recording room responses

For the purposes of the study, a binaural feature was necessary to account for the binaural

character of human hearing, thus enabling a more efficient comparison to subjective

impression. BS ISO 3382: 2000 [11 ] prescribes the use of a head and torso simulator

conforming to ITU recommendation P.58 [ 142 ] for the measurement of binaural

parameters e.g. IACC. Accordingly, binaural recordings were made using the procedure

described in the particular BS document. The session specific characteristics included the

use of a portable PC as a means to play back and record audio. A multi track recorder

with simultaneous I/O was used as the software platform. Four source configurations, as

described in Chapter 3 (2x omni directional loudspeaker, 2x sound system), were used in

turn in the ten test rooms and binaural recordings were obtained at each receiver position.

6.3.2 Equipment list

Norsonic 140 sound level meter

171

Page 184: Final_1]2_ phd

Chapter 6 – Auralization

B&K Calibrator Type 4230

B&K Head & Torso simulator Type 4100, with NEXUS pre-amp

Dell Latitude PC D610

Digigram VXpocket v2 sound card

Dodecahedron omni directional loudspeaker

Yamaha HS50M studio monitor loudspeaker on tripod stand (x4)

Audio SR707 power amplifier

Sony Vegas Pro 8, professional multi-track audio editing software

6.3.3 Comparison of recordings to predicted auralization

A listening test with a group of 10 listener subjects (6 Acoustics professionals, 4

untrained listeners) was undertaken to assess the quality of the predicted auralizations.

The latter were presented via headphones in negligible BGNL conditions, followed by

the equivalent binaural recording. For consistency, samples of the existing background

noise at the time of recording were extracted from ‘silent’ sections, i.e. no signal playing,

of the binaural recordings within the test rooms and mixed with the associated

auralization prior to presentation. All audio samples were level calibrated using a pink

noise test signal preceding the test stimulants (speech, music). The test comprised

primarily a basic realism assessment requiring the subject’s input as to the level of

similarity between the presented auralization / recording pairs; source localization was

also assessed. Ultimately, the auralizations were categorized according to the realism

incorporated, as perceived by the group of listeners.

The assessment outcomes are presented in section 6.4.3.

6.4 Auralization study in ten test rooms

In the following sections, the current and proposed auralization validation methods are

compared to establish the advantage of using one method over the other. Auralizations

from ten rest rooms are further assessed in both objective and subjective terms to

determine primarily the level of consistency to the related predictions in terms of

172

Page 185: Final_1]2_ phd

Chapter 6 – Auralization

numerical output. The auralization quality is subjectively assessed considering the

resultant realism as compared to actual room recordings.

6.4.1 Comparison of objective validation methods

The current and proposed methods for an objective validation of auralization were

simultaneously used in the ten test rooms to derive a set of acoustic parameters i.e. EDT,

T30, Ts, C80 and STI, for the same conditions in the rooms. This enabled a comparison of

results so as to determine the consistency of the impulse response filter with the

auralization, based on the value differences observed.

The examination of room data for two source configurations (see Appendix 7) showed

overall fair agreement between the values derived from the two methods. Tables 6.1 and

6.2 show an example of the output variation for data averaged over all rooms. Output

consistency in this case suggested that for the majority of conditions the auralization filter

is capable of conveying its characteristics to the final auralization. Accordingly, if

individual data points are not the main focus, uncertainties in terms of the auralization

accuracy as related to the impulse response filter would not require additional

consideration.

Variation of results between two validation methods, S1 F[Hz] 125 250 500 1000 2000 4000

EDT (%) -0.9 1.4 0.9 0.8 0.2 0.2

T30 (%) -0.1 -1.6 -1.0 -0.8 -0.4 -0.3

Ts (ms) 0.1 0.0 -0.2 -0.2 -0.1 -0.2

C80 (dB) 0.0 0.0 0.1 0.0 0.0 0.0

STI 0.01

Table 6. 1. Result variation example between the two auralization validation

methods (averaged over ten test rooms for omni source 1)

Variation of results between two validation methods, S2 F[Hz] 125 250 500 1000 2000 4000

EDT (%) 5.2 3.6 2.9 2.1 0.2 -1.9

T30 (%) -0.3 0.9 -0.9 -0.9 -0.6 -0.5

Ts (ms) 0.1 0.0 -0.3 -0.2 -0.2 -0.1

C80 (dB) 0.0 0.0 0.1 0.1 0.0 0.0

STI 0.00

Table 6. 2. Result variation example between the two auralization validation methods (averaged over ten test rooms for omni source 2)

173

Page 186: Final_1]2_ phd

Chapter 6 – Auralization

Taking a more detailed approach, a number of discrepancies were observed in the

comparison. For the EDT measure in particular, value variations relating mainly to the

125Hz and 250Hz octave bands up to 24% and 7% respectively were found; similarly,

C80 variations reached 7dB for lower frequency bands, as deduced from data averaged

over all the receiver positions, see Appendix 7. In these atypical conditions, the existing

validation method cannot account for the accuracy level that is lost in the process of

applying the auralization filter to the anechoic material. Considering the possibility of a

noticeable error margin at lower frequencies, as shown in the current results, the

proposed methodology is advantageous as it is able to detect any discrepancies in this

respect. The new method thus can improve on the assessment uncertainty relating to the

discrepancies at individual data points.

Considering the similarity of the rooms examined in this study, a clear understanding of

the reasons underlying the observed discrepancies could not be fully determined. The

outcomes of the proposed method nevertheless suggest an improvement on the

uncertainty factor.

6.4.2 Objective assessment of auralizations

The assessment in the ten test rooms for four source configurations was performed using

the proposed validation method to determine the consistency of auralization with the

prediction. Results from this comparison gave an indication of the level of accuracy that

could be expected from the auralization of a space, given the preceding simulation as

interpreted in terms of its numerical output.

The previous section has demonstrated differences between two validation methods for

two source configurations. The observed discrepancies were primarily for the output from

the existing validation method using post processing of the impulse response being

misleadingly comparable to the prediction. The effect was mainly reflected in limited

EDT errors at lower frequency octave bands, mainly 125Hz, suggesting in the context of

174

Page 187: Final_1]2_ phd

Chapter 6 – Auralization

the assessment that low frequency characteristics are not always consistently conveyed in

an auralization using one source. In a comparison between auralizations and prediction

output, absolute differences up to 70% EDT and 9dB Clarity were found for the 125Hz

octave band in some instances, see Appendix 8. However, for remaining octave bands

and additional parameters no significant discrepancies were observed overall, with results

closely approximating the prediction. For clarity and STI in particular the absolute

differences when compared to the prediction output were typically below 2dB for clarity

and below 0.03 (=JND) for STI, see table 6.3. Thus, auralizations simulating one source

largely incorporated the room acoustics characteristics as suggested by the prediction’s

numerical output, with the partial exception of the 125Hz octave band.

Prediction and Auralization data comparison for Omni source 1 F[Hz] 125 250 500 1000 2000 4000

Prediction Auralization Pred. Aural. Pred. Aural. Pred. Aural. Pred. Aural. Pred. Aural.

EDT[s] 0.64 0.4 0.53 0.55 0.39 0.44 0.37 0.42 0.4 0.42 0.39 0.48

T30[s] 0.76 0.6 0.64 0.65 0.46 0.52 0.46 0.48 0.52 0.51 0.53 0.5

Ts (ms) 42.2 30.3 34.8 40.5 21.4 31.5 18.9 27.3 20.1 27 19.8 30

C50[dB] 3.7 8.1 5.1 5 8.6 6.7 9.2 7.5 8.7 7.9 8.8 6.8

C80[dB] 7.3 12 9.1 9 13.1 10.7 14 12.5 13.1 11.6 13.1 11.7

STI 0.79 0.79 Rating: Excellent Excellent Table 6. 3. Prediction and auralization data comparison example (Room 2, S1)

For multi source conditions the analyzed auralizations showed similar levels of accuracy

as discrepancies at the 125Hz octave band for the clarity values were similarly repeated,

see table 6.4. A somewhat altered behavior was however introduced with a number of

clearly invalid clarity values at, arguably, random octave bands, see Appendix 8. The

altered behavior was further reflected in the STI where discrepancies up to 0.09 STI were

found. In such cases, the subjective difference of speech intelligibility in the auralization

would be clearly audible. However, with an overall STI divergence typically below the

JND a screening of the source data as related to individual receiver positions could enable

the exclusion of flawed data from the analysis. Consequently, auralizations that might not

accurately represent a particular position can be excluded and the overall accuracy of an

assessment enhanced.

175

Page 188: Final_1]2_ phd

Chapter 6 – Auralization

Prediction and Auralization data comparison for SS4

F[Hz] 125 250 500 1000 2000 4000 Prediction Auralization Pred. Aural. Pred. Aural. Pred. Aural. Pred. Aural. Pred. Aural.

Ts (ms) 39.8 28.5 35.2 37.8 23.1 30.0 19.6 25.0 23.1 25.5 23.9 28.0

C50[dB] 4.1 9.8 4.9 6.3 8 5.3 8.9 6.6 7.8 7.7 7.7 7.0

C80[dB] 7.5 13.5 8.6 8.8 12.1 10.0 12.9 12.2 12.0 11.4 11.4 11.0

STI 0.77 0.78 Rating: Excellent Excellent Table 6. 4. Prediction and auralization data comparison example (Room 2, SS4)

The comparison of the prediction output to data derived from the auralizations using the

proposed validation method showed that the majority of auralizations closely

approximated the measured conditions in terms of the acoustic parameters considered.

For speech intelligibility in particular, the limited number of discrepancies observed did

not have a significant impact in the auralizations simulating single source conditions. A

somewhat more critical effect was found for multi source auralizations. The possibility of

a misleading assessment due to individual receiver position data and their subsequent

effect on auralization could be easily detected if the latter are objectively validated. The

limited auralizations that do not correspond to the prediction’s numerical output become

easy to identify, thus enabling exclusion to maintain consistency in the assessment.

Nonetheless, given that the number of significant discrepancies was limited, an

assessment based on presenting a group of auralizations i.e. not a single one, would

potentially give a fair indication of the potential acoustic performance of a space.

A subjective perception assessment of the auralization accuracy in comparison to room

binaural recordings is shown in the next section for the different source conditions.

6.4.3 Subjective assessment of auralizations

In a pilot listening test concerning the auralizations for ten test rooms, the audio examples

were scored by listeners (0-100%, equivalent to ‘not accurate-accurate’) at 76% and 75%

accuracy on average (STD=1.6 and 5%), for single source and multi source conditions

respectively (average STD of different listener results among test rooms was 11.9% and

15.3%). The outcomes thus indicate that a satisfactorily realistic result was produced

overall, with no significant differentiation between the source configurations.

176

Page 189: Final_1]2_ phd

Chapter 6 – Auralization

Considering the additional element of source localization, the average score was 77% for

both source conditions (STD=6.9% and 4.7%), suggesting that a reasonably accurate

positioning of the sound source was incorporated in all auralizations (average STD of

different listener results among test rooms was 15.6% and 16.8%).

6.5 Auralization accuracy and relation to model detail

The accuracy of the auralizations predicted from ‘simple’ and ‘CAD’ models was

assessed to establish the efficiency of the auralization process given a somewhat different

generation approach. Considering a more accurate prediction outcome in terms of

numerical output for the ‘CAD’ model, as shown in section 5.3.2, an objective validation

was initially carried out to establish potential differences in the quality of convolution of

the impulse response filter characteristics with the anechoic material. The level of

accuracy conveyed at the end product was thus again the reference for performance. The

second part of the assessment involved a listening test to determine the level of

naturalness in the predicted auralizations in relation to the detail level involved in the

model construction.

6.5.1 Objective assessment of convolution quality from ‘simple’ and ‘CAD’ models

A close examination by objective means demonstrated that to a large extend the ‘simple’

and ‘CAD’ models conveyed the predicted parameters to the auralization in a similar

manner. For a single source condition (S1), minor discrepancies were observed for the

125Hz octave having however an insignificant effect in terms of speech intelligibility

parameters, see tables 6.5 and 6.6. C50 values resulted in an error margin close to the JND,

while STI also produced discrepancies below the JND with good agreement for both

models (tables 6.5 and 6.6). Ts discrepancies were typically below 5ms. An example

comparison of the T30 and EDT parameters in Room 8 is shown in figures 6.4 - 6.5 and

6.6 - 6.7 for the ‘simple’ and ‘CAD’ models respectively, validating the initial

assumption of a consistent approach for either detail level in the model construction.

177

Page 190: Final_1]2_ phd

Chapter 6 – Auralization

Figure 6. 4. Comparison of T30 from prediction output and auralization validation (simple) in Room 8

Figure 6. 5. Comparison of EDT from prediction output and auralization validation (simple) in Room 8

Figure 6. 7. Comparison of EDT from prediction output and auralization validation (CAD) in Room 8

Figure 6. 6. Comparison of T30 from prediction output and auralization validation (CAD) in Room 8

Considering a multi source configuration (SS4), the comparison outcome approximated

single source conditions, see tables 6.7 and 6.8. Again with minor exceptions in the

125Hz octave band, clarity values produced differences that were typically limited in the

JND range, while STI variation was also within the JND. Ts produced differences within

5ms in most cases.

Prediction-Auralization difference (Simple, S1)

F[Hz] 125 250 500 1000 2000 4000 Ts(ms) -0.2 -3.3 -7.8 -4.6 2.6 3.0

C50(dB) -0.7 -0.3 1.0 0.8 -1.9 -1.9 C80(dB) -1.8 -0.5 1.0 1.2 -1.7 -1.7

STI -0.01

Table 6. 5. Example of prediction and auralization based acoustic parameter

differences for ‘Simple’ model, Single source (S1) in Room 8

Prediction-Auralization difference (CAD, S1)

F[Hz] 125 250 500 1000 2000 4000 Ts(ms) 5.2 -1.3 -6.8 -3.0 4.3 4.4

C50(dB) -2.3 -1.6 0.3 0.9 -1.6 -1.7 C80(dB) -2.8 -0.6 0.3 0.8 -1.6 -1.9

STI -0.02

Table 6. 6. Example of prediction and auralization based acoustic parameter differences for ‘CAD’ model, Single source (S1) in Room 8

178

Page 191: Final_1]2_ phd

Chapter 6 – Auralization

Prediction-Auralization difference (Simple, SS4) F[Hz] 125 250 500 1000 2000 4000

Ts(ms) -0.2 -5.6 -4.6 0.4 -0.2 -3.4 C50(dB) -2.1 0.5 0.8 0.6 0.2 0.9 C80(dB) -3.0 -0.5 0.4 0.5 -0.1 0.0

STI 0.01

Table 6. 7. Example of prediction and auralization based acoustic parameter differences for ‘Simple’ model, Multi source (SS4) in Room 8

Prediction-Auralization difference (CAD, SS4)

F[Hz] 125 250 500 1000 2000 4000 Ts(ms) 13.0 4.0 -5.2 -1.1 2.1 -6.5

C50(dB) -3.9 -2.1 0.1 0.5 -0.7 1.0 C80(dB) -3.4 -1.5 0.8 1.0 -0.1 0.9

STI -0.01

Table 6. 8. Example of prediction and auralization based acoustic parameter differences for ‘CAD’ model, Multi source (SS4) in Room 8

Overall, the data comparison based on the convolution process for ‘simple’ and ‘CAD’

models did not produce significant differences between the two modelling approaches.

The influence of a model’s detail resolution was minimal, thus not central in the context

of auralization validation process consistency. Consequently, the enhanced accuracy of

‘CAD’ models in terms of numerical output, as demonstrated in section 5.3.2, would

further suggest a better quality auralization.

6.5.2 Assessment of auralization realism by subjective means

The subjective assessment was based on the comparison of binaural recordings to the

predicted auralization for each case, while models were judged in terms of realism to

indicate differences in performance.

Results from the listening tests suggested that ‘simple’ and ‘CAD’ models both produced

a sufficiently realistic auralization. In the quality assessment scale used by the listeners

(0-100% range) ‘simple’ based auralizations scored on average 71% for single source

conditions, compared to 75% for ‘CAD’ based auralizations (STD=4.9% and 2.2%

respectively, average STD of different listener results among test rooms was 12.4% and

10.9% in the same order). The equivalent results for multi source conditions were 73%

179

Page 192: Final_1]2_ phd

Chapter 6 – Auralization

and 78% for the two modelling approaches (STD=4.4% and 4.9% respectively, average

STD of different listener results among test rooms was 17.2% and 13.2%). The listening

tests therefore indicated that ‘CAD’ auralizations were of marginally better quality than

‘simple’ i.e. more realistic for both single and multi source conditions in the rooms

considered. Considering the quality differences in terms of numerical output as described

in section 5.3.2, ‘CAD’ auralizations support the advantage of ‘CAD’ models for small

rooms.

The scores awarded for ‘simple’ auralizations nonetheless support to an extent earlier

findings [114] suggesting that in terms of an auralization quality assessment, audio realism

and speech intelligibility performance can comprise two individual tasks, not necessarily

related with each other.

6.6 Conclusions

This chapter has presented a new methodology for an objective validation of auralization,

aimed as an improvement on the only existing method. The new method makes use of a

swept sine test signal as the anechoic material for the convolution process that produces

the auralization. Using a validated simulation and an open loop measurement system as a

basis, the method is able to objectively assess the end result of an auralization process i.e.

the auralized material. By comparison to the existing method, test simulations revealed

that in a limited number of cases the level of accuracy that is conveyed from the

simulation to the impulse response filter (measured by the existing method) can be

somewhat different from the qualities conveyed from the impulse response filter to the

auralization (measured by the proposed method). The observed variations related

primarily to the EDT and C80 for low frequencies, while lesser differences were found for

the additional measures considered i.e. T30, Ts and STI in all octave bands. Overall, the

new method allows for the latter discrepancies to be identified and taken into account.

For consistency purposes, related auralizations can thus be excluded prior to the

presentation of a given set.

180

Page 193: Final_1]2_ phd

Chapter 6 – Auralization

At the same time it is clear that the proposed methodology increases the flexibility of an

objective auralization assessment by enabling the use of a swept sine at the deconvolution

stage; accordingly, a broader choice of software platforms for the open loop processing is

available.

Using ‘simple’ and ‘CAD’ models, the accuracy of the convolution process in obtaining

an auralization was examined in relation to the model detail. Trivial differences were

found between the two modelling approaches, thus suggesting that detail resolution is not

a critical factor in this respect. Consequently, considering that ‘CAD’ models have an

advantage in terms of the consistency of numerical output with measurements, shown in

section 5.3.2, an analogous outcome can be expected in terms of auralization accuracy. A

comparative subjective assessment of auralizations derived from a combination of

‘simple’ and ‘CAD’ models showed that both approaches produced satisfactory results,

however with ‘CAD’ auralizations being of marginally better quality i.e. more realistic,

for both single and multi source conditions.

Auralizations for ten test rooms were further assessed in objective terms to determine the

accuracy of the auralizations for the primary test rooms. For single source conditions,

comparison to the predictions’ numerical output revealed occasional discrepancies for

low frequencies in terms of EDT and clarity. For multi source conditions, a similar level

of accuracy was found in terms of clarity with some additional discrepancies at random

octave bands, suggesting that screening of results is essential.

Given an uncertainty factor in the level of auralization accuracy for either source

configuration, an assessment should generally allow for an error margin equal to the JND

for STI and 2dB for clarity, compared to the prediction’s numerical output; if an

appropriate error margin is not considered, the resultant uncertainty could become

unacceptable for projects involving marginal intelligibility conditions.

A subjective assessment of auralization was undertaken for a broader view of the results.

The assessment considered realism and source localization, as compared to room

181

Page 194: Final_1]2_ phd

Chapter 6 – Auralization

182

recordings, indicating that a satisfactorily realistic result was produced overall, with no

significant differentiation between the source configurations. The additional element of

source localization assessment suggested that a reasonably accurate positioning of the

sound source was incorporated in all auralizations.

Page 195: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

CHAPTER 7

Summary and Conclusions

7.1 Overview

The aim of this PhD study was to identify an efficient approach, from room acoustics

measurements to computer simulations, for the prediction and assessment of speech

intelligibility and related acoustic parameters in enclosed spaces.

The work undertaken consisted of four main parts:

Defining a measurement methodology that enables a consistent measurement of

the acoustic environment in rooms under different conditions.

Examining a series of low level measurement data to facilitate the acquisition of

usable data from measurements performed under marginal conditions.

The development of suitable computer models incorporating an appropriate

validation/calibration procedure for an accurate prediction of room acoustics

parameters within lecture rooms under different conditions.

The development of a new hybrid methodology for an objective validation of

auralization accuracy and subsequent auralization assessment in relation to the

detail resolution of the associated models.

The conclusions of the study are summarized in the following sections.

7.2 Room acoustics measurement methodology

Room acoustics measurements in ten test rooms have been analyzed for four different

source configurations (closed loop system) to determine the consistency of the output

183

Page 196: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

under the different conditions. The interrelationships between room acoustic parameters

were addressed for a better understanding of the acoustic conditions in lecture rooms,

while open loop resultant data were evaluated to determine the efficiency of the

methodology as a reference for acoustic performance.

The examination of measurement data established that either of the four source

configurations could be used in conjunction with an exponential sine sweep measurement

methodology for a consistent room assessment in terms of T30 (deviations ≤5%, 1 JND).

Given established measured differences among the source configurations for EDT, data

that is averaged over all the receiver positions are required for a reasonably consistent

assessment (deviations ≤10%, 2 JND). The smaller rooms in the study (rooms 1-8) gave

enhanced confidence in the consistency of measurements. Clarity variation appeared

dependant on the source-receiver arrangement, however differences did not exceed 2-3dB

in most cases. STI variation was typically within the JND range (≤0.02).

The C50 and EDT values measured in the test rooms were found to be highly correlated in

small rooms. Good correlation was also established between Clarity and STI (0.91 and

0.96 for C50 and C80 respectively) for noiseless or under adequate S/N measurement

conditions. STI can thus be predicted using clarity values with a high precision. For high

S/N in an actual case i.e. low BGNL, clarity can be used as a direct descriptor of

intelligibility.

The feasibility of using data resultant from an open loop measurement methodology as

reference values was examined by a series of open loop measurements in ten test rooms.

It was found that open loop methodologies could potentially be used as an alternative to

closed loop systems; however, the method would not be suitable for a comprehensive

assessment of room acoustics.

184

Page 197: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

7.3 Low level measurements

A series of room acoustics measurements at a reducing S/N have been examined using an

experimental low level measurement methodology to establish the effect on the

measurement output i.e. T30, EDT and STI under marginal conditions. The relation of

room reverberance to S/N was also examined.

Measurement data for ten test rooms revealed that EDT is a more consistent parameter

when compared to T30, producing reduced fluctuations at the derived values for the series

of measurements at a continuously reducing S/N. Accordingly, at marginal conditions

EDT can be significantly more accurate than T30, while its use for further post processing

e.g. for computer modelling purposes as the reference parameter of choice would enhance

confidence on the consistency of later analysis outcomes. Overall, a 20dB and 12dB S/N

was necessary in this study for an accurate estimation of T30 and EDT respectively, over

all frequency bands with no signal averaging.

The measured acoustic parameters revealed a trend in terms of the relation between

threshold efficient S/N to T30 and EDT, where for increasing values of reverberance a

higher S/N is generally required for an accurate measurement.

Higher signal levels are used to obtain adequate S/N, particularly at lower frequencies,

due to the effect of BGNL. Given that higher frequencies are more intrusive in terms of

annoyance, a significantly more tolerable level can be achieved by suitable signal

equalization to produce a constant S/N across the frequency spectrum while facilitating

measurement accuracy. For an absolute speech intelligibility assessment, this is only

valid if post processing of the measurements is used to suitably account for the influence

of speech level and BGNL.

185

Page 198: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

7.4 Development of an optimized methodology for improved computer

models

Computer simulations that can accurately predict the acoustic environment have been

developed. The simulations have been used to investigate the prediction efficiency within

university lecture rooms in terms of speech intelligibility parameters under different

conditions. The focus was primarily on two aspects of the simulation process i.e. the

validation/calibration methodology and the detail resolution that is required for a

consistent prediction outcome under different conditions. The optimization of processing

speed was also addressed in parallel with the resultant simulation precision. A near field

response was confirmed as essential in terms of source directivity for simulating acoustic

conditions for small rooms.

The proposed prediction validation/calibration process utilized a single acoustic

parameter in comparison to acoustic measurements. Given that an accurate definition of

scattering and absorption coefficients is a critical element in reducing simulation

uncertainties, the estimation principally of absorption coefficients via simulation

calibration appeared to be most suitable for post evaluation applications. The

methodology enabled an accurate prediction i.e. prediction results that are consistent with

measurements under different conditions.

Enhanced detail in the representation of room geometry was found to be preferable for

smaller rooms. CAD platforms provide a useful tool for this purpose by speeding up

model generation, however typically resulting in an increase of the processing time

required for a prediction. Comparative pilot approaches for models of varying detail

resolution have established prediction results marginally over a JND STI for ‘simple’

models, while the same parameter was within the JND for models with enhanced detail

resolution. Results in terms of C50, T30 and EDT were found to be acceptable for both

approaches, however with noticeably enhanced accuracy for the detailed models. In the

same context, the use of alternative experimental conditions e.g. different source

configurations, resulted in similar levels of accuracy for the two modelling approaches

without the need for any alterations (or further tuning) of the simulation post calibration,

186

Page 199: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

i.e. based on the original T30 (or EDT) calibration for one source configuration. In small

rooms thus, enhanced detail resolution has improved the prediction accuracy.

Simultaneous use of additional parameters during the validation/calibration process could

enable a more accurate prediction for ‘simple’ models. However, this more elaborate

procedure appeared unnecessary for more detailed geometry representations. Considering

the potential advantages of using EDT as the reference parameter due to increased

robustness at marginal conditions, experimentation confirmed the approach as viable,

producing similar accuracy levels to a T30 based procedure. The use of EDT as a

reference parameter is thus recommended as a means to increase confidence in the

predictions.

Considering the necessary balance between overly simplified rooms and complex

inefficient geometries, the processing time of ‘CAD’ models was optimized to

approximate ‘simple’ model run times. Simple steps have been described to improve the

simulation efficiency by model redesign, while retaining the required detail resolution. A

model design/preparation guideline was defined to optimize the simulation methodology.

The analysis of prediction outcomes for ten test rooms and four source configurations

demonstrated high accuracy in terms of the parameters considered i.e. EDT, T30, Ts, C50,

C80 and STI, emerging as particularly efficient for a speech intelligibility assessment.

7.5 Auralization accuracy assessment

A new hybrid methodology has been proposed for an objective validation of auralization

accuracy. The method uses a swept sine test signal at the convolution stage and an open

loop measurement system to assess the end product of an auralization process.

Comparisons to the only existing objective auralization validation method revealed

variations in the assessment results, predominantly for low frequencies. The proposed

method allows for the discrepancies to be identified and taken into account in an

187

Page 200: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

objective auralization validation. For consistency purposes thus, related auralizations can

be excluded prior to the presentation of a given set.

An additional improvement over the old method is the increased flexibility of an

objective auralization assessment. By enabling the use of a swept sine at the

deconvolution stage, a broader choice of software platforms for the open loop processing

are accordingly available since the process is not restricted to Dirac pulses.

Auralizations for ten test rooms were further assessed using the proposed method in

comparison to direct numerical output of the predictions to determine the accuracy of the

auralizations for the primary test rooms. Typically, minor variations were found for the

acoustic parameters considered while clarity and STI variations in particular were below

2dB and 0.03 respectively, for all source configurations. However, individual data points

exceeding the typical error suggested that for multi source conditions in particular a

screening of results is essential.

Given an uncertainty factor in the level of auralization accuracy for either source

configuration, an assessment should generally allow for an error margin equal to the JND

for STI and 2dB for clarity, compared to the prediction’s numerical output, or the

resultant uncertainty could become unacceptable for projects involving marginal

intelligibility conditions.

A subjective assessment of auralization was undertaken considering realism and source

localization, compared to room recordings. The assessment indicated a satisfactory result

overall for both auralization aspects, with no significant differentiation between the

source configurations.

Using ‘simple’ and ‘CAD’ models, the quality of the convolution process that produces

the auralization was examined in relation to the model detail, showing no significant

differences between the modelling approaches. As the ‘CAD’ model predictions were

more consistent with measurements, an analogous outcome can be expected for the

188

Page 201: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

auralization products. A comparative subjective assessment of auralizations for ‘simple’

and ‘CAD’ models showed that both approaches produced satisfactory results. However,

the ‘CAD’ auralizations were of marginally higher quality.

7.6 Further work

The research undertaken could be furthered in a number of aspects:

The low level measurement methodology needs to be further investigated for the

different source configurations in different environments to better understand the

measurement conditions and extract further propositions to improve the

measurement efficiency.

Binaural STI measurements can be compared to standard single omni directional

based measurements to establish prospective differences. Model behaviour can be

analyzed in relation to binaural data to verify the prediction quality for different

conditions.

A detailed model was preferred for computer simulations of smaller fitted rooms.

This is not desired for large rooms however, further work is needed to identify the

prospective limits past which an enhanced level of detail is no longer beneficial.

The models could be improved by further considering prediction repeatability,

processing speed and overall efficiency.

The models should be used to give suggestions for optimal acoustic conditions in

classrooms, particularly when sound systems are utilized, e.g. controlling

loudspeaker time delay, positioning, aiming, directivity pattern etc.

7.7 Overall conclusions

The different stages of a general room acoustics/speech intelligibility assessment in

university lecture rooms have been examined and analyzed. Computer models were

developed to accurately predict acoustic parameters while reducing the uncertainties

involved, facilitating a confident assessment in the rooms considered.

189

Page 202: Final_1]2_ phd

Chapter 7 – Summary and Conclusions

190

In Chapter 3, the room acoustics measurements in ten test rooms have been analyzed for

four source configurations and two measurement methodologies to identify data

consistency between different approaches. Measurement uncertainty was further analyzed

in terms of S/N at marginal conditions in Chapter 4, highlighting the advantages of using

EDT in terms of later use of data for a validation/calibration methodology. A

validation/calibration methodology and the preferred approach in terms of model detail

resolution have been examined in Chapter 5, producing simulations that enable accurate

prediction outcomes based on EDT. Finally, a flexible objective auralization assessment

methodology has been developed and validated in Chapter 6. The associated uncertainties

for each stage of the assessments were specified, see table 7.1, to account for the potential

error margins, therefore enhancing confidence on the assessment outcome for analogous

research.

Table 7. 1. Assessment uncertainty via error margins, calculated by considering the data origin at the assessment stages

Error margins based on the origin of related data Reference process T30 EDT C50 STI

Four source configurations' data consistency (σ) ≤ 5% ≤ 10% ≤ 3dB ≤ 0.02 *

Open loop data (either source configuration) compared to closed loop data ≤ 10% ≤ 10% ≤ 2dB ≤ 0.03 *

Prediction (simple, single source) compared to measurements ≤ 25% ≤ 50% ≤ 7dB ≤ 0.06 *

Prediction (CAD, single source) compared to measurements ≤ 10% ≤ 25% ≤ 3dB ≤ 0.03 *

Prediction (simple, multi source) compared to measurements N/A N/A ≤ 5dB ≤ 0.06 *

Prediction (CAD, multi source) compared to measurements N/A N/A ≤ 3dB ≤ 0.03 *

Prediction (numerical output) compared to auralization (single source) ≤ 5% ≤ 5% ≤ 2dB ≤ 0.03

Prediction (numerical output) compared to auralization (multi source) ≤ 5% ≤ 5% ≤ 2dB* ≤ 0.03* * screening needed for extreme values * averaged over receiver positions

This work has thus developed a suitable methodology for the assessment of primary

acoustic parameters and speech intelligibility using computer simulations and

auralization in university lecture rooms. Confidence in the assessment has been achieved,

specifying the potential error margins at the different stages of the assessment.

The outcomes can be used by acoustics researchers and consultants to study and improve

the acoustic conditions in university lecture rooms and other alike spaces.

Page 203: Final_1]2_ phd

References

References 1 Kuttruff H., Room Acoustics, 4th Edition, ISBN: 0-419-24580-4, Elsevier Science publishers, Taylor & Francis Group 2 Schroeder M. R., New Method of Measuring Reverberation Time, J. Acoust. Soc. Am., 37, 409–412 (1965) 3 Alrutz, H. & Schroeder, M.R., A fast Hadamard transform method for the evaluation of measurements using pseudorandom test signals, Proc. of 11th International Congress on Acoustics, Paris (France 1983) 6 (1983) p.235-238 4 Vorländer M., Kob M., Practical Aspects of MLS Measurements in Building Acoustics, Applied Acoustics, Vol. 52, No. 314, pp. 239-258, 1997 5 Rife D. D., Vanderkooy J., Transfer-Function Measurement with Maximum-Length Sequences, J. Audio Eng. Soc., Vol. 37, No. 6, June 1989 6 Morset L., Morset development, WinMLS 2004, Professional Measurement Software for PC and Soundcard, User’s Manual, www.winmls.com 7 Griesinger D., Beyond MLS - Occupied Hall Measurement with FFT Techniques, J. Audio Eng. Soc., preprint 4403 (Nov 1996) 8 Farina A., Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique, J. Audio Eng. Soc., preprint 5093 (Feb 2000) 9 Moriya N., Kaneda Y., Study of harmonic distortion on impulse response measurement with logarithmic time stretched pulse, Acoust. Sci. & Tech. 26, 5 (2005) 10 Muller S., Massarani P., Transfer-Function Measurement with Sweeps, J Audio Eng Soc, Vol. 49, No 6, June 2001 11 BS EN ISO 18233-2006, Acoustics - Application of new measurement methods in building and room acoustics 12 Farina A., Advancements in Impulse Response Measurements by Sine Sweeps, Presented at AES 122nd Convention, Vienna, Austria, 2007 May 5–8, Convention paper 7121 13 BS EN 60268-16;2003, Sound system equipment - Objective rating of speech intelligibility by speech transmission index 14 Mapp P., Speech intelligibility measurement-The current state of the art, Proc. Institute of Acoustics, Vol. 25, Pt. 7 (2003) 15 Bjor O. H., STIPA-The golden mean between full STI and RASTI, Proc. Institute of Acoustics, Vol. 25, Pt. 7 (2003) 16 Steeneken H. et al, Development of an Accurate, Handheld, Simple-to-use Meter for the Prediction of Speech Intelligibility, Presented at Reproduced Sound 17, Stratford-on-Avon, November 16, 2001 17 Mapp P., Limitations of Current Sound System Intelligibility Verification Techniques, Presented at AES 113th Convention, Los Angeles, California, 2002 October 5 – 8, Convention paper 5668

191

Page 204: Final_1]2_ phd

References

192

18 Mapp P., Systematic & Common Errors in Sound System STI and Intelligibility Measurements, Presented at AES 117th Convention, San Francisco, California, 2004 October 28-31, Convention paper 6271 19 Mapp P., Is STIPa a robust measure of speech intelligibility performance?, Presented at AES 118th Convention , Barcelona, Spain, 2005 May 28-31, Convention paper 6399 20 Skarlatos D., Applied Acoustics (in Greek), Second edition, ISBN: 960-87710-1-3 21 Fitzroy D., Reverberation Formula Which Seems to Be More Accurate with Nonuniform Distribution of Absorption, J. Acoust. Soc. Am., 31 (7), 893-897 (1959) 22 Kang J., Neubauer R. O., Predicting reverberation time: Comparison between analytic formulae and computer simulation, 17th International Congress on Acoustics (ICA), Rome, Italy, 2001 23 Jordan V. L., Acoustical Criteria for Auditoriums and Their Relation to Model Techniques, J. Acoust. Soc. Am., 47, 408 (1970) 24 Damaske P., Ando Y., Interaural Crosscorrelation for Multichannel Loudspeaker Reproduction, Acustica, 27 (1972) 232-238 25 Thiele R., "Richtungsverteilung und Zeitfolge der Schallrückwürfe in Raümen,", Acustica 3 , 291-302 (1953). 26 Reichardt W. et al., Abhängigkeit der grenzen zwischen brauchbarer und unbrauchbarer durchsichtigkeit von der art des musikmotives, der nachhallzeit und der nachhalleinsatzzeit, Appl. Acoustics, 7 (1974) 243-264 (In German with English abstract) 27 Bradley J.S., A just noticeable difference in C50 for speech, Applied Acoustics 58 (1999) p.99-108 28 Bradley J. S., Relationships among Measures of Speech Intelligibility in Rooms, J. Audio Eng. Soc., Vol. 46, No. 5, May 1998 29 Kurer R., Zur Gewinnung von Einzahlkriterien bei Impulsmessungen in der Raumakustik, Acustica, 21 (1969) 370 (in German) 30 French N. R., Steinberg J. C., Factors Governing the Intelligibility of Speech Sounds, J. Acoust. Soc. Am., 19 (1), 90-119 (1947) 31 Kryter K. D., Methods for the Calculation and Use of the Articulation Index, J. Acoust. Soc. Am., 34 (11), 1689-1697 (1962) 32 ANSI S3.5-1969, Methods for the calculation of the articulation index 33 ANSI S3.5-1997, Methods for the calculation of the speech intelligibility index (SII) 34 Peutz V. M. A., Articulation Loss of Consonants as a Criterion for Speech Transmission in a Room, J. Audio Eng. Soc., Vol. 19, p.915-919, Dec 1971 35 Steeneken H.J.M., Houtgast T., Phoneme – group specific octave band weights in predicting speech intelligibility, Speech Communication 38 (2002) 399–411 36 BS EN ISO 9921: 2003, Ergonomics - Assessment of speech communication

Page 205: Final_1]2_ phd

References

193

37 Barron M., Auditorium Acoustics and Architectural Design, E & FN Spon, First edition, ISBN:0419177108 38 ANSI S3.2-1989, Method for Measuring the Intelligibility of Speech over Communication Systems 39 House A.S. et al., Articulation-Testing Methods: Consonantal Differentiation with a Closed-Response Set, J. Acoust Soc. Am. 37, 158-166 (1965) 40 Steeneken H., The measurement of speech intelligibility, TNO human factors, White paper 41 Katz J., Handbook of Clinical Audiology, Fifth edition, ISBN: 0-683-30765-7 42 Houtgast T., Steeneken H. J. M., The modulation transfer function in room acoustics as a predictor of speech intelligibility, Acustica, Vol.28 (1973) 66-73 43 Houtgast T., Steeneken H. J. M., A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria, J. Acoust. Soc. Am., 77 (3), 1069-1077 (1985) 44 Houtgast T., Steeneken H. J. M., Evaluation of speech transmission channels by using artificial signals, Acustica, Vol.25 (1971) 355-367 45 Steeneken H. J. M., Houtgast T., A physical method for measuring speech-transmission quality, J. Acoust. Soc. Am., 67 (1), 318-326 (1980) 46 Houtgast T., Steeneken H. J. M., Plomp R., Predicting Speech Intelligibility in Rooms from the Modulation Transfer Function. I. General Room Acoustics, Acustica, Vol.46 (1980) 60-72 47 Plomp R., Steeneken H. J. M., Houtgast T., Predicting Speech Intelligibility in Rooms from the Modulation Transfer Function. II. Mirror Image Computer Model Applied to Rectangular Rooms, Acustica, Vol.46 (1980) 73-81 48 van Rietschote H. F., Houtgast T., Steeneken H. J. M., Predicting Speech Intelligibility in Rooms from the Modulation Transfer Function IV: A Ray-Tracing Computer Model, Acustica, Vol.49 (1981) 245-252 49 Schroeder M. R., Modulation transfer functions: Definition and Measurement, Acustica Vol. 49 (1981) 179-182 50 Steeneken H. J. M., Houtgast T., Mutual dependence of the octave-band weights in predicting speech intelligibility, Speech Communication 28 (1999) 109-123 51 Anderson B. W., Kalb J. T., English verification of the STI method for estimating speech intelligibility of a communications channel, J. Acoust. Soc. Am., 81 (6), 1982-1985 (1987) 52 Steeneken H. J. M., Houtgast T., Validation of the revised STIr method, Speech Communication 38 (2002) 413-425 53 van Wijngaarden S. J., Steeneken H. J. M., Houtgast T., Quantifying the intelligibility of speech in noise for non-native talkers, J. Acoust. Soc. Am. 112 (6), 3004-3013 (2002) 54 van Wijngaarden S. J., Steeneken H. J. M., Houtgast T., Quantifying the intelligibility of speech in noise for non-native listeners, J. Acoust. Soc. Am. 111 (4), 1906-1916 (2002)

Page 206: Final_1]2_ phd

References

194

55 van Wijngaarden S. J., Houtgast T., Bronkhorst A. W., Steeneken H. J. M., Using the Speech Transmission Index for predicting non-native speech intelligibility, J. Acoust. Soc. Am. 115 (3), 1281-1291 (2004) 56 Mapp P., Private communication (2008) 57 van Wijngaarden S. J., Drullman R., Development of a binaural speech transmission index (A), J. Acoust. Soc. Am., 119 (5), 3442 (2006), Abstract 58 van Wijngaarden S. J., Drullman R., Binaural intelligibility prediction based on the speech transmission index, J. Acoust. Soc. Am., 123 (6), 4514-4523 (2008) 59 van Wijngaarden S. J., Verhave J. A., Recent advances in STI measuring techniques, Proc. of the Institute of Acoustics, Vol. 28, Pt. 6, 2006 60 Drullman R., van Wijngaarden S. J., New directions for a speech-based speech transmission index (A), J. Acoust. Soc. Am., 119 (5), 3442 (2006), Abstract 61 Mapp P., Speech intelligibility measurement-The current state of the art, Proc. of the Institute of Acoustics, Vol. 25, Pt 7 (2003) 62 Colburn H. S., Binaural Hearing Mechanisms, Proc. of the 37th International Congress and exposition on noise control engineering - Inter noise 2008, Shanghai, China, 26-29 October (2008) 63 David M. Howard, James Angus, Acoustics and Psychoacoustics, second edition, Focal Press – Music Technology Series, ISBN: 0 240 51609 5 64 Bronkhorst A. W., Plomp R., Binaural speech intelligibility in noise for hearing-impaired listeners, J. Acoust. Soc. Am., 86 (4), 1374-1383 (1989) 65 Bronkhorst A. W., Plomp R., The effect of head-induced interairal time and level differences on speech intelligibility in noise, J. Acoust. Soc. Am., 83 (4), 1508-1516 (1988) 66 Hawley M., Litovsky R. Y., Culling J. F., The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer A), J. Acoust. Soc. Am. 115 (2), February 2004 67 Houtgast T., The effect of ambient noise on speech intelligibility in classrooms, Applied Acoustics 14 (1981) 15-25 68 Kuttruff H., Sound fields in small rooms, Presented at AES 15th International Conference: Audio, Acoustics & Small Spaces (October 1998), paper number: 15-002 69 Haas H., The Influence of a Single Echo on the Audibility of Speech, J. Audio Eng. Soc., Vol. 20, No. 2, p.146-159, March 1972 70 Bradley J. S., Reich R. D., and Norcross S. G., On the combined effects of signal-to-noise ratio and room acoustics on speech intelligibility, J. Acoust. Soc. Am. 106 (4), Pt. 1, October 1999 71 Mapp P., Intelligibility – Winning the acoustics battle, Proceedings of the AES 18th UK Conference, Live sound -Technological advances to satisfy new audience expectations, London, UK, 2003 April 72 Eargle J., Foreman C., JBL, Audio engineering for sound reinforcement, Hal Leonard corp., ISBN: 0 634 04355 2

Page 207: Final_1]2_ phd

References

195

73 Mapp P., Online article, http://www.svconline.com (Cited 25/01/2005) 74 Sato H., Bradley J. S., Evaluation of acoustical conditions for speech communication in working elementary school classrooms, J. Acoust. Soc. Am. 123 (4), April 2008 75 Hodgson M., Experimental investigation of the acoustical characteristics of university classrooms, J. Acoust. Soc. Am. 106 (4), Pt. 1, October 1999 76 Bradley J. S., Sato H., The intelligibility of speech in elementary school classrooms, J. Acoust. Soc. Am. 123 (4), April 2008 77 DfES, Building Bulletin 93, Acoustic design for schools, http://www.teachernet.gov.uk/acoustics 78 ANSI S12.60-2002 American National Standard Acoustical Performance Criteria, Design Requirements, and Guidelines for Schools 79 Bistafa S. R., Bradley J. S., Reverberation time and maximum background-noise level for classrooms from a comparative study of speech intelligibility metrics, J. Acoust. Soc. Am. 107 (2), February 2000 80 Hodgson M., Measurement and prediction of typical speech and background-noise levels in university classrooms during lectures, J. Acoust. Soc. Am. 105 (1), January 1999 81 Shield B., Dockrell J., External and internal noise surveys of London primary schools, J. Acoust. Soc. Am. 115 (2), February 2004 82 Mapp P., Measuring speech intelligibility in classrooms with and without hearing assistance, Proc. Institute of Acoustics, Vol. 25, Pt. 7 (2003) 83 Hodgson M., Nosal E.M., Effect of noise and occupancy on optimal reverberation times for speech intelligibility in classrooms, J. Acoust. Soc. Am. 111 (2), February 2002 84 Yang W., Hodgson M., Optimal reverberation time for speech intelligibility for normal and hearing- impaired listeners using auralization, Proc. of the 19th International Congress on Acoustics, Madrid, Spain 85 Bradley J. S., Sato H., Picard M., On the importance of early reflections for speech in rooms, J. Acoust. Soc. Am. 113 (6), June 2003 86 Bistafa S. R., Bradley J. S., Reverberation time and maximum background-noise level for classrooms from a comparative study of speech intelligibility metrics, J. Acoust. Soc. Am. 107 (2), February 2000 87 Mapp P., Relationships between Speech Intelligibility Measures for Sound Systems, Presented at AES 112th Convention , Munich, Germany, 2002 May 10-13, Convention paper 5604 88 Bradley J. S., Speech intelligibility studies in classrooms, J. Acoust. Soc. Am. 80 (3), September 1986 89 Onaga H., Furue Y., Ikeda T., The disagreement between speech transmission index (STI) and speech intelligibility, Acoust. Sci. & Tech. 22, 4 (2001) 90 Mapp P., Modifying STI to Better Reflect Subjective Impression, Presented at AES 21st Conference , St Petersburg, Russia, 2002 June 01-03 91 BS EN 60849:1998, IEC 60849:1998 - Sound systems for emergency purposes 92 Vorländer M., Auralization, ISBN:978-3-540-48829-3, Springer-Verlag (2008) Berlin Heidelberg

Page 208: Final_1]2_ phd

References

196

93 Schroeder M.R., Die statistischen Parameter der Frequenzkurven von grossen Räumen, Acustica Vol. 4 (1954) 594-600 94 Schroeder M. R., Atal B. S., Bird C., Digital computation in room acoustics, 4th International Congress on Acoustics, Copenhagen, 1962 95 Krokstad A., Strom S., Sørsdal S., Calculating the acoustical room response by the use of a ray tracing technique, J. of Sound and Vibration (1968), Vol. 8, 118-125 96 Dance S., Shield B., The effect on prediction accuracy reducing the number of rays in a ray-tracing model, Proc. of Inter noise 1994, Yokohama, Japan, 29-31 August (1994) 97 Yang L., Computer modelling of speech intelligibility in underground stations, PhD thesis, London South Bank University (1997) 98 Allen J. B., Berkley D. A., Image method for efficiently simulating small-room acoustics, J. Acoust. Soc. Am. 65 (4), April 1979 99 Borish J., Extension of the image model to arbitrary polyhedra, J. Acoust. Soc. Am. 75 (6), June 1984 100 Long M., Architectural Acoustics, Elsevier academic press (2006), ISBN: 13: 978-0-12-455551-8 101 Vorländer M., Simulation of the transient and steady-state sound propagation in rooms using a new combined ray-tracing/image-source algorithm, J. Acoust. Soc. Am. 86 (1), July 1989 102 Dalenbäck, B. I., CATT-Acoustic: Image source modelling augmented by ray tracing and diffuse reflections, Applied Acoustics, Volume 38, Issues 2-4, 1993, 350 (Abstract) 103 van Maercke D., Martin J., The prediction of echograms and impulse responses within the Epidaure software, Applied Acoustics 38 (1993) 93-114 104 Naylor G. M., ODEON-Another hybrid room acoustical model, Applied Acoustics 38 (1993) 131-143 105 Hodgson M., Evidence of diffuse surface reflections in rooms, J. Acoust. Soc. Am. 89 (2), February 1991 106 Howarth M. J., Lam Y. W., An assessment of the accuracy of a hybrid room acoustics model with surface diffusion facility, Applied Acoustics 60 (2000) 237-251 107 Vorländer M., International round robin on room acoustical computer simulations, 15th International Congress on Acoustics, Trondheim, 26-30 Jun 1995 108 Bork I., A comparison of room simulation software – the 2nd round robin on room acoustical computer simulation, Acustica - Acta Acustica, Vol. 86 (2000) 943-956 109 Bork I., Report on the 3rd Round Robin on Room Acoustical Computer Simulation – Part I: Measurements, Acta Acustica united with Acustica, Vol. 91 (2005) 740 – 752 110 Bork I., Report on the 3rd Round Robin on Room Acoustical Computer Simulation – Part II: Calculations, Acta Acustica united with Acustica, Vol. 91 (2005) 753 – 763 111 Torres R.R., Kleiner M., Dalenback B.I., Audibility of 'Diffusion' in Room Acoustics Auralization: An Initial Investigation, Acustica - Acta Acustica, Vol. 86 (2000) 919-927

Page 209: Final_1]2_ phd

References

197

112 Kleiner M., Dalenback B.I., Svensson P., Auralization – An Overview, J. Audio Eng. Soc., Vol. 41, No. 11, November 1993 113 Dalenback B.I., Kleiner M., Svensson P., The Audibility of Changes in Geometric Shape, Source Directivity, and Absorptive treatment: Experiments in Auralization, Presented at AES 91st Convention, New York, U.S.A., 1991 October 4-8, Convention paper 3123 114 Nestoras C., Dance S., Computer model utilization for speech intelligibility assessment in enclosed spaces using sound systems, Proc. Institute of Acoustics, Vol. 30, Pt. 2 (2008) 115 Hodgson M., Experimental investigation of the acoustical characteristics of university classrooms, J. Acoust. Soc. Am. 106 (4), Pt. 1, October 1999 116 Nestoras C., Gomez L., Dance S., Murano S., Speech intelligibility measurements in a diffuse space using open and closed loop systems, 19th International Congress on Acoustics (ICA), Madrid, 2-7 Sep 2007 117 Gomez L., Nestoras C., Dance S., Murano S., Speech intelligibility measurements in a non-diffuse space using open and closed loop systems, 19th International Congress on Acoustics (ICA), Madrid, 2-7 Sep 2007 118 BS EN ISO 3382-2: 2008, Acoustics - Measurement of room acoustic parameters - Reverberation time in ordinary rooms 119 Hodgson M. et al., Measurement and prediction of typical speech and background-noise levels in university classrooms during lectures, J. Acoust. Soc. Am. 105 (1), January 1999 120 Hodgson M., Rating, ranking and understanding acoustical quality in university classrooms, J. Acoust. Soc. Am. 112 (2), August 2002 121 Nestoras C., Dance S., Design and Validation of Computer Models for the Assessment of Speech Intelligibility in Enclosed Spaces, Proc. of the 37th International Congress and exposition on noise control engineering - Inter noise 2008, Shanghai, China, 26-29 October (2008) 122 Nestoras C., Dance S., Speech intelligibility measurements with low level output – efficiency limitations, Proc. Institute of Acoustics, Vol. 28, Pt. 6 (2006) 123 Nestoras C., Dance S., Measurement of Speech Intelligibility Using Low Level Output - Threshold Efficient S/N Ratios, Acoustics ’08, Paris, 29th June –4th July 2008 124 AES-4id-2001 (r2007): AES information document for room acoustics and sound reinforcement systems -- Characterization and measurement of surface scattering uniformity, J. Audio Eng. Soc., Vol. 49, No. 3, 149-165, 2001 125 Nironen H., Diffuse Reflections in Room Acoustics Modelling, MSc dissertation, Helsinki University of Technology (2004) 126 ISO 17497-1: 2004, Acoustics -- Sound-scattering properties of surfaces -- Part 1: Measurement of the random-incidence scattering coefficient in a reverberation room 127 Vorlander M., Mommertz E., Definition and measurement of random-incidence scattering coefficients, Applied Acoustics 60 (2000) 187-199 128 CATT-Acoustic v8.0f (build 2.01) Acoustics prediction software, User manual, (CATT 1988-2006)

Page 210: Final_1]2_ phd

References

198

129 Hodgson M., Scherebnyj K., Estimation of the absorption coefficients of the surfaces of classrooms, Applied Acoustics 67 (2006) 936–944 130 Zeng X., Christensen C. L., Rindel J. H., Practical methods to define scattering coefficients in a room acoustics computer model, Applied Acoustics 67 (2006) 771–786 131 Saher K., Nijs L., van der Voorden M., Definition of material properties in the acoustical model calibration, Presented at AES 118th Convention, Barcelona, Spain, 2005 May 28-31, Convention paper 6496 132 Dalenbäck B. I., Kleiner M., Svensson P., The Audibility of Changes in Geometric Shape, Source Directivity, and Absorptive Treatment: Experiments in Auralization, Presented at AES 91st Convention, New York, USA, 1991 October 4-8, Convention paper 3123 133 Wang L. M., Vigeant M. C., Evaluations of output from room acoustic computer modeling and auralization due to different sound source directionalities, Applied Acoustics (2007), doi:10.1016/j.apacoust.2007.09.004 134 Dalenbäck B. I., CATT DLL Directivity Interface (DDI) v1.0, DDI White Paper, Rev. 0, 981021 135 BS EN 60268-5:2003, Sound system equipment-Loudspeakers 136 Wang L. M., Rathsam J., The influence of absorption factors on the sensitivity of a virtual room’s sound field to scattering coefficients, Applied Acoustics (2007), doi:10.1016/j.apacoust.2007.09.004 137 Google SketchUp v6 Pro, 3D CAD software, User manual 138 http://www.rahe-kraft.de/cms/su2catt/index.htm 139 Peng J., Feasibility of speech intelligibility assessment based on auralization, Applied Acoustics 66 (2005) 591–601 140 Christensen C. L., Weitze C. A., Rindel J. H. , Gade A. C., Validation of an auralization system (Abstract), J. Acoust. Soc. Am. 111 (5), May 2002 141 B&K Dirac 3.0, Room Acoustics Software Type 7841, http://www.bksv.com 142 ITU-T Recommendation P.58, Telephone transmission quality - Head and torso simulator for telephonometry - August 1996


Recommended