Online Submission ID: 0
SynCoPation: Synthesis Coupled Sound Propagation
Figure 1: Our coupled sound synthesis-propagation technique has been integrated in the UnityTM game engine. We plan to demonstratethe sound effects generated by our system on a variety of scenarios: (a) Cathedral, (b) Tuscany, and (c) Game scene. In the left most case,the bowl sounds are synthesized and propagated in the Cathedral; In the middle scene, the bell sounds are synthesized and propagated inoutdoor scene; In the last scene sounds of barrel hitting the ground are synthesized and propagated.
Abstract1
Sounds can augment the sense of presence and immersion of users2
and improve their experience in virtual environments. Recent re-3
search in sound simulation and rendering has focused either on4
sound synthesis or on sound propagation, and many standalone al-5
gorithms have been developed for each domain. We present a novel6
technique for automatically generating aural content for virtual en-7
vironments based on an integrated scheme that can perform sound8
synthesis as well as sound propagation. Our coupled approach can9
generate sounds from rigid-bodies based on the audio modes and10
radiation coefficients; and interactively propagate them through the11
environment to generate acoustic effects. Our integrated system12
allows high degree of dynamism - it can support dynamic sources,13
dynamic listeners, and dynamic directivity simultaneously. Further-14
more, our approach can be combined with wave-based and geomet-15
ric sound propagation algorithms to compute environmental effects.16
We have integrated our system with the Unity game engine and17
show the effectiveness of fully-automatic audio content creation in18
complex indoor and outdoor scenes.19
1 Introduction20
Sound simulation algorithms predict the behavior of sound waves,21
including generation of sound waves from vibrations and propaga-22
tion of sound waves in the environment. Realistic sound simula-23
tion is important in computer games to increase the level of immer-24
sion and realism. Sound augments the visual sense of the player,25
provides spatial cues about the environment, and can improve the26
overall gaming experience. At a broad level, prior research in sound27
simulation can be classified into two parts - synthesis and propaga-28
tion. The problem of sound synthesis deals with simulating the29
physical processes (e.g. vibration of a sound source) involved in30
generation of sound. Sound propagation, on the other hand, deals31
with the behavior of sound waves as they are emitted by the source,32
interact with the environment, and reach the listener.33
State-of-the-art techniques for sound simulation deal with the prob-34
lem of sound synthesis and sound propagation independently.35
Sound synthesis techniques model the generation of sound result-36
ing from vibration analysis of the structure of the object [Zheng and37
James 2009; Chadwick et al. 2009; Moss et al. 2010; Zheng and38
James 2010]. However, in these techniques only sound propagation39
in free-space (empty space) is modeled and the acoustics effects40
generated by the environment are mostly ignored. Sound propa-41
gation techniques [Krokstad et al. 1968; Allen and Berkley 1979;42
Funkhouser et al. 1998; Raghuvanshi et al. 2010; Mehra et al. 2013;43
Yeh et al. 2013] model the interaction of sound waves in indoor and44
outdoor spaces, but assume pre-recorded or pre-synthesized audio45
clips as input. These assumptions can result in missing sound ef-46
fects and generate inaccurate (or non-plausible) solutions for the47
underlying physical reality produced by the process of sound sim-48
ulation. For example, consider the case of a kitchen bowl falling49
from a countertop; the change in the directivity of the bowl with the50
hit position and the effect of this time-varying directivity on propa-51
gated sound field in the kitchen is mostly ignored by current simula-52
tion techniques. Similarly, for a barrel rolling down in an alley, the53
sound consists of multiple frequencies, where each frequency has54
different radiation and propagation characteristic, which are mostly55
ignored by current sound simulation systems. Due to these limi-56
tations, artists and game audio-designers have to manually design57
sound effects corresponding to these different kinds of scenarios,58
which can be very tedious and time-consuming.59
In this paper, we present the first coupled synthesis and propaga-60
tion system which models the entire process of sound simulation61
starting from the surface vibration of objects, radiation of sound62
waves from these surface vibrations, and interaction of resulting63
sound waves with the environment. Our technique models the sur-64
face vibration characteristic of an object by performing modal anal-65
ysis using the finite element method. These surface vibrations are66
used as boundary conditions to the Helmholtz equation solver (us-67
ing boundary element method) to generate outward radiating sound68
fields. These radiating sound fields are expressed in a compact basis69
using the single-point multipole expansion [Ochmann 1999]. Math-70
ematically, this single-point multipole expansion corresponds to a71
single sound source placed inside the object. The sound propaga-72
tion due to this source is achieved by using numerical sound sim-73
ulation technique (at low frequencies) and ray-tracing (at high fre-74
quencies). We also describe techniques to accelerate ray-tracing75
algorithms based on path clustering and binning. Our approach76
performs end-to-end sound simulation from first principles and en-77
ables automatic sound effect generation for interactive applications,78
thereby reducing manual effort and time-spent by artists and game-79
audio designers.80
The main contributions of our work on coupled sound synthesis-81
propagation include:82
1
Online Submission ID: 0
1. Integrated technique for accurately simulating the effect of83
time-varying directivity.84
2. High accuracy achieved by correct phase computation, and85
per-frequency modeling of sound vibration, radiation and86
propagation.87
3. Interactive runtime to handle high degree of dynamism e.g.88
dynamic surface vibrations, dynamic sound radiation, and89
sound propagation for dynamic sources and listeners.90
We plan to integrate our technique with the UnityTM game engine91
and demonstrate the effect of coupled sound synthesis-propagation92
on a variety of indoor and outdoor scenarios as shown in Fig. 1.93
2 Related Work94
In this section, we review some of the most closely related work on95
sound synthesis, radiation, and propagation techniques.96
2.1 Sound Synthesis97
CORDIS-ANIMA was perhaps the first proposed system of98
damped springs and masses for modeling surface vibration to syn-99
thesize physically-based sounds [Florens and Cadoz 1991]. Nu-100
merical integration using a finite element formulation was later101
presented as a more accurate technique for modeling vibrations102
[Chaigne and Doutaut 1997; O’Brien et al. 2001]. Instead of using103
numerical integration, [van den Doel and Pai 1996; van den Doel104
and Pai 1998] proposed to compute analytical vibrational modes of105
an object, leading to considerable speedups and enabling real-time106
sound synthesis.107
[van den Doel et al. 2001] introduced the first method to deter-108
mine the vibration modes and their dependence on the point of im-109
pact for a given shape, based on physical measurements. Later,110
[O’Brien et al. 2002] presented a general algorithm to determine111
modal modes of an arbitrarily-shaped 3D objects by discretizing112
them into tetrahedral volume elements. They showed that the cor-113
responding finite element equations can be solved analytically after114
suitable approximations. Consequently, they were able to model115
arbitrarily shaped objects and simulate realistic sounds for a few116
objects at interactive rates [O’Brien et al. 2002]. This approach re-117
quires expensive pre-computation called modal analysis. [Raghu-118
vanshi and Lin 2006a] used a simpler system of spring-mass along119
with perceptually motivated acceleration techniques to recreate re-120
alistic sound effects for hundreds of objects in real time. In this pa-121
per, we use a FEM-based method to precompute the modal modes,122
similar to [O’Brien et al. 2002].123
[Ren et al. 2012] presented an interactive virtual percussion instru-124
ment system that used modal synthesis as well as numerical sound125
propagation for modeling a small instrument cavity. This work, de-126
spite of some obvious similarity, is actually very different from our127
coupled approach. Their combination of synthesis and propagation128
is not well-coupled or integrated, and the volume of the underlying129
acoustic spaces is rather small in comparison to the typical game130
scenes (e.g. benchmarks shown in Fig. 1).131
2.2 Sound Radiation132
The Helmholtz equation is the standard way to model sound ra-133
diating from vibrating, rigid bodies. Boundary element method is134
a widely used method for acoustic radiation problems [Ciskowski135
and Brebbia 1991; von Estorff 2000] but has a major drawback in136
terms of high memory requirements, i.e. O(N2) memory for N137
boundary elements. An efficient technique known as the Equiva-138
lent source method (ESM) [Fairweather 2003; Kropp and Svens-139
son 1995; Ochmann 1999; Pavic 2006] exploits the uniqueness of140
the solutions to the acoustic boundary value problem. ESM ex-141
presses the solution field as a linear combination of simple radiat-142
ing point sources of various orders (monopoles, dipoles, etc.) by143
placing these simple sources at variable locations inside the object144
and matching the total generated field with the boundary conditions145
on the object’s surface, guaranteeing the correctness of solution.146
[James et al. 2006] use the equivalent source method to compute147
the radiated field generated by a vibrating object.148
2.3 Sound Propagation149
Sound is a pressure wave described using the Helmholtz equation150
for a domain Ω:151
∇2p+ω2
c2p = 0,x ∈ Ω (1)
where p(x) is the complex-valued pressure field, ω is the angu-152
lar frequency, c is the constant speed of sound in a homogenous153
medium, and ∇2 is the Laplacian operator. Boundary conditions154
are specified on the boundary of the domain ∂Ω by either using155
the Dirichlet boundary condition that specifies the pressure on the156
boundary p = f(x) on ∂Ω, the Nuemann boundary condition157
that specifies the velocity of the medium ∂p(x)∂n
= f(x) on ∂Ω,158
or a mixed boundary condition that specifies Z ∈ C, so that159
Z ∂p(x)∂n
= f(x) on ∂Ω. We also need to specify the behavior of160
p at infinity, which is usually done using the Sommerfeld radiation161
condition [Pierce et al. 1981]:162
limr→∞
[∂p
∂r+ i
ω
cp] = 0 (2)
where r = ||x|| is the distance of point x from the origin.163
Different methods exist to solve the equation with different formu-164
lations. Numerical methods solve for p numerically either by dis-165
cretizing the entire domain or the boundary. Geometric techniques166
model p as a set of rays and propagate these rays through the envi-167
ronment.168
2.3.1 Wave-based Sound Propagation169
Wave-based or numerical sound propagation solve the acoustic170
wave equation using a numerical wave solvers. These meth-171
ods obtain the exact behavior of a propagating sound wave in172
a domain. Numerical wave solvers discretize space and time173
to solve the wave equation. Typically techniques include fi-174
nite difference time domain (FDTD) method [Yee 1966; Taflove175
and Hagness 2005; Sakamoto et al. 2006], finite element method176
[Thompson 2006] , boundary element method [Cheng and Cheng177
2005], pseudo-spectral time domain [Liu 1997], and domain-178
decomposition [Raghuvanshi et al. 2009]. Wave-based methods179
have high accuracy and can simulate wave-effects such as diffrac-180
tion accurately at low frequencies. However, their memory and181
compute requirements grow as the third or fourth power of the fre-182
quency, making them impractical for interactive applications.183
2.3.2 Geometric Sound Propagation184
Geometric sound propagation techniques use the simplifying as-185
sumption that the wavelength of sound is much smaller than fea-186
tures in the scene. As a result, these methods are most accurate187
for high frequencies and must model low-frequency effects like188
diffraction and scattering as separate phenomena. Commonly used189
2
Online Submission ID: 0
techniques are based on image source methods [Allen and Berkley190
1979; Borish 1984] and ray tracing [Krokstad et al. 1968; Vorlander191
1989]. Recently, there has been a focus on computing realistic192
acoustics in real time using algorithms designed for fast simulation.193
These include beam tracing [Funkhouser et al. 1998], frustum trac-194
ing [Chandak et al. 2008], and ray-based algorithms [Lentz et al.195
2007; Taylor et al. 2012] that compute low-order reflections. In ad-196
dition, frame-to-frame coherence of the sound field can be utilized197
to achieve a significant speedup [Schissler et al. 2014].198
Edge diffraction effects can be approximated within GA frame-199
works using methods based on the uniform theory of diffrac-200
tion (UTD) [Kouyoumjian and Pathak 1974] or the Biot-Tolstoy-201
Medwin (BTM) model [Svensson et al. 1999]. These approaches202
have been applied to static scenes and low-oder diffraction [Tsin-203
gos et al. 2001; Antani et al. 2012a], as well as dynamic scenes204
with first-order [Taylor et al. 2012] and higher-order diffrac-205
tion [Schissler et al. 2014]. Diffuse reflection effects caused by206
surface scattering have been previously modeled using the acous-207
tic rendering equation [Siltanen et al. 2007; Antani et al. 2012b],208
and radiosity-based methods [Franzoni et al. 2001]. Another209
commonly-used technique for ray tracing called vector-based scat-210
tering uses scattering coefficients to model diffusion [Christensen211
and Koutsouris 2013].212
3 Our Algorithm213
In this section, we give a brief background on various concepts used214
in the paper and present our coupled synthesis-propagation algo-215
rithm.216
3.1 Background217
Modal Analysis: Sound is produced by small vibrations of objects.218
These vibrations although invisible to the naked eye are audible if219
the frequency of vibration lies in the range of human hearing (20 Hz220
- 20 kHz). Modal analysis is a well-known technique for modeling221
such sounds in rigid-bodies. The small vibrations can be modeled222
using a coupled linear system of ODEs:223
Kd + Cd + Md = f , (3)
where K, C, and M are the stiffness, damping, and mass matrices224
respectively and f represents the (external) force vector. For small225
damping it is possible to approximate C as a combination of mass226
and stiffness matrix: C = αM+βK. This facilitates the diagonal-227
ization of the above equation, which is represented as a generalized228
eigenvalue problem:229
KU = ΛMU, (4)
where Λ is the diagonal eigenvalue matrix, U contains the eigen-230
vectors of K. Solving this eigenvalue problem enables us to write231
Eq. 3 as system of decoupled oscillators:232
q + (αI + βΛ)q + Λq = UTf , (5)
where U projects d into the modal subspace q with d = Uq.233
Acoustic transfer: The pressure p(x) at any point obtained on234
solving Eq. (1) is called the acoustic transfer function. The acous-235
tic transfer function gives the relation between the surface normal236
displacements at a surface node and sound pressure at a given field237
point. A common method used in acoustics to evaluate these trans-238
fer functions through the use of boundary element method (BEM)239
discussed before.240
Since we’re solving Eq. (1) in the frequency domain, we have to241
solve the exterior scattering problem for each mode separately. This242
can be achieved using a fast BEM solver and specifying the Neu-243
mann Boundary Condition:244
∂p
∂n= −iωρv on S, (6)
where S = ∂Ω (the boundary of the object), ρ is the fluid density,245
and v is the surface’s normal velocity given by v = iω(n · u),246
where n · u is modal displacement in the normal direction. This247
boundary condition links the modal displacements with the pressure248
at a point. Unfortunately, BEM is not fast enough for an interactive249
runtime necessitating the use of fast, approximate acoustic transfer250
functions [James et al. 2006].251
In order to approximate the acoustic transfer, we use a source sim-252
ulation technique called the Equivalent Source Method. We rep-253
resent a sound source using a collection of point sources (called254
equivalent sources) and match the pressure values on the boundary255
of the object ∂Ω with the pressure on ∂Ω calculated using BEM.256
The main idea here is that if we can match the strengths of the257
equivalent sources to match the boundary pressure, we can evaluate258
the pressure at any point on Ω using these equivalent sources.259
Equivalent sources: The uniqueness of the acoustic bound-260
ary value problem guarantees that the solution of the free-space261
Helmholtz equation along with the specified boundary conditions262
is unique inside Ω. The unique solution p(x) can be found by ex-263
pressing it as a linear combination of fundamental solutions. One264
choice of fundamental solutions is based on equivalent sources. An265
Equivalent source q(x,yi), of the Helmholtz equation subject to266
the Sommerfeld radiation condition xi 6= yi is the solution field267
induced at any point x due to a point source located at yi, and can268
be expressed as:269
q(x,yi) =
L−1∑l=0
l∑m=−l
cilmϕilm(x) =
L2∑k=1
dikϕik(x), (7)
where k is a generalized index for (l,m) and cilm is its strength.270
These fundamental solutions (ϕik) are chosen to correspond to the271
field due to spherical multipole sources of order L (L = 1 being272
a monopole, L = 2 a dipole, and so on) located at yi. Spherical273
multipoles are given as a product of two functions:274
ϕilm(x) = Γlmh(2)l (kiri)ψlm(θi, φi), (8)
where (ri, θi, φi) is the vector (x − yi) expressed in spherical co-275
ordinates, h(2)l (kiri) is the spherical Hankel function of the sec-276
ond kind, ki is the wavenumber given by ωic
, ψlm(θi, φi) are the277
complex-valued spherical harmonics functions, and Γlmh(2)l is the278
normalizing factor for the spherical harmonics. The pressure at any279
point in Ω due to M equivalent sources located at yiMi=1 can be280
expressed as a linear combination:281
p(y) =
M∑i=1
L−1∑l=0
m=l∑m=−l
cilmϕilm(y). (9)
We have to determine the L2 complex coefficients cilm for each282
of the M multipoles. This compact representation of the pressure283
p(y) makes it possible to evaluate the pressure at any point of the284
domain in an efficient manner.285
3.2 Coupled Algorithm286
We now discuss our coupled synthesis-propagation algorithm. As287
shown in Fig. 2, we start with the modal analysis of the sounding288
object which gives the modal displacements, modal frequencies,289
3
Online Submission ID: 0
Figure 2: Overview of our coupled synthesis propagation pipeline. The bowl is used an example of a modal object. The 1st stage comprisesthe modal analysis. The figures in red show the first two sounding modes of the bowl. We then form an offset surface around the bowl,calculate the pressure on this offset surface, place a single multipole at the center of the object, and approximate the BEM evaluated pressure.In the runtime part of the pipeline, we use the multipole to couple with a propagation system and generate the final sound at the listener.
and modal amplitudes. We use these mode shapes as a boundary290
condition to BEM to compute the pressure on an offset surface.291
Then we place a single equivalent source in the center of the object292
and approximate the pressure calculated using BEM. This gives us293
a vector of (complex) coefficients of the multipole strengths. At this294
stage (the SPME stage in the pipeline), we have computed the rep-295
resentation of an acoustic radiator, which serves as source for the296
propagation in the runtime stage of the pipeline using either a ge-297
ometric or a numeric sound propagator. Our method is agnostic to298
the type of sound propagator, but owing to high modal frequencies299
generated in our benchmarks, we use a geometric sound propaga-300
tion system to obtain interactive performance. The final stage of301
the pipeline takes the impulse response for each mode, convolves302
it with that mode’s amplitude , and sums it to give the final signal.303
We describe each stage of our pipeline below:304
3.2.1 Sound synthesis305
Given an object, we solve the displacement equation (Eq. 5) to get306
a discrete set of mode shapes di, their modal frequencies ωi, and307
the amplitudes qi(t). The vibration’s displacement vector is given308
by:309
d(t) = Uq(t) ≡ [d1, ..., ˆdM ]q(t), (10)
where M is total number of modes and q(t) ∈ <M is the vector of310
modal amplitude coefficients expressed as a bank of sinusoids:311
qi = aie−ditsin(2πfit+ θi), (11)
where fi is the modal frequency (in Hz.), di is the damping coeffi-312
cient, ai is amplitude, and θi is the initial phase.313
3.2.2 Sound radiation314
Once we have the mode shapes and modal frequencies of an object,315
we compute the approximate acoustic transfer for the object similar316
to [James et al. 2006], but use a different equivalent source repre-317
sentation. We first compute a manifold and closed offset around the318
object. This defines a clear inside to place the multipole source319
and also serves as the boundary ∂Ω on which BEM solves the320
Helmholtz Equation to obtain pressure p at the N vertices on the321
surface.322
We then use a Single-Point Multipole Expansion to match the pres-323
sure values, p, on ∂Ω. This is performed by fixing the position of324
the multipole and iteratively increasing the order of the multipole325
till the error is below a certain threshold ε. This step has to be re-326
peated for each modal frequency with the order generally increasing327
with the modal frequency.328
Since we are using a geometric sound propagator in the runtime329
stage of our pipeline, using single-point multipole (per mode)330
makes it possible to use just one geometric propagation source for331
all modes. Theoretically, each multipole should be represented as a332
different geometric propagation source, but since all the multipoles333
were kept at the same position during BEM pressure evaluation on334
the offset surface, we can use just one geometric propagation source335
and use the modal frequency (ωi) as the filter to scale the pressure336
at a point. This makes it possible to have an interactive runtime337
performance.338
For a single-point multipole, Eq. (8) simplifies to:339
p(y) =
L−1∑l=0
m=l∑m=−l
cilmϕilm(y). (12)
Since no optimal strategies exist for the optimal placement of the340
multipole source [Ochmann 1995], we chose the center of our341
modal object as the source location. This is in stark contrast to342
[James et al. 2006; Mehra et al. 2013] who used a hierarchical343
source placement algorithm to minimize the residual error. We344
maintain the same error thresholds as them, but simplify the prob-345
lem by increasing the order L iteratively and checking the pressure346
4
Online Submission ID: 0
residual ||r||2 < ε, where r = p−Acwith A being anN−by−L2347
multipole basis matrix, and c ∈ CL2
is complex coefficient vector.348
Once we match the BEM pressure on the offset surface for each349
mode, we place one spherical sound source for the geometric prop-350
agation for all the modes at the same position as our multipoles.351
3.2.3 Sound propagation352
Given a single-point multipole source, we can use either a wave-353
based or a geometric sound propagation scheme to propagate the354
source’s radiation pattern into the environment. We describe exist-355
ing techniques that can be used with our system.356
Wave-based Propagation: Frequency domain numerical methods357
like [Mehra et al. 2013] use the equivalent source method to com-358
pute the pressure field on a domain. They decompose the scene into359
well-separated objects and compute the per-object and the inter-360
object transfer functions. The per-object transfer function maps the361
incoming sound field incident for an object A to the outgoing field362
as is defined as:363
f(ΦinA ) = TAΦoutA , (13)
where ΦinA is the vector of multipoles representing the incident field364
on an object, ΦoutA is the vector of outgoing multipoles representing365
the scattered field, TA is the scattering matrix containing the (com-366
plex) coefficients of the outgoing multipole sources. Similarly, the367
inter-object transfer function for a pair of objects A and B is de-368
fined as:369
gBA (ΦoutA ) = GBAΦinB , (14)
where GBA is the interaction matrix and contains the (complex) co-370
efficients for mapping an outgoing field from object A to object B.371
In general, GBA 6= GAB . For more details on this technique, refer to372
[Mehra et al. 2013]373
The single-point multipole source can be used to represent the inci-374
dent field on an object for each modal frequency, which can then be375
approximated using the incoming multipoles ΦinA and used in Eq. 7376
to get the per-object and the inter-object transfer functions.377
Geometric Propagation: These methods make the assumption that378
the wavelength of sound is much greater than the size of features in379
the scene and then treat sound as rays, frustums, or beams. Wave380
effects like diffraction are modeled separately using geometric ap-381
proximations. We make use of the ray-based sound propagation382
system of [Schissler et al. 2014] to compute paths that sound can383
travel through the scene. This system combines path tracing with384
a cache of diffuse sound paths to reduce the number of rays re-385
quired for an interactive simulation. The approach begins by trac-386
ing a small number (e.g. 500) of rays uniformly in all directions387
from each sound source. These rays strike surfaces and are re-388
flected recursively up to a specified maximum reflection depth (e.g.389
50). The reflected rays are computed using vector-based scatter-390
ing [Christensen and Koutsouris 2013], where the resulting rays are391
a linear combination of the specularly reflected rays and random392
Lambertian-distributed rays. The listener is modeled as a sphere393
the same size as a human head. At each ray-triangle intersection,394
the visibility of the listener sphere is sampled by tracing a few addi-395
tional rays towards the listener. If some fraction of the rays are not396
occuluded, a path to the listener is produced. A path contains the397
following output data: The total distance the ray traveled r, along398
with the attenuation factor α due to reflection and diffraction inter-399
actions. Diffracted sound is computed separately using the UTD400
diffraction model [Tsingos et al. 2001]. Given the output of the ge-401
ometric propagation system, we can evaluate the sound pressure as:402
403
p(x) =∑r∈R
pr(x), (15)
where pr is the contribution from a ray r in a set of rays R. We404
model a multipole Ψi using rays R as :405
M∑i=1
Ψi(x) = p(x) =∑r∈R
pr(x),x ∈ Ω, (16)
where Ψi =∑L−1l=0
∑m=lm=−l cilmϕilm(y) for iM1 .406
This coupling lets us calculate the pressure for a set of ray directions407
sampling a sphere uniformly: For a ray direction (θ, φ), traveling408
a distance r, its pressure is scaled by ψ(θ, φ), h(2)l (kr), and α(ωi)409
where α(ωi) is the energy of a ray for a modal frequency ωi. We410
use a geometric ray-traced based system to get the paths and their411
respective energies.412
Path Clustering: Although, using a single geometric source reduces413
the number of rays considerably, in order to get the acoustic phe-414
nomena right we still need a considerable number of rays (≥ 15000)415
which makes it too slow for a modal sound source even with a few416
sounding modes (M ≥ 20). We solve this problem by cluster-417
ing the rays based on the angle between the rays and their respec-418
tive time-delays. We bin the IR (Impulse Response) according to419
a user-specified bin size t in seconds (Figure. 3). Then, for each420
bin we cluster the rays based on the binning angle ϑ. The binning421
algorithm is shown in Algorithm 1.422
Soun
d In
tens
ity
Delay time
Soun
d In
tens
ity
Delay time (binned) δt 2δt 3δt… 0
Figure 3: Path Clustering.
Algorithm 1 PathBinning(t, ϑ)
1: maxNumberOfBins← ceil(IR.length()/t)2: bins.setSize(maxNumberOfBins)3: for Each ray r ∈ R do. Ray directions are normalized
4:−→Sr ← r.direction()
5: binIndex← floor(r.delay()/t)6: bin← bins[binIndex]7: for Each cluster in bin do. Cluster directions are normalized
8:−→Sc ← cluster.direction()
. check if the angle between the two vectors is less than thecluster angle ϑ
9: if−→Sc ·−→Sr > cos(ϑ) then
10: cluster.add(r)11: end if12: end for
. If the ray wasn’t compatible with any of the clusters, create anew one and add the path to it
13: newCluster ← cluster(−→Sr)
14: bin.add(newCluster)15: newCluster.add(r)16: end for
Auralization: The last stage of the pipeline is computing the lis-423
tener response for all the modes. We compute this response by424
5
Online Submission ID: 0
convolving the time-domain impulse response of each mode with425
that mode’s amplitude. The final signal O(t) is:426
O(t) =
M∑i=1
qi(t) ∗ IRi(t) (17)
Where IRi is the impulse response of the ith mode, qi(t) is the427
amplitude of the ith mode, and ∗ is the convolution operator.428
4 Implementation429
In this section, we describe the implementation details of our sys-430
tem. All the runtime code was written in C++ and timed on a 16-431
core, Intel Xeon E5-2687W @ 3.1 GHz desktop with 64 GB of432
RAM running Windows 7 64-bit. In the preprocessing stage, the433
offset surface generation and eigen decomposition code was writ-434
ten in C++, while the single-point multipole expansion was written435
in MATLAB.436
Preprocessing: We used finite element technique to compute the437
stiffness matrix K which takes the tetrahedralized model, Young’s438
modulus and the Poisson’s ratio of the sounding object and com-439
pute the stiffness matrix for the object. We then do the eigenvalue440
decomposition of the system using Intel’s MKL library (DSYEV)441
and calculate the modal displacements, frequencies, and amplitudes442
in C++. The code to find the multipole strengths was written in443
MATLAB, the pressure on the offset surface was calculated using444
a fast BEM solver (FastBEM) using FMM-BEM (Fast multipole445
method).446
Sound Propagation: We use a fast, state-of-the-art geometric ray447
tracer [Schissler et al. 2014] to get the paths for our pressure com-448
putation. This technique is capable of handling very high orders of449
diffuse and specular reflections (e.g. 10 orders of specular reflec-450
tions and 50 orders of diffuse reflections) and still maintain inter-451
active performance. As mentioned in previous section, we cluster452
the rays in order to reduce the number of rays in the scene, but even453
with that, the pressure (i.e., the spherical harmonics and the hankel454
functions) computation for each ray has to be optimized heavily to455
meet the interactive performance requirements.456
Spherical Harmonic computation: The number of spherical har-457
monics computed per ray varies asO(L2), making naive evaluation458
too slow for an interactive runtime. We used a modified version of459
available fast spherical harmonic code [Sloan 2013] to compute460
the pressure contribution of each ray. The available code computes461
only the real spherical harmonics by making extensive use of SSE462
(Streaming SIMD Extension). We find the complex spherical har-463
monics from the real ones following a simple observation:464
Y ml =1√2
(Re(Y ml ) + ιRe(Y −ml ))m > 0, (18)
Y ml =1√2
(Re(Y ml )− ιRe(Y −ml ))(−1)m m < 0. (19)
Using this optimized code gives us a 2-3 orders of magnitude speed-465
up compared to existing spherical harmonic implementations, e.g.,466
BOOST467
Distance Clustering: Even after the significant speedup achieved in468
calculating spherical harmonics, Hankel functions need to be com-469
puted for each ray, varying linearly with the order of the multipole.470
We solve this problem by clustering the paths, similar to what we471
did in previous section , based on the distance traveled by them in472
the environment. Given a user-defined bin size δd and the length of473
the IR t in seconds, we cluster ray distances into NHankel = tδd
474
bins requiring us to make an order of magnitude less computations.475
The Hankel functions are evaluated using BOOST.476
Parallel computation of Mode pressure: Since each mode is inde-477
pendent of the other, the pressure computation for each one of them478
can be done in parallel. The lower modes generally require lesser479
time to evaluate than the higher ones, so we use a simple, scene de-480
pendent, load-balancing scheme to divide the work equally amongst481
all the 16 cores. We used OpenMP for the parallelizing on a multi-482
core system.483
Real-Time Auralization: The final audio for the simulations is ren-484
dered using a streaming partitioned convolution technique. All au-485
dio rendering is performed at a sampling rate of 48 kHz. We first486
construct an impulse response (IR) for each mode using the com-487
puted pressure for the paths returned by the propagation system that488
incorporate the effects of the single-point multipole expansion. The489
IR is initialized to zero and the pressure for each path is added to490
the IR at the sample index corresponding to the delay for that path.491
Once constructed, the IRs for all modes are passed to the convolu-492
tion system for auralization, where they are converted to frequency493
domain. During audio rendering, the time-domain input audio for494
each mode is converted to frequency domain, then multiplied with495
the corresponding IR partition coefficients. The inverse FFT of the496
resulting sound is computed and accumulated using overlap-add in497
a circular output buffer. The audio device reads from the circular498
buffer at the current position and plays back the rendered sound.499
5 Conclusion500
We present the first coupled sound synthesis-propagation algorithm501
that can generate realistic sound effects for computer games and502
virtual reality. We describe an approach that integrates prior meth-503
ods for modal sound synthesis, sound radiation, and sound prop-504
agation. The radiating sound fields are represented in a compact505
basis using a single-point multiple expansion. We perform sound506
propagation using this source basis using fast ray-tracing to com-507
pute the impulse response and convolve them with the modes to508
generate the final sound at the listener. The resulting system has509
been integrated and we highlight the performance of many indoor510
and outdoor scenes. Overall, this is the first system that successfully511
combines these methods and can handle a high degree of dynamism512
in term of source radiation and propagation in complex scenes.513
Our approach has some limitations. It is limited to rigid objects514
and modal sounds. Moreover the time complexity tends to increase515
with the mode frequency. Our single-point multipole expansion ap-516
proach can result in very high order of multipoles. The geometric517
sound propagation algorithm may not be able to compute the low518
frequency effects (e.g. diffraction) accurately in all environments.519
Moreover, the wave-based sound propagation algorithm involves520
high pre computation overhead and is limited to static scenes. Cur-521
rently, we do not perform any sort of mode compression resulting522
in a lot of closely spaced modes being generated. We could use the523
compression algorithms [Raghuvanshi and Lin 2006b; Langlois524
et al. 2014] as a means to reduce the number of modes and thus525
reduce the overhead of pressure computation. Our preprocessing526
stage takes long time, where most of the time is spent in doing the527
eigen-decomposition of the stiffness matrix.528
There are many avenues for future work. In addition to overcom-529
ing these limitations, we would like to use it in more complex in-530
door and outdoor environments and generate other sound effects for531
complex objects in large environments (e.g. a bell ringing over a532
6
Online Submission ID: 0
large, outdoor valley). We would like to explore some approximate533
solutions to accelerate the freespace acoustic transfer computation.534
It would be useful to include directional sources and also accelerate535
the computations using iterative algorithms like Arnoldi [ARP ].536
References537
ALLEN, J. B., AND BERKLEY, D. A. 1979. Image method for efficiently simulating538
small-room acoustics. The Journal of the Acoustical Society of America 65, 4539
(April), 943–950.540
ANTANI, L., CHANDAK, A., TAYLOR, M., AND MANOCHA, D. 2012. Efficient541
finite-edge diffraction using conservative from-region visibility. Applied Acoustics542
73, 218–233.543
ANTANI, L., CHANDAK, A., SAVIOJA, L., AND MANOCHA, D. 2012. Interactive544
sound propagation using compact acoustic transfer operators. ACM Trans. Graph.545
31, 1 (Feb.), 7:1–7:12.546
ARPACK. http://www.caam.rice.edu/software/ARPACK/.547
BORISH, J. 1984. Extension to the image model to arbitrary polyhedra. The Journal548
of the Acoustical Society of America 75, 6 (June), 1827–1836.549
CHADWICK, J. N., AN, S. S., AND JAMES, D. L. 2009. Harmonic shells: a practical550
nonlinear sound model for near-rigid thin shells. In ACM SIGGRAPH Asia 2009551
papers, ACM, New York, NY, USA, SIGGRAPH Asia ’09, 119:1–119:10.552
CHAIGNE, A., AND DOUTAUT, V. 1997. Numerical simulations of xylophones. i.553
time domain modeling of the vibrating bars. J. Acoust. Soc. Am. 101, 1, 539–557.554
CHANDAK, A., LAUTERBACH, C., TAYLOR, M., REN, Z., AND MANOCHA, D.555
2008. Ad-frustum: Adaptive frustum tracing for interactive sound propagation.556
IEEE Trans. Visualization and Computer Graphics 14, 6, 1707–1722.557
CHENG, A., AND CHENG, D. 2005. Heritage and early history of the boundary558
element method. Engineering Analysis with Boundary Elements 29, 3 (Mar.), 268–559
302.560
CHRISTENSEN, C., AND KOUTSOURIS, G. 2013. Odeon manual, chapter 6.561
CISKOWSKI, R. D., AND BREBBIA, C. A. 1991. Boundary element methods in562
acoustics. Computational Mechanics Publications Southampton, Boston.563
FAIRWEATHER, G. 2003. The method of fundamental solutions for scattering and564
radiation problems. Engineering Analysis with Boundary Elements 27, 7 (July),565
759–769.566
fastBEM making efficient high-fidelity acoustic modeling a reality! http://www.567
fastbem.com/fastbemacoustics.html.568
FLORENS, J. L., AND CADOZ, C. 1991. The physical model: modeling and simu-569
lating the instrumental universe. In Represenations of Musical Signals, G. D. Poli,570
A. Piccialli, and C. Roads, Eds. MIT Press, Cambridge, MA, USA, 227–268.571
FRANZONI, L. P., BLISS, D. B., AND ROUSE, J. W. 2001. An acoustic boundary572
element method based on energy and intensity variables for prediction of high-573
frequency broadband sound fields. The Journal of the Acoustical Society of Amer-574
ica 110, 3071.575
FUNKHOUSER, T., CARLBOM, I., ELKO, G., PINGALI, G., SONDHI, M., AND576
WEST, J. 1998. A beam tracing approach to acoustic modeling for interactive577
virtual environments. In Proc. of ACM SIGGRAPH, 21–32.578
JAMES, D. L., BARBIC, J., AND PAI, D. K. 2006. Precomputed acoustic transfer:579
output-sensitive, accurate sound generation for geometrically complex vibration580
sources. In ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, SIG-581
GRAPH ’06, 987–995.582
KOUYOUMJIAN, R. G., AND PATHAK, P. H. 1974. A uniform geometrical theory of583
diffraction for an edge in a perfectly conducting surface. Proceedings of the IEEE584
62, 11, 1448–1461.585
KROKSTAD, A., STROM, S., AND SORSDAL, S. 1968. Calculating the acoustical586
room response by the use of a ray tracing technique. Journal of Sound and Vibration587
8, 1 (July), 118–125.588
KROPP, W., AND SVENSSON, P. U. 1995. Application of the time domain formulation589
of the method of equivalent sources to radiation and scattering problems. Acta590
Acustica united with Acustica 81, 6, 528–543.591
LANGLOIS, T. R., AN, S. S., JIN, K. K., AND JAMES, D. L. 2014. Eigenmode592
compression for modal sound models. ACM Transactions on Graphics (TOG) 33,593
4, 40.594
LENTZ, T., SCHRODER, D., VORLANDER, M., AND ASSENMACHER, I. 2007.595
Virtual reality system with integrated sound field simulation and reproduction.596
EURASIP Journal on Advances in Singal Processing 2007 (January), 187–187.597
LIU, Q. H. 1997. The PSTD algorithm: A time-domain method combining the pseu-598
dospectral technique and perfectly matched layers. The Journal of the Acoustical599
Society of America 101, 5, 3182.600
MEHRA, R., RAGHUVANSHI, N., ANTANI, L., CHANDAK, A., CURTIS, S., AND601
MANOCHA, D. 2013. Wave-based sound propagation in large open scenes using602
an equivalent source formulation. ACM Trans. Graph. (Apr.).603
MOSS, W., YEH, H., HONG, J.-M., LIN, M. C., AND MANOCHA, D. 2010. Sound-604
ing liquids: Automatic sound synthesis from fluid simulation. ACM Trans. Graph.605
29, 3, 1–13.606
O’BRIEN, J. F., COOK, P. R., AND ESSL, G. 2001. Synthesizing sounds from607
physically based motion. In SIGGRAPH ’01: Proceedings of the 28th annual con-608
ference on Computer graphics and interactive techniques, ACM Press, New York,609
NY, USA, 529–536.610
O’BRIEN, J. F., SHEN, C., AND GATCHALIAN, C. M. 2002. Synthesizing sounds611
from rigid-body simulations. In The ACM SIGGRAPH 2002 Symposium on Com-612
puter Animation, ACM Press, 175–181.613
OCHMANN, M. 1995. The source simulation technique for acoustic radiation prob-614
lems. Acustica 81, 512–527.615
OCHMANN, M. 1999. The full-field equations for acoustic radiation and scattering.616
The Journal of the Acoustical Society of America 105, 5, 2574–2584.617
PAVIC, G. 2006. A technique for the computation of sound radiation by vibrating618
bodies using multipole substitute sources. Acta Acustica united with Acustica 92,619
112–126(15).620
PIERCE, A. D., ET AL. 1981. Acoustics: an introduction to its physical principles and621
applications. McGraw-Hill New York.622
RAGHUVANSHI, N., AND LIN, M. C. 2006. Interactive sound synthesis for large scale623
environments. In SI3D ’06: Proceedings of the 2006 symposium on Interactive 3D624
graphics and games, ACM Press, New York, NY, USA, 101–108.625
RAGHUVANSHI, N., AND LIN, M. C. 2006. Interactive sound synthesis for large scale626
environments. In Proceedings of the 2006 symposium on Interactive 3D graphics627
and games, ACM, 101–108.628
RAGHUVANSHI, N., NARAIN, R., AND LIN, M. C. 2009. Efficient and accurate629
sound propagation using adaptive rectangular decomposition. Visualization and630
Computer Graphics, IEEE Transactions on 15, 5, 789–801.631
RAGHUVANSHI, N., SNYDER, J., MEHRA, R., LIN, M. C., AND GOVINDARAJU,632
N. K. 2010. Precomputed Wave Simulation for Real-Time Sound Propagation of633
Dynamic Sources in Complex Scenes. SIGGRAPH 2010 29, 3 (July).634
REN, Z., MEHRA, R., COPOSKY, J., AND LIN, M. C. 2012. Tabletop ensemble:635
touch-enabled virtual percussion instruments. In Proceedings of the ACM SIG-636
GRAPH Symposium on Interactive 3D Graphics and Games, ACM, 7–14.637
SAKAMOTO, S., USHIYAMA, A., AND NAGATOMO, H. 2006. Numerical analysis of638
sound propagation in rooms using the finite difference time domain method. The639
Journal of the Acoustical Society of America 120, 5, 3008–3008.640
SCHISSLER, C., MEHRA, R., AND DINESH, M. 2014. High-order diffraction and641
diffuse reflections for interactive sound propagation in large environments. In Proc.642
of ACM SIGGRAPH.643
SILTANEN, S., LOKKI, T., KIMINKI, S., AND SAVIOJA, L. 2007. The room acous-644
tic rendering equation. The Journal of the Acoustical Society of America 122, 3645
(September), 1624–1635.646
SLOAN, P.-P. 2013. Efficient spherical harmonic evaluation. Journal of Computer647
Graphics Techniques (JCGT) 2, 2 (September), 84–83.648
SVENSSON, U. P., FRED, R. I., AND VANDERKOOY, J. 1999. An analytic secondary649
source model of edge diffraction impulse responses . Acoustical Society of America650
Journal 106 (Nov.), 2331–2344.651
TAFLOVE, A., AND HAGNESS, S. C. 2005. Computational Electrodynamics: The652
Finite-Difference Time-Domain Method, Third Edition, 3 ed. Artech House Pub-653
lishers, June.654
TAYLOR, M., CHANDAK, A., MO, Q., LAUTERBACH, C., SCHISSLER, C., AND655
MANOCHA, D. 2012. Guided multiview ray tracing for fast auralization. IEEE656
Transactions on Visualization and Computer Graphics 18, 1797–1810.657
THOMPSON, L. L. 2006. A review of finite-element methods for time-harmonic658
acoustics. The Journal of the Acoustical Society of America 119, 3, 1315–1330.659
7
Online Submission ID: 0
TSINGOS, N., FUNKHOUSER, T., NGAN, A., AND CARLBOM, I. 2001. Modeling660
acoustics in virtual environments using the uniform theory of diffraction. In Proc.661
of ACM SIGGRAPH, 545–552.662
VAN DEN DOEL, K., AND PAI, D. K. 1996. Synthesis of shape dependent sounds with663
physical modeling. In Proceedings of the International Conference on Auditory664
Displays.665
VAN DEN DOEL, K., AND PAI, D. K. 1998. The sounds of physical shapes. Presence666
7, 4, 382–395.667
VAN DEN DOEL, K., KRY, P. G., AND PAI, D. K. 2001. Foleyautomatic: physically-668
based sound effects for interactive simulation and animation. In SIGGRAPH ’01:669
Proceedings of the 28th annual conference on Computer graphics and interactive670
techniques, ACM Press, New York, NY, USA, 537–544.671
VON ESTORFF, O. 2000. Boundary elements in acoustics: advances and applications,672
vol. 9. Wit Pr/Computational Mechanics.673
VORLANDER, M. 1989. Simulation of the transient and steady-state sound propa-674
gation in rooms using a new combined ray-tracing/image-source algorithm. The675
Journal of the Acoustical Society of America 86, 1, 172–178.676
YEE, K. 1966. Numerical solution of initial boundary value problems involving677
maxwell’s equations in isotropic media. IEEE Transactions on Antennas and Prop-678
agation 14, 3 (May), 302–307.679
YEH, H., MEHRA, R., REN, Z., ANTANI, L., MANOCHA, D., AND LIN, M. 2013.680
Wave-ray coupling for interactive sound propagation in large complex scenes. ACM681
Trans. Graph. 32, 6, 165:1–165:11.682
ZHENG, C., AND JAMES, D. L. 2009. Harmonic fluids. ACM Trans. Graph. 28, 3,683
1–12.684
ZHENG, C., AND JAMES, D. L. 2010. Rigid-body fracture sound with precomputed685
soundbanks. In SIGGRAPH ’10: ACM SIGGRAPH 2010 papers, ACM, New York,686
NY, USA, 1–13.687
8