SynCoPation: Synthesis Coupled Sound Propagationcsbio.unc.edu/PRP/Reports/rungtaFall2014.pdf ·...

Online Submission ID: 0

SynCoPation: Synthesis Coupled Sound Propagation

Figure 1: Our coupled sound synthesis-propagation technique has been integrated in the UnityTM game engine. We plan to demonstratethe sound effects generated by our system on a variety of scenarios: (a) Cathedral, (b) Tuscany, and (c) Game scene. In the left most case,the bowl sounds are synthesized and propagated in the Cathedral; In the middle scene, the bell sounds are synthesized and propagated inoutdoor scene; In the last scene sounds of barrel hitting the ground are synthesized and propagated.

Abstract1

Sounds can augment the sense of presence and immersion of users2

and improve their experience in virtual environments. Recent re-3

search in sound simulation and rendering has focused either on4

sound synthesis or on sound propagation, and many standalone al-5

gorithms have been developed for each domain. We present a novel6

technique for automatically generating aural content for virtual en-7

vironments based on an integrated scheme that can perform sound8

synthesis as well as sound propagation. Our coupled approach can9

generate sounds from rigid-bodies based on the audio modes and10

radiation coefficients; and interactively propagate them through the11

environment to generate acoustic effects. Our integrated system12

allows high degree of dynamism - it can support dynamic sources,13

dynamic listeners, and dynamic directivity simultaneously. Further-14

more, our approach can be combined with wave-based and geomet-15

ric sound propagation algorithms to compute environmental effects.16

We have integrated our system with the Unity game engine and17

show the effectiveness of fully-automatic audio content creation in18

complex indoor and outdoor scenes.19

1 Introduction20

Sound simulation algorithms predict the behavior of sound waves,21

including generation of sound waves from vibrations and propaga-22

tion of sound waves in the environment. Realistic sound simula-23

tion is important in computer games to increase the level of immer-24

sion and realism. Sound augments the visual sense of the player,25

provides spatial cues about the environment, and can improve the26

overall gaming experience. At a broad level, prior research in sound27

simulation can be classified into two parts - synthesis and propaga-28

tion. The problem of sound synthesis deals with simulating the29

physical processes (e.g. vibration of a sound source) involved in30

generation of sound. Sound propagation, on the other hand, deals31

with the behavior of sound waves as they are emitted by the source,32

interact with the environment, and reach the listener.33

State-of-the-art techniques for sound simulation deal with the prob-34

lem of sound synthesis and sound propagation independently.35

Sound synthesis techniques model the generation of sound result-36

ing from vibration analysis of the structure of the object [Zheng and37

James 2009; Chadwick et al. 2009; Moss et al. 2010; Zheng and38

James 2010]. However, in these techniques only sound propagation39

in free-space (empty space) is modeled and the acoustics effects40

generated by the environment are mostly ignored. Sound propa-41

gation techniques [Krokstad et al. 1968; Allen and Berkley 1979;42

Funkhouser et al. 1998; Raghuvanshi et al. 2010; Mehra et al. 2013;43

Yeh et al. 2013] model the interaction of sound waves in indoor and44

outdoor spaces, but assume pre-recorded or pre-synthesized audio45

clips as input. These assumptions can result in missing sound ef-46

fects and generate inaccurate (or non-plausible) solutions for the47

underlying physical reality produced by the process of sound sim-48

ulation. For example, consider the case of a kitchen bowl falling49

from a countertop; the change in the directivity of the bowl with the50

hit position and the effect of this time-varying directivity on propa-51

gated sound field in the kitchen is mostly ignored by current simula-52

tion techniques. Similarly, for a barrel rolling down in an alley, the53

sound consists of multiple frequencies, where each frequency has54

different radiation and propagation characteristic, which are mostly55

ignored by current sound simulation systems. Due to these limi-56

tations, artists and game audio-designers have to manually design57

sound effects corresponding to these different kinds of scenarios,58

which can be very tedious and time-consuming.59

In this paper, we present the first coupled synthesis and propaga-60

tion system which models the entire process of sound simulation61

starting from the surface vibration of objects, radiation of sound62

waves from these surface vibrations, and interaction of resulting63

sound waves with the environment. Our technique models the sur-64

face vibration characteristic of an object by performing modal anal-65

ysis using the finite element method. These surface vibrations are66

used as boundary conditions to the Helmholtz equation solver (us-67

ing boundary element method) to generate outward radiating sound68

fields. These radiating sound fields are expressed in a compact basis69

using the single-point multipole expansion [Ochmann 1999]. Math-70

ematically, this single-point multipole expansion corresponds to a71

single sound source placed inside the object. The sound propaga-72

tion due to this source is achieved by using numerical sound sim-73

ulation technique (at low frequencies) and ray-tracing (at high fre-74

quencies). We also describe techniques to accelerate ray-tracing75

algorithms based on path clustering and binning. Our approach76

performs end-to-end sound simulation from first principles and en-77

ables automatic sound effect generation for interactive applications,78

thereby reducing manual effort and time-spent by artists and game-79

audio designers.80

The main contributions of our work on coupled sound synthesis-81

propagation include:82

1


1. Integrated technique for accurately simulating the effect of83

time-varying directivity.84

2. High accuracy achieved by correct phase computation, and85

per-frequency modeling of sound vibration, radiation and86

propagation.87

3. Interactive runtime to handle high degree of dynamism e.g.88

dynamic surface vibrations, dynamic sound radiation, and89

sound propagation for dynamic sources and listeners.90

We plan to integrate our technique with the UnityTM game engine91

and demonstrate the effect of coupled sound synthesis-propagation92

on a variety of indoor and outdoor scenarios as shown in Fig. 1.93

2 Related Work94

In this section, we review some of the most closely related work on95

sound synthesis, radiation, and propagation techniques.96

2.1 Sound Synthesis97

CORDIS-ANIMA was perhaps the first proposed system of98

damped springs and masses for modeling surface vibration to syn-99

thesize physically-based sounds [Florens and Cadoz 1991]. Nu-100

merical integration using a finite element formulation was later101

presented as a more accurate technique for modeling vibrations102

[Chaigne and Doutaut 1997; O’Brien et al. 2001]. Instead of using103

numerical integration, [van den Doel and Pai 1996; van den Doel104

and Pai 1998] proposed to compute analytical vibrational modes of105

an object, leading to considerable speedups and enabling real-time106

sound synthesis.107

[van den Doel et al. 2001] introduced the first method to deter-108

mine the vibration modes and their dependence on the point of im-109

pact for a given shape, based on physical measurements. Later,110

[O’Brien et al. 2002] presented a general algorithm to determine111

modal modes of an arbitrarily-shaped 3D objects by discretizing112

them into tetrahedral volume elements. They showed that the cor-113

responding finite element equations can be solved analytically after114

suitable approximations. Consequently, they were able to model115

arbitrarily shaped objects and simulate realistic sounds for a few116

objects at interactive rates [O’Brien et al. 2002]. This approach re-117

quires expensive pre-computation called modal analysis. [Raghu-118

vanshi and Lin 2006a] used a simpler system of spring-mass along119

with perceptually motivated acceleration techniques to recreate re-120

alistic sound effects for hundreds of objects in real time. In this pa-121

per, we use a FEM-based method to precompute the modal modes,122

similar to [O’Brien et al. 2002].123

[Ren et al. 2012] presented an interactive virtual percussion instru-124

ment system that used modal synthesis as well as numerical sound125

propagation for modeling a small instrument cavity. This work, de-126

spite of some obvious similarity, is actually very different from our127

coupled approach. Their combination of synthesis and propagation128

is not well-coupled or integrated, and the volume of the underlying129

acoustic spaces is rather small in comparison to the typical game130

scenes (e.g. benchmarks shown in Fig. 1).131

2.2 Sound Radiation132

The Helmholtz equation is the standard way to model sound ra-133

diating from vibrating, rigid bodies. Boundary element method is134

a widely used method for acoustic radiation problems [Ciskowski135

and Brebbia 1991; von Estorff 2000] but has a major drawback in136

terms of high memory requirements, i.e. O(N2) memory for N137

boundary elements. An efficient technique known as the Equiva-138

lent source method (ESM) [Fairweather 2003; Kropp and Svens-139

son 1995; Ochmann 1999; Pavic 2006] exploits the uniqueness of140

the solutions to the acoustic boundary value problem. ESM ex-141

presses the solution field as a linear combination of simple radiat-142

ing point sources of various orders (monopoles, dipoles, etc.) by143

placing these simple sources at variable locations inside the object144

and matching the total generated field with the boundary conditions145

on the object’s surface, guaranteeing the correctness of solution.146

[James et al. 2006] use the equivalent source method to compute147

the radiated field generated by a vibrating object.148

2.3 Sound Propagation149

Sound is a pressure wave described using the Helmholtz equation150

for a domain Ω:151

∇2p+ω2

c2p = 0,x ∈ Ω (1)

where p(x) is the complex-valued pressure field, ω is the angu-152

lar frequency, c is the constant speed of sound in a homogenous153

medium, and ∇2 is the Laplacian operator. Boundary conditions154

are specified on the boundary of the domain ∂Ω by either using155

the Dirichlet boundary condition that specifies the pressure on the156

boundary p = f(x) on ∂Ω, the Nuemann boundary condition157

that specifies the velocity of the medium ∂p(x)∂n

= f(x) on ∂Ω,158

or a mixed boundary condition that specifies Z ∈ C, so that159

Z ∂p(x)∂n

= f(x) on ∂Ω. We also need to specify the behavior of160

p at infinity, which is usually done using the Sommerfeld radiation161

condition [Pierce et al. 1981]:162

limr→∞

[∂p

∂r+ i

ω

cp] = 0 (2)

where r = ||x|| is the distance of point x from the origin.163

Different methods exist to solve the equation with different formu-164

lations. Numerical methods solve for p numerically either by dis-165

cretizing the entire domain or the boundary. Geometric techniques166

model p as a set of rays and propagate these rays through the envi-167

ronment.168

2.3.1 Wave-based Sound Propagation169

Wave-based or numerical sound propagation solve the acoustic170

wave equation using a numerical wave solvers. These meth-171

ods obtain the exact behavior of a propagating sound wave in172

a domain. Numerical wave solvers discretize space and time173

to solve the wave equation. Typically techniques include fi-174

nite difference time domain (FDTD) method [Yee 1966; Taflove175

and Hagness 2005; Sakamoto et al. 2006], finite element method176

[Thompson 2006] , boundary element method [Cheng and Cheng177

2005], pseudo-spectral time domain [Liu 1997], and domain-178

decomposition [Raghuvanshi et al. 2009]. Wave-based methods179

have high accuracy and can simulate wave-effects such as diffrac-180

tion accurately at low frequencies. However, their memory and181

compute requirements grow as the third or fourth power of the fre-182

quency, making them impractical for interactive applications.183

2.3.2 Geometric Sound Propagation184

Geometric sound propagation techniques use the simplifying as-185

sumption that the wavelength of sound is much smaller than fea-186

tures in the scene. As a result, these methods are most accurate187

for high frequencies and must model low-frequency effects like188

diffraction and scattering as separate phenomena. Commonly used189

2


techniques are based on image source methods [Allen and Berkley190

1979; Borish 1984] and ray tracing [Krokstad et al. 1968; Vorlander191

1989]. Recently, there has been a focus on computing realistic192

acoustics in real time using algorithms designed for fast simulation.193

These include beam tracing [Funkhouser et al. 1998], frustum trac-194

ing [Chandak et al. 2008], and ray-based algorithms [Lentz et al.195

2007; Taylor et al. 2012] that compute low-order reflections. In ad-196

dition, frame-to-frame coherence of the sound field can be utilized197

to achieve a significant speedup [Schissler et al. 2014].198

Edge diffraction effects can be approximated within GA frame-199

works using methods based on the uniform theory of diffrac-200

tion (UTD) [Kouyoumjian and Pathak 1974] or the Biot-Tolstoy-201

Medwin (BTM) model [Svensson et al. 1999]. These approaches202

have been applied to static scenes and low-oder diffraction [Tsin-203

gos et al. 2001; Antani et al. 2012a], as well as dynamic scenes204

with first-order [Taylor et al. 2012] and higher-order diffrac-205

tion [Schissler et al. 2014]. Diffuse reflection effects caused by206

surface scattering have been previously modeled using the acous-207

tic rendering equation [Siltanen et al. 2007; Antani et al. 2012b],208

and radiosity-based methods [Franzoni et al. 2001]. Another209

commonly-used technique for ray tracing called vector-based scat-210

tering uses scattering coefficients to model diffusion [Christensen211

and Koutsouris 2013].212

3 Our Algorithm213

In this section, we give a brief background on various concepts used214

in the paper and present our coupled synthesis-propagation algo-215

rithm.216

3.1 Background217

Modal Analysis: Sound is produced by small vibrations of objects.218

These vibrations although invisible to the naked eye are audible if219

the frequency of vibration lies in the range of human hearing (20 Hz220

- 20 kHz). Modal analysis is a well-known technique for modeling221

such sounds in rigid-bodies. The small vibrations can be modeled222

using a coupled linear system of ODEs:223

Kd + Cd + Md = f , (3)

where K, C, and M are the stiffness, damping, and mass matrices224

respectively and f represents the (external) force vector. For small225

damping it is possible to approximate C as a combination of mass226

and stiffness matrix: C = αM+βK. This facilitates the diagonal-227

ization of the above equation, which is represented as a generalized228

eigenvalue problem:229

KU = ΛMU, (4)

where Λ is the diagonal eigenvalue matrix, U contains the eigen-230

vectors of K. Solving this eigenvalue problem enables us to write231

Eq. 3 as system of decoupled oscillators:232

q + (αI + βΛ)q + Λq = UTf , (5)

where U projects d into the modal subspace q with d = Uq.233

Acoustic transfer: The pressure p(x) at any point obtained on234

solving Eq. (1) is called the acoustic transfer function. The acous-235

tic transfer function gives the relation between the surface normal236

displacements at a surface node and sound pressure at a given field237

point. A common method used in acoustics to evaluate these trans-238

fer functions through the use of boundary element method (BEM)239

discussed before.240

Since we’re solving Eq. (1) in the frequency domain, we have to241

solve the exterior scattering problem for each mode separately. This242

can be achieved using a fast BEM solver and specifying the Neu-243

mann Boundary Condition:244

∂p

∂n= −iωρv on S, (6)

where S = ∂Ω (the boundary of the object), ρ is the fluid density,245

and v is the surface’s normal velocity given by v = iω(n · u),246

where n · u is modal displacement in the normal direction. This247

boundary condition links the modal displacements with the pressure248

at a point. Unfortunately, BEM is not fast enough for an interactive249

runtime necessitating the use of fast, approximate acoustic transfer250

functions [James et al. 2006].251

In order to approximate the acoustic transfer, we use a source sim-252

ulation technique called the Equivalent Source Method. We rep-253

resent a sound source using a collection of point sources (called254

equivalent sources) and match the pressure values on the boundary255

of the object ∂Ω with the pressure on ∂Ω calculated using BEM.256

The main idea here is that if we can match the strengths of the257

equivalent sources to match the boundary pressure, we can evaluate258

the pressure at any point on Ω using these equivalent sources.259

Equivalent sources: The uniqueness of the acoustic bound-260

ary value problem guarantees that the solution of the free-space261

Helmholtz equation along with the specified boundary conditions262

is unique inside Ω. The unique solution p(x) can be found by ex-263

pressing it as a linear combination of fundamental solutions. One264

choice of fundamental solutions is based on equivalent sources. An265

Equivalent source q(x,yi), of the Helmholtz equation subject to266

the Sommerfeld radiation condition xi 6= yi is the solution field267

induced at any point x due to a point source located at yi, and can268

be expressed as:269

q(x,yi) =

L−1∑l=0

l∑m=−l

cilmϕilm(x) =

L2∑k=1

dikϕik(x), (7)

where k is a generalized index for (l,m) and cilm is its strength.270

These fundamental solutions (ϕik) are chosen to correspond to the271

field due to spherical multipole sources of order L (L = 1 being272

a monopole, L = 2 a dipole, and so on) located at yi. Spherical273

multipoles are given as a product of two functions:274

ϕilm(x) = Γlmh(2)l (kiri)ψlm(θi, φi), (8)

where (ri, θi, φi) is the vector (x − yi) expressed in spherical co-275

ordinates, h(2)l (kiri) is the spherical Hankel function of the sec-276

ond kind, ki is the wavenumber given by ωic

, ψlm(θi, φi) are the277

complex-valued spherical harmonics functions, and Γlmh(2)l is the278

normalizing factor for the spherical harmonics. The pressure at any279

point in Ω due to M equivalent sources located at yiMi=1 can be280

expressed as a linear combination:281

p(y) =

M∑i=1

L−1∑l=0

m=l∑m=−l

cilmϕilm(y). (9)

We have to determine the L2 complex coefficients cilm for each282

of the M multipoles. This compact representation of the pressure283

p(y) makes it possible to evaluate the pressure at any point of the284

domain in an efficient manner.285

3.2 Coupled Algorithm286

We now discuss our coupled synthesis-propagation algorithm. As287

shown in Fig. 2, we start with the modal analysis of the sounding288

object which gives the modal displacements, modal frequencies,289

3


Figure 2: Overview of our coupled synthesis propagation pipeline. The bowl is used an example of a modal object. The 1st stage comprisesthe modal analysis. The figures in red show the first two sounding modes of the bowl. We then form an offset surface around the bowl,calculate the pressure on this offset surface, place a single multipole at the center of the object, and approximate the BEM evaluated pressure.In the runtime part of the pipeline, we use the multipole to couple with a propagation system and generate the final sound at the listener.

and modal amplitudes. We use these mode shapes as a boundary290

condition to BEM to compute the pressure on an offset surface.291

Then we place a single equivalent source in the center of the object292

and approximate the pressure calculated using BEM. This gives us293

a vector of (complex) coefficients of the multipole strengths. At this294

stage (the SPME stage in the pipeline), we have computed the rep-295

resentation of an acoustic radiator, which serves as source for the296

propagation in the runtime stage of the pipeline using either a ge-297

ometric or a numeric sound propagator. Our method is agnostic to298

the type of sound propagator, but owing to high modal frequencies299

generated in our benchmarks, we use a geometric sound propaga-300

tion system to obtain interactive performance. The final stage of301

the pipeline takes the impulse response for each mode, convolves302

it with that mode’s amplitude , and sums it to give the final signal.303

We describe each stage of our pipeline below:304

3.2.1 Sound synthesis305

Given an object, we solve the displacement equation (Eq. 5) to get306

a discrete set of mode shapes di, their modal frequencies ωi, and307

the amplitudes qi(t). The vibration’s displacement vector is given308

by:309

d(t) = Uq(t) ≡ [d1, ..., ˆdM ]q(t), (10)

where M is total number of modes and q(t) ∈ <M is the vector of310

modal amplitude coefficients expressed as a bank of sinusoids:311

qi = aie−ditsin(2πfit+ θi), (11)

where fi is the modal frequency (in Hz.), di is the damping coeffi-312

cient, ai is amplitude, and θi is the initial phase.313

3.2.2 Sound radiation314

Once we have the mode shapes and modal frequencies of an object,315

we compute the approximate acoustic transfer for the object similar316

to [James et al. 2006], but use a different equivalent source repre-317

sentation. We first compute a manifold and closed offset around the318

object. This defines a clear inside to place the multipole source319

and also serves as the boundary ∂Ω on which BEM solves the320

Helmholtz Equation to obtain pressure p at the N vertices on the321

surface.322

We then use a Single-Point Multipole Expansion to match the pres-323

sure values, p, on ∂Ω. This is performed by fixing the position of324

the multipole and iteratively increasing the order of the multipole325

till the error is below a certain threshold ε. This step has to be re-326

peated for each modal frequency with the order generally increasing327

with the modal frequency.328

Since we are using a geometric sound propagator in the runtime329

stage of our pipeline, using single-point multipole (per mode)330

makes it possible to use just one geometric propagation source for331

all modes. Theoretically, each multipole should be represented as a332

different geometric propagation source, but since all the multipoles333

were kept at the same position during BEM pressure evaluation on334

the offset surface, we can use just one geometric propagation source335

and use the modal frequency (ωi) as the filter to scale the pressure336

at a point. This makes it possible to have an interactive runtime337

performance.338

For a single-point multipole, Eq. (8) simplifies to:339

p(y) =

L−1∑l=0

m=l∑m=−l

cilmϕilm(y). (12)

Since no optimal strategies exist for the optimal placement of the340

multipole source [Ochmann 1995], we chose the center of our341

modal object as the source location. This is in stark contrast to342

[James et al. 2006; Mehra et al. 2013] who used a hierarchical343

source placement algorithm to minimize the residual error. We344

maintain the same error thresholds as them, but simplify the prob-345

lem by increasing the order L iteratively and checking the pressure346

4


residual ||r||2 < ε, where r = p−Acwith A being anN−by−L2347

multipole basis matrix, and c ∈ CL2

is complex coefficient vector.348

Once we match the BEM pressure on the offset surface for each349

mode, we place one spherical sound source for the geometric prop-350

agation for all the modes at the same position as our multipoles.351

3.2.3 Sound propagation352

Given a single-point multipole source, we can use either a wave-353

based or a geometric sound propagation scheme to propagate the354

source’s radiation pattern into the environment. We describe exist-355

ing techniques that can be used with our system.356

Wave-based Propagation: Frequency domain numerical methods357

like [Mehra et al. 2013] use the equivalent source method to com-358

pute the pressure field on a domain. They decompose the scene into359

well-separated objects and compute the per-object and the inter-360

object transfer functions. The per-object transfer function maps the361

incoming sound field incident for an object A to the outgoing field362

as is defined as:363

f(ΦinA ) = TAΦoutA , (13)

where ΦinA is the vector of multipoles representing the incident field364

on an object, ΦoutA is the vector of outgoing multipoles representing365

the scattered field, TA is the scattering matrix containing the (com-366

plex) coefficients of the outgoing multipole sources. Similarly, the367

inter-object transfer function for a pair of objects A and B is de-368

fined as:369

gBA (ΦoutA ) = GBAΦinB , (14)

where GBA is the interaction matrix and contains the (complex) co-370

efficients for mapping an outgoing field from object A to object B.371

In general, GBA 6= GAB . For more details on this technique, refer to372

[Mehra et al. 2013]373

The single-point multipole source can be used to represent the inci-374

dent field on an object for each modal frequency, which can then be375

approximated using the incoming multipoles ΦinA and used in Eq. 7376

to get the per-object and the inter-object transfer functions.377

Geometric Propagation: These methods make the assumption that378

the wavelength of sound is much greater than the size of features in379

the scene and then treat sound as rays, frustums, or beams. Wave380

effects like diffraction are modeled separately using geometric ap-381

proximations. We make use of the ray-based sound propagation382

system of [Schissler et al. 2014] to compute paths that sound can383

travel through the scene. This system combines path tracing with384

a cache of diffuse sound paths to reduce the number of rays re-385

quired for an interactive simulation. The approach begins by trac-386

ing a small number (e.g. 500) of rays uniformly in all directions387

from each sound source. These rays strike surfaces and are re-388

flected recursively up to a specified maximum reflection depth (e.g.389

50). The reflected rays are computed using vector-based scatter-390

ing [Christensen and Koutsouris 2013], where the resulting rays are391

a linear combination of the specularly reflected rays and random392

Lambertian-distributed rays. The listener is modeled as a sphere393

the same size as a human head. At each ray-triangle intersection,394

the visibility of the listener sphere is sampled by tracing a few addi-395

tional rays towards the listener. If some fraction of the rays are not396

occuluded, a path to the listener is produced. A path contains the397

following output data: The total distance the ray traveled r, along398

with the attenuation factor α due to reflection and diffraction inter-399

actions. Diffracted sound is computed separately using the UTD400

diffraction model [Tsingos et al. 2001]. Given the output of the ge-401

ometric propagation system, we can evaluate the sound pressure as:402

403

p(x) =∑r∈R

pr(x), (15)

where pr is the contribution from a ray r in a set of rays R. We404

model a multipole Ψi using rays R as :405

M∑i=1

Ψi(x) = p(x) =∑r∈R

pr(x),x ∈ Ω, (16)

where Ψi =∑L−1l=0

∑m=lm=−l cilmϕilm(y) for iM1 .406

This coupling lets us calculate the pressure for a set of ray directions407

sampling a sphere uniformly: For a ray direction (θ, φ), traveling408

a distance r, its pressure is scaled by ψ(θ, φ), h(2)l (kr), and α(ωi)409

where α(ωi) is the energy of a ray for a modal frequency ωi. We410

use a geometric ray-traced based system to get the paths and their411

respective energies.412

Path Clustering: Although, using a single geometric source reduces413

the number of rays considerably, in order to get the acoustic phe-414

nomena right we still need a considerable number of rays (≥ 15000)415

which makes it too slow for a modal sound source even with a few416

sounding modes (M ≥ 20). We solve this problem by cluster-417

ing the rays based on the angle between the rays and their respec-418

tive time-delays. We bin the IR (Impulse Response) according to419

a user-specified bin size t in seconds (Figure. 3). Then, for each420

bin we cluster the rays based on the binning angle ϑ. The binning421

algorithm is shown in Algorithm 1.422

Soun

d In

tens

ity

Delay time

Soun

d In

tens

ity

Delay time (binned) δt 2δt 3δt… 0

Figure 3: Path Clustering.

Algorithm 1 PathBinning(t, ϑ)

1: maxNumberOfBins← ceil(IR.length()/t)2: bins.setSize(maxNumberOfBins)3: for Each ray r ∈ R do. Ray directions are normalized

4:−→Sr ← r.direction()

5: binIndex← floor(r.delay()/t)6: bin← bins[binIndex]7: for Each cluster in bin do. Cluster directions are normalized

8:−→Sc ← cluster.direction()

. check if the angle between the two vectors is less than thecluster angle ϑ

9: if−→Sc ·−→Sr > cos(ϑ) then

10: cluster.add(r)11: end if12: end for

. If the ray wasn’t compatible with any of the clusters, create anew one and add the path to it

13: newCluster ← cluster(−→Sr)

14: bin.add(newCluster)15: newCluster.add(r)16: end for

Auralization: The last stage of the pipeline is computing the lis-423

tener response for all the modes. We compute this response by424

5


convolving the time-domain impulse response of each mode with425

that mode’s amplitude. The final signal O(t) is:426

O(t) =

M∑i=1

qi(t) ∗ IRi(t) (17)

Where IRi is the impulse response of the ith mode, qi(t) is the427

amplitude of the ith mode, and ∗ is the convolution operator.428

4 Implementation429

In this section, we describe the implementation details of our sys-430

tem. All the runtime code was written in C++ and timed on a 16-431

core, Intel Xeon E5-2687W @ 3.1 GHz desktop with 64 GB of432

RAM running Windows 7 64-bit. In the preprocessing stage, the433

offset surface generation and eigen decomposition code was writ-434

ten in C++, while the single-point multipole expansion was written435

in MATLAB.436

Preprocessing: We used finite element technique to compute the437

stiffness matrix K which takes the tetrahedralized model, Young’s438

modulus and the Poisson’s ratio of the sounding object and com-439

pute the stiffness matrix for the object. We then do the eigenvalue440

decomposition of the system using Intel’s MKL library (DSYEV)441

and calculate the modal displacements, frequencies, and amplitudes442

in C++. The code to find the multipole strengths was written in443

MATLAB, the pressure on the offset surface was calculated using444

a fast BEM solver (FastBEM) using FMM-BEM (Fast multipole445

method).446

Sound Propagation: We use a fast, state-of-the-art geometric ray447

tracer [Schissler et al. 2014] to get the paths for our pressure com-448

putation. This technique is capable of handling very high orders of449

diffuse and specular reflections (e.g. 10 orders of specular reflec-450

tions and 50 orders of diffuse reflections) and still maintain inter-451

active performance. As mentioned in previous section, we cluster452

the rays in order to reduce the number of rays in the scene, but even453

with that, the pressure (i.e., the spherical harmonics and the hankel454

functions) computation for each ray has to be optimized heavily to455

meet the interactive performance requirements.456

Spherical Harmonic computation: The number of spherical har-457

monics computed per ray varies asO(L2), making naive evaluation458

too slow for an interactive runtime. We used a modified version of459

available fast spherical harmonic code [Sloan 2013] to compute460

the pressure contribution of each ray. The available code computes461

only the real spherical harmonics by making extensive use of SSE462

(Streaming SIMD Extension). We find the complex spherical har-463

monics from the real ones following a simple observation:464

Y ml =1√2

(Re(Y ml ) + ιRe(Y −ml ))m > 0, (18)

Y ml =1√2

(Re(Y ml )− ιRe(Y −ml ))(−1)m m < 0. (19)

Using this optimized code gives us a 2-3 orders of magnitude speed-465

up compared to existing spherical harmonic implementations, e.g.,466

BOOST467

Distance Clustering: Even after the significant speedup achieved in468

calculating spherical harmonics, Hankel functions need to be com-469

puted for each ray, varying linearly with the order of the multipole.470

We solve this problem by clustering the paths, similar to what we471

did in previous section , based on the distance traveled by them in472

the environment. Given a user-defined bin size δd and the length of473

the IR t in seconds, we cluster ray distances into NHankel = tδd

474

bins requiring us to make an order of magnitude less computations.475

The Hankel functions are evaluated using BOOST.476

Parallel computation of Mode pressure: Since each mode is inde-477

pendent of the other, the pressure computation for each one of them478

can be done in parallel. The lower modes generally require lesser479

time to evaluate than the higher ones, so we use a simple, scene de-480

pendent, load-balancing scheme to divide the work equally amongst481

all the 16 cores. We used OpenMP for the parallelizing on a multi-482

core system.483

Real-Time Auralization: The final audio for the simulations is ren-484

dered using a streaming partitioned convolution technique. All au-485

dio rendering is performed at a sampling rate of 48 kHz. We first486

construct an impulse response (IR) for each mode using the com-487

puted pressure for the paths returned by the propagation system that488

incorporate the effects of the single-point multipole expansion. The489

IR is initialized to zero and the pressure for each path is added to490

the IR at the sample index corresponding to the delay for that path.491

Once constructed, the IRs for all modes are passed to the convolu-492

tion system for auralization, where they are converted to frequency493

domain. During audio rendering, the time-domain input audio for494

each mode is converted to frequency domain, then multiplied with495

the corresponding IR partition coefficients. The inverse FFT of the496

resulting sound is computed and accumulated using overlap-add in497

a circular output buffer. The audio device reads from the circular498

buffer at the current position and plays back the rendered sound.499

5 Conclusion500

We present the first coupled sound synthesis-propagation algorithm501

that can generate realistic sound effects for computer games and502

virtual reality. We describe an approach that integrates prior meth-503

ods for modal sound synthesis, sound radiation, and sound prop-504

agation. The radiating sound fields are represented in a compact505

basis using a single-point multiple expansion. We perform sound506

propagation using this source basis using fast ray-tracing to com-507

pute the impulse response and convolve them with the modes to508

generate the final sound at the listener. The resulting system has509

been integrated and we highlight the performance of many indoor510

and outdoor scenes. Overall, this is the first system that successfully511

combines these methods and can handle a high degree of dynamism512

in term of source radiation and propagation in complex scenes.513

Our approach has some limitations. It is limited to rigid objects514

and modal sounds. Moreover the time complexity tends to increase515

with the mode frequency. Our single-point multipole expansion ap-516

proach can result in very high order of multipoles. The geometric517

sound propagation algorithm may not be able to compute the low518

frequency effects (e.g. diffraction) accurately in all environments.519

Moreover, the wave-based sound propagation algorithm involves520

high pre computation overhead and is limited to static scenes. Cur-521

rently, we do not perform any sort of mode compression resulting522

in a lot of closely spaced modes being generated. We could use the523

compression algorithms [Raghuvanshi and Lin 2006b; Langlois524

et al. 2014] as a means to reduce the number of modes and thus525

reduce the overhead of pressure computation. Our preprocessing526

stage takes long time, where most of the time is spent in doing the527

eigen-decomposition of the stiffness matrix.528

There are many avenues for future work. In addition to overcom-529

ing these limitations, we would like to use it in more complex in-530

door and outdoor environments and generate other sound effects for531

complex objects in large environments (e.g. a bell ringing over a532

6


large, outdoor valley). We would like to explore some approximate533

solutions to accelerate the freespace acoustic transfer computation.534

It would be useful to include directional sources and also accelerate535

the computations using iterative algorithms like Arnoldi [ARP ].536

References537

ALLEN, J. B., AND BERKLEY, D. A. 1979. Image method for efficiently simulating538

small-room acoustics. The Journal of the Acoustical Society of America 65, 4539

(April), 943–950.540

ANTANI, L., CHANDAK, A., TAYLOR, M., AND MANOCHA, D. 2012. Efficient541

finite-edge diffraction using conservative from-region visibility. Applied Acoustics542

73, 218–233.543

ANTANI, L., CHANDAK, A., SAVIOJA, L., AND MANOCHA, D. 2012. Interactive544

sound propagation using compact acoustic transfer operators. ACM Trans. Graph.545

31, 1 (Feb.), 7:1–7:12.546

ARPACK. http://www.caam.rice.edu/software/ARPACK/.547

BORISH, J. 1984. Extension to the image model to arbitrary polyhedra. The Journal548

of the Acoustical Society of America 75, 6 (June), 1827–1836.549

CHADWICK, J. N., AN, S. S., AND JAMES, D. L. 2009. Harmonic shells: a practical550

nonlinear sound model for near-rigid thin shells. In ACM SIGGRAPH Asia 2009551

papers, ACM, New York, NY, USA, SIGGRAPH Asia ’09, 119:1–119:10.552

CHAIGNE, A., AND DOUTAUT, V. 1997. Numerical simulations of xylophones. i.553

time domain modeling of the vibrating bars. J. Acoust. Soc. Am. 101, 1, 539–557.554

CHANDAK, A., LAUTERBACH, C., TAYLOR, M., REN, Z., AND MANOCHA, D.555

2008. Ad-frustum: Adaptive frustum tracing for interactive sound propagation.556

IEEE Trans. Visualization and Computer Graphics 14, 6, 1707–1722.557

CHENG, A., AND CHENG, D. 2005. Heritage and early history of the boundary558

element method. Engineering Analysis with Boundary Elements 29, 3 (Mar.), 268–559

302.560

CHRISTENSEN, C., AND KOUTSOURIS, G. 2013. Odeon manual, chapter 6.561

CISKOWSKI, R. D., AND BREBBIA, C. A. 1991. Boundary element methods in562

acoustics. Computational Mechanics Publications Southampton, Boston.563

FAIRWEATHER, G. 2003. The method of fundamental solutions for scattering and564

radiation problems. Engineering Analysis with Boundary Elements 27, 7 (July),565

759–769.566

fastBEM making efficient high-fidelity acoustic modeling a reality! http://www.567

fastbem.com/fastbemacoustics.html.568

FLORENS, J. L., AND CADOZ, C. 1991. The physical model: modeling and simu-569

lating the instrumental universe. In Represenations of Musical Signals, G. D. Poli,570

A. Piccialli, and C. Roads, Eds. MIT Press, Cambridge, MA, USA, 227–268.571

FRANZONI, L. P., BLISS, D. B., AND ROUSE, J. W. 2001. An acoustic boundary572

element method based on energy and intensity variables for prediction of high-573

frequency broadband sound fields. The Journal of the Acoustical Society of Amer-574

ica 110, 3071.575

FUNKHOUSER, T., CARLBOM, I., ELKO, G., PINGALI, G., SONDHI, M., AND576

WEST, J. 1998. A beam tracing approach to acoustic modeling for interactive577

virtual environments. In Proc. of ACM SIGGRAPH, 21–32.578

JAMES, D. L., BARBIC, J., AND PAI, D. K. 2006. Precomputed acoustic transfer:579

output-sensitive, accurate sound generation for geometrically complex vibration580

sources. In ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, SIG-581

GRAPH ’06, 987–995.582

KOUYOUMJIAN, R. G., AND PATHAK, P. H. 1974. A uniform geometrical theory of583

diffraction for an edge in a perfectly conducting surface. Proceedings of the IEEE584

62, 11, 1448–1461.585

KROKSTAD, A., STROM, S., AND SORSDAL, S. 1968. Calculating the acoustical586

room response by the use of a ray tracing technique. Journal of Sound and Vibration587

8, 1 (July), 118–125.588

KROPP, W., AND SVENSSON, P. U. 1995. Application of the time domain formulation589

of the method of equivalent sources to radiation and scattering problems. Acta590

Acustica united with Acustica 81, 6, 528–543.591

LANGLOIS, T. R., AN, S. S., JIN, K. K., AND JAMES, D. L. 2014. Eigenmode592

compression for modal sound models. ACM Transactions on Graphics (TOG) 33,593

4, 40.594

LENTZ, T., SCHRODER, D., VORLANDER, M., AND ASSENMACHER, I. 2007.595

Virtual reality system with integrated sound field simulation and reproduction.596

EURASIP Journal on Advances in Singal Processing 2007 (January), 187–187.597

LIU, Q. H. 1997. The PSTD algorithm: A time-domain method combining the pseu-598

dospectral technique and perfectly matched layers. The Journal of the Acoustical599

Society of America 101, 5, 3182.600

MEHRA, R., RAGHUVANSHI, N., ANTANI, L., CHANDAK, A., CURTIS, S., AND601

MANOCHA, D. 2013. Wave-based sound propagation in large open scenes using602

an equivalent source formulation. ACM Trans. Graph. (Apr.).603

MOSS, W., YEH, H., HONG, J.-M., LIN, M. C., AND MANOCHA, D. 2010. Sound-604

ing liquids: Automatic sound synthesis from fluid simulation. ACM Trans. Graph.605

29, 3, 1–13.606

O’BRIEN, J. F., COOK, P. R., AND ESSL, G. 2001. Synthesizing sounds from607

physically based motion. In SIGGRAPH ’01: Proceedings of the 28th annual con-608

ference on Computer graphics and interactive techniques, ACM Press, New York,609

NY, USA, 529–536.610

O’BRIEN, J. F., SHEN, C., AND GATCHALIAN, C. M. 2002. Synthesizing sounds611

from rigid-body simulations. In The ACM SIGGRAPH 2002 Symposium on Com-612

puter Animation, ACM Press, 175–181.613

OCHMANN, M. 1995. The source simulation technique for acoustic radiation prob-614

lems. Acustica 81, 512–527.615

OCHMANN, M. 1999. The full-field equations for acoustic radiation and scattering.616

The Journal of the Acoustical Society of America 105, 5, 2574–2584.617

PAVIC, G. 2006. A technique for the computation of sound radiation by vibrating618

bodies using multipole substitute sources. Acta Acustica united with Acustica 92,619

112–126(15).620

PIERCE, A. D., ET AL. 1981. Acoustics: an introduction to its physical principles and621

applications. McGraw-Hill New York.622

RAGHUVANSHI, N., AND LIN, M. C. 2006. Interactive sound synthesis for large scale623

environments. In SI3D ’06: Proceedings of the 2006 symposium on Interactive 3D624

graphics and games, ACM Press, New York, NY, USA, 101–108.625

RAGHUVANSHI, N., AND LIN, M. C. 2006. Interactive sound synthesis for large scale626

environments. In Proceedings of the 2006 symposium on Interactive 3D graphics627

and games, ACM, 101–108.628

RAGHUVANSHI, N., NARAIN, R., AND LIN, M. C. 2009. Efficient and accurate629

sound propagation using adaptive rectangular decomposition. Visualization and630

Computer Graphics, IEEE Transactions on 15, 5, 789–801.631

RAGHUVANSHI, N., SNYDER, J., MEHRA, R., LIN, M. C., AND GOVINDARAJU,632

N. K. 2010. Precomputed Wave Simulation for Real-Time Sound Propagation of633

Dynamic Sources in Complex Scenes. SIGGRAPH 2010 29, 3 (July).634

REN, Z., MEHRA, R., COPOSKY, J., AND LIN, M. C. 2012. Tabletop ensemble:635

touch-enabled virtual percussion instruments. In Proceedings of the ACM SIG-636

GRAPH Symposium on Interactive 3D Graphics and Games, ACM, 7–14.637

SAKAMOTO, S., USHIYAMA, A., AND NAGATOMO, H. 2006. Numerical analysis of638

sound propagation in rooms using the finite difference time domain method. The639

Journal of the Acoustical Society of America 120, 5, 3008–3008.640

SCHISSLER, C., MEHRA, R., AND DINESH, M. 2014. High-order diffraction and641

diffuse reflections for interactive sound propagation in large environments. In Proc.642

of ACM SIGGRAPH.643

SILTANEN, S., LOKKI, T., KIMINKI, S., AND SAVIOJA, L. 2007. The room acous-644

tic rendering equation. The Journal of the Acoustical Society of America 122, 3645

(September), 1624–1635.646

SLOAN, P.-P. 2013. Efficient spherical harmonic evaluation. Journal of Computer647

Graphics Techniques (JCGT) 2, 2 (September), 84–83.648

SVENSSON, U. P., FRED, R. I., AND VANDERKOOY, J. 1999. An analytic secondary649

source model of edge diffraction impulse responses . Acoustical Society of America650

Journal 106 (Nov.), 2331–2344.651

TAFLOVE, A., AND HAGNESS, S. C. 2005. Computational Electrodynamics: The652

Finite-Difference Time-Domain Method, Third Edition, 3 ed. Artech House Pub-653

lishers, June.654

TAYLOR, M., CHANDAK, A., MO, Q., LAUTERBACH, C., SCHISSLER, C., AND655

MANOCHA, D. 2012. Guided multiview ray tracing for fast auralization. IEEE656

Transactions on Visualization and Computer Graphics 18, 1797–1810.657

THOMPSON, L. L. 2006. A review of finite-element methods for time-harmonic658

acoustics. The Journal of the Acoustical Society of America 119, 3, 1315–1330.659

7

http://www.caam.rice.edu/software/ARPACK/

http://www.fastbem.com/fastbemacoustics.html




TSINGOS, N., FUNKHOUSER, T., NGAN, A., AND CARLBOM, I. 2001. Modeling660

acoustics in virtual environments using the uniform theory of diffraction. In Proc.661

of ACM SIGGRAPH, 545–552.662

VAN DEN DOEL, K., AND PAI, D. K. 1996. Synthesis of shape dependent sounds with663

physical modeling. In Proceedings of the International Conference on Auditory664

Displays.665

VAN DEN DOEL, K., AND PAI, D. K. 1998. The sounds of physical shapes. Presence666

7, 4, 382–395.667

VAN DEN DOEL, K., KRY, P. G., AND PAI, D. K. 2001. Foleyautomatic: physically-668

based sound effects for interactive simulation and animation. In SIGGRAPH ’01:669

Proceedings of the 28th annual conference on Computer graphics and interactive670

techniques, ACM Press, New York, NY, USA, 537–544.671

VON ESTORFF, O. 2000. Boundary elements in acoustics: advances and applications,672

vol. 9. Wit Pr/Computational Mechanics.673

VORLANDER, M. 1989. Simulation of the transient and steady-state sound propa-674

gation in rooms using a new combined ray-tracing/image-source algorithm. The675

Journal of the Acoustical Society of America 86, 1, 172–178.676

YEE, K. 1966. Numerical solution of initial boundary value problems involving677

maxwell’s equations in isotropic media. IEEE Transactions on Antennas and Prop-678

agation 14, 3 (May), 302–307.679

YEH, H., MEHRA, R., REN, Z., ANTANI, L., MANOCHA, D., AND LIN, M. 2013.680

Wave-ray coupling for interactive sound propagation in large complex scenes. ACM681

Trans. Graph. 32, 6, 165:1–165:11.682

ZHENG, C., AND JAMES, D. L. 2009. Harmonic fluids. ACM Trans. Graph. 28, 3,683

1–12.684

ZHENG, C., AND JAMES, D. L. 2010. Rigid-body fracture sound with precomputed685

soundbanks. In SIGGRAPH ’10: ACM SIGGRAPH 2010 papers, ACM, New York,686

NY, USA, 1–13.687

8

Date post:	04-Apr-2018
Category:	Documents
Upload:	voque
View:	232 times
Download:	2 times

SynCoPation: Synthesis Coupled Sound Propagationcsbio.unc.edu/PRP/Reports/rungtaFall2014.pdf ·...

Documents