Virtual Reality Sound Design

Article Preview

Photo: Oculus

This article examines how Virtual Reality (VR) sound designers use digital audio processing to re-create four related elements of sound and hearing from the real world to produce an immersive auditory experience in a virtual world. These key elements are:

  • the sound itself
  • our position relative to the sound
  • the physical space in which the sound occurs
  • our heads’ and ears’ sound filtering

VR: Mirroring Sensory Perceptions

VR relies on mirroring the sensory perceptions we experience in the real world, essentially tricking our brains into believing that we’re in an environment that exists outside of our everyday reality. To achieve this, not only must our brain believe what our eyes see, it also must believe what our ears hear.

Sound Basics

Sound arrives at our ears dynamically, reflecting off the range of materials that surround us.

A basic understanding of how sound works in the real world provides a launching point for exploring what VR sound designers seek to model in virtual environments. Sound is a disturbance in the molecules that comprise a medium such as air: an emission of sound energy disturbs the air molecules, scatters them and creates soundwaves.

Our ears hear the disturbance, and our brains perceive that disturbance as a sound, assigning it a source coming from certain direction at a certain volume in a spatial process known as localization. We can tell if the sound is in front of us, behind us, to our left or right, or above or below us.

Our capacity to localize sound and place it in space is tied to our brain’s ability to decipher differences in the time and volume the sound arrives at our two ears, which is determined by our environment and our head’s position relative to the sound source.

Perceiving Sound

How we perceive sound is dependent on the environment in which the sound occurs. Sound reflects to varying degrees off every surface it contacts, resulting in direct sound and reflected sound. This reflected sound tells us the size of the environment and provides clues to the materials it’s bouncing off: Sounds in a room made of wood will have a different quality than the same sound interacting with surfaces in a metal room. Direct sound plus reflected sound combine to create our real world’s complex soundscape.

The following video demonstrates the physics of real-world sound.


How We Hear

How does our brain make sense of, and keep us oriented to, this soundscape? The answer lies in our binaural hearing and a cognitive-auditory response known as Head-Related Transfer Function (HRTF). Binaural refers to having two ears. Think of our hearing as being two separate receivers of sound that transfer information independently to our brains. Binaural hearing is closely tied to HRTF, which involves our anatomy affecting how we receive and process sound.

Although we have only two ears, our brains can locate sound in three dimensions by calculating two separate, but inextricably linked, audio aspects:

  • the difference in volume level between what each ear gathers (called the interaural level difference, or ILD)
  • the difference in arrival time between our ears (the interaural time difference, or ITD)

Additionally, our head and ears filter sound independently, contingent on which direction the sound comes from and the reflections it undergoes before reaching our auditory system. Our brain, in turn, uses ILD, ITD and our anatomical filtering information to spatially place the sound in the world around us.

Of course, all these audio cues change depending on how we move our heads relative to the sound, which has a significant impact on how we perceive sound. These soundscape dynamics must be calculated and simulated through digital signal processing in a virtual world, where sound locations must seem to continuously change around us as we interact with the environment, turning toward or away from, or moving past, sound sources.

Creating Acoustic Models in VR

Sound designers use digital audio workstations to emulate real-world audio dynamics. (Photo: RoadToVR.)

Re-creating the real world’s complex layers of 3D sound is the task of VR sound engineers and designers.They develop audio engines and work with software that allow them to digitally shape the spatial array of sound entering our ears from an immersive environment and influence our brain’s auditory perception.

From a breathy whisper to a concussive explosion, the result is a hearing-is-believing convergence of the art of sound design with the science of auditory cognition in a world governed by 360º audio presentation techniques that seek to add sound sources anywhere in a 3D sphere. Up, down. Left, right. Near, far.

Modeling real-world hearing is achieved by digitally simulating the influence of our anatomy’s natural sound-filtering through HRTF, replicated in VR sound design through a complex array of algorithm-based digital filtering that accounts for sound direction, volume and frequency.

Our brains have registered such learned sound dynamics through our interaction with the real world, and VR sound design taps into these cognitive imprints, so our virtual hearing experience mirrors how we engage a soundscape in the real world and how our hearing functions in that real world.

Video demonstration of the spatial sound-design in Google’s Software Development Kit. (NOTE: For best experience, wear headphones.)


Movement-Tracking Headsets

VR headsets that can track head movement allow sound designers to dynamically instruct the complex layers of virtual sounds how to behave and interact with our ears relative to our head’s position. Audio rendering delivered by these movement-tracking headsets tricks our brains into believing what we’re hearing is real because the seemingly real-life sound behavior taps into deep-seated memories of how we hear in the real world.

The convergence of real-world soundscape emulation, binaural rendering and motion-tracking headsets create an immersive audio experience. Supplemented with visual immersion, auditory immersion is crucial to producing and sustaining a sense of visceral presence.

Digital audio design practitioners continue to push the envelope to discover what’s possible. For example, a spherical audio-capture and reproduction technique known as ambisonics, which traces its roots back to the 1970s, is making a come-back, with a promising capacity to further advance the boundaries of spatial audio production and rendering.

Ambisonic Sound: Hear New York City in 3D Audio. (NOTE: For best experience, wear headphones.)

Headphone tracking, too, is improving, and excitement abounds that sound design will continue to help drive the evolution of VR through further discoveries of how to more accurately model real-world acoustics and hearing and deliver a visceral experience in virtual worlds.

As VR audio-design capabilities continue to advance, the science of sound engineering will continue to push the art of sound design, opening new immersive frontiers in what many hope will be one of the next “big things” in computer technology. If you listen closely, you can virtually hear it coming.

Video demonstration of dearVR’s 3D sound design for headphones. (NOTE: For best experience, wear headphones.)


Additional Resources…

A discussion that introduces VR audio. Presented by Oculus. (NOTE: Not overly technical.)


Panel discussion of spatial audio and immersion at the Google I/O Conference 2016.