Modeling a room’s acoustics
Understanding how sound travels through a particular space — and bounces off its surfaces before reaching the ear — is another powerful tool for making virtual sounds replicate real ones. Just as visual AR uses simultaneous localization and mapping (SLAM) to get the geometry and lighting right for virtual objects, on the sound front we need to understand the room acoustic properties to seamlessly place a virtual sound source into the real space. For my personal masterclass in room acoustics, the team invites me to play a game, trying to determine which sounds are coming from a series of physical speakers set up in the room around me and which are coming from a set of open-ear headphones I’m wearing. I can move around the space and hear the sounds respond accordingly. I consider myself a bit of an audiophile, but my attempts at distinguishing which sounds are real and which are virtual top out at about 50-50. Despite the fact that it’s coming from headphones, the spatialized audio and simulated acoustics are so realistic, my brain is fully convinced that the sounds I’m hearing are coming from the speakers in the room. I even have to pull the headphones off to confirm where the sounds are really coming from.
“Imagine if you were on a phone call and you forgot that you were separated by distance,” says Robinson. “That’s the promise of the technology we’re developing.”
To get a sense of what’s at stake here, the team shows me a demo that illustrates telepresence, the ability to feel present in a location other than your own, in real time. I sit in a room wearing a modified Oculus Rift headset and a pair of headphones, but it feels like I’m someplace else, sitting around a table with a number of researchers and colleagues. I can see the meeting room via my headset. An array of 32 microphones captures the sounds in the meeting room and delivers spatialized audio directly to my headphones so that each person’s voice sounds like it’s coming from their specific location around the table. I find myself naturally turning to face the direction of each person. This helps me follow and participate in the conversation and feel like I’m in the room itself — even though I’m actually not.
This could be a game changer for video calls with friends, family, or coworkers at a distance. With a phone call today, the other person’s voice sounds like it’s coming from the phone itself (or from the center of your head, if you’re wearing earbuds), so your brain rejects the idea that the other person might be in the same location as you. Spatial audio mimics the directions that sounds come from in real life and the environmental acoustics, so you more fully experience social presence.
Spatial audio when combined with Codec Avatars (ultra-realistic representations of people that can be animated in real time), hyper-realistic 3D reconstruction, full body tracking, shared virtual spaces, and more will allow us to crack true social presence. By letting you spend time with the important people in your life in meaningful places, we can radically transform how you live, work, and play.
“I take to heart the overall Facebook mission, which is really about connecting people,” says Robinson. “The only reason we need for virtual sound to be made real is so that I can put a virtual person in front of me and have a social interaction with them that’s as if they were really there. And remote or in person, if we can improve communication even a little bit, it would really enable deeper and more impactful social connections.”
As mind-blowing as truly spatialized audio and realistic room acoustics can be, they only touch upon the first part of the FRL Research audio team’s mission. “As we started doing this research in VR and as that morphed into AR, we realized that all of the technologies that we’re building here can serve a higher purpose, which is to improve human hearing,” explains Mehra.
AR glasses and perceptual superpowers
The second part of the FRL Research audio team’s mission — to redefine human hearing — is an ambitious goal, to be sure. But it’s also directly connected to Facebook’s work to deliver AR glasses.
“Human hearing is an amazing sense that allows us to connect through spoken language and musical expression,” explains Tony Miller, who leads hardware research for the team. “At FRL Research, we are exploring new technologies that can extend, protect, and enhance your hearing ability — giving you the ability to increase concentration and focus, while allowing you to seamlessly interact with the people and information you care about. At the heart of this work is a focus on building hardware that is deeply rooted in auditory perception and augmented by the latest developments in signal processing and artificial intelligence.”
Imagine being able to hold a conversation in a crowded restaurant or bar without having to raise your voice to be heard or straining to understand what others are saying. By using multiple microphones on your glasses, we can capture the sounds around you. Then by using the pattern of your head and eye movements, we can figure out which of these sounds you’re most interested in hearing, without requiring you to robotically stare at it. This lets us enhance the right sounds for you and dim others, making sure that what you really want to hear is clear, even in loud background noise.