One of the biggest potential benefits of realistic retinal blur is more comfortable VR experiences. “This is about all-day immersion,” says Douglas Lanman, FRL’s Director of Display Systems Research. “Whether you're playing a video game for hours or looking at a boring spreadsheet, eye strain, visual fatigue and just having a beautiful image you’re willing to spend your day with, all of that matters.”
Lanman recognized the need for rendered blur back in 2015, in the early stages of the Half Dome project (which he also leads). Even just a few months into the project, early prototypes were showing promising results for creating sharp focus within VR. Software-based defocusing, however, was proving to be a major obstacle. Our process couldn’t draw from existing techniques for rendering real-time blur in non-VR games, which have more to do with cinematography than realism, generating eye-catching cinematic effects (such as a pleasantly defocused background) geared specifically for flatscreen monitors and TVs. These fast but inaccurate methods of creating “game blur” ran counter to Half Dome’s mission, which is to faithfully reproduce the way light falls on the human retina.
After months spent exploring traditional techniques for optimizing computational displays, the results still weren’t fast enough to produce truly real-time blur that accurately matched physical reality. Those early efforts exposed the dual challenge of rendering truly realistic blur in VR, which requires combining incredibly high render speeds with the levels of image quality required by advanced head-mounted displays. And rendered blur isn’t a one-off process applied to a scene while it’s being developed or when the viewer first encounters it. Gaze-contingent blur has to deliver rapid-fire and near-instant defocusing to match essentially every eye movement, with a level of fidelity that can’t be achieved by simply dropping the resolution of objects that the wearer is no longer focusing on.
Lanman had already learned that throwing more processing power at the problem wasn’t feasible. A 2016 demo of Half Dome achieved real-time blur through a process called accumulation buffer rendering, where the scene was rendered 32 times per eye. But this approach worked only because the overall scene was simple; it wouldn’t be possible for a wider range of VR experiences, particularly given Lanman’s focus on making any software solution accessible to the entire VR community. “I wanted something that could work with every single game immediately, so we wouldn’t have to ask developers to alter their titles—they’d just work out of the box with Half Dome,” says Lanman.
Bringing deep learning to VR
Instead of waiting for future processors to meet our requirements or asking customers to foot the bill for more total processing power, Lanman decided to develop software powered by AI. Specifically, he wanted to explore the use of deep learning, an approach in which AI systems learn to carry out a given task by training on large sets of relevant data. Deep learning algorithms are often used to analyze or even generate images. And while chipmakers have been moving in this direction, boosting the upper limits of image quality by adding AI-compatible learning cores to their latest video cards, deep learning is relatively unheard of in VR-related systems. “We decided to leverage those same AI tools that are driving industry trends,” says Lanman, “to go beyond just generating the pixels and actually give you more realism than you’ve seen before.”
Lanman’s deep learning strategy began in earnest when he hired Lei Xiao, an AI researcher who was fresh out of graduate school, where his PhD studies included numerical optimization and machine learning for computational photography. “I believe it was Lei’s first day in the lab when I told him, ‘I want to make computational displays like Half Dome run in real time, for the first time,’” says Lanman. “And that solution has to work for every single title in the Oculus Store, without asking developers to recompile.”