Using artificial intelligence to generate 3D holograms in real-time
A new method called tensor holography could enable the creation of holograms for virtual reality, 3D printing, medical imaging, and more — and it can run on a smartphone.
Despite years of hype, virtual reality headsets have yet to topple TV or computer screens as the go-to devices for video viewing. One reason: VR can make users feel sick. Nausea and eye strain can result because VR creates an illusion of 3D viewing although the user is in fact staring at a fixed-distance 2D display. The solution for better 3D visualization could lie in a 60-year-old technology remade for the digital world: holograms.
Holograms deliver an exceptional representation of 3D world around us. Plus, they’re beautiful. (Go ahead — check out the holographic dove on your Visa card.) Holograms offer a shifting perspective based on the viewer’s position, and they allow the eye to adjust focal depth to alternately focus on foreground and background.
Researchers have long sought to make computer-generated holograms, but the process has traditionally required a supercomputer to churn through physics simulations, which is time-consuming and can yield less-than-photorealistic results. Now, MIT researchers have developed a new way to produce holograms almost instantly — and the deep learning-based method is so efficient that it can run on a laptop in the blink of an eye, the researchers say.
“People previously thought that with existing consumer-grade hardware, it was impossible to do real-time 3D holography computations,” says Liang Shi, the study’s lead author and a PhD student in MIT’s Department of Electrical Engineering and Computer Science (EECS). “It’s often been said that commercially available holographic displays will be around in 10 years, yet this statement has been around for decades.”
Shi believes the new approach, which the team calls “tensor holography,” will finally bring that elusive 10-year goal within reach. The advance could fuel a spillover of holography into fields like VR and 3D printing.
Shi worked on the study, published today in Nature, with his advisor and co-author Wojciech Matusik. Other co-authors include Beichen Li of EECS and the Computer Science and Artificial Intelligence Laboratory at MIT, as well as former MIT researchers Changil Kim (now at Facebook) and Petr Kellnhofer (now at Stanford University).
The quest for better 3D
A typical lens-based photograph encodes the brightness of each light wave — a photo can faithfully reproduce a scene’s colors, but it ultimately yields a flat image.
In contrast, a hologram encodes both the brightness and phase of each light wave. That combination delivers a truer depiction of a scene’s parallax and depth. So, while a photograph of Monet’s “Water Lilies” can highlight the paintings’ color palate, a hologram can bring the work to life, rendering the unique 3D texture of each brush stroke. But despite their realism, holograms are a challenge to make and share.
First developed in the mid-1900s, early holograms were recorded optically. That required splitting a laser beam, with half the beam used to illuminate the subject and the other half used as a reference for the light waves’ phase. This reference generates a hologram’s unique sense of depth. The resulting images were static, so they couldn’t capture motion. And they were hard copy only, making them difficult to reproduce and share.
Computer-generated holography sidesteps these challenges by simulating the optical setup. But the process can be a computational slog. “Because each point in the scene has a different depth, you can’t apply the same operations for all of them,” says Shi. “That increases the complexity significantly.” Directing a clustered supercomputer to run these physics-based simulations could take seconds or minutes for a single holographic image. Plus, existing algorithms don’t model occlusion with photorealistic precision. So Shi’s team took a different approach: letting the computer teach physics to itself.
They used deep learning to accelerate computer-generated holography, allowing for real-time hologram generation. The team designed a convolutional neural network — a processing technique that uses a chain of trainable tensors to roughly mimic how humans process visual information. Training a neural network typically requires a large, high-quality dataset, which didn’t previously exist for 3D holograms.
The team built a custom database of 4,000 pairs of computer-generated images. Each pair matched a picture — including color and depth information for each pixel — with its corresponding hologram. To create the holograms in the new database, the researchers used scenes with complex and variable shapes and colors, with the depth of pixels distributed evenly from the background to the foreground, and with a new set of physics-based calculations to handle occlusion. That approach resulted in photorealistic training data. Next, the algorithm got to work.
By learning from each image pair, the tensor network tweaked the parameters of its own calculations, successively enhancing its ability to create holograms. The fully optimized network operated orders of magnitude faster than physics-based calculations. That efficiency surprised the team themselves.
“We are amazed at how well it performs,” says Matusik. In mere milliseconds, tensor holography can craft holograms from images with depth information — which is provided by typical computer-generated images and can be calculated from a multicamera setup or LiDAR sensor (both are standard on some new smartphones). This advance paves the way for real-time 3D holography. What’s more, the compact tensor network requires less than 1 MB of memory. “It’s negligible, considering the tens and hundreds of gigabytes available on the latest cell phone,” he says.
The research “shows that true 3D holographic displays are practical with only moderate computational requirements,” says Joel Kollin, a principal optical architect at Microsoft who was not involved with the research. He adds that “this paper shows marked improvement in image quality over previous work,” which will “add realism and comfort for the viewer.” Kollin also hints at the possibility that holographic displays like this could even be customized to a viewer’s ophthalmic prescription. “Holographic displays can correct for aberrations in the eye. This makes it possible for a display image sharper than what the user could see with contacts or glasses, which only correct for low order aberrations like focus and astigmatism.”
“A considerable leap”
Real-time 3D holography would enhance a slew of systems, from VR to 3D printing. The team says the new system could help immerse VR viewers in more realistic scenery, while eliminating eye strain and other side effects of long-term VR use. The technology could be easily deployed on displays that modulate the phase of light waves. Currently, most affordable consumer-grade displays modulate only brightness, though the cost of phase-modulating displays would fall if widely adopted.
Three-dimensional holography could also boost the development of volumetric 3D printing, the researchers say. This technology could prove faster and more precise than traditional layer-by-layer 3D printing, since volumetric 3D printing allows for the simultaneous projection of the entire 3D pattern. Other applications include microscopy, visualization of medical data, and the design of surfaces with unique optical properties.
“It’s a considerable leap that could completely change people’s attitudes toward holography,” says Matusik. “We feel like neural networks were born for this task.”
The work was supported, in part, by Sony.