Create an Immersive VR Video Experience--No Headsets Required!
Scott Janus
Unknown
- 0 Collaborators
Intel has developed a new volumetric video compression technique that lets you watch videos with motion parallax without having to wear a VR headset. ...learn more
Project status: Under Development
Groups
SIGGRAPH 2020
Intel Technologies
Intel Integrated Graphics
Overview / Usage
One of the major challenges of delivering volumetric video is compressing it in such a way that it can be streamed at a reasonbale bitrate and decoded and rendered in real time. Intel has developed a new way to compress Multi Plane Images (MPI) that can be played back in realtime on mainstream PCs and provide users with a motion parallax experience without a headset.
Methodology / Approach
In recent years we have seen the emergence of a new technique for representing volumetric video: Multi-plane Images (MPI). The basic concept of MPI is to use machine learning to create a volumetric representation of a scene as a stack of semi-transparent images or planes that contain textures derived from the original camera view. When these planes are stacked on top of each other, the original camera view is recovered. These stacks can be used to render the scene that would be visible from novel viewpoints with a visual quality that surpasses many traditional viewpoint interpolation techniques.
Although much of the research to date has focused on still images, it is natural to extend the MPI technique to video sequences. Here is where a drawback of MPI becomes immediately apparent: a typical MPI stack for a single image consists of 32 planes wherein each element has a red, green, blue, and alpha (RGBA) sample. A 32-plane 1920x1080 resolution MPI stack requires 2.123 Gb. A 30 FPS video sequence would thus require a bandwidth of 63.7 Gb/s for a single camera source, and correspondingly higher bandwidths if multiple camera inputs are used.
Obviously, compression must be used in order to make MPI video a viable technology for broad deployment. A custom codec tailored to MPI would likely achieve excellent results, but we have focused on re-using the extensive deployment of traditional video codecs. We focus on HEVC, but our results can be applied to other video codecs such as AV1. The compression techniques we describe in this paper do not rely on any knowledge of the algorithm used to generate the MPI planes and should be generally applicable. We also focus heavily on reducing the pixel rate to make real time decoding practical on available consumer systems.
We developed a technique to consolidate the 32 RGBA textures of an MPI stack into a single 1YUVA texture. We can then compress this single texture using a standard HEVC encoder. We achieve real-time playback by decoding on an Intel GPU, and then expanding the single texture back into a set of 32 textures and rendering using standard MPI techniques.