Immersive Media Formats

Immersive media comes in several formats. Here’s an overview of media types to help understand which is best for your project.

Photo courtesy of Insta360, the Insta360 Pro 2 camera

Many immersive media experiences are immersive videos, which are defined by being both created and played back in a very-wide field of view that results in an immersive experience, often in a VR headset (typically 180 degrees or more). The most popular immersive video formats are defined by their respective FOVs: 360 video (all the way around) and 180 video (halfway around). Both 360 and 180 videos can be 2D (monoscopic) or 3D (stereoscopic). Because 360 videos are fully spherical, viewers can look around in all directions when viewing in a VR headset. 180 videos are hemispherical, and while they are still convincingly immersive, the back half of the world is black. Immersive videos provide 3 degrees of freedom (3DOF) when viewed in VR.

A good way to think about immersive video formats is that you must choose 1) a FOV, either 360 or 180, and 2) the number of “eyes” to target, either one (monoscopic) or two (stereoscopic).

As a quick starting point, check out this Immersive Video Formats Video in your Meta Quest 2 headset. It gives examples of the formats that are used in immersive video.

Interactive immersive media experiences are typically created as either bespoke VR apps, although WebXR is also a promising (and completely democratized) emerging ecosystem. If you are interested in developing VR apps, head over to the Developer Center.

180° vs 360°

180 VR cameras record half of the world via a single fisheye lens or pair of fisheye lenses (for 3D). Because only half the world is captured, 180 video is relatively easy to produce. Camera operators and a production team can stand behind the camera and are not captured in the scene.

360 VR cameras record everything around the camera by using multiple lenses and sensors. The smallest 360 cameras use two back-to-back fisheye lenses, each of which captures a bit more than 180 degrees. Because there is overlap, the two images can be stitched together into a single 360 scene. Cameras like this record in 2D (monoscopic), as there is only one perspective for any given subject in the frame.

Capturing 3D-360 is more complicated and is typically accomplished by using camera rigs that contain six or more fisheye lenses in a cylindrical arrangement. This allows each subject in the world to be captured by multiple perspectives, and this data is used to produce a stereoscopic image using stitching software during post production.

2D vs 3D

For video, 2D is the common name for “monoscopic” (one eye), and 3D, for “stereoscopic” (two eyes). You’re probably already familiar with 3D movies, in which two images are projected simultaneously and 3D glasses are used to present the left and right eyes with different images. 3D immersive videos are similar in that each eye is presented with its own video, but the projection type and field of view differ: immersive videos are spherical, and ultra wide field of view.

There is a wide range of 360 VR cameras on the market, from consumer to professional, including 360 action cameras from companies like GoPro and Insta360. In many scenarios, 2D 360 video can provide a sense of immersion, but 3D-360 is much more immersive, as viewers can both look around and see the world in 3D – with a stereoscopic sense of depth. However, capturing video in 3D-360 requires specialized cameras and it is much more complex to produce.

3D-180 has emerged as a popular format due to the availability of high-quality cameras and lenses combined with a relatively simple post production process (as compared to 3D-360).

Equirectangular Projection

Immersive video formats are all spherical in nature, with its most popular formats covering half or full coverage. However, spherical captures need to be represented in compatible video formats, which all use rectangular image frames. Equirectangular projection is a spherical projection that fills a rectangular frame. It is the standard projection used in immersive video, and all stitching and post production video editing tools support it. If you’ve looked at a map of the Earth, you are already familiar with equirectangular projection–it is very commonly used in maps.

360 monoscopic equirectangular example

