Researchers develop framework for creating sharp 4D reconstructions from blurry videos

Researchers have developed a new framework that can create sharp neural radiance fields from blurry monocular videos, captured from everyday handheld devices.

Neural Radiance Fields (NeRF) is a machine learning technique that can create 3D reconstructions of a scene from 2D images captured from multiple angles, representing it from entirely new perspectives.

While well established for static images, existing methods struggle when using monocular videos as input due to motion blur. Now, researchers have developed MoBluRF: a two-stage framework that enables creation of accurate, sharp 4D (dynamic 3D) NeRFs from blurry videos, captured from everyday handheld devices.

Neural Radiance Fields (NeRF) is a technique that creates three-dimensional (3D) representations of a scene from a set of two-dimensional (2D) images, captured from different angles. It works by training a deep neural network to predict the colour and density at any point in 3D space.

To do this, it casts imaginary light rays from the camera through each pixel in all input images, sampling points along those rays with their 3D coordinates and viewing direction. Using this information, NeRF reconstructs the scene in 3D and can render it from entirely new perspectives, a process known as novel view synthesis (NVS).

Beyond still images, a video can also be used, with each frame of the video treated as a static image. However, existing methods are highly sensitive to the quality of the videos. Videos captured with a single camera, such as those from a phone or drone, inevitably suffer from motion blur caused by fast object motion or camera shake. This makes it difficult to create sharp, dynamic NVS. This is because most existing deblurring-based NVS methods are designed for static multi-view images, which fail to account for global camera and local object motion. In addition, blurry videos often lead to inaccurate camera pose estimations and loss of geometric precision.

To address these issues, a research team jointly led by Assistant Professor Jihyong Oh from the Graduate School of Advanced Imaging Science (GSIAM) at Chung-Ang University (CAU) in Korea and Professor Munchurl Kim from Korea Advanced Institute of Science and Technology (KAIST), Korea, along with Mr. Minh-Quan Viet Bui, Mr. Jongmin Park developed MoBluRF, a two-stage motion deblurring method for NeRFs.

“Our framework is capable of reconstructing sharp 4D scenes and enabling NVS from blurry monocular videos using motion decomposition, while avoiding mask supervision, significantly advancing the NeRF field,” explains Dr. Oh.

Their study was made available online on May 28, 2025, and was published in Volume 47, Issue 09 of the IEEE Transactions on Pattern Analysis and Machine Intelligence on September 01, 2025.

MoBluRF consists of two main stages: Base Ray Initialisation (BRI) and Motion Decomposition-based Deblurring (MDD). Existing deblurring-based NVS methods attempt to predict hidden sharp light rays in blurry images, called latent sharp rays, by transforming a ray called the base ray. However, directly using input rays in blurry images as base rays can lead to inaccurate prediction. BRI addresses this issue by roughly reconstructing dynamic 3D scenes from blurry videos and refining the initialisation of “base rays” from imprecise camera rays.

Next, these base rays are used in the MDD stage to accurately predict latent sharp rays through an Incremental Latent Sharp-rays Prediction (ILSP) method. ILSP incrementally decomposes motion blur into global camera motion and local object motion components, greatly improving the deblurring accuracy. MoBluRF also introduces two novel loss functions, one that separates static and dynamic regions without motion masks, and another that improves geometric accuracy of dynamic objects, two areas where previous methods struggled.

Owing to this innovative design, MoBluRF outperforms state-of-the-art methods with significant margins in various datasets, both quantitatively and qualitatively. It is also robust against varying degrees of blur.

“By enabling deblurring and 3D reconstruction from casual handheld captures, our framework enables smartphones and other consumer devices to produce sharper and more immersive content,” remarks Dr. Oh. “It could also help create crisp 3D models of shaky footage from museums, improve scene understanding and safety for robots and drones, and reduce the need for specialised capture setups in virtual and augmented reality.”

MoBluRF marks a new direction for NeRFs, enabling high quality 3D reconstructions from ordinary blurry videos recorded with everyday devices.

Chung-Ang University is a leading private research university in Seoul, South Korea, dedicated to shaping global leaders for an evolving world. Founded in 1916 and achieving university status in 1953, it combines academic tradition with a strong commitment to innovation.

Researchers develop framework for creating sharp 4D reconstructions from blurry videos

Share this:

Like this: