LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures

Anonymous authors

LighthouseGS synthesizes photorealistic novel views from panorama-style captures using a single mobile device, providing a practical framework for indoor scenes.

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) have enabled real-time novel view synthesis (NVS) with impressive quality in indoor scenes. However, achieving high-fidelity rendering requires meticulously captured images covering the entire scene, limiting accessibility for general users. In this work, we propose LighthouseGS, a practical 3DGS-based NVS framework using simple panorama-style motion with a handheld camera (e.g., mobile device). While convenient, this rotation-dominant motion and narrow baseline make accurate camera pose and 3D point estimation challenging, especially in textureless indoor scenes. To address these challenges, LighthouseGS leverages rough geometric priors, such as mobile device camera poses and monocular depth estimation, and utilizes indoor planar structures. We present a new initialization method called plane scaffold assembly to generate consistent 3D points on these structures, followed by a stable pruning strategy to enhance geometry and optimization stability. Additionally, we introduce geometric and photometric corrections to resolve inconsistencies from motion drift and auto-exposure in mobile devices. Tested on real and synthetic indoor scenes, LighthouseGS delivers photorealistic rendering, outperforming state-of-the-art methods and enabling applications like panoramic view synthesis and object placement.

Method

Overview of LighthouseGS. Given consecutive images captured by panorama-style motion with the corresponding rough geometric priors, we construct the plane scaffold that ensures global and local consistency. Then, we initialize 3D Gaussians to be aligned to scene geometry and optimize LighthouseGS with plane-aware stable optimization. To address motion drift and auto-exposure by the use of mobile devices, we additionally correct camera poses and view-dependent colors.

Comparisons (Real world)

Comparisons (Synthetic)