Aalok Patwardhan* Callum Rhodes* Gwangbin Bae Andrew J. Davison

*Equal Contribution

Dyson Robotics Lab, Imperial College London

Paper arXiv Code

Summary

  • U-ARE-ME provides globally consistent rotation estimates in Manhattan environments across sequences of RGB images, without the need for camera intrinsics.
  • This is done by finding the rotation matrix that aligns the predicted surface normals to the principal directions of the scene.
  • Even in non-Manhattan scenes, it can reliably estimate the global up-direction (i.e. the pitch & roll).

Demo

Abstract

Camera rotation estimation from a single image is a challenging task, often requiring depth data and/or camera intrinsics, which are generally not available for in-the-wild videos. Although external sensors such as inertial measurement units (IMUs) can help, they often suffer from drift and are not applicable in non-inertial reference frames. We present U-ARE-ME, an algorithm that estimates camera rotation along with uncertainty from uncalibrated RGB images. Using a Manhattan World assumption, our method leverages the per-pixel geometric priors encoded in single-image surface normal predictions and performs optimisation over the SO(3) manifold. Given a sequence of images, we can use the per-frame rotation estimates and their uncertainty to perform multi-frame optimisation, achieving robustness and temporal consistency. Our experiments demonstrate that U-ARE-ME performs comparably to RGB-D methods and is more robust than sparse feature-based SLAM methods.

Acknowledgement

This research has been supported by the EPSRC Prosperity Partnership Award with Dyson Technology Ltd.