CSE5519 Advances in Computer Vision (Topic E: 2022: Deep Learning for Geometric Computer Vision)

Map-free Visual Relocalization: Metric Pose Relative to a Single Image

This paper proposes a map-free visual relocalization method that can estimate the metric pose relative to a single image. Only use 2 image in total.

Novelty in Map-free Visual Relocalization

Use single image as reference view to estimate the metric pose for a given image up to a certain scale factor.

Use Relative pose regression to estimate the pose relative to the reference view.

Tip

This paper reminds me of the Disparity net, which makes the estimation of disparity from the left image and the right image. They also feed the data across several ResNet layers.

After reading this paper, I’m impressed by the customized data set of over 600+ places of interest around the world. The dataset considered the difference (lighting conditions, seasonal changes, etc.) between the frame of the traditional dataset to be too low compared with the customized dataset (photo taken by different people with different equipment). It raises a good metric to evaluate the performance of the model.

However, I didn’t see the performance of the model on the traditional dataset. How does the model adapt to the traditional dataset? Can we use the map-free visual relocalization to improve the performance of the model on the traditional dataset or tasks like structure from motion or pose estimation? I wonder how long it takes to train the model and how efficient it is in terms of memory usage or inference time compared with the traditional model?