CSE5519 Advances in Computer Vision (Topic G: 2025: Correspondence Estimation and Structure from Motion)

MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos

vanilla Droid-SLAM
mono-depth initialization
objective movement map prediction
two-stage training scheme

Tip

How does the two-stage training scheme help with the robustness of the model? For me, it seems that this paper is just the integration of GeoNet (separated pose and depth) with full regression.

Last updated on March 9, 2026

CSE5519 Advances in Computer Vision (Topic A: 2025: Semantic Segmentation)CSE5519 Advances in Computer Vision (Topic I: 2025: Embodied Computer Vision and Robotics)