Performance of different frond-ends #754

sarlinpe · 2023-12-04T09:39:51Z

Hi folks,

I've read your GTSfM paper - nice work, thanks for pushing this to arxiv. I enjoyed reading it and appreciate the huge effort that went into building it. I am very surprised by the conclusion that SuperPoint+Super/LightGlue is not as good as SIFT - in fact we've always observed the exact opposite with incremental SfM (COLMAP) on different easy and difficult datasets (ETH3D, IMC 2020/1/2/3). I went through the code but didn't find anything obvious.

The point clouds of SP+SG/LG look pretty sparse on several datasets, so do the matches in fig 3.

the shorter image side is resized to at most 760 pixels in length

So that'd give a 1351x760 px image for a 1920×1080 input - this seems fine.

A maximum of 5000 keypoints are used for each of the following front-ends

Do you know how many points are effectively extracted by SuperPoint per image? How often is the limit of 5k hit compared to SIFT?

gtsfm/gtsfm/frontend/detector_descriptor/superpoint.py

Lines 45 to 46 in 1b55b76

    
           self._config = {"weights_path": weights_path} 
        
           self._model = SuperPoint(self._config).eval()

Do I understand correctly that you use the default settings? Did you try to tweak them? As is, it cannot return 5k keypoints on these kinds of images, unlike SIFT. I recommend the following:

decrease the detection threshold: keypoint_threshold=0.001
decrease the NMS radius: nms_radius=3
if images are smaller than the limit (760px), upsample them

This should make SuperPoint competitive with SIFT in terms of keypoint detection.

We do know that these deep matchers are more easily tricked by symmetries, as you point out in fig 3. This seems confirmed by table 3: compared to SIFT, the mean of the front-end errors is much higher than their median and they have many more VG outliers, especially on South Building and Crane.

Did you try tuning the filtering threshold (minimum number of inliers, cycle consistency) for each front-end? 15 and 7° seem pretty loose for front-ends that have a high recall.
Did you try running the averaging+BA on edges that are inliers according to the GT poses?
It seems that the motion averaging does not have any robustness built-in. Zhang et al. (ICCV 2023) show that using a robust cost function is critical (table 5) and that weighting by inlier count or two-view covariance can often help. Did you try this? This paper actually shows that SuperPoint+SuperGlue can work perfectly fine for global SfM.

Thanks!
cc @Phil26AT @ducha-aiki

The text was updated successfully, but these errors were encountered:

dellaert · 2023-12-06T06:23:51Z

Thanks for your comments! We'll certainly discuss and try some of the things you suggest. We were not rooting for SIFT in any way :-) We appreciate the advice and hope to just get the best possible performance..

ayushbaid mentioned this issue Jan 21, 2024

Try out GlobalSfMPy's rotation averaging in GTSFM #767

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of different frond-ends #754

Performance of different frond-ends #754

sarlinpe commented Dec 4, 2023 •

edited

Loading

dellaert commented Dec 6, 2023

Performance of different frond-ends #754

Performance of different frond-ends #754

Comments

sarlinpe commented Dec 4, 2023 • edited Loading

dellaert commented Dec 6, 2023

sarlinpe commented Dec 4, 2023 •

edited

Loading