3D-Image-Matching-with-LoFTR

Research Paper

LoFTR: Detector-Free Local Feature Matching with Transformers

Jiaming Sun*, Zehong Shen*, Yu'ang Wang*, Hujun Bao, Xiaowei Zhou CVPR 2021

Introduction

LoFTR stands for Detector-Free Local Feature Matching with Transformers. LoFTR can extract high-quality semi-dense matches even in indistinctive regions with low-textures, motion blur, or repetitive patterns. Instead of performing image feature detection, description, and matching sequentially, this paper propose to first establish pixel-wise dense matches at a coarse level and later refine the good matches at a fine level. In contrast to dense methods that use cost volume to search correspondences, it use self and cross attention layers in Transformers to obtain feature descriptors that are conditioned on both images. The global receptive field provided by Transformers enables our method to produce dense matches in low-texture areas, where feature detectors usually struggle to produce repeatable interest points. The experiments on indoor and outdoor datasets show that LoFTR outperforms state-of-the-art methods by a large margin. LoFTR also ranks first on two public benchmarks of visual localization among the published methods.

LoFTR has four components:

A local feature CNN extracts the coarse-level feature maps F(A) and F(B), together with the fine-level feature maps F(A) and F(B) from the image pair I(A) and I(B).
The coarse feature maps are flattened to 1-D vectors and added with the positional encoding. The added features are then processed by the Local Feature TRansformer (LoFTR) module, which has N(c) self-attention and cross-attention layers.
A differentiable matching layer is used to match the transformed features, which ends up with a confidence matrix P(c). The matches in P(c) are selected according to the confidence threshold and mutual-nearest-neighbor criteria, yielding the coarse-level match prediction M(c).
For every selected coarse prediction (i, j) E M(c), a local window with size (w x w) is cropped from the fine-level feature map. Coarse matches will be refined within this local window to a sub-pixel level as the final match prediction M(f).

Weights

To download the model weights, click here!

Results

Copyright

This work is affiliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Codes		Codes
assets		assets
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D-Image-Matching-with-LoFTR

Research Paper

Introduction

LoFTR has four components:

Weights

Results

Copyright

About

Releases

Packages

Languages

MDSALMANSHAMS/3D-Image-Matching-with-LoFTR

Folders and files

Latest commit

History

Repository files navigation

3D-Image-Matching-with-LoFTR

Research Paper

Introduction

LoFTR has four components:

Weights

Results

Copyright

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages