Skip to content

Deep image alignment for UAV-Taken visible and infrared image pairs using two branched CNN pipeline and a registration block.

Notifications You must be signed in to change notification settings

ozerlabs-proxy/VisIrNet

Repository files navigation

VisIRNet

VisIRNet is a deep learning model designed for aligning visible and infrared image pairs captured by UAVs. This repository contains the code and resources used in the paper "VisIRNet: Deep Image Alignment for UAV-Taken Visible and Infrared Image Pairs" published in IEEE Transactions on Geoscience and Remote Sensing.

Table of Contents

Getting Started

Steps

  1. Clone the repository:
    git clone https://github.com/ozerlabs-proxy/VisIrNet.git
    cd VisIRNet
  2. Create a virtual environment and activate it:
    conda create -n VisIrNet python==3.10
    conda activate VisIrNet
  3. Install the required packages:
    pip install -r requirements.txt

Data

  1. create data under VisIrNet/

    mkdir data
    cd data
  2. link datasets to data

    python ./scripts/link_datasets_to_data.py

    or manually

    cd data 
    ln -s ~/Datasets/GoogleEarth .
    ln -s ~/Datasets/MSCOCO .
    ln -s ~/Datasets/SkyData .
    ln -s ~/Datasets/VEDAI .
    ln -s ~/Datasets/GoogleMap .

Usage

Training

We did our experiments on a cluster of computers and GPUs with slurm. The scripts for training and inference are provided. The configs folder includes configuration files for models, datasets, loss functions etc. Choose and provide the the configuration file to the training script (feel free to adjust them).

# train locally
conda activate VisIrNet
python Train.py --config-file >>>skydata_default_config.json<<<

OR

# train with slurm
sbatch slurm-training.sh

Inference

#1. inference locally
conda activate VisIrNet
python Test.py --config-file skydata_default_config.json --r_loss_function  l2_corners_loss --b_loss_function ssim_pixel

OR

#2. inference with slurm
sbatch slurm-inference.sh

Visualize plots

visualize logs with tensorboard

# make sure conda env is activated
conda activate VisIrNet
tensorboard --logdir logs/tensorboard

Model

The VisIRNet architecture is designed to handle the challenges of aligning visible and infrared images. Refer to the paper for detailed information about the model architecture and design choices.

Experiments and Results

The model was trained and tested on the SkyData, VEDAI, Google Earth, Google Maps, and MSCOCO datasets. The results are presented in the paper. The following tables show the quantitative results for the SkyData, VEDAI, Google Earth, Google Maps, and MSCOCO datasets.

Backbone losses choice

  • ✓ mean_squared_error (mse_pixel) "l2"
  • ✓ mean_absolute_error (mae_pixel) "l1"
  • ✓ sum_squared_error (sse_pixel)
  • ✓ structural_similarity (ssim_pixel)

registration losses choice

  • ✓ l1_homography_loss
  • ✓ l2_homography_loss
  • ✓ l1_corners_loss
  • ✓ l2_corners_loss
Backbones with diferent losses and datasets
SkyData VEDAI
mse_pixel
mae_pixel
sse_pixel
ssim_pixel
for each dataset there will be regression head trained on different backbones with different regression losses
Backbone R_loss SkyData VEDAI
mse_pixel l2_corners_loss
mae_pixel l2_corners_loss
sse_pixel l2_corners_loss
ssim_pixel l2_corners_loss

Quantitative Results

SkyData

VEDAI

Google Earth

Google Maps

MSCOCO

Qualitative Results

Important Notebooks

Citations

If you use this code in your research, please cite our paper:

@article{ozer2024visirnet,
  title={VisIRNet: Deep Image Alignment for UAV-Taken Visible and Infrared Image Pairs},
  author={{\"O}zer, Sedat and Ndigande, Alain P},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  volume={62},
  pages={1--11},
  year={2024},
  publisher={IEEE}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.


About

Deep image alignment for UAV-Taken visible and infrared image pairs using two branched CNN pipeline and a registration block.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published