Official PyTorch implementation of the paper "CLIP-Driven Fine-grained Text-Image Person Re-identification".
we use single RTX3090 24G GPU for training and evaluation.
Python 3.6.9
pytorch 1.7.0
torchvision 0.8.1
scipy 1.2.1
Download the CUHK-PEDES dataset from here, ICFG-PEDES dataset from here and RSTPReid dataset form here
Organize them in your dataset root dir
folder as follows:
|-- your dataset root dir/
| |-- <CUHK-PEDES>/
| |-- imgs
| |-- cam_a
| |-- cam_b
| |-- ...
| |-- reid_raw.json
|
| |-- <ICFG-PEDES>/
| |-- imgs
| |-- test
| |-- train
| |-- ICFG_PEDES.json
|
| |-- <RSTPReid>/
| |-- imgs
| |-- data_captions.json
- Run data.sh (or Download from here)
- Copy files test_reid.json, train_reid.json and val_reid.json to project_directory/cuhkpedes/processed_data/
python train.py
python test.py
Our code is extended from the following repositories. We sincerely appreciate for their contributions.
If you find this code useful for your research, please cite our paper.
@article{CFine,
title={CLIP-Driven Fine-grained Text-Image Person Re-identification},
author={Shuanglin Yan and Neng Dong and Liyan Zhang and Jinhui Tang},
journal={IEEE Transactions on Image Processing},
year={2023},
volume={},
number={},
pages={1-14},
doi={10.1109/TIP.2023.3327924}
}
If you have any question, please feel free to contact us. E-mail: [email protected].