|
1 |
| -# MCJA |
2 |
| -Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification |
| 1 | +# Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification |
| 2 | + |
| 3 | +By [Tengfei Liang](https://scholar.google.com/citations?user=YE6fPvgAAAAJ&hl=en), [Yi Jin](https://scholar.google.com/citations?user=NQAenU0AAAAJ&hl=en), [Wu Liu](https://scholar.google.com/citations?user=rQpizr0AAAAJ&hl=en), [Tao Wang](https://scholar.google.com/citations?user=F3C5oAcAAAAJ&hl=en&oi=sra), [Songhe Feng](https://scholar.google.com/citations?user=K5lqMYgAAAAJ&hl=en), [Yidong Li](https://scholar.google.com/citations?hl=en&user=3PagRQEAAAAJ). |
| 4 | + |
| 5 | +This repository is an official implementation of the paper [Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification](https://ieeexplore.ieee.org/abstract/document/10472470). [`IEEEXplore`](https://ieeexplore.ieee.org/abstract/document/10472470) |
| 6 | + |
| 7 | +*Notes:* |
| 8 | + |
| 9 | +This repository offers the complete code of the entire method, featuring a well-organized directory structure and detailed comments, facilitating the training and testing of the model. |
| 10 | +It is hoped that this can serve as a new baseline for cross-modal visible-infrared person re-identification. |
| 11 | + |
| 12 | + |
| 13 | +## Abstract |
| 14 | + |
| 15 | +Visible-Infrared person Re-IDentification (VI-ReID) is a challenging cross-modality image retrieval task that aims to match pedestrians' images across visible and infrared cameras. |
| 16 | +To solve the modality gap, existing mainstream methods adopt a learning paradigm converting the image retrieval task into an image classification task with cross-entropy loss and auxiliary metric learning losses. |
| 17 | +These losses follow the strategy of adjusting the distribution of extracted embeddings to reduce the intra-class distance and increase the inter-class distance. |
| 18 | +However, such objectives do not precisely correspond to the final test setting of the retrieval task, resulting in a new gap at the optimization level. |
| 19 | +By rethinking these keys of VI-ReID, we propose a simple and effective method, the Multi-level Cross-modality Joint Alignment (MCJA), bridging both the modality and objective-level gap. |
| 20 | +For the former, we design the Visible-Infrared Modality Coordinator in the image space and propose the Modality Distribution Adapter in the feature space, effectively reducing modality discrepancy of the feature extraction process. |
| 21 | +For the latter, we introduce a new Cross-Modality Retrieval loss. |
| 22 | +It is the first work to constrain from the perspective of the ranking list in the VI-ReID, aligning with the goal of the testing stage. |
| 23 | +Moreover, to strengthen the robustness and cross-modality retrieval ability, we further introduce a Multi-Spectral Enhanced Ranking strategy for the testing phase. |
| 24 | +Based on the global feature only, our method outperforms existing methods by a large margin, achieving the remarkable rank-1 of 89.51% and mAP of 87.58% on the most challenging single-shot setting and all-search mode of the SYSU-MM01 dataset. |
| 25 | +(For more details, please refer to [the original paper](https://ieeexplore.ieee.org/abstract/document/10472470)) |
| 26 | + |
| 27 | +<br/> |
| 28 | +<div align="center"> |
| 29 | + <img src="./figs/mcja_overall_structure.png" width="90%"/> |
| 30 | + |
| 31 | + Fig. 1: Overall architecture of the proposed MCJA model. |
| 32 | +</div> |
| 33 | + |
| 34 | + |
| 35 | +## Requirements |
| 36 | + |
| 37 | +The code of this repository is designed to run on a single GPU. |
| 38 | +The [requirements.txt](./requirements.txt) lists the Python packages and their corresponding versions during the execution of our experiments: |
| 39 | + |
| 40 | +- Python 3.8 |
| 41 | +- apex==0.1 |
| 42 | +- numpy==1.21.5 |
| 43 | +- Pillow==8.4.0 |
| 44 | +- pytorch_ignite==0.2.1 |
| 45 | +- scipy==1.7.3 |
| 46 | +- torch==1.8.1+cu111 |
| 47 | +- torchsort==0.1.9 |
| 48 | +- torchvision==0.9.1+cu111 |
| 49 | +- yacs==0.1.8 |
| 50 | + |
| 51 | +*Notes:* |
| 52 | +Higher or Lower versions of these packages might be supported. |
| 53 | +When attempting to use a different version of PyTorch, please be mindful of the compatibility with pytorch_ignite, torchsort, etc. |
| 54 | + |
| 55 | + |
| 56 | +## Dataset & Preparation |
| 57 | + |
| 58 | +During the experiment, we evaluate our proposed method on publicly available datasets, SYSU-MM01 and RegDB, which are commonly used for comparison in VI-ReID. |
| 59 | +Please download the corresponding datasets and modify the path of the data_root folder in [configs/default/dataset.py](./configs/default/dataset.py). |
| 60 | + |
| 61 | + |
| 62 | +## Experiments |
| 63 | + |
| 64 | +Our [main.py](./main.py) supports both training and testing as well as testing only. |
| 65 | + |
| 66 | +### Train |
| 67 | + |
| 68 | +During the training process, executing the following command allows for the training and evaluation of MCJA models on the SYSU-MM01 and RegDB datasets: |
| 69 | + |
| 70 | +```bash |
| 71 | +python main.py --cfg configs/SYSU_MCJA.yml --gpu 0 --seed 8 --desc MCJA |
| 72 | +``` |
| 73 | + |
| 74 | +```bash |
| 75 | +python main.py --cfg configs/RegDB_MCJA.yml --gpu 0 --seed 8 --desc MCJA |
| 76 | +``` |
| 77 | + |
| 78 | +### Test |
| 79 | + |
| 80 | +When conducting tests only, set 'test_only' to true in the 'XXXX.yml' configuration file and specify the path for loading the model in the resume setting. |
| 81 | +Then, execute the same command as mentioned above to complete the testing and evaluation: |
| 82 | + |
| 83 | +```bash |
| 84 | +python main.py --cfg configs/SYSU_MCJA.yml --gpu 0 --desc MCJA_test_only |
| 85 | +``` |
| 86 | + |
| 87 | +```bash |
| 88 | +python main.py --cfg configs/RegDB_MCJA.yml --gpu 0 --desc MCJA_test_only |
| 89 | +``` |
| 90 | + |
| 91 | +*Notes:* |
| 92 | +The '--seed' and '--desc' of [main.py](./main.py) are optional. |
| 93 | +The former is used to add a suffix description to the current run, while the latter controls the random seed for this experiment. |
| 94 | + |
| 95 | + |
| 96 | +## Citation |
| 97 | +If you find MCJA useful in your research, please kindly cite this paper in your publications: |
| 98 | +```bibtex |
| 99 | +@article{TCSVT24_MCJA, |
| 100 | + author = {Liang, Tengfei and Jin, Yi and Liu, Wu and Wang, Tao and Feng, Songhe and Li, Yidong}, |
| 101 | + title = {Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification}, |
| 102 | + journal = {IEEE Transactions on Circuits and Systems for Video Technology}, |
| 103 | + pages = {1-1}, |
| 104 | + year = {2024}, |
| 105 | + doi = {10.1109/TCSVT.2024.3377252} |
| 106 | +} |
| 107 | +``` |
| 108 | + |
| 109 | + |
| 110 | +## Related Repos |
| 111 | +Our repository builds upon the work of others, and we extend our gratitude for their contributions. |
| 112 | +Below is a list of some of these works: |
| 113 | + |
| 114 | +- AGW - https://github.com/mangye16/Cross-Modal-Re-ID-baseline |
| 115 | +- MPANet - https://github.com/DoubtedSteam/MPANet |
| 116 | + |
| 117 | + |
| 118 | +## License |
| 119 | + |
| 120 | +This repository is released under the MIT license. Please see the [LICENSE](./LICENSE) file for more information. |
0 commit comments