- [2025-12] Code and dataset released
- [2025-11] Paper uploaded to arXiv
- Hardware: 4 × NVIDIA RTX 3090 GPUs (24GB VRAM recommended)
- Software: Python 3.8+ (Recommended), CUDA 11.x+
Clone the repository and install dependencies:
git clone https://github.com/Flame-Chasers/TAG-PR.git
cd TAG-PR
pip install -r requirements.txtWe provide the processed dataset via Quark Cloud Drive.
- Download: Click here to download (Access Code:
8pE6) - Organize: Extract and arrange the files as follows:
dataset/
├── anno_dir/
│ ├── train_reid.json
│ └── test_reid.json
└── images/
├── 0001.jpg
├── 0002.jpg
└── ...
Modify the configuration file located at config/s.config.yaml.
# Data Paths
anno_dir: "/path/to/dataset/anno_dir" # ⚠️ Absolute path to annotation JSONs
image_dir: "/path/to/dataset/images" # ⚠️ Absolute path to image directory
# Model Settings
model:
checkpoint: "/path/to/clip/ViT-B-16.pt" # Path to pre-trained CLIP weights
# ... other model paramsWe support multi-GPU training and shell script execution.
Use torchrun for distributed data parallel (DDP) training:
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun \
--rdzv_id=12345 \
--rdzv_backend=c10d \
--rdzv_endpoint=localhost:0 \
--nnodes=1 \
--nproc_per_node=4 \
main.pyYou can also use the provided shell script wrapper:
bash shell/train.shIf you find this project useful for your research, please consider citing our paper:
@article{zhou2025text,
title={Text-based Aerial-Ground Person Retrieval},
author={Zhou, Xinyu and Wu, Yu and Ma, Jiayao and Wang, Wenhao and Cao, Min and Ye, Mang},
journal={arXiv preprint arXiv:2511.08369},
year={2025}
}This project is released under the MIT License.