This is the official PyTorch implementation of our paper Generative Retrieval for Unsupervised Text-Based Person Search. The paper link will be released soon.
We propose GTR+ for unsupervised text-based person search, removing the need for expensive human-annotated descriptions. GTR+ combines:
-
a three-tier description generation framework for producing fine-grained and diverse pseudo texts;
-
an adaptive confidence-weighted retrieval learning framework to alleviate noisy supervision;
-
LargeFine-Person, a large-scale benchmark for unsupervised TBPS pre-training.
- [2026-3-20] Initial release of code.
- ...
Our experiments are mainly conducted on NVIDIA L40 GPUs. The code should also run on other GPUs with sufficient memory.
More dependency details are provided in requirements.txt.
git clone ...
cd ...
conda create -n blip -y python=3.10
conda activate blip
pip install -r requirements.txtThe following scripts provide an example for training and evaluation. Please modify the dataset paths and checkpoint paths in the scripts before running.
# Training
bash shell/train.sh
# Evaluation
bash shell/eval.shDownload the CUHK-PEDES dataset, ICFG-PEDES dataset and RSTPReid dataset.
dataset_root/
├── CUHK-PEDES/
│ ├── imgs/
│ │ ├── cam_a/
│ │ ├── cam_b/
│ │ └── ...
│ └── reid_raw.json
├── ICFG-PEDES/
│ ├── imgs/
│ │ ├── test/
│ │ └── train/
│ └── ICFG_PEDES.json
├── RSTPReid/
│ ├── imgs/
│ └── data_captions.json
└── LargeFine-Person/
├── imgs/
├── LargeFine_Person_qa.json
├── LargeFine_Person_com.json
└── LargeFine_Person_sty.json
Download our pre-training dataset LargeFine-Person
Unsupervised TBPS Results with BLIP as Baseline
CUHK-PEDES
| Method | Baseline | Fine-tuning | R@1 | R@5 | R@10 | mAP | Checkpoint |
|---|---|---|---|---|---|---|---|
| GTR | BLIP | 47.53 | 68.23 | 75.91 | 42.91 | / | |
| GAAP | BLIP | 47.64 | 67.79 | 76.08 | 41.28 | / | |
| MUMA | BLIP | 59.52 | 77.79 | 84.65 | 52.75 | / | |
| GTR+ | BLIP | 61.35 | 79.35 | 85.75 | 55.75 | Download | |
| GTR+ (Pre-trained) | BLIP | ✗ | 62.65 | 78.80 | 84.76 | 55.27 | Download |
| GTR+ (Pre-trained) | BLIP | ✓ | 64.65 | 80.72 | 86.78 | 58.67 | Download |
ICFG-PEDES
| Method | Baseline | Fine-tuning | R@1 | R@5 | R@10 | mAP | Checkpoint |
|---|---|---|---|---|---|---|---|
| GTR | BLIP | 28.25 | 45.21 | 53.51 | 13.82 | / | |
| GAAP | BLIP | 27.12 | 44.91 | 53.56 | 11.43 | / | |
| MUMA | BLIP | 38.11 | 56.01 | 63.96 | 19.02 | / | |
| GTR+ | BLIP | 47.81 | 64.97 | 71.94 | 28.75 | Download | |
| GTR+ (Pre-trained) | BLIP | ✗ | 47.53 | 64.32 | 71.39 | 25.38 | Download |
| GTR+ (Pre-trained) | BLIP | ✓ | 52.78 | 67.94 | 73.91 | 33.99 | Download |
RSTPReid
| Method | Baseline | Fine-tuning | R@1 | R@5 | R@10 | mAP | Checkpoint |
|---|---|---|---|---|---|---|---|
| GTR | BLIP | 45.60 | 70.35 | 79.95 | 33.30 | / | |
| GAAP | BLIP | 44.45 | 65.15 | 75.30 | 31.21 | / | |
| MUMA | BLIP | 54.35 | 76.05 | 83.65 | 40.50 | / | |
| GTR+ | BLIP | 54.75 | 75.15 | 83.50 | 43.79 | Download | |
| GTR+ (Pre-trained) | BLIP | ✗ | 52.00 | 74.05 | 82.35 | 38.72 | Download |
| GTR+ (Pre-trained) | BLIP | ✓ | 55.70 | 76.55 | 84.25 | 43.86 | Download |
Supervised TBPS Results with IRRA as Baseline
CUHK-PEDES
| Method | Baseline | Fine-tuning | R@1 | R@5 | R@10 | mAP | Checkpoint |
|---|---|---|---|---|---|---|---|
| GTR+ | IRRA | 59.44 | 78.54 | 85.22 | 54.11 | Download | |
| GTR+ | IRRA | ✓ | 77.13 | 90.82 | 94.49 | 68.37 | Download |
ICFG-PEDES
| Method | Baseline | Fine-tuning | R@1 | R@5 | R@10 | mAP | Checkpoint |
|---|---|---|---|---|---|---|---|
| GTR+ | IRRA | 43.77 | 60.77 | 68.05 | 22.30 | Download | |
| GTR+ | IRRA | ✓ | 67.80 | 82.81 | 87.66 | 41.00 | Download |
RSTPReid
| Method | Baseline | Fine-tuning | R@1 | R@5 | R@10 | mAP | Checkpoint |
|---|---|---|---|---|---|---|---|
| GTR+ | IRRA | 50.45 | 73.45 | 82.35 | 37.68 | Download | |
| GTR+ | IRRA | ✓ | 69.05 | 86.90 | 92.25 | 54.19 | Download |
More qualitative examples of generated descriptions and retrieval results are shown below.
If you find this code useful for your research, please cite our paper.
coming soon

