Skip to content

Flame-Chasers/MNEMO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

MNEMO: Interactive Person Retrieval via Multi-Turn Multimodal Conversation

Conference Code Dataset License

Official repository for Interactive Person Retrieval via Multi-Turn Multimodal Conversation, accepted by ICML 2026.

Code and dataset are coming soon.
We are currently preparing the camera-ready version and organizing the release of the dataset, training code, evaluation scripts, and model weights.

📖Overview

This project studies multimodal interactive person retrieval, where users refine retrieval results through multi-turn conversations with visual feedback on candidate images. We build MInterPEDES, a multimodal conversational dataset, and propose MNEMO, which encodes each dialogue turn as an atomic multimodal unit and aggregates dialogue memory to capture fine-grained cross-turn dependencies.

📌TODO

  • Release the MInterPEDES dataset
  • Release training and evaluation code
  • Release model weights

We will update this checklist as each component becomes available.

🌟Star History

Star History Chart

✨Citation

If you find this project useful for your research, please consider citing our paper:

@inproceedings{bai2026interactive,
  title={Interactive person retrieval via multi-turn multimodal conversation},
  author={Bai, Yang and Wang, Tingfeng and Yang, Bin and Cao, Min and Wang, Jinqiao and Ye, Mang},
  booktitle={Forty-third International Conference on Machine Learning},
  year={2026}
}

License

This code is distributed under an MIT LICENSE.

About

【ICML 2026】MNEMO: Interactive Person Retrieval via Multi-Turn Multimodal Conversation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors