CuriseJia / FreeStyleRet

Precision Search through Multi-Style Inputs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

If you like our project, please give us a star โญ on GitHub for latest update.

arXiv License Data License

๐Ÿ“ฐ News

  • [2023.11.29] Code is available now! Welcome to watch ๐Ÿ‘€ this repository for the latest updates.

๐Ÿ˜ฎ Highlights

๐Ÿ’ก High performance, plug-and-play, and lightweight

FreestyleRet is the first multi-style retrieval model and focus on the precision search field. You can transfer our gram-based style block to any other pre-trained model with only 28M trainable parameter.

โšก๏ธ A multi-style, fully aligned and gained dataset

We propose the precision search task and its fisrt corresponding dataset. Following figure shows our proposed Diverse-Style Retrieval Dataset(DSR), which includes five styles: origin, sketch, art, mosaic, and text.

๐Ÿš€ Main Results

FreestyleRet achieves state-of-the-art (SOTA) performance on the DSR dataset and the ImageNet-X dataset, * donates the results of prompt tuning.

๐Ÿค— Visualization

Each sample has three images to compare the retrieval performance between our FreestyleRet and the BLIP baseline on the DSR dataset. The left images are the queries randomly selected from different styles. The middle and the right images are the retrieval results of our FreestyleRet-BLIP model and the original BLIP model, respectively.

๐Ÿ› ๏ธ Requirements and Installation

  • Python >= 3.9
  • Pytorch >= 1.9.0
  • CUDA Version >= 11.3
  • Install required packages:
git clone https://github.com/YanhaoJia/FreeStyleRet
cd FreeStyleRet
pip install -r requirements.txt

๐Ÿ’ฅ DSR dataset & FreestyleRet Checkpoints

The datasets and Checkpoints are coming soon.

๐Ÿ—๏ธ Training & Validating

The training & validating instruction is in train.py and test.py.

๐Ÿ‘ Acknowledgement

  • OpenCLIP An open source pretraining framework.
  • LanguageBind Bind five modalities through Language.
  • ImageBind Bind five modalities through Image.
  • FSCOCO An open source Sketch-Text retrieval dataset.

๐Ÿ”’ License

  • The majority of this project is released under the MIT license as found in the LICENSE file.
  • The dataset of this project is released under the CC-BY-NC 4.0 license as found in the DATASET_LICENSE file.

โœ๏ธ Citation

If you find our paper and code useful in your research, please consider giving a star โญ and citation ๐Ÿ“.

@misc{li2023freestyleret,
      title={FreestyleRet: Retrieving Images from Style-Diversified Queries}, 
      author={Hao Li and Curise Jia and Peng Jin and Zesen Cheng and Kehan Li and Jialu Sui and Chang Liu and Li Yuan},
      year={2023},
      eprint={2312.02428},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Precision Search through Multi-Style Inputs

License:MIT License


Languages

Language:Python 100.0%