hssip / FashionSAP

CVPR2023 paper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training

This paper is accepted by the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR2023) paper

This is the source code of PyTorch implementation of the FashionSAP.

We will introduce more about our project ...

Requirements:

  • requirements.txt

Prepare:

  • FashionGen

    1. download the raw file and extract it in path data_root.
    2. change the data_root and split in prepare_dataset.py and run it get the assitance file.
  • Fashioniq

    1. download the raw file and extract it in path data_root.
    2. the directory captions and images in raw fileare put in data_root. Besides the file, we also merge all kinds of train file into cap.train.json file in captions, so as to val.

Run

  1. we define 3 kinds downstream names as downstream_name

    • retrieval: includes 2 downstream tasks: text-to-image retrieval downstream and image-to-text retrieval.
    • catereg: fashion domain category recognition and subcategory recognition.
    • tgir: text guided image retrieval or text modified image retrieval.
  2. command bash run_pretrain.sh to run pretrain stage.

  3. command bash run_{downstream_name}.sh to train and evaluate different downstream tasks.

Models

  1. Our pre-trained model can be downloaded from Google Driver

Citations

If you find this code useful for your research, please cite:

@inproceedings{FashionSAP,
      title={FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training}, 
      author={Han, Yunpeng and Zhang, Lisai and Chen, Qingcai and Chen, Zhijian and Li, Zhonghua and Yang, Jianxin and Cao, Zhao},
      year={2023},
      booktitle={CVPR}
}

Some utils codes are referenced from project ALBEF

About

CVPR2023 paper

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 99.8%Language:Shell 0.2%