sterzhang / image-textualization

Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

image

Plan

  • Main code for IT framework.
  • Data cleaning is on-going. Expect to open-source 170K data before 6/17.
  • Code for evaluation.
  • Release the usage of our IT framework.

πŸ”₯ Now, IT-170K dataset can be found in πŸ€—Huggingface. Link to our paper: arxiv.

Contents

Install

See detailed instructions in install.md.

Datasets

Images

  • COCO: Download here train2017.
  • SAM: Click here SAM (sa_000000.tar ~ sa_000024.tar).
  • VG: Click here VG.

After downloading, organize the image datasets as follows in ./dataset/:

β”œβ”€β”€ coco
β”‚   └── train2017
β”œβ”€β”€ sam
    └── images
β”œβ”€β”€ vg

Use

After install all the requirements, you can follow use.md to generate description on your datasets.

Visualization

image

Acknowledgement

If you find our work useful for your research or applications, please cite using this BibTeX:

@misc{pi2024image,
      title={Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions}, 
      author={Renjie Pi and Jianshu Zhang and Jipeng Zhang and Rui Pan and Zhekai Chen and Tong Zhang},
      year={2024},
      eprint={2406.07502},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions


Languages

Language:Python 99.2%Language:Shell 0.8%