P9D (Product 9 Domains)

The Download methods of Vision-language Continual Pretraining Dataset P9D (Product 9 Domains).

🎨 Introduction

P9D (Product 9 Domains) is a multimodal dataset that supports the study of Vision-Language Continual Pretraining (VLCP). P9D includes more than 1 million image-text pairs of real products. According to the industry name of products, P9D is divided into 9 tasks to sequential training which are Household, Furnishings, Food, Beauty, Clothing, Auto, Parenting, Outdoor, and Electronics. These product domains contain different and rich knowledge to support the VLCP.

Different from the traditional Class-Incremental Learning (CIL) that divides different tasks according to the narrow category concept, we divide different tasks by product domains covering rich categories to simulate the change of knowledge domain in continual pretraining.

The P9D contains over 3800 categories and the number of samples in different categories is consistent with the realistic long-tail distribution.

Besides, The P9D dataset has 1,014,599 image-text pairs for training and 2,846 pairs as the test set of cross-modal retrieval. 4,615 pairs and 46,855 pairs as the query set and gallery set of multi-modal retrieval.

📂 Download Dataset

Step 1️⃣: Download JSON Files.

These files contain the image-link ("image_link" or "oss_url") and captions of each product.

The download links of JSON files:

	Train / Test / Query / Gallery set
Google Driver	Here
Baidu Netdisk	Here

Step 2️⃣ : Download Images.

There are two methods to download all images of P9D:

Method 1： Online download each image

By simply changing the storage path, you can use codes of this codebase to download all images to the specified path.

P9D
├── download_images_train.py # Download the images of the train set.
├── download_images_test.py # You can use it to download the images of the test/query/gallery set.
├── download_images_check.py # In the first download, some images may fail to download and this code can download missing images.

Method 2： Unzip the zip files.

The zip file can be downloaded from this Baidu Netdisk.

📌 Tips: This process can be slow due to the large number of small files that need to be downloaded or unzipped. In addition, please ensure that there is at least 250G of remaining storage. You may need more if you unzip these zip files.

📝 Citation

If this codebase is useful to you, please cite our work:

@article{zhu2023ctp,
  title={CTP: Towards Vision-Language Continual Pretraining via Compatible
Momentum Contrast and Topology Preservation},
  author={Hongguang Zhu and Yunchao Wei and Xiaodan Liang and Chunjie Zhang and Yao Zhao},
  journal={Proceedings of the IEEE International Conference on Computer Vision},
  year={2023},
}

🐼 Contacts

If you have any questions, please contact me : zhuhongguang1103@gmail.com or hongguang@bjtu.edu.cn.

KevinLight831 / P9D