Solution for CCKS2022 Track2

🌟 Introduction

This is the third place solution for the 2nd Task of the CCKS-2022 Digital Business Knowledge Map Assessment Competition.

📃Paper: "Multi-Modal Representation Learning with Self-Adaptive Thresholds for Commodity Verification"

About Training Data

The training is only conducted on the official training set. Neither external training data nor test data are utilized.
When dividing the validation set, we remove the items that appear in the training set to ensure that the training set and validation set do not overlap. The ratio of the final training set and validation set is about 5.6:1.

About Data Preprocessing

We resize all images to 384 x 384.
For text, except title, we picked the 10 most frequent pvs and sku: ["颜色分类", "货号", "型号", "品牌", "尺寸", "口味", "品名", "批准文号", "系列", "尺码"].

About Pre-trained Models

For image, we use Swin Transformer Large pre-trained on ImageNet-22k.
For text, we use RoBERTa Base pre-trained on EXT data.
Both pre-trained models are from Hugging Face.

About Model Ensemble

We do not ensemble models and all results are from a single model.

About Runtime Environment

GPU	NVIDIA A100-SXM4-80GB * 2
Python	3.8.8
PyTorch	1.8.1
CUDA	11.1
cuDNN	8

About Training Time and GPU Memory

Stage	Training time	GPU memory
Train	Full steps, 100k iters, ~23 hours Peak performance, 64k iters, ~15 hours	~42GB
Inference	~7 minutes	~16GB

Train with FP16: FP16-version

🐾 Updates

Add emojis

🚧 TODO

🏪 Model Zoo

Model	Threshold	Val F1 / P / R	Test A F1 / P / R	Test B F1 / P / R	Training Log	YAML
63_grad_clip_norm_0.5_net_64000.pth	0	0.8834 0.8909 0.8761	0.8888 0.8762 0.9017	0.8909 0.8790 0.9031	log	yaml
	1.65	-	-	0.8936 0.8970 0.8902
64_grad_clip_norm_0.1_net_60000.pth	0	0.8753 0.9002 0.8517	0.8910 0.8901 0.8919	0.8933 0.8933 0.8933	log	yaml

🪄 Environment Setup

Docker

We recommend to use our established docker image ccks-2022, which also includes our preprocessed data.

Pip

Please install PyTorch according to About Runtime Environment first.
Then install other dependencies by pip.

pip install -r requirements.txt

🗺 Dataset Preparation

Docker

Our docker image ccks-2022 includes our preprocessed data, which is relatively smaller and easier to download.

Download and Preprocess manually

export REPO_DIR=$PWD

mkdir /data
cd /data
bash $REPO_DIR/scripts/download_data.sh
cat item_train_images.zip.part* > item_train_images.zip

cd $REPO_DIR
bash scripts/resize_img.sh
bash scripts/prepare_data.sh

🚄 Train

bash train.sh

📋 Test

Due to the file size limit of GitHub Release, we have to split the checkpoint. Please download 63_grad_clip_norm_0.5_net_64000.pth.partaa and 63_grad_clip_norm_0.5_net_64000.pth.partab to this repo and run

cat 63_grad_clip_norm_0.5_net_64000.pth.part* > 63_grad_clip_norm_0.5_net_64000.pth
bash predict.sh

📝Citations

If it helps your research or work, please consider citing our paper. The following is a BibTeX reference.

  @misc{https://doi.org/10.48550/arxiv.2208.11064,
    doi = {10.48550/ARXIV.2208.11064},
    url = {https://arxiv.org/abs/2208.11064},
    author = {Han Chenchen and Jia Heng},
    keywords = {Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Multi-Modal Representation Learning with Self-Adaptive Thresholds for Commodity Verification},
    publisher = {arXiv},
    year = {2022},
    copyright = {arXiv.org perpetual, non-exclusive license}
  }

zhaoguangyao / CCKS2022-track2-solution

Solution for CCKS2022 Track2

🌟 Introduction

About Training Data

About Data Preprocessing

About Pre-trained Models

About Model Ensemble

About Runtime Environment

About Training Time and GPU Memory

🐾 Updates

🚧 TODO

🏪 Model Zoo

🪄 Environment Setup

Docker

Pip

🗺 Dataset Preparation

Docker

Download and Preprocess manually

🚄 Train

📋 Test

📝Citations

About

Languages