zhaoguangyao / CCKS2022-track2-solution

ccsk2022-task9-track2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Solution for CCKS2022 Track2

🌟 Introduction

This is the third place solution for the 2nd Task of the CCKS-2022 Digital Business Knowledge Map Assessment Competition.

📃Paper: "Multi-Modal Representation Learning with Self-Adaptive Thresholds for Commodity Verification"

model similarity-pos-neg-sat

About Training Data

  • The training is only conducted on the official training set. Neither external training data nor test data are utilized.
  • When dividing the validation set, we remove the items that appear in the training set to ensure that the training set and validation set do not overlap. The ratio of the final training set and validation set is about 5.6:1.

About Data Preprocessing

  • We resize all images to 384 x 384.
  • For text, except title, we picked the 10 most frequent pvs and sku: ["颜色分类", "货号", "型号", "品牌", "尺寸", "口味", "品名", "批准文号", "系列", "尺码"].

About Pre-trained Models

About Model Ensemble

  • We do not ensemble models and all results are from a single model.

About Runtime Environment

GPU NVIDIA A100-SXM4-80GB * 2
Python 3.8.8
PyTorch 1.8.1
CUDA 11.1
cuDNN 8

About Training Time and GPU Memory

Stage Training time GPU memory
Train Full steps, 100k iters, ~23 hours
Peak performance, 64k iters, ~15 hours
~42GB
Inference ~7 minutes ~16GB

Train with FP16: FP16-version

🐾 Updates

  • Add emojis

🚧 TODO

  • Docker image
  • Pre-trained models
  • Logs
  • Results
  • Figure
  • FP16
  • Emoji

🏪 Model Zoo

Model Threshold Val
F1 / P / R
Test A
F1 / P / R
Test B
F1 / P / R
Training Log YAML
63_grad_clip_norm_0.5_net_64000.pth 0 0.8834
0.8909
0.8761
0.8888
0.8762
0.9017
0.8909
0.8790
0.9031
log yaml
1.65 - - 0.8936
0.8970
0.8902
64_grad_clip_norm_0.1_net_60000.pth 0 0.8753
0.9002
0.8517
0.8910
0.8901
0.8919
0.8933
0.8933
0.8933
log yaml

🪄 Environment Setup

Docker

  • We recommend to use our established docker image ccks-2022, which also includes our preprocessed data.

Pip

  1. Please install PyTorch according to About Runtime Environment first.
  2. Then install other dependencies by pip.
pip install -r requirements.txt

🗺 Dataset Preparation

Docker

  • Our docker image ccks-2022 includes our preprocessed data, which is relatively smaller and easier to download.

Download and Preprocess manually

export REPO_DIR=$PWD

mkdir /data
cd /data
bash $REPO_DIR/scripts/download_data.sh
cat item_train_images.zip.part* > item_train_images.zip

cd $REPO_DIR
bash scripts/resize_img.sh
bash scripts/prepare_data.sh

🚄 Train

bash train.sh

📋 Test

Due to the file size limit of GitHub Release, we have to split the checkpoint. Please download 63_grad_clip_norm_0.5_net_64000.pth.partaa and 63_grad_clip_norm_0.5_net_64000.pth.partab to this repo and run

cat 63_grad_clip_norm_0.5_net_64000.pth.part* > 63_grad_clip_norm_0.5_net_64000.pth
bash predict.sh

📝Citations

If it helps your research or work, please consider citing our paper. The following is a BibTeX reference.

  @misc{https://doi.org/10.48550/arxiv.2208.11064,
    doi = {10.48550/ARXIV.2208.11064},
    url = {https://arxiv.org/abs/2208.11064},
    author = {Han Chenchen and Jia Heng},
    keywords = {Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Multi-Modal Representation Learning with Self-Adaptive Thresholds for Commodity Verification},
    publisher = {arXiv},
    year = {2022},
    copyright = {arXiv.org perpetual, non-exclusive license}
  }

About

ccsk2022-task9-track2


Languages

Language:Python 53.6%Language:Jupyter Notebook 41.4%Language:Shell 5.0%