seekerhuang / ROAD-R-CHALLENGE

[NeurIPS 2023] ROAD-R-CHALLENGE TRACK 1 & 2 RK2 🥈

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ROAD-R Challenge

The code is built on top of 3D-RetinaNet for ROAD.

The first task requires developing models for scenarios where only little annotated data is available at training time. More precisely, only 3 out of 15 videos (from the training partition train_1 of the ROAD-R dataset) are used for training the models in this task.

The videos' ids are: 2014-07-14-14-49-50_stereo_centre_01, 2015-02-03-19-43-11_stereo_centre_04, and 2015-02-24-12-32-19_stereo_centre_04.

By solely using the three specified videos, without any data augmentation, complex post-processing, or TTA, our developed TBSD model achieves a frame map@50 score of 0.262 for task-1. When combined with the only-dinov2 branch through ensemble (with only_dinov2_branch), the final frame map@50 score reaches around 0.27 for task-1.

Table of Contents

Dependencies and Data Preparation

Please refer to the "environment" folder in the directory, where you can choose the .yml file for building.

conda env create -f environment.yml
conda activate base

or:

pip install -r requirements.txt

The road directory should look like this:

   road/
        - road_trainval_v1.0.json
        - videos/
            - 2014-06-25-16-45-34_stereo_centre_02
            - 2014-06-26-09-53-12_stereo_centre_02
            - ........
        - rgb-images
            - 2014-06-25-16-45-34_stereo_centre_02/
                - 00001.jpg
                - 00002.jpg
                - .........*.jpg
            - 2014-06-26-09-53-12_stereo_centre_02
                - 00001.jpg
                - 00002.jpg
                - .........*.jpg
            - ......../
                - ........*.jpg

And you need to place the directory for configuring the dataset in the parent level of this file directory, i.e., the parent level of the directory where the README.md file is located. Please refer to road-dataset for the specific format.

Pretrained Model

Please place the pre-trained models in the /pretrainmodel folder. You can obtain the pre-trained models from the link provided below.

Model Link
swin_base_patch244_window1677_sthv2.pth (optional) swin-base-ssv2
swin-large-p244-w877_in22k-pre_16xb8-amp-32x2x1-30e_kinetics700-rgb_20220930-f8d74db7.pth swin-large-k700
yolox_l.pth yolox-l
vit-giant-p14_dinov2-pre_3rdparty_20230426-2934a630.pth dinov2-giant
vit-large-p14_dinov2-pre_3rdparty_20230426-f3302d9e.pth dinov2-large
pretrained weight for head pretrained weight for head

Note: You may need to run get_kinetics_weights.sh (included in the ROAD-R Challenge ) to obtain the file named resnet50RCGRU.pth. Otherwise, you may encounter an error.

Training

To train the model, provide the following positional arguments:

  • DATA_ROOT: path to a directory in which road can be found, containing road_test_v1.0.json, road_trainval_v1.0.json, and directories rgb-images and videos.
  • SAVE_ROOT: path to a directory in which the experiments (e.g. checkpoints, training logs) will be saved.
  • MODEL_PATH: path to the directory containing the weights for the chosen backbone (e.g. resnet50RCGRU.pth).

The remaining experimental details and logs can be found in actual_task1_logs_TBSD and actual_task1_logs_only_dinov2. The folder all_history_logs in the main directory contains all the experimental information for tasks one and two.

Example train command (to be run from the root of this repository):

python main.py --TASK=1 --DATA_ROOT="yourpath/road-dataset-master/" --pretrained_model_path="/root/autodl-tmp/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/swin-large-p244-w877_in22k-pre_16xb8-amp-32x2x1-30e_kinetics700-rgb_20220930-f8d74db7.pth" --pretrained_model_path2="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/pretrained_weights_task1.pth" --MODEL_PATH="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/kinetics-pt/" --SAVE_ROOT="yourpath/road-dataset-master/SAVE/" --MODE="train" --LOGIC="Lukasiewicz" --VAL_STEP=1 --LR=6e-5 --MAX_EPOCHS=25

Testing

Below is an example command to test a model.

CUDA_VISIBLE_DEVICES=1 python main.py --RESUME=20 --TASK=1 --LOGIC="Lukasiewicz" --EXPDIR="yourpath/road-dataset-master/experiments/" --DATA_ROOT="yourpath/road-dataset-master/" --pretrained_model_path="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/swin-large-p244-w877_in22k-pre_16xb8-amp-32x2x1-30e_kinetics700-rgb_20220930-f8d74db7.pth" --pretrained_model_path2="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/pretrained_weights_task1.pth" --MODEL_PATH="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/kinetics-pt/" --SAVE_ROOT="yourpath/road-dataset-master/SAVE/" --MODE="gen_dets" --TEST_SUBSETS=test --EVAL_EPOCHS=20 --EXP_NAME="yourpath/road-dataset-master/SAVE/road/logic-ssl_cache_Lukasiewicz_8.0/resnet50RCGRU512-Pkinetics-b8s12x1x1-roadt1-h3x3x3-10-23-09-28-54x/"

Appendix

There are readme instructions in the environmentfolder and ensemble folder that you may need to read in order to run the project more effectively.

Acknowledgments

[1] road-dataset

[2] ROAD-R-2023-Challenge

[3] 3D-RetinaNet for ROAD

[4] Video-Swin-Transformer

[5] dinov2

[6] YOLOX

[7] mmpretrain

[8] mmaction2

About

[NeurIPS 2023] ROAD-R-CHALLENGE TRACK 1 & 2 RK2 🥈

License:Apache License 2.0


Languages

Language:Python 63.5%Language:Jupyter Notebook 21.2%Language:C++ 11.0%Language:TeX 3.9%Language:Makefile 0.3%Language:Shell 0.1%Language:C 0.1%