metadriverse / ACO

[ECCV 2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Learning to Drive by Watching YouTube videos: Action-Conditioned Contrastive Policy Pretraining (ECCV22)

Webpage | Code | Paper | YouTube Driving Dataset | Pretrained ResNet34

Installation

Our codebase is based on MoCo. A simple PyTorch environment will be enough (see MoCo's installation instruction).

Dataset

We collect driving videos from YouTube. Here we provide the ๐Ÿ”— video list we used. You could also download the frames directly via ๐Ÿ”— this OneDrive link and run:

cat sega* > frames.zip

to get the zip file. For training ACO, you should also download label.pt and meta.txt, and put them under {aco_path}/code and {your_dataset_directory}/ respectively.

Training

We provide main_label_moco.py for training. To perform ACO training of a ResNet-34 model in an 8-gpu machine, run:

python main_label_moco.py -a resnet34 --mlp -j 16 --lr 0.003 \
			 --batch-size 256 --moco-k 40960 --dist-url 'tcp://localhost:10001' \
			 --multiprocessing-distributed --world-size 1 --rank 0 {your_dataset_directory} 

Some important arguments:

  • --aug_cf: whether to use Cropping and Flipping augmentations in pre-training. In ACO, we do not use these two augmentations by default.
  • --thres: action similarity threshold.

Pretrained weights

We also provide ๐Ÿ”— pretrained ResNet34 checkpoint. After downloading, you can load this checkpoint via:

import torch
from torchvision.models import resnet34
net = resnet34()
net.load_state_dict(torch.load('ACO_resnet34.ckpt'), strict=False) 

Bibtex

@article{zhang2022learning,
  title={Learning to Drive by Watching YouTube videos: Action-Conditioned Contrastive Policy Pretraining},
  author={Zhang, Qihang and Peng, Zhenghao and Zhou, Bolei},
  journal={European Conference on Computer Vision (ECCV)},
  year={2022}
}

About

[ECCV 2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining


Languages

Language:Python 100.0%