vobecant / DriveAndSegment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Welcome to the code for Drive&Segment ๐Ÿ‘‹

Project Page License: MIT Twitter: AVobecky Gradio App

๐Ÿš™๐Ÿ“ท Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation (ECCV'22 oral)

This project hosts the code for inference of the Drive&Segment for unsupervised image segmentation of urban scenes.

It is accepted to the ECCV'22 conference as a paper for an oral presentation (2.7% of submissions).

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation
Antonin Vobecky, David Hurych, Oriane Simรฉoni, Spyros Gidaris, Andrei Bursuc, Patrick Pรฉrez, and Josef Sivic

arXiv preprint (arXiv 2203.11160, full paper)

Table of Contents

teaser

๐Ÿ’ซ Highlights

  • ๐Ÿšซ๐Ÿ”ฌ Unsupervised semantic segmentation: Drive&Segments proposes learning semantic segmentation in urban scenes without any manual annotation, just from the raw non-curated data collected by cars which, equipped with ๐Ÿ“ท cameras and ๐Ÿ’ฅ LiDAR sensors.
  • ๐Ÿ“ท๐Ÿ’ฅ Multi-modal training: During the train time our method takes ๐Ÿ“ท images and ๐Ÿ’ฅ LiDAR scans as an input, and learns a semantic segmentation model without using manual annotations.
  • ๐Ÿ“ท Image-only inference: During the inference time, Drive&Segments takes only images as an input.
  • ๐Ÿ† State-of-the-art performance: Our best single model based on Segmenter architecture achieves 21.8% in mIoU on Cityscapes (without any fine-tuning).
  • ๐Ÿš€ Gradio Application: We provide an interactive Gradio application so that everyone can try our model.

๐Ÿ“บ Examples

Pseudo segmentation.

Example of pseudo segmentation.

Cityscapes segmentation.

Two examples of pseudo segmentation mapped to the 19 ground-truth classes of the Cityscapes dataset by using Hungarian algorithm.

โœจ Running the models

๐Ÿ“ Requirements

Please, refer to requirements.txt

To create and run a virtual environment with the required packages, you can run:

conda create --name DaS --file requirements.txt
conda activate DaS

๐Ÿš€ Inference

We provide our Segmenter model trained on the nuScenes dataset.

Run

python3 inference.py --input-path [path to image/folder with images] --output-dir [where to save the outputs] --cuda

where:

  • --input-path specifies either a path to a single image or a path to a folder with images,
  • output-dir (optional) specifies the output directory, and
  • --cuda is flag denoting whether to run the code on the GPU (strongly recommended).

Example: python3 inference.py --input-path sources/img1.jpeg --output-dir ./outputs --cuda

nuScenes model

We provide weights and config files (CPU / GPU) for the Segmenter model trained on the nuScenes dataset.

The weights and config files are downloaded automatically when running inference.py. Should you prefer to download them by hand, please place them to the ./weights folder.

Waymo Open model

Due to the Waymo Open dataset license terms, we cannot openly share the trained weights. If you are interested in using the model trained on the Waymo Open dataset, please register at the Waymo Open and send the confirmation of your agreement to the license terms to the authors (antonin (dot) vobecky (at) cvut (dot) cz).

๐Ÿ“ˆ Results

In the table below, we report the results of our best model (Segmenter trained on Waymo Open dataset) evaluated in the unsupervised setup (i.e., by obtaining mapping between pseudo- and ground-truth classes by Hungarian algorithm).

Dataset mIoU
Cityscapes (19 cls) 21.8
Cityscapes (27 cls) 15.3
Dark Zurich 14.2
Nighttime Driving 18.9
ACDC (night) 13.8
ACDC (fog) 14.5
ACDC (rain) 14.9
ACDC (snow) 14.6
ACDC (mean) 16.7

Current standings:

PWC PWC PWC PWC

๐Ÿ“– Citation

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follows.

@inproceedings{vobecky2022drivesegment,
  title={Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation},
  author={Antonin Vobecky and David Hurych and Oriane Simรฉoni and Spyros Gidaris and Andrei Bursuc and Patrick Pรฉrez and Josef Sivic},
  journal = {Proceedings of the European Conference on Computer Vision {(ECCV)}},
  month = {October},
  year = {2022}
}

About


Languages

Language:JavaScript 44.8%Language:Python 37.4%Language:HTML 16.8%Language:CSS 1.0%