yan-hao-tian / lawin

code based on maskformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Lawin Transformer: Improving New-Era Vision Backbones with Multi-Scale Representations for Semantic Segmentation

[Paper] [Poster] [Supplementary] [Code]

🔥🔥🔥 A 4-page abstract version accepted by Transformers for Vision @CVPR2023.

🔥🔥🔥 A formal version Multi-Scale Representations by Varying Window Attention for Semantic Segmentation accepted by ICLR2024

Installation

See installation instructions.

Datasets

See Preparing Datasets for MaskFormer.

Train

More Utilization: See Getting Started with MaskFormer.

MaskFormer

Swin-Tiny

python ./train_net.py \
--resume --num-gpus 2 --dist-url auto \
--config-file configs/ade20k-150/swin/lawin/lawin_maskformer_swin_tiny_bs16_160k.yaml \
OUTPUT_DIR path/to/tiny TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Swin-Small

python ./train_net.py \
--resume --num-gpus 4 --dist-url auto \
--config-file configs/ade20k-150/swin/lawin/lawin_maskformer_swin_small_bs16_160k.yaml \
OUTPUT_DIR path/to/small TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Swin-Base

python ./train_net.py \
--resume --num-gpus 8 --dist-url auto \
--config-file configs/ade20k-150/swin/lawin/lawin_maskformer_swin_base_IN21k_384_bs16_160k_res640.yaml \
OUTPUT_DIR path/to/base TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Swin-Large

python ./train_net.py \
--resume --num-gpus 16 --dist-url auto \
--config-file configs/ade20k-150/swin/lawin/lawin_maskformer_swin_large_IN21k_384_bs16_160k_res640.yaml \
OUTPUT_DIR path/to/large TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Mask2Former

Swin-Tiny

cd Mask2Former
python ./train_net.py \
--resume --num-gpus 2 --dist-url auto \
--config-file configs/ade20k/semantic-segmentation/swin/lawin/lawin_maskformer2_swin_tiny_bs16_160k.yaml \
OUTPUT_DIR path/to/tiny TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Swin-Small

cd Mask2Former
python ./train_net.py \
--resume --num-gpus 4 --dist-url auto \
--config-file configs/ade20k/semantic-segmentation/swin/lawin/lawin_maskformer2_swin_small_bs16_160k.yaml \
OUTPUT_DIR path/to/small TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Swin-Base

cd Mask2Former
python ./train_net.py \
--resume --num-gpus 8 --dist-url auto \
--config-file configs/ade20k/semantic-segmentation/swin/lawin/lawin_maskformer2_swin_base_IN21k_384_bs16_160k_res640.yaml \
OUTPUT_DIR path/to/base TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Swin-Large

cd Mask2Former
python ./train_net.py \
--resume --num-gpus 16 --dist-url auto \
--config-file configs/ade20k/semantic-segmentation/swin/lawin/lawin_maskformer2_swin_large_IN21k_384_bs16_160k_res640.yaml \
OUTPUT_DIR path/to/large TEST.EVAL_PERIOD 10000 MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Evaluation

MaskFormer

python ./train_net.py \
--eval-only --num-gpus NGPUS --dist-url auto \
--config-file path/to/config \
MODEL.WEIGHTS path/to/weight TEST.AUG.ENABLED True MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Mask2Former

cd Mask2Former
python ./train_net.py \
--eval-only --num-gpus NGPUS --dist-url auto \
--config-file path/to/config \
MODEL.WEIGHTS path/to/weight TEST.AUG.ENABLED True MODEL.MASK_FORMER.SIZE_DIVISIBILITY 64

Model

Name Backbone crop
size
lr
sched
mIoU mIoU
(ms+flip)
download
Lawin-MaskFormer Swin-T 512x512 160k 47.4 49.0 model
Lawin-MaskFormer Swin-S 512x512 160k 50.5 52.7 model
Lawin-MaskFormer Swin-B 640x640 160k 53.8 54.6 model
Lawin-MaskFormer Swin-L 640x640 160k 55.3 56.5 model
Name Backbone crop
size
lr
sched
mIoU mIoU
(ms+flip)
download
Lawin-Mask2Former Swin-T 512x512 160k 48.2 50.5 model
Lawin-Mask2Former Swin-S 512x512 160k 52.1 53.7 model
Lawin-Mask2Former Swin-B 640x640 160k 54.6 56.0 model
Lawin-Mask2Former Swin-L 640x640 160k 56.5 57.8 model

Citing Lawin Transformer

@article{yan2022lawin,
  title={Lawin transformer: Improving semantic segmentation transformer with multi-scale representations via large window attention},
  author={Yan, Haotian and Zhang, Chuang and Wu, Ming},
  journal={arXiv preprint arXiv:2201.01615},
  year={2022}
}

About

code based on maskformer

License:Other


Languages

Language:Python 94.9%Language:Cuda 4.4%Language:C++ 0.5%Language:Cython 0.1%Language:C 0.1%Language:Shell 0.1%