nightowlowl/OV_PARTS

OV-PARTS: Towards Open-Vocabulary Part Segmentation

This codebase contains code for baselines used in the paper "OV-PARTS: Towards Open-Vocabulary Part Segmentation".

Enviroment

torch==1.13.1
torchvision==0.14.1
detectron2==0.6 #Following https://detectron2.readthedocs.io/en/latest/tutorials/install.html to install it and some required packages
mmcv==1.7.1

FurtherMore, install the modified clip package.

cd third_party/CLIP
python -m pip install -e .

Data Preparation

We provide the download links for the two benchmark datasets in OV-PARTS: the refined Pascal-Part-116 and ADE20K-Part-234 datasets.

[Pascal-Part-116] [ADE20K-Part-234]

After downloading the datasets, please extract the files by running the following command and place the extracted folder under the "Datasets" directory.

tar -xzf PascalPart116.tar.gz
tar -xzf ADE20KPart234.tar.gz

The Datasets folder should follow this structure:

Datasets/
├─Pascal-Part-116/
│ ├─train_16shot.json
│ ├─images/
│ │ ├─train/
│ │ └─val/
│ ├─annotations_detectron2_obj/
│ │ ├─train/
│ │ └─val/
│ └─annotations_detectron2_part/
│   ├─train/
│   └─val/
└─ADE20K-Part-234/
  ├─images/
  │ ├─training/
  │ ├─validation/
  ├─train_16shot.json
  ├─ade20k_instance_train.json
  ├─ade20k_instance_val.json
  └─annotations_detectron2_part/
    ├─training/
    └─validation/

Training
- Training the two-stage baseline ZSseg+.
  
  Please first download the clip model fintuned with CPTCoOp.
  
  Then run the training command:
```
# For ZSSeg+.
python train_net.py --num-gpus 8 --config-file configs/${SETTING}/zsseg+_R50_coop_${DATASET}.yaml
```
- Training the one-stage baselines CLIPSeg and CATSeg.
  
  Please first download the pre-trained object models of CLIPSeg and CATSeg and place them under the "pretrain_weights" directory.
  
  Models Pre-trained checkpoint
  
  CLIPSeg download
  
  CATSeg download
  
  Then run the training command:
```
# For CATseg.
python train_net.py --num-gpus 8 --config-file configs/${SETTING}/catseg_${DATASET}.yaml

# For CLIPseg.
python train_net.py --num-gpus 8 --config-file configs/${SETTING}/clipseg_${DATASET}.yaml
```

Models	Pre-trained checkpoint
CLIPSeg	download
CATSeg	download

Evaluation

We provide the trained weights for the three baseline models reported in the paper.

Models	Setting	Pascal-Part-116 checkpoint	ADE20K-Part-234 checkpoint
ZSSeg+	Zero-shot	download	download
CLIPSeg	Zero-shot	download	download
CatSet	Zero-shot	download	download
CLIPSeg	Few-shot	download	download
CLIPSeg	cross-dataset	-	download

To evaluate the trained models, add --eval-only to the training command. For example:

  python train_net.py --num-gpus 8 --config-file configs/${SETTING}/catseg_${DATASET}.yaml --eval-only MODEL.WEIGHTS ${WEIGHT_PATH}

Acknowledgement

We would like to express our gratitude to the open-source projects and their contributors, including ZSSeg, CATSeg and CLIPSeg. Their valuable work has greatly contributed to the development of our codebase.

nightowlowl / OV_PARTS

OV-PARTS: Towards Open-Vocabulary Part Segmentation

Enviroment

Data Preparation

Training

Evaluation

Acknowledgement

About

Languages