ai-forever / CerberusDet

CerberusDet: Unified Multi-Task Object Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool



CerberusDet: Unified Multi-Task Object Detection

[Paper]


The code is based on:

Install

Python>=3.8.0 is required.

$ git clone
$ pip install -e .

Docker

Change local paths to data in docker-compose.yml file (volumes) and run the docker:

sudo docker-compose up -d
sudo docker attach cerberusdet_cerber_1

Data

  • Use script voc.py to download VOC dataset

For information about the VOC dataset and its creators, visit the PASCAL VOC dataset website.

  • Use script objects365_animals.py to download part of Objects365 dataset with 19 categories, used in the paper.
['Monkey', 'Rabbit', 'Yak', 'Antelope', 'Pig',  'Bear', 'Deer', 'Giraffe', 'Zebra', 'Elephant',
'Lion', 'Donkey', 'Camel', 'Jellyfish', 'Other Fish', 'Dolphin', 'Crab', 'Seal', 'Goldfish']

The Objects365 dataset is available for the academic purpose only. For information about the dataset and its creators, visit the Objects365 dataset website.

IMPORTANT: If some patch tar.gz archives are still present in the Objects365_part/tmp_images directory, it means they were not fully downloaded. Please restart the script for missed patches to obtain all subset images.

Train

  • Download pretrained on COCO yolov8 weights
  • Run train process with 1 GPU
$ python3 cerberusdet/train.py \
--img 640 --batch 32 \
--data data/voc_obj365.yaml \
--weights pretrained/yolov8x_state_dict.pt \
--cfg cerberusdet/models/yolov8x_voc_obj365.yaml \
--hyp data/hyps/hyp.cerber-voc_obj365.yaml \
--name voc_obj365_v8x --device 0
  • OR run train process with several GPUs (batch size will be divided):
$ CUDA_VISIBLE_DEVICES="0,1,2,3" \
python -m torch.distributed.launch --nproc_per_node 4 cerberusdet/train.py \
--img 640 --batch 128 \
--data data/voc_obj365.yaml \
--weights pretrained/yolov8x_state_dict.pt \
--cfg cerberusdet/models/yolov8x_voc_obj365.yaml \
--hyp data/hyps/hyp.cerber-voc_obj365.yaml \
--name voc_obj365_v8x \
--sync-bn

By default logging will be done with tensorboard, but you can use mlflow if set --mlflow-url, e.g. --mlflow-url localhost.

CerberusDet model config details

Example of the model's config for 2 tasks: yolov8x_voc_obj365.yaml

  • The model config is based on yolo configs, except that the head is divided into two sections (neck and head)
  • The layers of the neck section can be shared between tasks or be unique
  • The head section defines what the head will be for all tasks, but each task will always have its own unique parameters
  • The from parameter of the first neck layer must be a positive ordinal number, specifying from which layer, starting from the beginning of the entire architecture, to take features.
  • The cerber section is optional and defines the architecture configuration for determining the neck layers to be shared among tasks. If not specified, all layers will be shared among tasks, and only the heads will be unique.
  • The CerberusDet configuration is constructed as follows:
    cerber: List[OneBranchConfig], where
      OneBranchConfig = List[cerber_layer_number, SharedTasksConfig], where
          cerber_layer_number - the layer number (counting from the end of the backbone) after which branching should occur
          SharedTasksConfig = List[OneBranchGroupedTasks], where
                OneBranchGroupedTasks = [number_of_task1_head, number_of_task2_head, ...] - the task head numbers (essentially task IDs) that should be in the same branch and share layers thereafter

    The head numbers will correspond to tasks according to the sequence in which they are listed in the data configuration.

    Example for YOLO v8x:
    [[2, [[15], [13, 14]]], [6, [[13], [14]]]] - configuration for 3 tasks. Task id=15 will have all task-specific layers, starting from the 3rd. Tasks id=13, id=14 will share layers 3-6, then after the 6th, they will have their own separate branches with all layers.

Evaluation

Inference

  • Download CerberusDet checkpoint trianed on VOC and part of Objects 365 datasets (see below)
  • Run script bash_scripts/detect.sh

Pretrained Checkpoints

Model Train set size
(pixels)
mAPval
50-95
mAPval
50
Speed
V100 b32, fp16
(ms)
params
(M)
FLOPs
@640 (B)
YOLOv8x VOC 640 0.758 0.916 5.6 68 257.5
YOLOv8x Objects365_animals 640 0.43 0.548 5.6 68 257.5
CerberusDet_v8x VOC, Objects365_animals 640 0.751, 0.432 0.918, 0.556 7.2 105 381.3

YOLOv8x models were trained with the the commit: https://github.com/ultralytics/ultralytics/tree/2bc36d97ce7f0bdc0018a783ba56d3de7f0c0518

Hyperparameter Evolution

See the launch example in the bash_scripts/evolve.sh.

Notes
  • To evolve hyperparameters specific to each task, specify initial parameters separately per task and append --evolve_per_task
  • To evolve specific set of hyperparameters, specify their names separated by comma via the --params_to_evolve argument, e.g. --params_to_evolve 'box,cls,dfl'
  • Use absolute paths to configs.
  • Specify search algorith via --evolver. You can use the search algorithms of the ray library (see available values here: predefined_evolvers.py), or 'yolov5'

License

CerberusDet is released under the GNU AGPL v.3 license.

See the file LICENSE for more details.

Citing

If you use our models, code or dataset, we kindly request you to cite our paper and give repository a ⭐

@article{cerberusdet,

   Author = {Irina Tolstykh,Michael Chernyshov,Maksim Kuprashevich},

   Title = {CerberusDet: Unified Multi-Task Object Detection},

   Year = {2024},

   Eprint = {arXiv:2407.12632},

}

About

CerberusDet: Unified Multi-Task Object Detection

License:GNU Affero General Public License v3.0


Languages

Language:Python 99.3%Language:Shell 0.5%Language:Dockerfile 0.2%