open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Home Page:https://mmaction2.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KeyError: 'ActionVisualizer is not in the mmengine::visualizer registry.[Bug]

berengueradrian opened this issue · comments

Branch

main branch (1.x version, such as v1.0.0, or dev-1.x branch)

Prerequisite

Environment

I am trying to fine tune my own dataset in AVA format for Spatio Temporal Action Detection in a Docker environment:

ARG PYTORCH="1.13.1"
ARG CUDA="11.6"
ARG CUDNN="8"

FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel

ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX"
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"

# fetch the key refer to https://forums.developer.nvidia.com/t/18-04-cuda-docker-image-is-broken/212892/9
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub 32
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
RUN apt-get update && apt-get install -y git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 ffmpeg \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Install MMCV
RUN pip install wheel
RUN pip install openmim
RUN mim install mmengine mmcv==2.0.0rc4

Then i installed the following:

pip install cython --no-cache-dir
pip install --no-cache-dir -e .
pip install pytorchvideo

Describe the bug

This is the complete log of the bug:

05/07 08:32:33 - mmengine - WARNING - Failed to import `None.registry` make sure the registry.py exists in `None` package.
05/07 08:32:33 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "visualizer" registry tree. As a workaround, the current "visualizer" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
Traceback (most recent call last):
  File "/workspace/mmaction2/tools/train.py", line 143, in <module>
    main()
  File "/workspace/mmaction2/tools/train.py", line 132, in main
    runner = Runner.from_cfg(cfg)
  File "/opt/conda/lib/python3.10/site-packages/mmengine/runner/runner.py", line 462, in from_cfg
    runner = cls(
  File "/opt/conda/lib/python3.10/site-packages/mmengine/runner/runner.py", line 416, in __init__
    self.visualizer = self.build_visualizer(visualizer)
  File "/opt/conda/lib/python3.10/site-packages/mmengine/runner/runner.py", line 803, in build_visualizer
    return VISUALIZERS.build(visualizer)
  File "/opt/conda/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "/opt/conda/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 100, in build_from_cfg
    raise KeyError(
KeyError: 'ActionVisualizer is not in the mmengine::visualizer registry. Please check whether the value of `ActionVisualizer` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module'

Reproduces the problem - code sample

This is the code I have in my configs/detection/slowfast folder:

_base_ = [
    '../../_base_/models/slowfast_r50.py',
    '../../_base_/default_runtime.py'
]
default_scope = "mmdet"

url = ('https://download.openmmlab.com/mmaction/recognition/slowfast/'
       'slowfast_r50_8x8x1_256e_kinetics400_rgb/'
       'slowfast_r50_8x8x1_256e_kinetics400_rgb_20200716-73547d2b.pth')

model = dict(
    type='FastRCNN',
    _scope_='mmdet',
    init_cfg=dict(type='Pretrained', checkpoint=url),
    backbone=dict(
        type='mmaction.ResNet3dSlowFast',
        resample_rate=4,
        speed_ratio=4,
        channel_ratio=8,
        pretrained=None,
        slow_pathway=dict(
            type='resnet3d',
            depth=50,
            pretrained=None,
            lateral=True,
            conv1_kernel=(1, 7, 7),
            dilations=(1, 1, 1, 1),
            conv1_stride_t=1,
            pool1_stride_t=1,
            inflate=(0, 0, 1, 1),
            spatial_strides=(1, 2, 2, 1),
            fusion_kernel=7),
        fast_pathway=dict(
            type='resnet3d',
            depth=50,
            pretrained=None,
            lateral=False,
            base_channels=8,
            conv1_kernel=(5, 7, 7),
            conv1_stride_t=1,
            pool1_stride_t=1,
            spatial_strides=(1, 2, 2, 1))),
    roi_head=dict(
        type='AVARoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor3D',
            roi_layer_type='RoIAlign',
            output_size=8,
            with_temporal_pool=True),
        bbox_head=dict(
            type='BBoxHeadAVA',
            background_class=True,
            in_channels=2304,
            num_classes=8,
            multilabel=False,
            dropout_ratio=0.5)),
    data_preprocessor=dict(
        type='mmaction.ActionDataPreprocessor',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        format_shape='NCTHW'),
    train_cfg=dict(
        rcnn=dict(
            assigner=dict(
                type='MaxIoUAssignerAVA',
                pos_iou_thr=0.9,
                neg_iou_thr=0.9,
                min_pos_iou=0.9),
            sampler=dict(
                type='RandomSampler',
                num=16,
                pos_fraction=1,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=1.0)),
    test_cfg=dict(rcnn=None))


data_root = './data/data_ava-kinetics/train'
data_root_val = './data/data_ava-kinetics/val'
ann_file_train = './data/data_ava-kinetics/train.txt'
ann_file_val = './data/data_ava-kinetics/val.txt'
ann_file_test = './data/data_ava-kinetics/test.txt'

dataset_type = 'AVADataset'
data_root = './data/data_ava-kinetics/train'
anno_root = './data/data_ava-kinetics'

ann_file_train = f'{anno_root}/train.txt'
ann_file_val = f'{anno_root}/val.txt'

exclude_file_train = None
exclude_file_val = None

label_file = f'{anno_root}/label_file.pbtxt'

proposal_file_train = None
proposal_file_val = None

file_client_args = dict(io_backend='disk')
train_pipeline = [
    dict(type='SampleAVAFrames', clip_len=32, frame_interval=2),
    dict(type='RawFrameDecode', **file_client_args),
    dict(type='RandomRescale', scale_range=(256, 320)),
    dict(type='RandomCrop', size=256),
    dict(type='Flip', flip_ratio=0.5),
    dict(type='FormatShape', input_format='NCTHW', collapse=True),
    dict(type='PackActionInputs')
]

# The testing is w/o. any cropping / flipping
val_pipeline = [
    dict(
        type='SampleAVAFrames', clip_len=32, frame_interval=2, test_mode=True),
    dict(type='RawFrameDecode', **file_client_args),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='FormatShape', input_format='NCTHW', collapse=True),
    dict(type='PackActionInputs')
]

train_dataloader = dict(
    batch_size=8,
    num_workers=8,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type=dataset_type,
        ann_file=ann_file_train,
        exclude_file=exclude_file_train,
        pipeline=train_pipeline,
        label_file=label_file,
        proposal_file=proposal_file_train,
        data_prefix=dict(img=data_root)))

val_dataloader = dict(
    batch_size=1,
    num_workers=8,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        ann_file=ann_file_val,
        exclude_file=exclude_file_val,
        pipeline=val_pipeline,
        label_file=label_file,
        proposal_file=proposal_file_val,
        data_prefix=dict(img=data_root),
        test_mode=True))
test_dataloader = val_dataloader

val_evaluator = dict(
    type='AVAMetric',
    ann_file=ann_file_val,
    label_file=label_file,
    exclude_file=exclude_file_val)
test_evaluator = val_evaluator

train_cfg = dict(
    type='EpochBasedTrainLoop', max_epochs=20, val_begin=1, val_interval=1)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')

param_scheduler = [
    dict(type='LinearLR', start_factor=0.1, by_epoch=True, begin=0, end=5),
    dict(
        type='MultiStepLR',
        begin=0,
        end=20,
        by_epoch=True,
        milestones=[10, 15],
        gamma=0.1)
]

optim_wrapper = dict(
    optimizer=dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.00001),
    clip_grad=dict(max_norm=40, norm_type=2))

# Default setting for scaling LR automatically
#   - `enable` means enable scaling LR automatically
#       or not by default.
#   - `base_batch_size` = (8 GPUs) x (8 samples per GPU).
auto_scale_lr = dict(enable=False, base_batch_size=64)

Reproduces the problem - command or script

I run the training like this:
python tools/train.py configs/detection/slowfast/slowfast_kinetics400-pretrained-r50_8xb8-8x8x1-20e_chantwin.py

Reproduces the problem - error message

No response

Additional information

  1. I am trying to obtain results of action detection from my own dataset which contains different action classes, is not multilabel and is not annotated by key frames.
  2. My dataset is composed of about 170 videos ranging from 20s to 1m30s of duration. Each video will contain 1 or more bboxes of the same class but not different classes. Also, it will contain just 1 activity label per bounding box per frame.