SamsungLabs / imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to train with sunrgbd data set

ToTensor opened this issue · comments

What steps should be taken to train the indoor sunrgbd data set?

Just follow our readme for installation, data preprocessing and training command?

When I process the sunrgbd data set, runpython tools/create_data.py sunrgbd --root-path ./data/sunrgbd --out-dir ./data/sunrgbd --extra-tag sunrgbdReport an error。
Traceback (most recent call last): File "tools/create_data.py", line 4, in <module> from data_converter import indoor_converter as indoor File "/home/ly/imvoxelnet-master/tools/data_converter/indoor_converter.py", line 5, in <module> from tools.data_converter.scannet_data_utils import ScanNetData ModuleNotFoundError: No module named 'tools.data_converter'
I tried many ways but couldn't solve it. Could you please help me

But data_converter actually is presented in tools. May be it's something about your installation. Can you please try PYTHONPATH=./ python tools/create_data.py ...?

Excuse mePYTHONPATH=./ python tools/create_data.py ...Where to add?I'm falling apart

Excuse mePYTHONPATH=./ python tools/create_data.py ...Where to add?I'm falling apart

You said you have and error during running python tools/create_data.py ... so you can try PYTHONPATH=./ python tools/create_data.py ....

Btw you can try running ImVoxelNet on SUN RGB-D in the original mmdetection3d repo here.

thanks,i'll try

Thank you for your reply. The problem with the data set has been solved. Can this algorithm be deployed on the edge computing device NVIDIA Jetson Xavier NX

`2023-03-13 16:23:19,100 - mmdet3d - INFO - Environment info:

sys.platform: linux
Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3090
CUDA_HOME: /home/ly/cuda-11.5
NVCC: Cuda compilation tools, release 11.5, V11.5.50
GCC: gcc (Ubuntu 7.5.0-6ubuntu2) 7.5.0
PyTorch: 1.7.1+cu110
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.2+cu110
OpenCV: 4.7.0
MMCV: 1.7.0
MMCV Compiler: GCC 7.5
MMCV CUDA Compiler: not available
MMDetection: 2.27.0
MMSegmentation: 0.30.0
MMDetection3D: 1.0.0rc6+
spconv2.0: False

2023-03-13 16:23:19,100 - mmdet3d - INFO - Distributed training: False
2023-03-13 16:23:19,450 - mmdet3d - INFO - Config:
model = dict(
type='ImVoxelNet',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=False),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=64,
num_outs=4),
neck_3d=dict(
type='ImVoxelNeck',
channels=[64, 128, 256, 512],
out_channels=64,
down_layers=[1, 2, 3, 4],
up_layers=[3, 2, 1],
conditional=False),
bbox_head=dict(
type='SunRgbdImVoxelHead',
n_classes=10,
n_channels=64,
n_convs=0,
n_reg_outs=7),
n_voxels=(80, 80, 32),
voxel_size=(0.08, 0.08, 0.08))
train_cfg = dict()
test_cfg = dict(
nms_pre=1000, nms_thr=0.15, use_rotate_nms=True, score_thr=0.05)
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
dataset_type = 'SunRgbdMultiViewDataset'
data_root = 'data/sunrgbd/'
class_names = ('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser',
'night_stand', 'bookshelf', 'bathtub')
train_pipeline = [
dict(type='LoadAnnotations3D'),
dict(
type='MultiViewPipeline',
n_images=1,
transforms=[
dict(type='LoadImageFromFile'),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Resize',
img_scale=[(512, 384), (768, 576)],
multiscale_mode='range',
keep_ratio=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32)
]),
dict(type='SunRgbdRandomFlip'),
dict(
type='DefaultFormatBundle3D',
class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk',
'dresser', 'night_stand', 'bookshelf', 'bathtub')),
dict(type='Collect3D', keys=['img', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type='MultiViewPipeline',
n_images=1,
transforms=[
dict(type='LoadImageFromFile'),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32)
]),
dict(
type='DefaultFormatBundle3D',
class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk',
'dresser', 'night_stand', 'bookshelf', 'bathtub'),
with_label=False),
dict(type='Collect3D', keys=['img'])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type='RepeatDataset',
times=2,
dataset=dict(
type='SunRgbdMultiViewDataset',
data_root='data/sunrgbd/',
ann_file='data/sunrgbd/sunrgbd_imvoxelnet_infos_train.pkl',
pipeline=[
dict(type='LoadAnnotations3D'),
dict(
type='MultiViewPipeline',
n_images=1,
transforms=[
dict(type='LoadImageFromFile'),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Resize',
img_scale=[(512, 384), (768, 576)],
multiscale_mode='range',
keep_ratio=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32)
]),
dict(type='SunRgbdRandomFlip'),
dict(
type='DefaultFormatBundle3D',
class_names=('bed', 'table', 'sofa', 'chair', 'toilet',
'desk', 'dresser', 'night_stand', 'bookshelf',
'bathtub')),
dict(
type='Collect3D',
keys=['img', 'gt_bboxes_3d', 'gt_labels_3d'])
],
classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk',
'dresser', 'night_stand', 'bookshelf', 'bathtub'),
filter_empty_gt=True,
box_type_3d='Depth')),
val=dict(
type='SunRgbdMultiViewDataset',
data_root='data/sunrgbd/',
ann_file='data/sunrgbd/sunrgbd_imvoxelnet_infos_val.pkl',
pipeline=[
dict(
type='MultiViewPipeline',
n_images=1,
transforms=[
dict(type='LoadImageFromFile'),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32)
]),
dict(
type='DefaultFormatBundle3D',
class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk',
'dresser', 'night_stand', 'bookshelf', 'bathtub'),
with_label=False),
dict(type='Collect3D', keys=['img'])
],
classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser',
'night_stand', 'bookshelf', 'bathtub'),
test_mode=True,
box_type_3d='Depth'),
test=dict(
type='SunRgbdMultiViewDataset',
data_root='data/sunrgbd/',
ann_file='data/sunrgbd/sunrgbd_imvoxelnet_infos_val.pkl',
pipeline=[
dict(
type='MultiViewPipeline',
n_images=1,
transforms=[
dict(type='LoadImageFromFile'),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32)
]),
dict(
type='DefaultFormatBundle3D',
class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk',
'dresser', 'night_stand', 'bookshelf', 'bathtub'),
with_label=False),
dict(type='Collect3D', keys=['img'])
],
classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser',
'night_stand', 'bookshelf', 'bathtub'),
test_mode=True,
box_type_3d='Depth'))
optimizer = dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0001,
paramwise_cfg=dict(
custom_keys=dict(backbone=dict(lr_mult=0.1, decay_mult=1.0))))
optimizer_config = dict(grad_clip=dict(max_norm=35.0, norm_type=2))
lr_config = dict(policy='step', step=[8, 11])
total_epochs = 12
checkpoint_config = dict(interval=1, max_keep_ckpts=1)
log_config = dict(
interval=50,
hooks=[dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')])
evaluation = dict(interval=1)
dist_params = dict(backend='nccl')
find_unused_parameters = True
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
work_dir = './work_dirs/imvoxelnet_sunrgbd'
gpu_ids = range(0, 1)

2023-03-13 16:23:19,450 - mmdet3d - INFO - Set random seed to 0, deterministic: False
/home/ly/Desktop/mmdetection3d/mmdet3d/models/builder.py:86: UserWarning: train_cfg and test_cfg is deprecated, please specify them in model
'please specify them in model', UserWarning)
Traceback (most recent call last):
File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
return obj_cls(**args)
TypeError: init() got an unexpected keyword argument 'voxel_size'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/train.py", line 166, in
main()
File "tools/train.py", line 139, in main
cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
File "/home/ly/Desktop/mmdetection3d/mmdet3d/models/builder.py", line 93, in build_detector
cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 237, in build
return self.build_func(*args, **kwargs, registry=self)
File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: ImVoxelNet: init() got an unexpected keyword argument 'voxel_size'`
Is there a problem with my environment configuration? I tried many ways to solve, can you help

Looks like you are running config from samsunglabs/imvoxelnet containing voxel_size argument in the openmmlab/mmdetection3d codebase without this parameter.

I'm not sure about running the model on NVidia Jetson. Basically it could be possible, as all trainable layers are directly from pytorch e.g. Conv2D or Conv3D. But you somehow need to figure out the code with 2d-3d reprojection and NMS function in preprocessing.

Is there a problem with my compiling mmdet3d? But I compiled it successfully. How can I solve it

Basically, I think you first need to peek one of this 2 implementations. If you use mmdetection3d you don't install imvoxelnet and and vice versa. So, now you use master branch of mmdetection3d or imvoxelnet?