Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Batch size less than 12 gives an error

aldinorizaldy opened this issue · comments

Hi @Gofinge ,

sorry for bothering you with this stupid question. If I use S3DIS data and batch_size = 12 in config file (the default value), it works perfectly. But if I reduce the size, it gives me an error.

I also have to set the batch size = 1 when I use another data (Vaihingen 3D) which has 1 point cloud for the train split. Otherwise I had similar error with this #163 (comment) because the batch size is larger than the train samples.

I've tried looking for the same error but it seems no one has experienced this error.

This is the error

 =========> RUN TASK <=========
/opt/conda/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/__init__.py:36: UserWarning: The environment variable `OMP_NUM_THREADS` not set. MinkowskiEngine will automatically set `OMP_NUM_THREADS=16`. If you want to set `OMP_NUM_THREADS` manually, please export it on the command line before running a python script. e.g. `export OMP_NUM_THREADS=12; python your_program.py`. It is recommended to set it below 24.
  warnings.warn(
Traceback (most recent call last):
  File "exp/vaihingen3d/v3d_semseg-spunet-v1m1-0-base/code/tools/train.py", line 38, in <module>
    main()
  File "exp/vaihingen3d/v3d_semseg-spunet-v1m1-0-base/code/tools/train.py", line 27, in main
    launch(
  File "/home/rizald42/containers/Pointcept/exp/vaihingen3d/v3d_semseg-spunet-v1m1-0-base/code/pointcept/engines/launch.py", line 74, in launch
    mp.spawn(
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 2 terminated with the following error:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/rizald42/containers/Pointcept/exp/vaihingen3d/v3d_semseg-spunet-v1m1-0-base/code/pointcept/engines/launch.py", line 137, in _distributed_worker
    main_func(*cfg)
  File "/home/rizald42/containers/Pointcept/exp/vaihingen3d/v3d_semseg-spunet-v1m1-0-base/code/tools/train.py", line 18, in main_worker
    cfg = default_setup(cfg)
  File "/home/rizald42/containers/Pointcept/exp/vaihingen3d/v3d_semseg-spunet-v1m1-0-base/code/pointcept/engines/defaults.py", line 136, in default_setup
    assert cfg.batch_size % world_size == 0
AssertionError

Thanks!!

As the AssertionError said, batch_size % world_size should be 0 (e.g. 12 % 4 == 0).

Thanks!! I did not realize the meaning of the world_size.