SHI-Labs / Versatile-Diffusion

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023

Home Page:https://arxiv.org/abs/2211.08332

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

integer division or modulo by zero

forensicmike opened this issue · comments

When I run Command:
(versatile-diffusion) C:\Users\mike\Desktop\Versatile-Diffusion>python inference.py --gpu 0 --app image-variation --image ..\invokeai\inputs\00003.png --seed 8 --save log\test.png --coloradj simple

I get:

(versatile-diffusion) C:\Users\mike\Desktop\Versatile-Diffusion>python inference.py --gpu 0 --app image-variation --image ..\invokeai\inputs\00003.png --seed 8 --save log\test.png --coloradj simple
Traceback (most recent call last):
  File "inference.py", line 565, in <module>
    vd_wrapper = vd_inference(pth=pth, fp16=args.fp16, device=device)
  File "inference.py", line 35, in __init__
    net = get_model()(cfgm)
  File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\common\get_model.py", line 87, in __call__
    net = self.model[t](**args)
  File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\vd.py", line 220, in __init__
    super().__init__(*args, **kwargs)
  File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\sd.py", line 55, in __init__
    highlight_print("Running in {} mode".format(self.parameterization))
  File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\sd.py", line 21, in highlight_print
    print_log('')
  File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\log_service.py", line 16, in print_log
    local_rank = sync.get_rank('local')
  File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\sync.py", line 35, in get_rank
    return global_rank % local_world_size
ZeroDivisionError: integer division or modulo by zero

After reviewing line 35 in sync,py, it appears that it is dividing by the torch.cuda.device_count(). I did a little searching and it seems like it is normal/expected for this to return 0 if you have 1 GPU.

If I add in a check,

if local_world_size == 0: return 0

I am able to get past that step.

I think this was due to having a non CUDA enabled torch version installed. Once I had this fixed, I was able to remove the revision and get it to work. Still, the error message associated to this failure wasn't ideal so it might be worth adding some checks?

Correct, the reason is no CUDA so no GPU can be found