Akegarasu / lora-scripts

LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ubuntu+cuda11+troch2.2.1 训练失败

LTtt456c opened this issue · comments

TensorBoard 2.10.1 at http://127.0.0.1:6006/ (Press CTRL+C to quit)
11:40:08-832595 INFO Torch 2.2.0+cu118
11:40:09-061717 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
11:40:09-136322 INFO Torch detected GPU: NVIDIA A100-SXM4-80GB VRAM 40950 Arch (8, 0) Cores 108
11:40:55-039352 INFO Training started with config file / 训练开始,使用配置文件:
/root/lora/lora-scripts/config/autosave/20240303-114055.toml
11:40:55-040973 INFO Task 249b593f-83c8-4281-9ac4-fecced72f486 created
Traceback (most recent call last):
File "/root/lora/lora-scripts/./sd-scripts/train_network.py", line 30, in
import library.train_util as train_util
File "/root/lora/lora-scripts/sd-scripts/library/train_util.py", line 37, in
from torchvision import transforms
File "/root/.conda/envs/sd/lib/python3.10/site-packages/torchvision/init.py", line 6, in
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
File "/root/.conda/envs/sd/lib/python3.10/site-packages/torchvision/_meta_registrations.py", line 164, in
def meta_nms(dets, scores, iou_threshold):
File "/root/.conda/envs/sd/lib/python3.10/site-packages/torch/library.py", line 440, in inner
handle = entry.abstract_impl.register(func_to_register, source)
File "/root/.conda/envs/sd/lib/python3.10/site-packages/torch/_library/abstract_impl.py", line 30, in register
if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist
11:41:07-339489 ERROR Training failed / 训练失败

解决了 模型路径问题 改成./sd-models/xxx.safetensors 就行了