test.py problems

Question

test.py problems

kingjames1155 opened this issue 2 years ago · comments

Hello All,

I encountered the following problems when running test.py. Have you ever encountered this situation.
Environment：
PyTorch 1.6.0
Python 3.8(ubuntu18.04)
Cuda 10.1（RTX2080Ti）

problems :
Initialize optimizer
Load pre-processed data
LIDC training dataset
LIDC validation dataset
LIDC testing dataset
LUNGx training dataset
LUNGx testing dataset
Initialize evaluator

LIDC_123vs45

training
Evaluation: 0%| | 0/648 [00:00<?, ?sample/s]

validation
Evaluation: 0%| | 0/163 [00:00<?, ?sample/s]

testing
Evaluation: 0%| | 0/72 [00:00<?, ?sample/s]

LUNGx

training
Evaluation: 0%| | 0/10 [00:00<?, ?sample/s]

validation

testing
Evaluation: 0%| | 0/73 [00:00<?, ?sample/s]
wandb: Waiting for W&B process to finish... (success).
wandb:
wandb: Synced Experiment_1/trial_1/epoch_180: https://wandb.ai/kingjames12138/CIR_test/runs/31ko97oh
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./experiments/MICCAI2022/Experiment_001/trial_1/wandb/run-20220922_145012-31ko97oh/logs

Wookjin Choi · Answer 1 · Thu Sep 22 2022 22:35:11 GMT+0800 (China Standard Time)

We only tested it with pytorch 1.11 and cuda 11.3.
I think DataLoader cannot load the preprocessed data. It requires pytorch3d, but pytorch3d does not support pytorch 1.6.
I suggest you test the model using the same versions of the required packages https://github.com/nadeemlab/CIR#installation or the Docker container https://github.com/nadeemlab/CIR#docker.

kingjames1155 · Answer 2 · Fri Sep 23 2022 10:36:38 GMT+0800 (China Standard Time)

We only tested it with pytorch 1.11 and cuda 11.3. I think DataLoader cannot load the preprocessed data. It requires pytorch3d, but pytorch3d does not support pytorch 1.6. I suggest you test the model using the same versions of the required packages https://github.com/nadeemlab/CIR#installation or the Docker container https://github.com/nadeemlab/CIR#docker.

Thank you for your comments . I tested it with pytorch 1.11 and cuda 11.3 at first. But I encountered the following problems.Remind me of my lack libcudart.so.10.1 . So I used Cuda 10.1

[3/3] c++ rasterize_cuda.o rasterize_cuda_kernel.cuda.o -shared -L/root/miniconda3/lib/python3.8/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda_cu -ltorch_cuda_cpp -ltorch -ltorch_python -L/usr/local/cuda/lib64 -lcudart -o rasterize_cuda.so
Loading extension module rasterize_cuda...
Traceback (most recent call last):
File "test.py", line 13, in
from model.voxel2mesh_nodule import Voxel2Mesh as network
File "/root/cir/model/voxel2mesh_nodule.py", line 6, in
from pytorch3d.ops import sample_points_from_meshes, SubdivideMeshes
File "/root/miniconda3/lib/python3.8/site-packages/pytorch3d/ops/init.py", line 5, in
from .graph_conv import GraphConv
File "/root/miniconda3/lib/python3.8/site-packages/pytorch3d/ops/graph_conv.py", line 8, in
from pytorch3d import _C
ImportError: libcudart.so.10.1: cannot open shared object file: No such file or directory

kingjames1155 · Answer 3 · Fri Sep 23 2022 10:52:52 GMT+0800 (China Standard Time)

I found this problem because I installed the pytorch3d version.Thanks again for your comments and helps.

Navdeep Dahiya · Answer 4 · Sat Sep 24 2022 06:48:01 GMT+0800 (China Standard Time)

I think there may be a mismatch in your system cuda version and virtual environment cuda version. If you install Cuda 11.3 in virtual environment, then you will have to (I think) install the same version of Cuda and CuDnn in Ubuntu OS as well.