[Question] Issues locating dataset using Docker

Question

[Question] Issues locating dataset using Docker

NMVRodrigues opened this issue 8 months ago · comments

Hi,
I'm experiencing different behaviours when using Docker and installing from source.
When installing from source, running nndet_prep 149 yields the intended results, preparing my dataset, however when using Docker, it cannot find the files, producing this error: ValueError:Expected/opt/data/Task149_lesions/raw_splitted/imagesTr/case_020_0000.nii.gz to be a raw splitted data path but it does not exist.

I ran the nndet_example command to generate the example dataset, which is stored in the same folder containing my Task149 and follows the exact same structure. However, running nndet_prep on the 000 example dataset yields the intended results, unlike with my dataset.
I need to use the Docker container for this project, and I tried looking through all similar issues but could not find a solution for this problem, unfortunately.

Here are all paths/commands I have set, in order it will shed some light into whatever it is I'm doing wrong:

sudo docker run --gpus all -v /home/Nuno/datasets/nnDetection_files/data:/opt/data -v /home/Nuno/datasets/nnDetection_files/models:/opt/models -it --shm-size=24gb nndetection:0.1 /bin/bash

root@5addd449df2d:/opt/data# ls
Task000D3_Example    Task149_lesions   
root@5addd449df2d:/opt/data# cd Task149_lesions/
root@5addd449df2d:/opt/data/Task149_Prostate158Lesions# ls
dataset.yaml  raw_splitted
root@5addd449df2d:/opt/data/Task149_lesions# cd raw_splitted/
root@5addd449df2d:/opt/data/Task149_lesions/raw_splitted# ls
imagesTr  imagesTs  labelsTr  labelsTs
root@5addd449df2d:/opt/data/Task149_lesions/raw_splitted# cd imagesTr
root@5addd449df2d:/opt/data/Task149_lesions/raw_splitted/imagesTr# ls
case_020_0000.nii.gz  case_063_0000.nii.gz  (...)
root@5addd449df2d:/opt/data/Task149_lesions/raw_splitted# cd labelsTr
root@5addd449df2d:/opt/data/Task149_lesions/raw_splitted/labelsTr# ls
case_020.nii.gz    case_020.json   case_063.nii.gz    case_063.json   (...)

The dataset file:

task: Task149_lesions

name: "lesions" # [Optional]
dim: 3 # number of spatial dimensions of the data

# Note: need to use integer value which is defined below of target class!
target_class: 0 # [Optional] define class of interest for patient level evaluations
test_labels: True # manually splitted test set

labels: # classes of data set; need to start at 0
    "0": "Lesion"

modalities: # modalities of data set; need to start at 0
    "0": "CT"

root@99b2bb3976c0:/opt/code/nndet# nndet_env
----- PyTorch Information -----
PyTorch Version: 1.11.0a0+b6df043
PyTorch Debug: False
PyTorch CUDA: 11.5
PyTorch Backend cudnn: 8300
PyTorch CUDA Arch List: ['sm_52', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'compute_86']
PyTorch Current Device Capability: (8, 6)
PyTorch CUDA available: True


----- System Information -----
System NVCC: nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_Sep_13_19:13:29_PDT_2021
Cuda compilation tools, release 11.5, V11.5.50
Build cuda_11.5.r11.5/compiler.30411180_0

System Arch List: 5.2 6.0 6.1 7.0 7.5 8.0 8.6+PTX
System OMP_NUM_THREADS: 1
System CUDA_HOME is None: False
System CPU Count: 128
Python Version: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:59:51) 
[GCC 9.4.0]


----- nnDetection Information -----
det_num_threads 6
det_data is set True
det_models is set True

Best regards,
Nuno

Nuno Rodrigues · Answer 1 · Mon Oct 02 2023 18:27:39 GMT+0800 (China Standard Time)

Turns out the problem was that my dataset is composed of symlinks and Docker does not work well with those. Swapping for the original files solved the problem.