aksg87 / adpkd-segmentation-pytorch

Segmentation of kidneys on MRI in Autosomal Dominant Polycystic Kidney

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

broken codebase

mlewis1973 opened this issue · comments

package installs without complaints on MacOS Catalina (conda, python 3.8.5, CPU only torch). inference.py drops segmentation fault. Extensive code walking finds the issue is 'from catalyst import dl'. There is an unknown incompatibility between catalyst and one or more other packages since 'import catalyst.dl' works fine in clean environment. Conflict is not with pytorch. tensorboard has some issue with catalyst (but no segmentation fault). Frankly, do tensorboard, keras, jupyter, etc all really need to be installed for inference? Detritus from training needs to be removed.

On 2 linux platforms (Ubuntu & Centos), package installs without complaint. Current requirements.txt does not have +cuXXX on torch versions so you get CPU-only libraries. No issue with catalyst.dl, but after model downloads, there is reproducible error loading torch module re: pickle serialization

File "/home/mlewis/miniconda3/envs/adpkd_gpu/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

Torch device seems to be CPU only even in a environment w GPU support:
File "/home/mlewis/work/Gears/ADPKD-segmentation/adpkd-segmentation-pytorch-GPU/adpkd_segmentation/utils/train_utils.py", line 38, in load_model_data
checkpoint = torch.load(path, map_location=torch.device('cpu'))
(adpkd_gpu) [mlewis@acr-ailab adpkd-segmentation-pytorch-GPU]$ python -c 'import torch;print(torch.cuda.is_available())'
True

Here is the barebones requirements.txt that will get MacOS to the same error as Linux:
albumentations==1.0.3
catalyst==20.8.2
nibabel==3.2.1
#nvidia-cublas-cu11==11.10.3.66
#nvidia-cuda-nvrtc-cu11==11.7.99
#nvidia-cuda-runtime-cu11==11.7.99
#nvidia-cudnn-cu11==8.5.0.96
opencv-python-headless==4.5.3.56
pydicom==2.3.0
seaborn==0.11.1
segmentation-models-pytorch~=0.2.0
SimpleITK==2.0.2
torch==1.13.1
torchvision==0.14.1

(adpkd) mlewis@bfei-imac275k adpkd-segmentation-pytorch % python adpkd_segmentation/inference/inference.py
Enter run inference...
Downloading: "https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/tf_efficientnet_b5_ap-9e82fae8.pth" to /Users/mlewis/.cache/torch/hub/checkpoints/tf_efficientnet_b5_ap-9e82fae8.pth
100%|██████████████████████████████████████████████████████████████████████████████████| 117M/117M [00:01<00:00, 115MB/s]
loading checkpoint checkpoints/best_val_checkpoint.pth
Traceback (most recent call last):
File "adpkd_segmentation/inference/inference.py", line 117, in
run_inference(
File "adpkd_segmentation/inference/inference.py", line 47, in run_inference
model_args = load_config(
File "/Users/mlewis/flywheel/Gears/AI/ADPKD-segmentation/adpkd-segmentation-pytorch/adpkd_segmentation/inference/inference_utils.py", line 80, in load_config
load_model_data(saved_checkpoint, model, new_format=checkpoint_format)
File "/Users/mlewis/flywheel/Gears/AI/ADPKD-segmentation/adpkd-segmentation-pytorch/adpkd_segmentation/utils/train_utils.py", line 38, in load_model_data
checkpoint = torch.load(path, map_location=torch.device('cpu'))
File "/Users/mlewis/opt/miniconda3/envs/adpkd/lib/python3.8/site-packages/torch/serialization.py", line 795, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/Users/mlewis/opt/miniconda3/envs/adpkd/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

It's also not clear why it's downloading the tensorflow efficientnet as opposed to the torch version....
here's 'pip list' after install with my distilled requirements.txt above:
(adpkd) mlewis@bfei-imac275k adpkd-segmentation-pytorch % pip list
Package Version Editable project location


absl-py 2.0.0
adpkd-segmentation 1.0 /Users/mlewis/flywheel/Gears/AI/ADPKD-segmentation/adpkd-segmentation-pytorch
albumentations 1.0.3
appnope 0.1.3
asttokens 2.4.1
backcall 0.2.0
cachetools 5.3.2
catalyst 20.8.2
certifi 2023.7.22
charset-normalizer 3.3.2
contourpy 1.1.1
cycler 0.12.1
decorator 5.1.1
deprecation 2.1.0
efficientnet-pytorch 0.6.3
executing 2.0.1
fonttools 4.43.1
gitdb 4.0.11
GitPython 3.1.40
google-auth 2.23.4
google-auth-oauthlib 1.0.0
grpcio 1.59.2
idna 3.4
imageio 2.31.6
importlib-metadata 6.8.0
importlib-resources 6.1.0
ipython 8.12.3
jedi 0.19.1
joblib 1.3.2
kiwisolver 1.4.5
lazy_loader 0.3
Markdown 3.5.1
MarkupSafe 2.1.3
matplotlib 3.7.3
matplotlib-inline 0.1.6
munch 4.0.0
networkx 3.1
nibabel 3.2.1
numpy 1.24.4
oauthlib 3.2.2
opencv-python-headless 4.5.3.56
packaging 23.2
pandas 2.0.3
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
Pillow 10.0.1
pip 23.3
plotly 5.18.0
pretrainedmodels 0.7.4
prompt-toolkit 3.0.39
protobuf 4.25.0
ptyprocess 0.7.0
pure-eval 0.2.2
pyasn1 0.5.0
pyasn1-modules 0.3.0
pydicom 2.3.0
Pygments 2.16.1
pyparsing 3.1.1
python-dateutil 2.8.2
pytz 2023.3.post1
PyWavelets 1.4.1
PyYAML 6.0.1
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
scikit-image 0.21.0
scikit-learn 1.3.2
scipy 1.10.1
seaborn 0.11.1
segmentation-models-pytorch 0.2.0
setuptools 68.0.0
SimpleITK 2.0.2
six 1.16.0
smmap 5.0.1
stack-data 0.6.3
tenacity 8.2.3
tensorboard 2.14.0
tensorboard-data-server 0.7.2
tensorboardX 2.6.2.2
threadpoolctl 3.2.0
tifffile 2023.7.10
timm 0.4.12
torch 1.13.1
torchvision 0.14.1
tqdm 4.66.1
traitlets 5.13.0
typing_extensions 4.8.0
tzdata 2023.3
urllib3 2.0.7
wcwidth 0.2.9
Werkzeug 3.0.1
wheel 0.41.2
zipp 3.17.0

The missing instruction is that the package does not pull the model from git-lfs automatically. You have to do it manually.
conda install git-lfs
git lfs install
git lfs pull

the following requirements.txt is all you need for inference and containerization:
albumentations==1.0.3
catalyst==20.8.2
nibabel==3.2.1
#nvidia-cublas-cu11==11.10.3.66
#nvidia-cuda-nvrtc-cu11==11.7.99
#nvidia-cuda-runtime-cu11==11.7.99
#nvidia-cudnn-cu11==8.5.0.96
opencv-python-headless==4.5.3.56
pydicom==2.3.0
seaborn==0.11.1
segmentation-models-pytorch~=0.2.0
SimpleITK==2.0.2
torch==1.13.1
torchvision==0.14.1