FirasGit / medicaldiffusion

Medical Diffusion: This repository contains the code to our paper Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Synthesis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

train own dataset

Zhongrocky opened this issue · comments

Hello, I encountered a situation where the dataset is not available while training my own dataset. The code can return training data, but cannot return validation data. Do you need to change the dataset structure? Or how to run it to train the Vqgan model. Thank you a lot.

I have the same question,It refers that only nii files can work but I am failed to do so with Traceback (most recent call last):
File "train/train_vqgan.py", line 95, in
run()
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/main.py", line 90, in decorated_main
_run_hydra(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra
_run_app(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 452, in _run_app
run_and_report(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 216, in run_and_report
raise ex
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 213, in run_and_report
return func()
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 453, in
lambda: hydra.run(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "train/train_vqgan.py", line 21, in run
train_dataset, val_dataset, sampler = get_dataset(cfg)
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/train/get_dataset.py", line 48, in get_dataset
raise ValueError(f'{cfg.dataset.name} Dataset is not available')
ValueError: liver Dataset is not available
What should I do to adjust my dataset

you need to adjust two lines like shown in this pull request:
746a588
And may also need to swap the name of your dataset in the config files for training. Does the error persist after that?

@benearnthof After modifying the config file and code according to your suggestion, there are still errors. I am considering whether the relationship between image size may be related to preprocessing. In addition, there is a significant gap between my area of interest and the entire image, and there is an extremely imbalanced phenomenon in the image. Can this algorithm be implemented for such image denoising?

@Zhongrocky Without seeing your dataset (image sizes, amount of images, the shape of the input tensors you want to model e.g. the snippet of the images that is relevant for your case) I really cannot answer this question. In general this method should work for 3D data that contains both information in the height and width axes (spatially coherent data) and also is sufficiently self-similar in the time/depth axis.