train own dataset

Question

train own dataset

Zhongrocky opened this issue a year ago · comments

Hello, I encountered a situation where the dataset is not available while training my own dataset. The code can return training data, but cannot return validation data. Do you need to change the dataset structure? Or how to run it to train the Vqgan model. Thank you a lot.

15835550088 · Answer 1 · Mon May 08 2023 17:08:32 GMT+0800 (China Standard Time)

I have the same question,It refers that only nii files can work but I am failed to do so with Traceback (most recent call last):
File "train/train_vqgan.py", line 95, in
run()
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/main.py", line 90, in decorated_main
_run_hydra(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra
_run_app(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 452, in _run_app
run_and_report(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 216, in run_and_report
raise ex
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 213, in run_and_report
return func()
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/utils.py", line 453, in
lambda: hydra.run(
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "train/train_vqgan.py", line 21, in run
train_dataset, val_dataset, sampler = get_dataset(cfg)
File "/home/21zzz/.conda/envs/medicaldiffusion/lib/python3.8/site-packages/train/get_dataset.py", line 48, in get_dataset
raise ValueError(f'{cfg.dataset.name} Dataset is not available')
ValueError: liver Dataset is not available
What should I do to adjust my dataset

benearnthof · Answer 2 · Wed May 24 2023 22:14:31 GMT+0800 (China Standard Time)

you need to adjust two lines like shown in this pull request:
746a588
And may also need to swap the name of your dataset in the config files for training. Does the error persist after that?

Zhongrocky · Answer 3 · Tue May 30 2023 09:16:30 GMT+0800 (China Standard Time)

@benearnthof After modifying the config file and code according to your suggestion, there are still errors. I am considering whether the relationship between image size may be related to preprocessing. In addition, there is a significant gap between my area of interest and the entire image, and there is an extremely imbalanced phenomenon in the image. Can this algorithm be implemented for such image denoising?

benearnthof · Answer 4 · Wed May 31 2023 01:58:26 GMT+0800 (China Standard Time)

@Zhongrocky Without seeing your dataset (image sizes, amount of images, the shape of the input tensors you want to model e.g. the snippet of the images that is relevant for your case) I really cannot answer this question. In general this method should work for 3D data that contains both information in the height and width axes (spatially coherent data) and also is sufficiently self-similar in the time/depth axis.