Deterministic image-to-image translation

Question

Deterministic image-to-image translation

LRpz opened this issue 7 months ago · comments

Hi

I was wondering if Med-DDPM model could also be used for deterministic image-to-image translation tasks in 3D, such as converting brightfield to fluorescence images, where a given input must consistently produce the same output.

Do you think adjustments might be necessary in the conditioning mechanism, the selection of loss functions, and/or hyperparameter tuning to achieve high fidelity in the translation process while ensuring the output's determinism.

Many thanks!

mobaidoctor · Answer 1 · Thu Feb 15 2024 14:33:27 GMT+0800 (China Standard Time)

Hi, thank you for your inquiry. Med-DDPM can indeed be used for any 3D image-to-image translation tasks. We are currently enhancing our method to extend its capabilities further. We have already achieved very good results in MRI to CT translation tasks. To obtain the best results for each different task, you may need to adjust the hyperparameters accordingly.

LRpz · Answer 2 · Thu Feb 15 2024 16:25:33 GMT+0800 (China Standard Time)

Thank you very much for the quick response.
Looking forward to see your latest results!

LRpz · Answer 3 · Mon Feb 19 2024 22:09:39 GMT+0800 (China Standard Time)

Hi,

Although, I could sucessfully train a model to perform image to image translation, I do struggle to obtain a determinsitic output during inference.

To illustrate my point, I attached the following, where an input image has been iteratively translated 7 times. The middle plane of the stack is shown.

I was wondering if you could provide some guidance regarding which parameters could be tuned to allow the model to be purely deterministic?

I apologize in advance if this is a naive question - my background in this field is very limited.

Cheers,

mobaidoctor · Answer 4 · Tue Feb 20 2024 13:32:48 GMT+0800 (China Standard Time)

Thank you for getting in touch about your issue. Could you please share some details of your experiment with us? How many training images did you use? Can you share some samples from your training dataset? What were your hyperparameter settings? Did you modify any hyperparameters from the ones in our repository? How many iterations did your training involve? Did you use the same scheduled learning rate as we did in our training?

LRpz · Answer 5 · Tue Feb 20 2024 19:18:48 GMT+0800 (China Standard Time)

__Could you please share some details of your experiment with us?_
I would like to translate a single channel Z stack (acquired in brightfield) into another single channel Z stack (in fluorescence).

How many training images did you use?
About 2000 pairs of Z-stacks of dimension 128x128x48 pixels

Can you share some samples from your training dataset?
Both the input and target images where scaled between -1 and 1 and saved as .nii.gz in input and target folders, image pairs sharing the same filename. I can send you a link to some images privately.

What were your hyperparameter settings?
Given the input not being a mask, I made the following changes to your 'train.py' script:

Set the number of channels to 2
(I believe the rationale here is n_input channel + 1 to match the mask reshaping logic from label2masks)
in_channels = 2

Set the 'full_channel_mask' to 'False' to bypass the label2masks and resize_img_4d functions

transform = Compose([
    Lambda(lambda t: t[:128, :128, :48]), #My input images are 256x256x50, I added this crop temporarily
    Lambda(lambda t: torch.tensor(t).float()),
    Lambda(lambda t: (t * 2) - 1),
    Lambda(lambda t: t.unsqueeze(0)),
    Lambda(lambda t: t.transpose(3, 1)),
])

dataset = NiftiPairImageGenerator(
        inputfolder,
        targetfolder,
        input_size=input_size,
        depth_size=depth_size,
        input_channel = 1,
        transform=transform,
        target_transform=transform,
        full_channel_mask=False
    )

So the set of input hyperparameters are:

in_channels = 2
out_channels = 1
input_size = 128
depth_size = 48
num_channels = 64
num_res_blocks = 1
num_class_labels = 1
save_and_sample_every = 1000
timesteps = 250
batchsize = 1
epochs = 100000
with_condition = True
resume_weight = ''
train_lr = 1e-5

Did you modify any hyperparameters from the ones in our repository?
No I did not. Other than described in the previous question, I left all parameters to their default values.

How many iterations did your training involve?
100 000 epochs.

Did you use the same scheduled learning rate as we did in our training?
Yes, I did not change the scheduling, nor the betas.

After exeprimenting a bit further, I noticed that during inference with my trained model, most outputs where just noise when using 250 timesteps and approx. 1/7 inference was giving somewhat decent results. But after decreasing the number of timesteps to 100-150, thre results were more consistants although the signal to noise was very low (see the attached image from my previous post).

Could it be that the noise is added to the condition tensor after my modifications?

mobaidoctor · Answer 6 · Wed Feb 21 2024 19:14:29 GMT+0800 (China Standard Time)

Thank you for sharing the details of your experiment. It appears that your model may not have been trained sufficiently, given that you have 2,000 samples in your training dataset and have only completed 100,000 iterations. This equates to your model being trained for merely 50 epochs. We recommend training your model for at least 100 epochs, which, in your case, would amount to 200,000 iterations. Initially, you might need to train your model for 100,000 iterations with a learning rate of 0.0001. Subsequently, select your best-performing model and fine-tune it by reducing the learning rate to 0.00001 for another 100,000 iterations. This approach totals 100 epochs, with the first 50 epochs at a higher learning rate and the remaining 50 epochs at a lower learning rate. We hope that by adopting this scheduled learning rate strategy, you will achieve an acceptable model.

However, if you encounter any issues following this advice, we encourage you to share your code along with samples from your training dataset so that we can investigate the precise nature of your problem. Nonetheless, we are optimistic that you will not face any significant issues if you adhere to our suggestions and ensure your model is trained for a sufficient number of epochs.

LRpz · Answer 7 · Thu Feb 22 2024 00:31:55 GMT+0800 (China Standard Time)

Thanks a lot, I will try that and let you know how it goes.

Asma Seraj · Answer 8 · Sat Feb 24 2024 02:14:03 GMT+0800 (China Standard Time)

Hi. Hope you're doing well.
I've seen that in the code you've mentioned that the model could be class-conditioned. I have a dataset that contains three subfolders (so I have three classes), each with its distinct feature. Can I utilize Med-DDPM to train a class-conditioned model? How can I achieve this and what changes should I implement on the model? I would appreciate it if you could provide some guidance for me.
Thank you so much.

mobaidoctor · Answer 9 · Sat Feb 24 2024 13:58:38 GMT+0800 (China Standard Time)

@asmaseraj #17 (comment)

mobaidoctor · Answer 10 · Thu Mar 14 2024 19:13:28 GMT+0800 (China Standard Time)

@LRpz Hi, if your issue is resolved and you have no further questions, I will proceed to close this issue.

LRpz · Answer 11 · Thu Mar 14 2024 19:27:13 GMT+0800 (China Standard Time)

Hi,

I trained a model for 100 epochs as per your recommendations: 100,000 iterations with a learning rate of 0.0001, selected the best-performing model and fine-tune it for another 100,000 iterations with a learning rate of 0.00001.

Unfortuantely, I do still observe high varaition between each inferences with 250 timesteps.

Given the inference speed of the diffusion model and the size and resolution requirements of my samples (thousands of pixels in XY and hunderds in Z), I decided to adopt a 3D pix2pix approach which returns decent results at very high inference speeds.

Thank you veru much for trying to troubleshoot my issue and getting back to me.

All the best with your experiments!

mobaidoctor · Answer 12 · Fri Mar 15 2024 21:28:57 GMT+0800 (China Standard Time)

Hi @LRpz I'm sorry to hear that you've decided to move on from exploring diffusion models, but I understand your concerns about the inference speed and the specific requirements of your project. It's true that diffusion models can be slower compared to GANs, especially for tasks requiring high resolution and speed. Your feedback is valuable, and it helps us focus on improving in these areas. Best of luck with your experiments, and thank you for your efforts.