Questions about the noise sampling.

Question

Questions about the noise sampling.

Muon2 opened this issue 7 months ago · comments

Great works!
I am wondering:

Why not use EDM noise sampling instead of the stratege in the simple diffusion?
Why using fixed noise strength (0) on the condition image? I thinkl the sampling expression has been given in svd paper.

Muhammad Muaz · Answer 1 · Mon Jan 15 2024 19:54:36 GMT+0800 (China Standard Time)

@pixeli99 Thanks for your work.
I have a similar question, why you chose the rand_cosine_interpolated noise scheduler instead of the one mentioned in the EDM (Karras et al.) paper. The one highlighted in the following image. Correct me if my understanding is wrong!

(Edit: And what does the variables noise_d_low and noise_d_high correspond to?)

Thanks in advance.

Pengxiang Li · Answer 2 · Tue Jan 16 2024 02:03:27 GMT+0800 (China Standard Time)

Thank you very much for raising this question,
This was due to an oversight on my part, as I originally thought that sigma in the code followed a simple log-normal distribution, but I uploaded an incorrect version;

The use of simple diffusion was merely an immature attempt of mine, because my intent was to try mixing videos of different resolutions for training. I wanted to use the strategy of simple diffusion to apply noise of varying distribution (in fact, I am not sure if my understanding is correct,). So, I am actually very eager to ask everyone for their understanding of p_train(σ).

Pengxiang Li · Answer 3 · Tue Jan 16 2024 02:04:53 GMT+0800 (China Standard Time)

Regarding the second question, it is solely because I have been lazy that I will complete this section of the code.

Pengxiang Li · Answer 4 · Tue Jan 16 2024 02:08:39 GMT+0800 (China Standard Time)

@pixeli99 Thanks for your work. I have a similar question, why you chose the rand_cosine_interpolated noise scheduler instead of the one mentioned in the EDM (Karras et al.) paper. The one highlighted in the following image. Correct me if my understanding is wrong!

(Edit: And what does the variables noise_d_low and noise_d_high correspond to?)

Thanks in advance.

Hello, as @Muon2 mentioned, this is the stratege of simple diffusion. You can look at section 3.1 of the paper for more detailed information (since I don't fully understand it either, it might be more reliable to read the original paper directly😢).

Muhammad Muaz · Answer 5 · Tue Jan 16 2024 06:52:54 GMT+0800 (China Standard Time)

@pixeli99 Thanks for your work. I have a similar question, why you chose the rand_cosine_interpolated noise scheduler instead of the one mentioned in the EDM (Karras et al.) paper. The one highlighted in the following image. Correct me if my understanding is wrong!
(Edit: And what does the variables noise_d_low and noise_d_high correspond to?)
Thanks in advance.

Hello, as @Muon2 mentioned, this is the stratege of simple diffusion. You can look at section 3.1 of the paper for more detailed information (since I don't fully understand it either, it might be more reliable to read the original paper directly😢).

I see. I'll look into the Simple Diffusion paper.

BTW, I am curious have you tried other noise scheduling techniques besides the one mentioned in the Simple Diffusion paper?

Muon2 · Answer 6 · Tue Jan 16 2024 14:09:53 GMT+0800 (China Standard Time)

@pixeli99 Thanks for your work. I have a similar question, why you chose the rand_cosine_interpolated noise scheduler instead of the one mentioned in the EDM (Karras et al.) paper. The one highlighted in the following image. Correct me if my understanding is wrong!
(Edit: And what does the variables noise_d_low and noise_d_high correspond to?)
Thanks in advance.

Hello, as @Muon2 mentioned, this is the stratege of simple diffusion. You can look at section 3.1 of the paper for more detailed information (since I don't fully understand it either, it might be more reliable to read the original paper directly😢).

I think you may be wrong about the simplediffusion sigma sampling. If you are using simplediffusion , you should probably change the sampling scheduler as well, since the Euler sampler is using original EDM timesteps formula instead of the one in the simplediffusion .

Pengxiang Li · Answer 7 · Tue Jan 16 2024 14:31:52 GMT+0800 (China Standard Time)

I understand what you're saying, but I think that different sigma distributions correspond to different diffusion paths. In theory, would it be possible to use the same sampler for sampling? I suspect there might be a flaw in my understanding, but I'm not sure where I've gone wrong. When we're training, is our definition of the timestep the same, as in 0.25ln(σ)?

Pengxiang Li · Answer 8 · Tue Jan 16 2024 14:33:31 GMT+0800 (China Standard Time)

@m-muaz I haven't tried it yet, but if I make any progress, I will update here.

Muon2 · Answer 9 · Tue Jan 16 2024 15:01:13 GMT+0800 (China Standard Time)

I understand what you're saying, but I think that different sigma distributions correspond to different diffusion paths. In theory, would it be possible to use the same sampler for sampling? I suspect there might be a flaw in my understanding, but I'm not sure where I've gone wrong. When we're training, is our definition of the timestep the same, as in 0.25ln(σ)?

Sigma distributions would NOT affect diffusion paths actually. The key in simplediffusion is the change of alpha and beta scheduler, just changing the training sigma distributions would not help in my opinion.

Pengxiang Li · Answer 10 · Tue Jan 16 2024 15:24:21 GMT+0800 (China Standard Time)

I roughly understand what you mean, I might still need to read carefully to grasp the principle here, thank you very much for your reply.