archinetai / audio-diffusion-pytorch

audio-diffusion-pytorch/audio_diffusion_pytorch/diffusion.py

Line 85 in eafa972

noise = torch.randn_like(x)

The sigma_t is not samped from 0 to 1 in v-diffusion, which is not like your thesis mentioned, will it cause any trouble?

By sampling a random σt ∈ [0,1], we are more likely to pick a value that resembles x x x0 instead of pure noise ε meaning that the model will more often see data with smaller amount of noise

That's the actual noise. Sigma is the noise level. Check the code 2 lines above the noise variable, it's sampled from the sigma distribution which is uniform in the range [0,1].

Question: the sigma_t is not samped from 0 to 1 in v-diffusion, which is not like your thesis mentioned, will it cause any trouble?