A crude effort to reimplement DiffusionBERT
Training does not work yet...
D3PM paper: https://arxiv.org/pdf/2107.03006.pdf
Diffusion BERT paper: https://arxiv.org/pdf/2211.15029.pdf
Math derivation blog post: https://beckham.nz/2022/07/11/d3pms.html
Bayes Rule: https://towardsdatascience.com/bayes-rule-with-a-simple-and-practical-example-2bce3d0f4ad0