orobix/fwdgrad Issues
No speed-up in my implementation too
Updated 3Implementation quick question
Updated 1Use FGD to fine-tune the transformer
Updated 1Test function
ClosedCan't train to convergence
Closed 2
Implementation of "Gradients without backpropagation" paper (https://arxiv.org/abs/2202.08587) using functorch