lucidrains/CoLT5-attention Issues
Simple ViT error
Updated 1Wrong results given by triton bwd
Closed 16The position embedding?
Closed 5GPT type T5 impementation
Closed 2
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch