lucidrains/performer-pytorch Issues
Residual Connection
Closed 3Question about masking
Updated 2Pretrained example
UpdatedCausal linear attention benchmark
Closed 13Using Performer with GNNs
UpdatedHuge model state dict size?
UpdatedAttention map
Closed 2Performer Plain
UpdatedRotary Position Embedding
Updatedtorch.max(data_dash) bug
Closed 2Recover attention scores
Updated 3Performer Benchmark
Updated`to_out` bias
Closed 3Decoder Mask
UpdatedTriangular matrices ?
Closed 10Deterministic layers
Updated 1FixNorm alongside ScaleNorm
Updated 3Applying decoder input mask?
Closed 2