When building Deep learning models the transformer architechture is often used, in particular Multi headed attention (MHA).
This repo is made to save me retyping out everything when ever I want to use MHA in a project.
Implemented in pytorch
Boilerplate for transformer architechture
When building Deep learning models the transformer architechture is often used, in particular Multi headed attention (MHA).
This repo is made to save me retyping out everything when ever I want to use MHA in a project.
Implemented in pytorch
Boilerplate for transformer architechture