performer
Simple implementaion of pytorch Performer.
Model
Performer approximates kernel using random feature map. The kernel expects to replace Transformer's Dot-Product Self attention.
Softmax Kernel
kernel_transformation=softmax_kernel_transformation.
Relu Kernel
kernel_transformation=relu_kernel_transformation
Test & Example
Language Model
Pertrain
pretrain masked language model.
- Pretrain file:
/example/train_mlm.py
. - Config file:
/example/config.json
Usage
① prepare dataset and vocab you want to train
② check configuration in config.json
③ run /example/train_mlm.py
Finetuing
TODO
- Performer performance test
- Write test example
- apply to language model
- evaluate language model