ashleve / transformer

Implementing Transformer achitecture from "Attention Is All You Need"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transformer

PyTorch Lightning Config: Hydra Template
Paper Conference

Description

Reproducing transformer architecture.
Language modelling with Wikitext-2 dataset.
Contains 2 models:

  • Transformer - transformer writter from scratch
  • TransformerPytorch - transformer written with nn.Transformer pytorch modules

How to run

Install dependencies

# clone project
git clone https://github.com/ashleve/transformer
cd transformer

# [OPTIONAL] create conda environment
conda env create -f conda_env_gpu.yaml -n myenv
conda activate myenv

# install requirements
pip install -r requirements.txt

Train model with default configuration

# default
python run.py

# train on CPU
python run.py trainer.gpus=0

# train on GPU
python run.py trainer.gpus=1

Train model with chosen experiment configuration from configs/experiment/

python run.py experiment=experiment_name

You can override any parameter from command line like this

python run.py trainer.max_epochs=20 datamodule.batch_size=64

About

Implementing Transformer achitecture from "Attention Is All You Need"

License:MIT License


Languages

Language:Python 54.0%Language:Jupyter Notebook 45.0%Language:Dockerfile 1.0%