Ofir Press's repositories
attention_with_linear_biases
Code for the ALiBi method for transformer language models (ICLR 2022)
YouMayNotNeedAttention
Code for the Eager Translation Model from the paper You May Not Need Attention
shortformer
Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.
sandwich_transformer
This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.
UsingTheOutputEmbedding
Code for the EACL paper "Using the Output Embedding to Improve Language Models" by Ofir Press and Lior Wolf
tstl_t5_bias
This is our implementation of the T5 bias for fairseq.
tensorflow_with_latest_papers
Implementation of Newest RNN and Seq2Seq Features
BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
composer
library of algorithms to speed up neural network training
LeViT_ALiBi
LeViT + ALiBi
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
ofirpress.github.io
Build a Jekyll blog in minutes, without touching the command line.
RecurrentHighwayNetworks
Recurrent Highway Networks - Author implementation for Tensorflow and Torch
tensorflow
Computation using data flow graphs for scalable machine learning
the-gan-zoo
A list of all named GANs!