Martin Tutek's starred repositories
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
arxiv-latex-cleaner
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
missing-semester
The Missing Semester of Your CS Education 📚
mplcyberpunk
"Cyberpunk style" for matplotlib plots
Fantasy-Premier-League
Creates a .csv file of all players in the English Player League with their respective team and total fantasy points
YouTokenToMe
Unsupervised text tokenizer focused on computational efficiency
pytorch_block_sparse
Fast Block Sparse Matrices for Pytorch
sparse_learning
Sparse learning library and sparse momentum resources.
annotated_encoder_decoder
The Annotated Encoder Decoder with Attention
Better_LSTM_PyTorch
An LSTM in PyTorch with best practices (weight dropout, forget bias, etc.) built-in. Fully compatible with PyTorch LSTM.
eraserbenchmark
A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/
sparselandtools
:sparkles: A Python package for sparse representations and dictionary learning, including matching pursuit, K-SVD and applications.
latent-treelstm
Cooperative Learning of Disjoint Syntax and Semantics
VirtualTeaching
DIY setup for virtual teaching on ubuntu
Textual-Entailment-New-Protocols
This data release is meant to accompany and document the paper: https://arxiv.org/abs/2004.11997 Collecting Entailment Data for Pretraining: New Protocols and Negative Results by Samuel R. Bowman, Jennimaria Palomaki, Livio Baldini Soares, and Emily Pitler