Beast code in Giters

Jonathan Tow's repositories

cs224n

Solutions to CS224n: Natural Language Processing with Deep Learning assignments.

Language:JavaScript70 1 1

text-sed

Implementation of Self-conditioned Embedding Diffusion for Text Generation

Language:PythonMIT35 2 2

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT2 10

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Language:PythonApache-2.0100

accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

Language:PythonApache-2.0000

jon-tow.github.io

My personal website

Language:HTML010

cc_net

Tools to download and cleanup Common Crawl data

Language:PythonMIT000

contriever

Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning

Language:PythonNOASSERTION000

CPCargo

A simple package to upload DL checkpoints to remote storage

Language:PythonApache-2.0000

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.0000

dynamic-sparse-flash-attention

Language:Jupyter NotebookNOASSERTION000

english-wordnet

The Open English WordNet

Language:PythonNOASSERTION000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause000

goodreads

code samples for the goodreads datasets

Apache-2.0000

helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

Language:PythonApache-2.0000

hf_transfer

Language:RustApache-2.0000

kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Language:Jupyter NotebookApache-2.0000

lm-evaluation-harness

Language:PythonMIT010

megablocks

Language:PythonApache-2.0000

Megatron-LLM

distributed trainer for LLMs

Language:PythonNOASSERTION000

ml-engineering

Machine Learning Engineering Guides and Tools

Language:PythonCC-BY-SA-4.0000

ok

Codex-based command line assistant

Language:C++010

rerope

Rectified Rotary Position Embeddings

Language:Python000

ring-flash-attention

Ring attention implementation with flash attention

Language:Python000

scaled-rope

Language:PythonMIT000

scattermoe

Triton-based implementation of Sparse Mixture of Experts.

Apache-2.0000

text-dedup

All-in-one text de-duplication

Language:Jupyter NotebookApache-2.0000

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0000

triton

Development repository for the Triton language and compiler

MIT000

zero-bubble-pipeline-parallelism

Zero Bubble Pipeline Parallelism

NOASSERTION000