Jianbin Chang's repositories
c4-dataset-script
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.
blueprint-trainer
Scaffolding for sequence model training research.
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
bagua-core
Core communication lib for Bagua.
BLOOM-COT
Ongoing research training transformer language models at scale, including: BERT & GPT-2
ColossalAI-Examples
Examples of training models with hybrid parallelism using ColossalAI
GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
GPU-math
🤯 GPU math & benchmarks, branched from mli / transformers-benchmarks
hyena-jax
JAX/Flax implementation of the Hyena Hierarchy
juicefs
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
NeMo
NeMo: a toolkit for conversational AI
OptimalShardedDataParallel
An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel
RWKV-LM
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
safari
Convolutions for Sequence Modeling
TimeChamber
A Massively Parallel Large Scale Self-Play Framework
tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
Titans
A collection of models built with ColossalAI
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.