Jianbin Chang (shjwudp)

shjwudp

Geek Repo

Company:@NVIDIA

Location:Beijing, China

Github PK Tool:Github PK Tool


Organizations
BaguaSys

Jianbin Chang's repositories

shu

中文书籍收录整理, Collection of Chinese Books

Language:PythonLicense:MITStargazers:146Issues:5Issues:3

c4-dataset-script

Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.

Language:PythonLicense:MITStargazers:106Issues:5Issues:0
Language:PythonStargazers:5Issues:3Issues:0

megabyte

A PyTorch implementation of MEGABYTE. This multi-scale transformer architecture has the excellent features of tokenization-free and sub-quadratic attention. The paper link: https://arxiv.org/abs/2305.07185

Language:PythonLicense:MITStargazers:2Issues:2Issues:0

blueprint-trainer

Scaffolding for sequence model training research.

Language:PythonLicense:MITStargazers:1Issues:2Issues:0

apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

bagua-core

Core communication lib for Bagua.

Language:RustLicense:MITStargazers:0Issues:1Issues:0

BLOOM-COT

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

ColossalAI-Examples

Examples of training models with hybrid parallelism using ColossalAI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Language:TeXLicense:MITStargazers:0Issues:0Issues:0

GLM-130B

GLM-130B: An Open Bilingual Pre-Trained Model

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

GPU-math

🤯 GPU math & benchmarks, branched from mli / transformers-benchmarks

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:1Issues:0

hyena-jax

JAX/Flax implementation of the Hyena Hierarchy

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

Language:GoLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MEGABYTE-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch

License:MITStargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

NeMo

NeMo: a toolkit for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OptimalShardedDataParallel

An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel

Language:PythonStargazers:0Issues:0Issues:0

RWKV-LM

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

safari

Convolutions for Sequence Modeling

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:HTMLLicense:MITStargazers:0Issues:1Issues:0

TimeChamber

A Massively Parallel Large Scale Self-Play Framework

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Titans

A collection of models built with ColossalAI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0