Xiang Jiang's starred repositories
llm-foundry
LLM training code for Databricks foundation models
alpaca-lora
Instruct-tune LLaMA on consumer hardware
Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
deep-learning-containers
AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet.
ColossalAI
Making large AI models cheaper, faster and more accessible
cdx_toolkit
A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine
huggingface_hub
The official Python client for the Huggingface Hub.
Megatron-LM
Ongoing research training transformer models at scale
LaMDA-rlhf-pytorch
Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
cc-crawl-statistics
Statistics of Common Crawl monthly archives mined from URL index files
news-please
news-please - an integrated web crawler and information extractor for news that just works