yanghanxy

Han Yang's starred repositories

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonNOASSERTION169100

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION902700

pytorch-tutorial

PyTorch Tutorial for Deep Learning Researchers

Language:PythonMIT2932700

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonMIT200100

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookBSD-3-Clause903300

Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain

Language:PythonApache-2.02882200

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT630500

facebook-hive-udfs

Facebook's Hive UDFs

Language:JavaApache-2.026500

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.03332000

WebCPM

Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"

Language:HTMLApache-2.096000

llama

Inference code for Llama models

Language:PythonNOASSERTION5385300

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonApache-2.01773000

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonApache-2.0630600

LLMsPracticalGuide

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

886800

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT436700

the-algorithm-ml

Source code for Twitter's Recommendation Algorithm

Language:PythonAGPL-3.0993700

the-algorithm

Source code for Twitter's Recommendation Algorithm

Language:ScalaAGPL-3.06157300

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonApache-2.01185300

yanghanxy

Han Yang's starred repositories

Megatron-DeepSpeed

Megatron-LM

pytorch-tutorial

llm-awq

LAVIS

Langchain-Chatchat

streaming-llm

facebook-hive-udfs

havenask

DeepSpeed

WebCPM

llama

Chinese-LLaMA-Alpaca

modelscope

LLMsPracticalGuide

trlx

the-algorithm-ml

the-algorithm

RWKV-LM

ppq

python_backend

Algorithm-Practice-in-Industry

nanoGPT

mastodon

AutoPhrase

accelerate

alphaFM_softmax

cleanlab

Diffusion-LM

LexiconAugmentedNER