Han Yang's starred repositories
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Megatron-LM
Ongoing research training transformer models at scale
pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
facebook-hive-udfs
Facebook's Hive UDFs
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
LLMsPracticalGuide
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
the-algorithm-ml
Source code for Twitter's Recommendation Algorithm
the-algorithm
Source code for Twitter's Recommendation Algorithm
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
python_backend
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
Algorithm-Practice-in-Industry
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
alphaFM_softmax
Multi-thread implementation of Factorization Machines with FTRL for multi-class classification problem which uses softmax as hypothesis.
Diffusion-LM
Diffusion-LM
LexiconAugmentedNER
Reject complicated operations for incorporating lexicon for Chinese NER.