ldwang's repositories
Megatron-LM
Ongoing research training transformer models at scale
Aurora
Aurora is a [Chinese Version] MoE model. Aurora is a further work based on Mixtral-8x7B, which activates the chat capability of the model's Chinese open domain.
bagel
A bagel, with everything.
causal-conv1d
Causal depthwise conv1d in CUDA, with a PyTorch interface
DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
LLaMA2-Accessory
An Open-source Toolkit for LLM Development
LLM-Shearing
Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
malaya
Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/
mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
mamba.py
An efficient Mamba implementation in PyTorch and MLX.
MiniCPM
MiniCPM-2.4B: An end-side LLM outperforms Llama2-13B.
MixtralKit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
OLMo
Modeling, training, eval, and inference code for OLMo
open-interpreter
A natural language interface for computers
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
stable-weight-decay-regularization
[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.
TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.