ftgreat

followers

following

stars

ldwang's repositories

llmkit

Language:Python3 10

AGIEval

Language:PythonMIT000

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION000

Aurora

Aurora is a [Chinese Version] MoE model. Aurora is a further work based on Mixtral-8x7B, which activates the chat capability of the model's Chinese open domain.

Language:PythonApache-2.0000

bagel

A bagel, with everything.

Language:Python000

causal-conv1d

Causal depthwise conv1d in CUDA, with a PyTorch interface

Language:CudaBSD-3-Clause000

DeepSeek-MoE

Language:PythonMIT000

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonApache-2.0000

doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets

Language:HTMLMIT000

learn-llm

Language:Jupyter NotebookApache-2.0000

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonApache-2.0000

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonApache-2.0000

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonNOASSERTION000

LLM-Shearing

Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Language:PythonMIT000

malaya

Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/

Language:Jupyter NotebookMIT000

mamba

Language:PythonApache-2.0000

mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍

Language:PythonApache-2.0000

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonApache-2.0000

mamba.py

An efficient Mamba implementation in PyTorch and MLX.

Language:Python000

mamba4transformers

Language:Python000

megablocks

Language:PythonApache-2.0000

MiniCPM

MiniCPM-2.4B: An end-side LLM outperforms Llama2-13B.

Language:PythonApache-2.0000

MixtralKit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

Language:PythonApache-2.0000

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonApache-2.0000

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonApache-2.0000

open-interpreter

A natural language interface for computers

Language:PythonAGPL-3.0000

Pai-Megatron-Patch

Language:PythonApache-2.0000

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:CMIT000

stable-weight-decay-regularization

[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.

Language:PythonMIT000

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonApache-2.0000