ldwang's repositories

Language:PythonStargazers:3Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

Aurora

Aurora is a [Chinese Version] MoE model. Aurora is a further work based on Mixtral-8x7B, which activates the chat capability of the model's Chinese open domain.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

bagel

A bagel, with everything.

Language:PythonStargazers:0Issues:0Issues:0

causal-conv1d

Causal depthwise conv1d in CUDA, with a PyTorch interface

Language:CudaLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets

Language:HTMLLicense:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

LLM-Shearing

Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

malaya

Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mamba.py

An efficient Mamba implementation in PyTorch and MLX.

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MiniCPM

MiniCPM-2.4B: An end-side LLM outperforms Llama2-13B.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MixtralKit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

open-interpreter

A natural language interface for computers

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:CLicense:MITStargazers:0Issues:0Issues:0

stable-weight-decay-regularization

[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0