ShenDezhou

followers

following

stars

Tsinghua University

Beijing

http://www.tsinghuaboy.com

ShenDezhou's starred repositories

llm-playground

Experiments with open source LLMs

Language:PythonMIT5300

litellm

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

Language:PythonNOASSERTION939900

ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

Language:GoMIT7057300

LoftQ

Language:PythonMIT16800

LDDL

Distributed preprocessing and data loading for language datasets

Language:PythonNOASSERTION3600

character-bert-pretraining

Code for pre-training CharacterBERT models (as well as BERT models).

Language:PythonApache-2.03400

ColossalAI-Examples

Examples of training models with hybrid parallelism using ColossalAI

Language:PythonApache-2.033300

LLM-Workshop

LLM Workshop by Sourab Mangrulkar

Language:Jupyter NotebookApache-2.028300

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.0716500

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03811200

Llama2-chinese

Llama2 chinese finetuning

Language:PythonMIT3700

llama2-lora-fine-tuning

llama2 finetuning with deepspeed and lora

Language:PythonMIT15300

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter Notebook994400

MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Language:PythonApache-2.01038100

localpilot

Language:PythonMIT331600

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++Apache-2.0929800

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

GPL-3.0160800

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0699100

vscode-extension-samples

Sample code illustrating the VS Code extension API.

Language:TypeScriptMIT824400

MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Language:PythonMIT4042800

llama.cpp

LLM inference in C/C++

Language:C++MIT5937700

zero_shot_cot

Prod Env

Language:Python34900

natural-instructions

Expanding natural instructions

Language:PythonApache-2.091100

nanoT5

Fast & Simple repository for pre-training and fine-tuning T5-style models

Language:PythonApache-2.093100

bert_distill

BERT distillation（基于BERT的蒸馏实验）

Language:Python30400

beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Language:PythonApache-2.0142500

duckdb-pgq

DuckDB is an in-process SQL OLAP Database Management System

Language:C++MIT3400

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonMIT1062600

duckdb

DuckDB is an in-process SQL OLAP Database Management System

Language:C++MIT1772300

arrow-tools

A collection of handy CLI tools to convert CSV and JSON to Apache Arrow and Parquet

Language:RustApache-2.012600