Beast code in Giters

Yanqi Zhang's starred repositories

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonMIT167222 1554 2692

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookMIT93044 682 7669

llama.cpp

LLM inference in C/C++

Language:C++MIT65772 549 3816

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.034972 342 2747

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.029388 339 268

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.027795 228 4680

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION26458 219 245

MemGPT

Letta (fka MemGPT) is a framework for creating stateful LLM services.

Language:PythonApache-2.011877 115 750

StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Language:PythonApache-2.09515 79 117

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonApache-2.09124 111 81

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION8435 74 530

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.08324 89 1829

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonApache-2.08233 72 409

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonBSD-3-Clause8133 139 3736

llama-cpp-python

Python bindings for llama.cpp

Language:PythonMIT7833 71 1103

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Language:PythonApache-2.06855 123 434

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT6595 37 1093

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.05807 62 625

Awesome-Incremental-Learning

Awesome Incremental Learning

3714 132 46

fairscale

PyTorch extensions for high performance and large scale training.

Language:PythonNOASSERTION3165 46 359

Ax

Adaptive Experimentation Platform

Language:PythonMIT2353 69 736

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaApache-2.01213 16 106

torchgpipe

A GPipe implementation in PyTorch

Language:PythonBSD-3-Clause807 33 33

rouge

A full Python Implementation of the ROUGE Metric (not a wrapper)

Language:PythonApache-2.0666 8 49

Freeflow

High performance container overlay networks on Linux. Enabling RDMA (on both InfiniBand and RoCE) and accelerating TCP to bare metal performance. Freeflow requires zero modification on application code/binary.

Language:CMIT600 34 22

aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services

Language:PythonNOASSERTION444 67 808

Awesome-LLM-Eval

Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表，主要面向基础大模型评测，旨在探求生成式AI的技术边界.

MIT405 9 1

pipedream

Language:PythonMIT375 18 73

k8s-dra-driver

Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes

Language:GoApache-2.0237 15 41

ML-Murphy

Complete solutions for exercises and MATLAB example codes for "Machine Learning: A Probabilistic Perspective" 1/e by K. Murphy

Language:C++235 5 3