Yanqi Zhang (zyqCSL)

zyqCSL

Geek Repo

Company:Cornell

Github PK Tool:Github PK Tool

Yanqi Zhang's starred repositories

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonLicense:MITStargazers:167222Issues:1554Issues:2692

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:93044Issues:682Issues:7669

llama.cpp

LLM inference in C/C++

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:34972Issues:342Issues:2747

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:29388Issues:339Issues:268

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:27795Issues:228Issues:4680

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26458Issues:219Issues:245

MemGPT

Letta (fka MemGPT) is a framework for creating stateful LLM services.

Language:PythonLicense:Apache-2.0Stargazers:11877Issues:115Issues:750

StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Language:PythonLicense:Apache-2.0Stargazers:9515Issues:79Issues:117

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonLicense:Apache-2.0Stargazers:9124Issues:111Issues:81

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:8435Issues:74Issues:530

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:8324Issues:89Issues:1829

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonLicense:Apache-2.0Stargazers:8233Issues:72Issues:409

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonLicense:BSD-3-ClauseStargazers:8133Issues:139Issues:3736

llama-cpp-python

Python bindings for llama.cpp

Language:PythonLicense:MITStargazers:7833Issues:71Issues:1103

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Language:PythonLicense:Apache-2.0Stargazers:6855Issues:123Issues:434

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:6595Issues:37Issues:1093

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:5807Issues:62Issues:625

fairscale

PyTorch extensions for high performance and large scale training.

Language:PythonLicense:NOASSERTIONStargazers:3165Issues:46Issues:359

Ax

Adaptive Experimentation Platform

Language:PythonLicense:MITStargazers:2353Issues:69Issues:736

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:1213Issues:16Issues:106

torchgpipe

A GPipe implementation in PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:807Issues:33Issues:33

rouge

A full Python Implementation of the ROUGE Metric (not a wrapper)

Language:PythonLicense:Apache-2.0Stargazers:666Issues:8Issues:49

Freeflow

High performance container overlay networks on Linux. Enabling RDMA (on both InfiniBand and RoCE) and accelerating TCP to bare metal performance. Freeflow requires zero modification on application code/binary.

aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services

Language:PythonLicense:NOASSERTIONStargazers:444Issues:67Issues:808

Awesome-LLM-Eval

Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表,主要面向基础大模型评测,旨在探求生成式AI的技术边界.

k8s-dra-driver

Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes

Language:GoLicense:Apache-2.0Stargazers:237Issues:15Issues:41

ML-Murphy

Complete solutions for exercises and MATLAB example codes for "Machine Learning: A Probabilistic Perspective" 1/e by K. Murphy