petrex

followers

following

stars

Mountain View, California

Peter Yeh's repositories

alphatensor

Language:PythonApache-2.0000

Auto-GPT

An experimental open-source attempt to make GPT-4 fully autonomous.

Language:PythonMIT000

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause000

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:PythonNOASSERTION030

triton

Development repository for the Triton language and compiler

Language:C++MIT000

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonApache-2.0010

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.0000

ChatDev

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

Apache-2.0000

composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Language:C++NOASSERTION000

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.0000

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

BSD-3-Clause000

gpt-researcher

GPT based autonomous agent that does online comprehensive research on any given topic

Language:PythonMIT000

human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

Language:PythonMIT000

jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Language:PythonApache-2.0000

llama

Inference code for LLaMA models

Language:PythonNOASSERTION000

llama-recipes

Examples and recipes for Llama 2 model

Language:PythonNOASSERTION000

llama.cpp

Port of Facebook's LLaMA model in C/C++

MIT000

llm.c

LLM training in simple, raw C/CUDA

MIT000

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

000

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT000

onnx

Open Neural Network Exchange

Language:PythonApache-2.0010

OpenChatKit

Language:PythonApache-2.0000

SpaceVim

A community-driven modular vim distribution - The ultimate vim configuration

Language:Vim scriptGPL-3.0010

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookNOASSERTION000

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0000

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonBSD-3-Clause000

training

Reference implementations of training benchmarks

Language:PythonNOASSERTION010

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0000

tvm-1

TVM integration into PyTorch

Language:C++010