Beast code in Giters

Alex's starred repositories

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonApache-2.0194500

web-llm

High-performance In-browser LLM Inference Engine

Language:TypeScriptApache-2.01334300

ml-engineering

Machine Learning Engineering Open Book

Language:PythonCC-BY-SA-4.01142500

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Language:PythonApache-2.087200

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookApache-2.0963900

moe_attention

Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"

Language:PythonMIT8900

oneflow

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

Language:C++Apache-2.0588400

dilated-attention-pytorch

(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (https://arxiv.org/abs/2307.02486)

Language:PythonMIT5000

dilated-self-attention

Implementation of the dilated self attention as described in "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

Language:PythonMIT1300

TorchIntegral

Integral Neural Networks in PyTorch

Language:PythonApache-2.012100

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.

Language:Jupyter NotebookAGPL-3.0281900

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonNOASSERTION16760700

tomesd

Speed up Stable Diffusion with this one simple trick!

Language:PythonMIT127400

web-stable-diffusion

Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.

Language:Jupyter NotebookApache-2.0356700

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

NOASSERTION2689400

nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Language:PythonMIT1402100

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0582800

MegCC

MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器

Language:C++Apache-2.047300

dmls-book

Summaries and resources for Designing Machine Learning Systems book (Chip Huyen, O'Reilly 2022)

225500

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonApache-2.0188300

GenNAS

Generic Neural Architecture Search via Regression (NeurIPS'21 Spotlight)

Language:Python3600

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION848900

FLASHQuad_pytorch

Language:PythonMIT6600

diffusion_distiller

🚀 PyTorch Implementation of "Progressive Distillation for Fast Sampling of Diffusion Models(v-diffusion)"

Language:PythonMIT21600

FLASH-pytorch

Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

Language:PythonMIT34500

TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Language:PythonBSD-3-Clause256500

guided-inpainting

Towards Unified Keyframe Propagation Models

Language:PythonMIT23300

einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Language:PythonMIT844400

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1380400

CanyonWind