Beast code in Giters

Yiming Liu's repositories

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT000

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT000

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.0000

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonApache-2.0000

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0000

triton

Development repository for the Triton language and compiler

Language:C++MIT000

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonApache-2.0000

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Apache-2.0000

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Apache-2.0000

ComfyUI

A powerful and modular stable diffusion GUI with a graph/nodes interface.

Language:PythonGPL-3.0000

ComfyUI-AnimateDiff-Evolved

Improved AnimateDiff for ComfyUI

Language:PythonApache-2.0000

CUDALibrarySamples

CUDA Library Samples

NOASSERTION000

cutlass

CUDA Templates for Linear Algebra Subroutines

NOASSERTION000

EAGLE

Official Implementation of EAGLE-1 and EAGLE-2

Apache-2.0000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause000

hivemind

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

MIT000

lilianweng.github.io

My personal page

000

llama_index

LlamaIndex (formerly GPT Index) is a data framework for your LLM applications

Language:PythonMIT000

mamba

Language:PythonApache-2.0000

Megatron-LM

Ongoing research training transformer models at scale

NOASSERTION000

mem0

The memory layer for Personalized AI

000

mergekit

Tools for merging pretrained large language models.

Language:PythonLGPL-3.0000

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

MIT000

Open-Sora-Plan

This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

Language:PythonMIT000

pykan

Kolmogorov Arnold Networks

MIT000

ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Language:PythonMIT000

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Apache-2.0000

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++Apache-2.0000

TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

NOASSERTION000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.0000