Yiming Liu (Yiming992)

Yiming992

Geek Repo

Company:NVIDIA

Location:Beijing

Github PK Tool:Github PK Tool

Yiming Liu's repositories

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:0Issues:0Issues:0

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:0Issues:0Issues:0

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

License:Apache-2.0Stargazers:0Issues:0Issues:0

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

License:Apache-2.0Stargazers:0Issues:0Issues:0

ComfyUI

A powerful and modular stable diffusion GUI with a graph/nodes interface.

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

ComfyUI-AnimateDiff-Evolved

Improved AnimateDiff for ComfyUI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

CUDALibrarySamples

CUDA Library Samples

License:NOASSERTIONStargazers:0Issues:0Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

License:NOASSERTIONStargazers:0Issues:0Issues:0

EAGLE

Official Implementation of EAGLE-1 and EAGLE-2

License:Apache-2.0Stargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

hivemind

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

License:MITStargazers:0Issues:0Issues:0

lilianweng.github.io

My personal page

Stargazers:0Issues:0Issues:0

llama_index

LlamaIndex (formerly GPT Index) is a data framework for your LLM applications

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

License:NOASSERTIONStargazers:0Issues:0Issues:0

mem0

The memory layer for Personalized AI

Stargazers:0Issues:0Issues:0

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:0Issues:0Issues:0

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

License:MITStargazers:0Issues:0Issues:0

Open-Sora-Plan

This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

License:MITStargazers:0Issues:0Issues:0

ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

License:Apache-2.0Stargazers:0Issues:0Issues:0

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

License:NOASSERTIONStargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0