Yan Yucheng (EzioZz)

EzioZz

Geek Repo

Company:University of Chinese Academy of Sciences

Location:Beijing

Github PK Tool:Github PK Tool

Yan Yucheng's starred repositories

triton

Development repository for the Triton language and compiler

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++License:Apache-2.0Stargazers:10617Issues:157Issues:3720

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:8340Issues:90Issues:1834

ROCm

AMD ROCm™ Software - GitHub Home

Language:ShellLicense:MITStargazers:4542Issues:215Issues:2365

ucasthesis

LaTeX Thesis Template for the University of Chinese Academy of Sciences

lagent

A lightweight framework for building LLM-based agents

Language:PythonLicense:Apache-2.0Stargazers:1775Issues:17Issues:63

starcoder2

Home of StarCoder2!

Language:PythonLicense:Apache-2.0Stargazers:1741Issues:16Issues:18

torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

Language:C++License:NOASSERTIONStargazers:1320Issues:248Issues:693

PatrickStar

PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.

Language:PythonLicense:BSD-3-ClauseStargazers:747Issues:16Issues:56

Wrapper_VideoStation

Synology VideoStation and DLNA FFmpeg Wrapper with AAC, DTS, EAC3 and TrueHD support via pipes (now with GStreamer support). It enables full hardware transcoding from Synology´s FFmpeg for video and transcoding DTS, EAC3, TrueHD and AAC from the SynoCommunity's FFmpeg only when necessary.

buddy-mlir

An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).

Language:C++License:Apache-2.0Stargazers:495Issues:13Issues:52

BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Language:PythonLicense:MITStargazers:363Issues:15Issues:64

rocBLAS

Next generation BLAS implementation for ROCm platform

Language:C++License:NOASSERTIONStargazers:339Issues:58Issues:156

composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Language:C++License:NOASSERTIONStargazers:298Issues:24Issues:216

FlagGems

FlagGems is an operator library for large language models implemented in Triton Language.

Language:PythonLicense:Apache-2.0Stargazers:282Issues:15Issues:27

xdsl

A Python Compiler Design Toolkit

Language:PythonLicense:NOASSERTIONStargazers:256Issues:19Issues:428
Language:PythonLicense:Apache-2.0Stargazers:194Issues:28Issues:115

LightSeq

Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Language:C++License:Apache-2.0Stargazers:155Issues:4Issues:44

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:131Issues:10Issues:36

ppcg

Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)

Language:CLicense:MITStargazers:117Issues:13Issues:2

rocSPARSE

Next generation SPARSE implementation for ROCm platform

Language:C++License:MITStargazers:115Issues:36Issues:39

isl

Integer Set Library (source repository: http://repo.or.cz/w/isl.git)

Language:CLicense:MITStargazers:63Issues:9Issues:4

MISA

Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)

Language:PythonLicense:MITStargazers:34Issues:25Issues:15

Triton-Compiler

Triton Compiler related materials.

License:MITStargazers:27Issues:0Issues:0
Language:Jupyter NotebookLicense:GPL-3.0Stargazers:21Issues:1Issues:0

tvm-gdb-commands

Small set of gdb commands for useful tasks in tvm

Language:PythonLicense:MITStargazers:17Issues:1Issues:0

swDNN

a highly-efficient library for deep neural networks based on Sunway TaihuLight supercomputer.

Language:RoffStargazers:14Issues:3Issues:0

swGEMM

A highly efficient library for GEMM operations on Sunway TaihuLight

antlr-4

learn Antlr 4 (with c++ examples & cmake)

Language:C++Stargazers:5Issues:2Issues:0