yuguo (yuguo-Jack)

yuguo-Jack

Geek Repo

Company:Sugon

Location:ZhengZhou

Home Page:yuguo960516@outlook.com

Github PK Tool:Github PK Tool

yuguo's starred repositories

llm-inference-benchmark

LLM Inference benchmark

Language:PythonLicense:MITStargazers:305Issues:0Issues:0

HIP-Performance-Optmization-on-VEGA64

14 basic topics for VEGA64 performance optmization

Language:C++Stargazers:47Issues:0Issues:0

dlsys_solution

Homework solutions for CMU 10-414/714 – Deep Learning Systems: Algorithms and Implementation

Language:PythonStargazers:39Issues:0Issues:0
Language:C++License:MITStargazers:196Issues:0Issues:0

hipBLASLt

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library

Language:AssemblyLicense:MITStargazers:46Issues:0Issues:0

amd-lab-notes

AMD lab notes with code examples to demonstrate use of AMD GPUs

Language:C++License:MITStargazers:87Issues:0Issues:0

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:1777Issues:0Issues:0

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonLicense:NOASSERTIONStargazers:15664Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:5Issues:0Issues:0

ChatGLM-Efficient-Tuning

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Language:PythonLicense:Apache-2.0Stargazers:3639Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:35997Issues:0Issues:0

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Language:CudaStargazers:1291Issues:0Issues:0

FlagAI

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

Language:PythonLicense:Apache-2.0Stargazers:3814Issues:0Issues:0

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38437Issues:0Issues:0

AISystem

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9898Issues:0Issues:0

oneflow

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

Language:C++License:Apache-2.0Stargazers:5819Issues:0Issues:0

Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

Language:C++License:Apache-2.0Stargazers:21950Issues:0Issues:0

DeepBench

Benchmarking Deep Learning operations on different hardware

Language:C++License:Apache-2.0Stargazers:1062Issues:0Issues:0
Language:Jupyter NotebookStargazers:76Issues:0Issues:0

rocBLAS

Next generation BLAS implementation for ROCm platform

Language:C++License:NOASSERTIONStargazers:335Issues:0Issues:0

Tensile

Stretching GPU performance for GEMMs and tensor contractions.

Language:PythonLicense:MITStargazers:204Issues:0Issues:0

VkFFT

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library

Language:C++License:MITStargazers:1492Issues:0Issues:0

gearshifft

Benchmark Suite for Heterogenuous FFT Implementations

Language:C++License:Apache-2.0Stargazers:34Issues:0Issues:0

rocFFT

Next generation FFT implementation for ROCm

Language:C++License:NOASSERTIONStargazers:155Issues:0Issues:0

cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

Language:CLicense:NOASSERTIONStargazers:5889Issues:0Issues:0

ROCm

AMD ROCm™ Software - GitHub Home

Language:ShellLicense:MITStargazers:4404Issues:0Issues:0
Language:CudaStargazers:2035Issues:0Issues:0

pyadi-iio

Python interfaces for ADI hardware with IIO drivers (aka peyote)

Language:PythonLicense:NOASSERTIONStargazers:136Issues:0Issues:0

heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing

Language:PythonLicense:Apache-2.0Stargazers:321Issues:0Issues:0

DeepLearningC

Simple program to learn CNN (LeNet-5) in pure C

Language:C++Stargazers:266Issues:0Issues:0