Anthony Chang (rosenrodt)

rosenrodt

Geek Repo

Github PK Tool:Github PK Tool

Anthony Chang's repositories

compute

A C++ GPU Computing Library for OpenCL

Language:C++License:BSL-1.0Stargazers:1Issues:0Issues:0

AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

bug_opencl_boost_compute

Minimal example for reproducing segfault issue with Boost.Compute

Language:CMakeStargazers:0Issues:0Issues:0

CMakeExamples

To understand how to leverage CMake effectively

Language:C++Stargazers:0Issues:0Issues:0
Language:C++Stargazers:0Issues:0Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:BSD-3-ClauseStargazers:0Issues:0Issues:0
Language:AssemblyStargazers:0Issues:1Issues:0

HIP-Performance-Optmization-on-VEGA64

14 basic topics for VEGA64 performance optmization

Language:C++Stargazers:0Issues:0Issues:0

HIPIFY

HIPIFY: Convert CUDA to Portable C++ Code

Language:C++Stargazers:0Issues:0Issues:0

rocBLAS

Next generation BLAS implementation for ROCm platform

License:MITStargazers:0Issues:0Issues:0

rocFFT

Next generation FFT implementation for ROCm

Language:C++License:MITStargazers:0Issues:0Issues:0

SGEMM_on_VEGA

An alternative SGEMM implementation on AMD Vega Series

Language:AssemblyStargazers:0Issues:0Issues:0

Tensile

Stretching GPU performance for GEMMs and tensor contractions.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0