carlushuang's repositories
cpu_gemm_opt
how to design cpu gemm on x86 with avx256, that can beat openblas.
FFT_implement
fft/ifft, r2c/c2r, 2d_r2c/2d_c2r, convolve, correlation, tiling fft, srfft, pfa, radix-2/3/5
deepcore_source_code
Subpart source code of of deepcore v0.7
amdgpu-jit
test project for amdgpu codegen
binutils-gdb
Unofficial mirror of sourceware binutils-gdb repository. Updated daily.
CWBVH
An implementation of NVIDIA's paper "Efficient Incoherent Ray Traversal on GPUs Through Compressed Wide BVHs"
D3D12nBodyGravity_clang
D3D12nBodyGravity example with clang build
HIP-Examples
Examples for HIP
Mandelbrot-Set
mandelbrot set
miopen-benchmark
benchmarking miopen
mlir
"Multi-Level Intermediate Representation" Compiler Infrastructure
rocm-recipes
Recipes for rocm
xbyak
a JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX/AVX2/AVX-512 by C++ header