alphaRGB's starred repositories
DL_Compiler
Study Group of Deep Learning Compiler
mlir-tutorial
MLIR For Beginners tutorial
mlir-tutorial
Hands-On Practical MLIR Tutorial
tvm_mlir_learn
compiler learning resources collect.
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
onnxsim_large_model
simplify >2GB large onnx model
llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
export_llama_to_onnx
export llama to onnx
SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
compiler-explorer
Run compilers interactively from your web browser and interact with the assembly
daily-accounting
a web site made by django to record income and expenses, show charts and statistics / django做的小网站用来记录日常开支和展示图表
wmma_extension
An extension library of WMMA API (Tensor Core API)
Pytorch-Memory-Utils
pytorch memory track code
scale-sim-v2
Repository to host and maintain scale-sim-v2 code
llm-cost-estimator
Estimating hardware and cloud costs of LLMs and transformer projects
MyCudaCode
练习的一些cuda代码