alphaRGB's starred repositories
compiler-explorer
Run compilers interactively from your web browser and interact with the assembly
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
tvm_mlir_learn
compiler learning resources collect.
Pytorch-Memory-Utils
pytorch memory track code
mlir-tutorial
MLIR For Beginners tutorial
llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
scale-sim-v2
Repository to host and maintain scale-sim-v2 code
mlir-tutorial
Hands-On Practical MLIR Tutorial
DL_Compiler
Study Group of Deep Learning Compiler
TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
wmma_extension
An extension library of WMMA API (Tensor Core API)
export_llama_to_onnx
export llama to onnx
ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
daily-accounting
a web site made by django to record income and expenses, show charts and statistics / django做的小网站用来记录日常开支和展示图表
onnxsim_large_model
simplify >2GB large onnx model
llm-cost-estimator
Estimating hardware and cloud costs of LLMs and transformer projects
MyCudaCode
练习的一些cuda代码