alphaRGB's starred repositories
scale-sim-v2
Repository to host and maintain scale-sim-v2 code
llm-cost-estimator
Estimating hardware and cloud costs of LLMs and transformer projects
MyCudaCode
练习的一些cuda代码
how-to-optimize-gemm
row-major matmul optimization
YHs_Sample
Yinghan's Code Sample
parallelformers
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
QvPlugin-Trojan
在 Qv2ray 中使用 Trojan, 感谢 Trojan-Qt5 0.x
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
profiler-workshop
Example code for profiler workshop