Difer's starred repositories
KuiperLLama
动手实现大模型推理框架
CUDA-Learn-Notes
🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
Megatron-LM
Ongoing research training transformer models at scale
NeMo-Framework-Launcher
Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.
HuggingFace-Download-Accelerator
利用HuggingFace的官方下载工具从镜像网站进行高速下载。
How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
Time-Series-Library
A Library for Advanced Deep Time Series Models.
coder-kung-fu
开发内功修炼
Cpp_Primer_Practice
搞定C++:punch:。C++ Primer 中文版第5版学习仓库,包括笔记和课后练习答案。
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
PaddleCustomDevice
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
Python-Interview-Customs-Collection
Python面试通关宝典,秋招、春招的小伙伴✿✿ヽ(°▽°)ノ✿),有面Python开发方向的,看这一个repo就够啦😘