Xiaoyu Zhang's repositories
tvm_mlir_learn
compiler learning resources collect.
how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
how-to-learn-deep-learning-framework
how to learn PyTorch and OneFlow
giantpandacv.com
www.giantpandacv.com
mlc-llm-code-analysis
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
cpp_related_tips
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.
How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
tokenizers-cpp
Universal cross-platform tokenizers binding to HF and sentencepiece
FasterTransformer
Transformer related optimization, including BERT, GPT
LLaMA-Factory
Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)
stb_image_example
std based image encoder decoder example (c++)
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
tvm_gpu_gemm
play gemm with tvm