wong_hs's repositories
AI_compiler_development_guide
Free resource for the book AI Compiler Development Guide
book
backup some books
brpc
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".
chipyard
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
Coyote
Framework providing operating system abstractions and a range of shared networking (RDMA, TCP/IP) and memory services to common modern heterogeneous platforms.
CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
DeepLearningSystem
Deep Learning System core principles introduction.
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
esp
Embedded Scalable Platforms: Heterogeneous SoC architecture and IP integration made easy
FasterTransformer
Transformer related optimization, including BERT, GPT
gloo
Collective communications library with various primitives for multi-machine training.
iDMA
A modular, parametrizable, and highly flexible Data Movement Accelerator (DMA)
iob-cache
Verilog Configurable Cache
lbt
Develop toolchain based on llvm to for Cpu0 processor
LLVM-Study-Notes
Study notes about LLVM. LLVM 学习笔记. Licensed under CC BY-NC-SA 4.0
MegCC
MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器
ML-Accelerators
Topics in Machine Learning Accelerator Design
NOCulator
NOCulator is a network-on-chip simulator providing cycle-accurate performance models for a wide variety of networks (mesh, torus, ring, hierarchical ring, flattened butterfly) and routers (buffered, bufferless, Adaptive Flow Control, minBD, HiRD).
oneDNN
oneAPI Deep Neural Network Library (oneDNN)
openmlsys-zh
《Machine Learning Systems: Design and Implementation》- Chinese Version
opentitan
OpenTitan: Open source silicon root of trust
Ripes
A graphical processor simulator and assembly editor for the RISC-V ISA
ROCm
AMD ROCm™ Software - GitHub Home
tpu-mlir
Machine learning compiler based on MLIR for Sophgo TPU.
triton
Development repository for the Triton language and compiler