Hongwei Chen's repositories
neural_network_quantum_state
Neural Network Quantum State
ising-model-gpu
Accelerating Monte Carlo simulations of 2D Ising Model using Nvidia GPU
Optimize_DGEMM_on_Intel_CPU
Implementations of DGEMM algorithm using different tricks to optimize the performance.
Lanczos_Neural_Network_Quantum_State
Supporting code for "Systematic improvement of neural network quantum states using Lanczos (NeurIPS 2022)""
Awesome-System-for-Machine-Learning
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
CPlusPlusThings
C++那些事
Optimize_SGEMM_on_Nvidia_GPU
Implementations of SGEMM algorithm on Nvidia GPU using different tricks to optimize the performance.
resnet_food101_cifar10_pytorch
ResNet50 Implementation for Food101 and ResNet9 model for CIFAR10 in Pytorch
CUDATeaching
CUDA based GPU Programming
DeepLearningExamples
Deep Learning Examples
flash-attention
Fast and memory-efficient exact attention
how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
Linear-Algebra-and-Learning-from-Data
Solutions to the problems in the book: Linear Algebra and Learning from Data by Gilbert Strang, MIT
MatmulTutorial
A Easy-to-understand TensorOp Matmul Tutorial
multi-gpu-programming-models
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
ncclOperationPlus
use ncclSend ncclRecv realize ncclSendrecv ncclGather ncclScatter ncclAlltoall
numpy-ml
Machine learning, in numpy
TheArtofHPC_pdfs
All pdfs of Victor Eijkhout's Art of HPC books and courses