YUKE WANG's repositories
GNNAdvisor_OSDI21
Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
TC-GNN_ATC23
Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
MGG_OSDI23
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
QGTC_PPoPP22
Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.
DSXplore_IPDPS21
Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.
CNN-TensorRT
Benchmarking TensorRT on CNN models
AlCOP_MLSys23
Artifact for MLSys'23: ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
APNN-TC_SC21
Artifact for SC21: APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
CUDALibrarySamples
CUDA Library Samples
dgl_pydirect_internal
dgl_pydirect for multi-GPU full-graph computation
docker-pytorch
A Docker image for PyTorch
EL-Rec_SC22
Artifact for SC'22: EL-Rec: Efficient Large-scale Recommendation Model Training via Tensor-Train Embedding Table.
Faith_ATC22
Artifact for Faith: An Efficient Framework for Transformer Verification on GPUs.
fast-dpsgd
Code for fast dpsgd implementations in JAX/TF
llvm-build
Docker file for build LLVM LibTooling
openshmem-examples
Some miscellaneous OpenSHMEM examples
tutorials-1
Training material for IPU users: tutorials, feature examples, simple applications
YukeWang96.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics