Oswald(Zifan) He's starred repositories
SET-ISCA2023
The framework for the paper "Inter-layer Scheduling Space Definition and Exploration for Tiled Accelerators" in ISCA 2023.
HMT-pytorch
Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"
mlirPyoclExec
Enabling OpenCL in MLIR via Python
unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
llama2.cpp
Inference Llama 2 in one file of pure C++
flash-attention
Fast and memory-efficient exact attention
LightningSim
A fast, accurate trace-based simulator for High-Level Synthesis.
YuenyeungSpTRSV
A Thread-Level and Warp-Level Fusion Synchronization-Free Sparse Triangular Solve on GPUs
Callipepla
Large-scale sparse Conjugate Gradient (CG) solvers on High Bandwidth Memory (HBM) FPGAs