Oswald(Zifan) He (OswaldHe)

OswaldHe

Geek Repo

Location:Los Angeles, CA

Home Page:oswaldhe.github.io

Github PK Tool:Github PK Tool


Organizations
MyKitchenManager
ucladevx
WeBuyers

Oswald(Zifan) He's starred repositories

qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Language:PythonLicense:Apache-2.0Stargazers:355Issues:0Issues:0

SET-ISCA2023

The framework for the paper "Inter-layer Scheduling Space Definition and Exploration for Tiled Accelerators" in ISCA 2023.

Language:C++Stargazers:40Issues:0Issues:0
Language:AdaLicense:MITStargazers:7Issues:0Issues:0

HMT-pytorch

Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"

Language:PythonLicense:Apache-2.0Stargazers:53Issues:0Issues:0

mlirPyoclExec

Enabling OpenCL in MLIR via Python

Stargazers:3Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:13923Issues:0Issues:0

brevitas

Brevitas: neural network quantization in PyTorch

Language:PythonLicense:NOASSERTIONStargazers:1135Issues:0Issues:0

recut

Large-scale medical image processing and reconstruction toolbox

Language:C++License:MITStargazers:18Issues:0Issues:0

allo

Allo: A Programming Model for Composable Accelerator Design

Language:PythonLicense:Apache-2.0Stargazers:110Issues:0Issues:0

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonLicense:Apache-2.0Stargazers:947Issues:0Issues:0

unlimiformer

Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"

Language:PythonLicense:MITStargazers:1045Issues:0Issues:0

LevelST

[FPGA 2024] Source code and bitstream for LevelST: Stream-based Accelerator for Sparse Triangular Solver

Language:TclLicense:MITStargazers:8Issues:0Issues:0

SSR

SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)

Language:CStargazers:21Issues:0Issues:0

CHARM

CHARM: Composing Heterogeneous Accelerators on Versal ACAP Architecture

Language:C++License:MITStargazers:115Issues:0Issues:0

LM-RMT

Recurrent Memory Transformer

Language:PythonLicense:Apache-2.0Stargazers:143Issues:0Issues:0

SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Language:PythonLicense:MITStargazers:608Issues:0Issues:0

sparsegpt

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Language:PythonLicense:Apache-2.0Stargazers:667Issues:0Issues:0

llama2.cpp

Inference Llama 2 in one file of pure C++

Language:PythonLicense:MITStargazers:72Issues:0Issues:0
Language:C++License:BSD-3-ClauseStargazers:63Issues:0Issues:0

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:54682Issues:0Issues:0

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:2176Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:12630Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23761Issues:0Issues:0
Language:C++License:NOASSERTIONStargazers:12Issues:0Issues:0

pasta

[FCCM 2023] PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs

Language:CStargazers:9Issues:0Issues:0

LightningSim

A fast, accurate trace-based simulator for High-Level Synthesis.

License:AGPL-3.0Stargazers:31Issues:0Issues:0

YuenyeungSpTRSV

A Thread-Level and Warp-Level Fusion Synchronization-Free Sparse Triangular Solve on GPUs

Language:CLicense:MITStargazers:6Issues:0Issues:0

Callipepla

Large-scale sparse Conjugate Gradient (CG) solvers on High Bandwidth Memory (HBM) FPGAs

Language:C++License:MITStargazers:7Issues:0Issues:0

Serpens

Serpens is an HBM FPGA accelerator for SpMV

Language:TclLicense:MITStargazers:11Issues:0Issues:0

tapa

TAPA is a dataflow HLS framework that features fast compilation, expressive programming model and generates high-frequency FPGA accelerators.

Language:C++License:MITStargazers:144Issues:0Issues:0