hongga16's repositories

Stargazers:0Issues:0Issues:0

DeepRecSys

http://vlsiarch.eecs.harvard.edu/research/recommendation/

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

dlrm

An implementation of a deep learning recommendation model (DLRM)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

EAGLE

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

I-BERT

[ICML'21 Oral] I-BERT: Integer-only BERT Quantization

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

PIM-DL-ASPLOS

PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization

Language:C++License:MITStargazers:0Issues:0Issues:0
Language:CStargazers:0Issues:0Issues:0

QuIP

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Stargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0

speculative-decoding

Explorations into some recent techniques surrounding speculative decoding

License:MITStargazers:0Issues:0Issues:0