hongga16's repositories
DeepRecSys
http://vlsiarch.eecs.harvard.edu/research/recommendation/
Language:PythonMIT000
dlrm
An implementation of a deep learning recommendation model (DLRM)
Language:PythonMIT000
EAGLE
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Language:PythonApache-2.0000
FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
Language:C++Apache-2.0000
000
I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Language:PythonMIT000
Megatron-LM
Ongoing research training transformer models at scale
Language:PythonNOASSERTION000
Language:PythonMIT000
PIM-DL-ASPLOS
PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization
Language:C++MIT000
Language:C000
QuIP
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
000
GPL-3.0000
speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
MIT000