This is the implementation of the paper:
Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Yipeng Ma, Tao Wang and Ji-Rong Wen. Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression Updates:
- [May 21] We update the README.
Code is coming soon!