singhranjodh / dpq_embedding_compression

Differentiable Product Quantization for End-to-End Embedding Compression.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Differentiable Product Quantization for Embedding Compression

This is code for our paper on compressing the embedding table with end-to-end learned KD codes via differentiable product quantization (DPQ).

Requirements

This code was developed under tensorflow version 1.12.0 and python 2. So if it doesn't work for you, you may want to install the right version.

Run the experiments

Cd to scripts/ subfolder in specific task folders (i.e. one of lm, nmt, text_classification).

Run the original full embedding baseline using the following command:

./run_fullembs.sh

Or run the kd code based method using the other command:

./run_kdq.sh

For text classification datasets (other than ag_news), please download them from this link, and put all subfolders of datasets in the text_classification/data folder.

Cite

Please cite our paper if you find it helpful in your own work:

@article{dpq2019,
  title={Differentiable Product Quantization for End-to-End Embedding Compression,
  author={Ting Chen, Lala Li, Yizhou Sun}
  journal={CoRR},
  volume={abs/1908.09756},
  year={2019},
}

Acknowledgement

The language model is modified from tenorflow's ptb tutorial, and NMT model is modified from tensorflow/nmt. We would like to thank the original creators of these models.

About

Differentiable Product Quantization for End-to-End Embedding Compression.

License:MIT License


Languages

Language:Python 95.3%Language:Shell 4.7%