Introduction

Official Pytorch implementation for Neural Video and Image Compression including:

Neural Video Codec
- DCVC: Deep Contextual Video Compression, NeurIPS 2021, in this folder.
- DCVC-TCM: Temporal Context Mining for Learned Video Compression, in IEEE Transactions on Multimedia, and arxiv, in this folder.
- DCVC-HEM: Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression, ACM MM 2022, in this folder.
  - The first end-to-end neural video codec to exceed H.266 (VTM) using the highest compression ratio configuration, in terms of both PSNR and MS-SSIM.
  - The first end-to-end neural video codec to achieve rate adjustment in single model.
- DCVC-DC: Neural Video Compression with Diverse Contexts, CVPR 2023, in this folder.
  - The first end-to-end neural video codec to exceed ECM using the highest compression ratio configuration, in terms of PSNR and MS-SSIM for RGB content.
  - The first end-to-end neural video codec to exceed ECM using the highest compression ratio configuration, in terms of PSNR for YUV420 content.
Neural Image Codec
- EVC: Towards Real-Time Neural Image Compression with Mask Decay, ICLR 2023, in this folder.

On the comparison

Please note that different methods may use different configurations to test different models, such as

Source video may be different, e.g., cropped or padded to the desired resolution.
Intra period may be different, e.g., 96, 32, 12, or 10.
Number of encoded frames may be different.

So, it does not make sense to compare the numbers in different methods directly, unless making sure they are using same test conditions.

Please find more details on the test conditions.

Acknowledgement

The implementation is based on CompressAI and PyTorchVideoCompression.

Citation

If you find this work useful for your research, please cite:

@article{li2021deep,
  title={Deep Contextual Video Compression},
  author={Li, Jiahao and Li, Bin and Lu, Yan},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

@article{sheng2022temporal,
  title={Temporal context mining for learned video compression},
  author={Sheng, Xihua and Li, Jiahao and Li, Bin and Li, Li and Liu, Dong and Lu, Yan},
  journal={IEEE Transactions on Multimedia},
  year={2022},
  publisher={IEEE}
}

@inproceedings{li2022hybrid,
  title={Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression},
  author={Li, Jiahao and Li, Bin and Lu, Yan},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  year={2022}
}

@inproceedings{li2023neural,
  title={Neural Video Compression with Diverse Contexts},
  author={Li, Jiahao and Li, Bin and Lu, Yan},
  booktitle={{IEEE/CVF} Conference on Computer Vision and Pattern Recognition,
             {CVPR} 2023, Vancouver, Canada, June 18-22, 2023},
  year={2023}
}

@inproceedings{wang2023EVC,
  title={EVC: Towards Real-Time Neural Image Compression with Mask Decay},
  author={Wang, Guo-Hua and Li, Jiahao and Li, Bin and Lu, Yan},
  booktitle={International Conference on Learning Representations},
  year={2023}
}

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.

tldev3000 / DCVC

Introduction

On the comparison

Acknowledgement

Citation

Trademarks

About

Languages