deep-learning image-compression lightweight neural-network pytorch video-compression

🎬 Cool-chic 3.1: Now it does video!

COOL-CHIC

Cool-chic (pronounced /kul ʃik/ as in French 🥖🧀🍷) is is a low-complexity neural image and video codec based on overfitting. Image coding performance are on par with VVC for 2000 multiplication per decoded pixels, while video coding performance compete with AVC with as few as 500 multiplication per decoded pixels.

All the documentation is available on the Cool-chic page

Version history

Fev. 24: version 3.1
- Cool-chic video from Cool-chic video: Learned video coding with 800 parameters, Leguay et al.
- Random access and low-delay video coding, competitive with AVC.
Jan. 24: version 3.0
- Re-implement most of the encoder-side improvements proposed by C3: High-performance and low-complexity neural compression from a single image or video, Kim et al.
- 15% to 20% rate decrease compared to Cool-chic 2
July 23: version 2 Low-complexity Overfitted Neural Image Codec, Leguay et al.
- Several architecture changes for the decoder: convolution-based synthesis, learnable upsampling module
- Friendlier usage: support for YUV 420 input format in 8-bit and 10-bit & Fixed point arithmetic for cross-platform entropy (de)coding
March 23: version 1 COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec, Ladune et al.

Up to come: a fast decoder implementation will be released soon for near real-time CPU decoding 🏎️ 🔥.

Cool-chic 3.1 performance

Cool-chic results are provided for both image and video compression inside the results/ directory alongside compressed bitstream.

Image compression

Image compression performance are presented on the kodak, clic20-pro-valid and jvet datasets.

Dataset	Vs. Cool-chic 2	Vs. Cool-chic 1	Vs. C3, Kim et al.	Vs. HEVC (HM 16.20)	Vs. VVC (VTM 19.1)	Min decoder complexity [MAC / pixel]	Max decoder complexity [MAC / pixel]	Avg decoder complexity [MAC / pixel]
kodak	- 19.4 %	- 29.1 %	- 1.6 %	- 14.6 %	+ 6.6 %	299	2291	1841
clic20-pro-valid	- 16.8 %	/	+ 3.3 %	- 21.4 %	+ 2.3 %	545	2295	1897
jvet	- 23.0 %	/	/	- 13.7 %	+ 25.4 %	300	2295	1680

Video compression

Video compression performance are presented on the first 33 frames (~= 1 second) from the CLIC24 validation subset, composed of 30 high resolution videos. We provide results for 2 coding configurations:

Low-delay P: address use-cases where low latency is mandatory;
Random access: address use-cases where compression efficiency is primordial e.g. video streaming.

Dataset	Config	Vs. HEVC (HM 16.20)	Vs. x265 medium	Vs. x264 medium	Min. decoder complexity [MAC / pixel]	Max decoder complexity [MAC / pixel]	Avg decoder complexity [MAC / pixel]
clic24-valid-subset	random-access	+ 60.4 %	+18.1 %	-15.5 %	460	460	460
clic24-valid-subset	low-latency	+ 122.0 %	+73.8 %	+28.9 %	460	460	460

Setup

More details available on the Cool-chic page

⚠️ Python version

Python version should be at least 3.10!

python3 --version                                          # Should be at least 3.10

Necessary packages

python3 -m pip install virtualenv                          # Install virtual env if needed
python3 -m virtualenv venv && source venv/bin/activate     # Create and activate a virtual env named "venv"
(venv) pip install -r requirements.txt                     # Install the required packages

Replicating Cool-chic results

Already encoded files are provided as bitstreams in results/<configuration>/<dataset_name>/.

<configuration> can be image, video-low-latency, video-random-access
<dataset_name> can be kodak, clic20-pro-valid, clic24-valid-subset, jvet.

For each dataset, a script is provided to decode all the bitstreams.

(venv) python results/decode_one_dataset.py <configuration> <dataset_name>  # Can take a few minutes

The file results/<configuration>/<dataset_name>/results.tsv provides the results that should be obtained.

Thanks

Special thanks go to:

Robert Bamler for the constriction package which serves as our entropy coder. More details @ Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective, Bamler.
Hyunjik Kim, Matthias Bauer, Lucas Theis, Jonathan Richard Schwarz and Emilien Dupont for their great work enhancing Cool-chic: C3: High-performance and low-complexity neural compression from a single image or video, Kim et al.

About

Low-complexity neural image & video codec.

https://orange-opensource.github.io/Cool-Chic/

deep-learning image-compression lightweight neural-network pytorch video-compression

BSD 3-Clause "New" or "Revised" License

Languages

Language:Python 100.0%