yasoob/ICSPCodec

ICSP Codec

Introduction

This is a implementation of video codec based on C/C++.
This codec is ICSP codec because I implemented this codec when I was working in the ICSP lab(http://icsp.hanyang.ac.kr).
ICSP Codec has been implemented based on basic video compression theories. The characteristics of the ICSP codec are as follows.

16x16 macro block unit process(One 16x16 macro block is divided by four 8x8 blocks in each compression process)
Intra Prediction / Inter Prediction
All Intra Prediction Mode
DCT Transformation / Quantization
Entropy Coding
AVX Intrinsics Parellel Processing
Only Supporting CIF(4:2:0)

ICSP Codec Encoding Process

Intra Prediction

ICSP codec supports three-direction prediction mode in intra prediction. Vectical, Horizontal, DC mode.

The green square is a reconstructed reference pixel around the current block. Intra prediction uses a reference pixel corresponding to each direction mode to generate a prediction block. DC mode generate prediction block using mean value of vertical and horizontal reference pixels. If reference pixels does not exist by some reason(first block, first row blocks, first colum blocks...), then it is replaced by integer value 127. You can set a period of a intra prediction frame in one sequcece by initialize 'intraPerid' as a integer value. For example, if you initialize 'intraPeriod' as 10 and one sequence consists of 300 frames, then every 10 frames are encoded as intra prediction mode and other frames are encoded as inter prediction.

All Intra Prediction Mode

ICSP Codec supports special mode, All Intra Prediction Mode. All Intra Prediction Mode encodes all of frames in one sqeuence using intra prediction mode. You can operate ICSP Codec as All Intra Prediction Mode by initialize 'intraPerid' veriable as ALL_INTRA macro value.

Inter Prediction

ICSP Codec supports only backword prediction. That is, inter prediction blocks are generated by using only previous frame(obviously, previous frame is a reconstrunted frame).

Basically, inter prediction process is to find the location of the most similar block with current block in previous frame(fn-1). Also, current blocks are in current frame(fn). in the figure above, red block is a current block in current frame. Green block is the one closest to current block among negibor blocks around red block in previous frame. ICSP Codec uses search method as 'spiral search'. Spiral search in ICSP codec means that search the block by round and round the current block position like sprial per a pixel unit.

I attached the above picture for conceptual understanding about spiral search. I hope there is no misunderstanding. ICSP codec moves search block by a pixel unit and finds a similar block which has minimum SAD(Sum of Absolute Difference) between a current block and a seach block.

Up to this point, it was a conceptual explanation of the inter prediction. The inter prediction consists of motion estimation and motion compensation. I will briefly explain these two processes.

Motion Estimation

Simplly speaking, motion estimation is to generate a motion vector that represents the distance between a current block position and a similar block which has the smallest SAD.

Motion Compensation

In ICSP Codec, motion compensation is to make prediction blocks using motion vectors and to make differential blocks(current block - prediction block) in results. Differential blocks are used to other encode/decode process like DCT, Quantization, entropy coding process.

DCT

Discrete Cosine Transform(DCT) decomposes a signal for time or spatial domain into frequency components such as Fourier transform. That is, DCT transforms the image of the spatial domain into the frequency domain. When the image is transformed by DCT, it is decomposed DC(direct current) components having a frequency of 0 and sixty-three AC(alternating current) components having a frequency. Since most image are composed of a large number of low frequency components and a few high frequency components, when DCT is applied to image, most of energy is concentrated in DC and just a few energy are widely spread to AC components having high frequency. This phenomenon is called 'Energy compaction' which is very widely used in the video compression field.
ICSP codec supports 8x8 block unit DCT. A 8x8 image block is decomposed by one DC(direct current) component which is having a frequency of 0 and sixty-three AC(alternating current) components having a relatively high frequency.

When the image block of the spatial domain is transformed into the frequency domain through the DCT. Most of energy us concentrated on the DC component and the low frequency AC component, and the remaining energy is widely distributed on the high frequency AC components.
the above figure is intended to explain DCT, so it is different from the exact DCT result. In fact, the image block used for DCT in the ICSP codec is a difference block and DCT result is stored as a floating type.

Quantization

Quantization is a process of dividing a transformed block by a constant(quantization step, QStep). Through quantization, most of the AC Components which have a high frequency component have 0 value or a very small value. The more AC components with very small values, the less bits are allocated in entropy coding process, resulting in a compression efficiency increas. However, this process directly damage to image quality. The larger the QStep, the larger the loss of image quality.
ICSP codec supports independent AC and DC component quantization(QStepAC, QStepDC). Also, ICSP codec provides 1, 8 and 16 QStep.

ZigZag Reordering

Zig-zag reordering arranges elements of a quantized block in certain order. The order is not just sequential order, but diagonal order like zig-zag order. please refer below figure.

By zig-zag reordering, a DC component and AC components with low frequency are arranged on front side and AC components with high frequency are arranged on rear side. If QStep is large(16 or bigger than 16), most elements of high frequency have 0 or very small values, so a number of 0 values are arranged on rear side.
In above figure case, the result of zig-zag reordering is like 61,13,12,0,11,7,0,6,0,...,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0... Most of values on rear side are 0. In ICSP codec, if all of AC components are 0, some unique process is applied into bitstream for compression efficiency by using AC flag. This technique will be explained in detal in Bit Steam Syntax part.

Entropy Coding

In Entropy coding, The list of numbers from zig-zag reordering is converted into a bitstream. Each numbers in the list are converted into bundle of bits. The amount of bits allocated depends on the frequency of occurence of the number. In above case, a few bits are allocated to 0 because 0 is the highest number of occurrences. Conversely, many bits are allocated to other numbers such as 63, 13, 12.... If the number of 0 values are lager in the list, the size of bitstream will become smaller and smaller. ICSP codec has already determined code word(bit form) according to certain range of integers, I didn't implement huffman coding by myself. I just convert a number into determined code word according to the range to which the number has belonged.

Bit Stream Syntax

Stream Header

The Header of ICSP bitsream has fixed structure like below figure. The size of the header is 14 bytes.

Stream Body

Intra Prediction Frame Body

Inter Prediction Frame Body

Vectorization Mode(SIMD, AVX Intrinsics)

ICSP Codec have been basically implemented by pixel unit process(Scalar process). Recently, I converted this pixel unit process into vectorization process using AVX Intrinsics fucntions. Since I didn't familiar with SIMD yet, There may be some unskillful logic and code. However, This conversion significantly shortened the computational complexity in similar quality. I will describe detail comparison result in following chapter.

Computational Complexity Comparison

All Intra Prediction Mode

Sequences	Scalar Encoding Time(Sec)	Vector Encoding Time(sec)	Encoding Time Reduction Rate(%)
akiyo	13.65	12.12	89%
children	16.9	12.91	76%
coastguard	13.98	12.36	88%
container	14.84	11.38	77%
football	5.05	4.04	80%
foreman	17.21	12.42	72%
hall monitor	14.31	12.83	90%
mobile	15.3	12.01	79%
mother_daughter	13.72	12.01	88%
news	14.26	11.41	80%
stefan	16.35	13.06	80%
table	13.62	11.33	83%
Average	x	x	82%

Inter/Intra Hybrid Prediction(intra period : 10)

Sequences	Scalar Encoding Time(Sec)	Vector Encoding Time(sec)	Encoding Time Reduction Rate(%)
akiyo	30.52	13.71	45%
children	25.9	11.27	44%
coastguard	30.71	12.51	41%
container	28.58	13.31	47%
football	9.59	4.4	46%
foreman	29.33	11.94	41%
hall monitor	30.53	12.31	40%
mobile	29.08	12.36	42%
mother_daughter	30.95	12.36	40%
news	26.34	11.43	43%
stefan	29.04	13.09	45%
table	28.51	12.81	45%
Average	x	x	43%

I will upload comparison of operation speed in each encoding process in details.

Folder	description
data	test sequences
source	C++ source code

yasoob / ICSPCodec

ICSP Codec

Introduction

ICSP Codec Encoding Process

Intra Prediction

All Intra Prediction Mode

Inter Prediction

Motion Estimation

Motion Compensation

DCT

Quantization

ZigZag Reordering

Entropy Coding

Bit Stream Syntax

Stream Header

Stream Body

Intra Prediction Frame Body

Inter Prediction Frame Body

Vectorization Mode(SIMD, AVX Intrinsics)

Computational Complexity Comparison

All Intra Prediction Mode

Inter/Intra Hybrid Prediction(intra period : 10)

Contents

About

Languages