pjkundert / ezpwd-reed-solomon

Reed-Solomon & BCH encoding and decoding, in C++, Javascript & Python

Home Page:https://hardconsulting.com/products/13-reed-solomon

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The c++ example throws exception with 4095 size

Hi-Angel opened this issue · comments

An example from README stops working once capacity increased from 255 to 4095.

Steps to reproduce:

  1. type in test.cpp:
#include <ezpwd/rs>

int main() {
    // Reed Solomon codec w/ 4095 symbols, up to 4091 data, 4 parity symbols
    ezpwd::RS<4095,4091> rs;
    std::vector<uint8_t> data;

    // ... fill data with up to 4091 bytes ...
    data.resize(4091);

    rs.encode( data ); // Adds 4 Reed-Solomon parity symbols (4095-4091 == 4)

    // ... later, after data is possibly corrupted ...
    rs.decode( data ); // Correct errors
}
  1. Compile, and execute:
$ g++ test2.cpp -o a -g3 -O0 -Wall -Wextra -Wsign-conversion -std=c++17 -pedantic -fsanitize=address -isystem/home/constantine/Projects/ezpwd-reed-solomon/c++/
$ ./a
terminate called after throwing an instance of 'std::runtime_error'
  what():  reed-solomon: output data type too small to contain symbols
[1]    3730 abort (core dumped)  ./a

Hi, Konstantin;

The problem you're running into is because of the Reed-Solomon "Symbol" size (in bits). An RS<4095...> codeword consists of symbols that are 12 bits in size. To contain these symbols, ezpwd RS requires vectors of uint16_t data, not uint8_t. Give that a try.

To contain these symbols, ezpwd RS requires vectors of uint16_t data, not uint8_t

Thanks! I tried making use of ezpwd, but now this requirement made me try exploring other libraries as well. To my surprise, I often encounter these magic numbers 255,256. This made me wondering: whether ezpwd's lack of API to just encode an arbitrary-sized buffer being a limitation of the library or the Reed-Solomon codes per se?

IOW, can't the library encode a 4095-bytes (and why not 4096 anyway?) buffer as uint8_t because it's how the library designed, or because it's how does Reed-Solomon codes work?

IOW, can't the library encode a 4095-bytes (and why not 4096 anyway?) buffer as uint8_t because it's how the library designed, or because it's how does Reed-Solomon codes work?

FTR, I asked at work, and turned out it's specific to Reed-Solomon codes per se. The reason for specific numbers like 256 and 4096 being something about irreducible polynomials over a Galua field for e.g. 2⁸=256, because whole byte-sequence from POV of Reed-Solomon being a 256 or 4096 coefficients of the polynomial. I am not sure, but probably e.g. 2⁹ is not an irreducible polynomial.

The reason for 12 bit elements I didn't really understand, but it does exist.

And the reason for the off-by-one, i.e. 255 instead of 256, and 4095 instead of 4096 is because polynomial calculation for the element at zero-th position doesn't have same effect as for the others.