Possible typo in comment
antirez opened this issue · comments
Salvatore Sanfilippo commented
Hi, in the Q2_K structure there is a comment stating that each weight uses 2.5625 bits:
// 2-bit quantization
// weight is represented as x = a * q + b
// 16 blocks of 16 elements each
// Effectively 2.5625 bits per weight
typedef struct {
uint8_t scales[QK_K/16]; // scales and mins, quantized with 4 bits
uint8_t qs[QK_K/4]; // quants
ggml_fp16_t d; // super-block scale for quantized scales
ggml_fp16_t dmin; // super-block scale for quantized mins
} block_q2_K;
But if I do the math, I obtain:
block size = 16 + 64 + 4 = 84 bytes, that is 672 bits
bits per weight = 672/256 = 2.625
Cheers
Georgi Gerganov commented
Yup, it'a typo - fixed
Salvatore Sanfilippo commented
Thanks!