jcrobak / parquet-python

python implementation of the parquet columnar file format.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

issue reading byte array with precision 10 and escale 2

hbanger opened this issue · comments

I have the following parquet schema:

field4: BINARY UNCOMPRESSED DO:0 FPO:170 SZ:58/58/1.00 VC:1 ENC:PLAIN,BIT_PACKED ST:[min: 32505002.09, max: 32505002.09, num_nulls: 0]

json:

{"field4":"32505002.09"}

However, if I try to read it I get the following value:

325050020.90

I have more examples:

parquet -> 62753276.08
parquet-pyton-> 627532760.80

parquet ->57768428.82
parquet-pyton->577684288.20

parquet -> 32505002.09
parquet-pyton-> 325050020.90

is that a kind of normal behavior?

Thanks!

This seems like a bug, probably an off by one. Do you happen to have a sample file to reproduce the issue?

Hi, thank you for the update. I'm attaching the sample file.
sample.parquet.zip