Parquet is missing rows
mariussoutier opened this issue · comments
I have a Parquet file that should have 30,000+ rows, but SELECT COUNT(*) FROM {}
returns 7000
. Another one with more than 40,000 rows returns exactly 8000
. Converting the same data to JSON works fine.
Thanks for the report! Can you share a parquet file that has this issue?
Unfortunately no, it's business-related. But nothing special, 30 or so columns with mostly UTF8 and two INT32 types.
One of column does contain very large values, but other than that, normal stuff.
@mariussoutier multiprocessio/datastation#278 should fix it!
Closed in #82 now available in dsq 0.21.0