multiprocessio / dsq

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regression between v0.19.0 and v0.20.0 around processing arrays in JSONL files?

fritzgrabo opened this issue Β· comments

Hi! Thanks for the new version and the new SQLite Writer feature. I'm really looking forward to the speed-up! πŸŽ‰

I gave v0.20.0 a spin and noticed what seems to be a regression around processing arrays in JSONL files. It's entirely possible that what I saw is just a symptom of another, underlying issue, but hopefully it provides a good lead.

Here's a minimal test case to replicate.

(1) Processing JSONL files with arrays in them works in v0.19.0

$ dsq --version
dsq 0.19.0

$ cat ~/test.jsonl
{"foo":[]}

$ dsq ~/test.jsonl "select count(1) from {}"
[{"count(1)":1}]

(2) It no longer works in v0.20.0

$ git log -1 --oneline
ba33348 Bump version in readme

$ ./dsq --version
dsq latest

$ ./dsq ~/test.jsonl "select count(1) from {}"
sql: converting argument $1 type: unsupported type []interface {}, a slice of interface

(3) Turning off the new SQLite Writer feature fixes the issue in v0.20.0

./dsq --no-sqlite-writer ~/test.jsonl "select count(1) from {}"
[{"count(1)":1}]

(4) Parsing JSON files (vs. JSONL) with arrays in them still works in v0.20.0, even with the new SQLite Writer feature turned on

$ cat ~/test.json
[{"foo":[]}]

$ ./dsq ~/test.json "select count(1) from {}"
[{"count(1)":1}]

Hope this helps. Thank you!

Good point, thanks for the report!

Ooph I think I'm going to have to remove everything but CSV/TSV and RegexpNewLines from the SQLiteWriter path as a quick fix because the SQLiteWriter doesn't do nested object expansion like the existing JSON writer does.

It will take me a bit longer to add the nested object expansion to the SQLiteWriter but it shouldn't be too hard. Basically when it sees a map[string]any it will expands all the nested fields and collect a list of new columns to ALTER TABLE in before INSERTing.

Fixed in release 0.20.1 πŸŽ‰