multiprocessio / dsq

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Building with -buildmode=pie exposes crash in parquet test

eatonphil opened this issue · comments

See #15 for the original report.

This crash shows up when you go build -buildmode=pie && ./scripts/test.py. This crash does not happen without -buildmode=pie.

panic: runtime error: index out of range [576457833716731764] with length 117670

goroutine 1 [running]:
github.com/goccy/go-json/internal/encoder.CompileToGetCodeSet(0xc000f70f90?, 0x55b1294306cc?)
      github.com/goccy/go-json@v0.9.4/internal/encoder/compiler_norace.go:11 +0x1df
github.com/goccy/go-json.encode(0xc001161ba0, {0xc0009540c0, 0xc00112a750})
      github.com/goccy/go-json@v0.9.4/encode.go:224 +0xd0
github.com/goccy/go-json.marshal({0xc0009540c0, 0xc00112a750}, {0x0, 0x0, 0x1?})
      github.com/goccy/go-json@v0.9.4/encode.go:148 +0xba
github.com/goccy/go-json.MarshalWithOption(...)
      github.com/goccy/go-json@v0.9.4/json.go:186
github.com/goccy/go-json.Marshal({0xc0009540c0?, 0xc00112a750?})
      github.com/goccy/go-json@v0.9.4/json.go:171 +0x2a
github.com/multiprocessio/go-json.(*StreamEncoder).EncodeRow(0xc000958060, {0xc0009540c0?, 0xc00112a750})
      github.com/multiprocessio/go-json@v0.0.0-20220308002443-61d497dd7b9e/encoder.go:57 +0x1dd
github.com/multiprocessio/datastation/runner.transformParquet.func1(0x0?)
      github.com/multiprocessio/datastation/runner@v0.0.0-20220311183454-aba843b46842/file.go:121 +0xc6
github.com/multiprocessio/datastation/runner.withJSONArrayOutWriter({0x55b12b25b338?, 0xc000011218}, 0xc000f71288)
      github.com/multiprocessio/datastation/runner@v0.0.0-20220311183454-aba843b46842/json.go:36 +0xf6
github.com/multiprocessio/datastation/runner.withJSONArrayOutWriterFile(...)
      github.com/multiprocessio/datastation/runner@v0.0.0-20220311183454-aba843b46842/json.go:51
github.com/multiprocessio/datastation/runner.transformParquet({0x55b12b26a2c0?, 0xc000c35788?}, {0x55b12b25b338, 0xc000011218})
      github.com/multiprocessio/datastation/runner@v0.0.0-20220311183454-aba843b46842/file.go:106 +0xd8
github.com/multiprocessio/datastation/runner.transformParquetFile({0x7ffddb498a31?, 0x1b?}, {0x55b12b25b338, 0xc000011218})
      github.com/multiprocessio/datastation/runner@v0.0.0-20220311183454-aba843b46842/file.go:143 +0xec
github.com/multiprocessio/datastation/runner.TransformFile({0x7ffddb498a31, 0x1b}, {{0x0?, 0x1ff?}, {0x0?, 0xc000aff440?}}, {0x55b12b25b338, 0xc000011218})
      github.com/multiprocessio/datastation/runner@v0.0.0-20220311183454-aba843b46842/file.go:594 +0x1e5
main.evalFileInto({0x7ffddb498a31, 0x1b}, 0x0?)
      github.com/multiprocessio/dsq/main.go:47 +0xc5
main._main()
      github.com/multiprocessio/dsq/main.go:241 +0xaec
main.main()
      github.com/multiprocessio/dsq/main.go:381 +0x19

This originally showed up when I switched from 0.5.0 to 0.6.0 (and all later versions), and the commits between 0.5.0 and 0.6.0 make me think that the issue is probably further upstream in github.com/goccy/go-json, as that is a new dependency added to the go.mod file.

Yeah the panic is in that library but while I can reproduce it inside of dsq and DataStation I cannot reproduce it using the library alone (even though it's hard to understand how it's a purely datastation/dsq bug). So I can't make a good issue for goccy/go-json yet.

I can't spend much more time debugging this at the moment. If Arch users mention this panic I'm going to suggest they download an official binary instead of the Arch one since -buildmode=pie is not supported at the moment (although I'd like to support it in the future).

I'd welcome any fixes/suggestions anyone has to fix this here or in goccy/go-json.

Also posted it on Reddit to see if anything about this sounded familiar to others: https://www.reddit.com/r/golang/comments/th2p4y/runtime_panic_when_accessing_global_variable_only/.

Hey I'm in luck, one of the contributors figured it out. goccy/go-json#350

However in looking at the go-json/issues I notice a few other panic issues. So I'm also going to institute a panic fallback to the standard library. It is going to make noticing this harder since dsq hides logs by default and even if it didn't you likely wouldn't look at logs if the program was successful.

But the most important thing is not crashing on valid input.

Alright 0.8.1 is out now with the fix.