overflowing uint64 being parsed as float64
karlmcguire opened this issue · comments
I'm using 3d975b7 as the last commit and here's how to reproduce the issue:
package main
import (
"fmt"
"github.com/minio/simdjson-go"
)
func main() {
data := []byte(`{
"number": 27670116110564327426
}`)
parsed, err := simdjson.Parse(data, nil)
if err != nil {
panic(err)
}
iter := parsed.Iter()
iter.Advance()
tmp := &simdjson.Iter{}
obj := &simdjson.Object{}
_, tmp, err = iter.Root(tmp)
if err != nil {
panic(err)
}
obj, err = tmp.Object(obj)
if err != nil {
panic(err)
}
// convert to map[string]interface{}
m, err := obj.Map(nil)
if err != nil {
panic(err)
}
// prints: 2.7670116110564327e+19
//
// should overflow uint64
fmt.Println(m["number"])
}
Interestingly, I found an article from 2018 where Dgraph was making the same mistake.
@karlmcguire That is by design. Numbers are untyped in JSON, so we find the closest working format.
If we rejected this input we would reject perfectly valid JSON, as your example, eg: https://play.golang.org/p/3x0KdffDwTY
If you expect certain fields to only have integers you must add that validation to your code.
I see. I guess what I need is dec.UseNumber()
from encoding/json
but for simdjson-go.
Technically the tape format could be modified to include a reference to the original ascii value in the input. That would be the only reasonable way I see to provide access to that data. But that is a rather large change.
@karlmcguire Thinking about it, we do have extra bits available on the tape for float values. We can't provide the original number, but we can provide a flag "this is a float converted from integer notation, because it overflowed".
@karlmcguire See #31 as a proposal.
That's exactly what I need. Thank you!