quinnj / JSON3.jl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

being careful about coercions of numeric types

ExpandingMan opened this issue · comments

Some of the type coercions that happen when reading numeric types are a bit too aggressive

julia> s = JSON3.write(typemax(UInt64))
"18446744073709551615"

julia> JSON3.read(s)
1.8446744073709552e19

julia> reinterpret(UInt64, ans)
0x43f0000000000000

It seems to me that this should either throw an error, or try to find a numeric type that the integer fits into (perhaps just default to BigInt although that would be rather inefficient).

I'm not exactly sure what you're expecting here, but I'll walk you through what JSON3 is doing:

  • you write out typemax(UInt64), it always writes out as Int/Float because JSON spec
  • JSON3.read(s) reads the value back in as a Float64 because it first tried to read as an Int64, but failed (overflowed in this case); the fallback is Float64; and in this case, 1.8446744073709552e19 is the closest binary Float64 number to represent the parsed number (even though it isn't exactly the input number)
  • calling reinterpret(UInt64, ans) just casts the raw bits to a different value, which I don't think is what you want here; i.e. you're taking the raw Float64 bits of a mantissa and exponent bit patterns and directly reinterpreting the bits as a UInt64 value

This is a time when using the "typed" JSON3 API can really come in handy. i.e. in your original example, you can do JSON3.read(s, UInt64) and it "just works" as expected.

For more complicated JSON, I find it helpful when working w/ data to define a "row" type, like:

struct Row
    id::UInt64
    name::String
    address::String
end

then if my data is an array of rows, I can just do JSON3.read(s, Vector{Row}). Another big advantage here is that reading the json here will be somewhere like 50x-1000x times faster. Go get that perf!

Yeah, I'm only starting to realize today the ridiculousness of the JSON standard and that it does not even distinguish between ints and floats. What I was hoping for was that this would fail when it realizes it cannot exactly represent the integer given. I'm no longer sure that would be entirely self-consistent.

Feel free to close if there's definitely nothing to be done here. Thanks for your help, as always!

Actually I do think there's a bit of a problem here. There can be some rather unexpected behavior

julia> s = JSON3.write(typemax(Int))
"9223372036854775807"

julia> JSON3.read(s)
9.223372036854776e18

julia> JSON3.read(s, Int)
9223372036854775807

I do think it's reasonable to expect this to give an exact result. If you convert the float to an Int implicitly somewhere without realizing it, this can lead to some pretty strange and unexpected behavior.

Hmmm, that does seem unexpected; I'll take a look at that one

Ok, the egregious cases are now fixed in #125. I also added some more details on what exactly JSON3.jl supports number-parsing-wise in #124.