valyala / fastjson

Fast JSON parser and validator for Go. No custom structs, no code generation, no reflection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parsing ip address gives invalid object model

alexus1024 opened this issue · comments

With given test

func TestFastjsonBug(t *testing.T) {
	str := "192.168.10.1"
	v, err := fastjson.Parse(str)
	require.NoError(t, err) // EXPECT error here. 

	vStr := v.MarshalTo(nil)
	err = fastjson.ValidateBytes(vStr)
	require.NoError(t, err) // FAIL: unexpected tail: ".10.1"
}

Expect to get a error during parse.
But actually parse gives Value with an invalid json representation

Confirmed by looking at the code.
https://github.com/valyala/fastjson/blob/master/parser.go#L427

parseRawNumber allows multiple instances of . e E - etc. in numbers.

@alexus1024 If your json requires validation, you should validate your json before parsing. I assume that these operations have remained separate in order to maximize parser efficiency while still providing ways to enforce correctness.

I added support for fastjson.Parse and fastjson.Validate to this test suite:
https://github.com/nst/JSONTestSuite

fastjson.Validate returns the correct result on every test.
fastjson.Parse fails to return an error on 52 cases.

SHOULD_HAVE_FAILED      n_number_++.json
SHOULD_HAVE_FAILED      n_number_+1.json
SHOULD_HAVE_FAILED      n_number_+Inf.json
SHOULD_HAVE_FAILED      n_number_-01.json
SHOULD_HAVE_FAILED      n_number_-1.0..json
SHOULD_HAVE_FAILED      n_number_-2..json
SHOULD_HAVE_FAILED      n_number_-NaN.json
SHOULD_HAVE_FAILED      n_number_.-1.json
SHOULD_HAVE_FAILED      n_number_.2e-3.json
SHOULD_HAVE_FAILED      n_number_0.1.2.json
SHOULD_HAVE_FAILED      n_number_0.3e+.json
SHOULD_HAVE_FAILED      n_number_0.3e.json
SHOULD_HAVE_FAILED      n_number_0.e1.json
SHOULD_HAVE_FAILED      n_number_0e+.json
SHOULD_HAVE_FAILED      n_number_0e.json
SHOULD_HAVE_FAILED      n_number_0_capital_E+.json
SHOULD_HAVE_FAILED      n_number_0_capital_E.json
SHOULD_HAVE_FAILED      n_number_1.0e+.json
SHOULD_HAVE_FAILED      n_number_1.0e-.json
SHOULD_HAVE_FAILED      n_number_1.0e.json
SHOULD_HAVE_FAILED      n_number_1eE2.json
SHOULD_HAVE_FAILED      n_number_2.e+3.json
SHOULD_HAVE_FAILED      n_number_2.e-3.json
SHOULD_HAVE_FAILED      n_number_2.e3.json
SHOULD_HAVE_FAILED      n_number_9.e+.json
SHOULD_HAVE_FAILED      n_number_expression.json
SHOULD_HAVE_FAILED      n_number_Inf.json
SHOULD_HAVE_FAILED      n_number_invalid+-.json
SHOULD_HAVE_FAILED      n_number_NaN.json
SHOULD_HAVE_FAILED      n_number_neg_int_starting_with_zero.json
SHOULD_HAVE_FAILED      n_number_neg_real_without_int_part.json
SHOULD_HAVE_FAILED      n_number_real_without_fractional_part.json
SHOULD_HAVE_FAILED      n_number_starting_with_dot.json
SHOULD_HAVE_FAILED      n_number_with_leading_zero.json
SHOULD_HAVE_FAILED      n_string_1_surrogate_then_escape_u.json
SHOULD_HAVE_FAILED      n_string_1_surrogate_then_escape_u1.json
SHOULD_HAVE_FAILED      n_string_1_surrogate_then_escape_u1x.json
SHOULD_HAVE_FAILED      n_string_backslash_00.json
SHOULD_HAVE_FAILED      n_string_escaped_ctrl_char_tab.json
SHOULD_HAVE_FAILED      n_string_escaped_emoji.json
SHOULD_HAVE_FAILED      n_string_escape_x.json
SHOULD_HAVE_FAILED      n_string_incomplete_escaped_character.json
SHOULD_HAVE_FAILED      n_string_incomplete_surrogate.json
SHOULD_HAVE_FAILED      n_string_incomplete_surrogate_escape_invalid.json
SHOULD_HAVE_FAILED      n_string_invalid-utf-8-in-escape.json
SHOULD_HAVE_FAILED      n_string_invalid_backslash_esc.json
SHOULD_HAVE_FAILED      n_string_invalid_unicode_escape.json
SHOULD_HAVE_FAILED      n_string_invalid_utf8_after_escape.json
SHOULD_HAVE_FAILED      n_string_unescaped_ctrl_char.json
SHOULD_HAVE_FAILED      n_string_unescaped_newline.json
SHOULD_HAVE_FAILED      n_string_unescaped_tab.json
SHOULD_HAVE_FAILED      n_string_unicode_CapitalU.json

Working as expected.
Recommend closing this issue.

After further testing, I think the best solution is a validating parser.

Running both Validate and Parse separately is not very efficient.
Better would be to combine the validator and parser to create a new less efficient parser that validates while parsing.

BenchmarkParse/medium/stdjson-map-12                      140197              7976 ns/op         292.02 MB/s       10091 B/op        206 allocs/op
BenchmarkParse/medium/stdjson-struct-12                   122487              8953 ns/op         260.15 MB/s        8646 B/op        239 allocs/op
BenchmarkParse/medium/stdjson-empty-struct-12             330585              3721 ns/op         625.92 MB/s         296 B/op          5 allocs/op
BenchmarkParse/medium/fastjson-12                        1810522               659.9 ns/op      3529.50 MB/s           0 B/op          0 allocs/op
BenchmarkParse/medium/fastjson-get-12                    1838865               666.7 ns/op      3493.11 MB/s           0 B/op          0 allocs/op
BenchmarkParse/medium/fastjson-validate-parser-12        1723939               690.8 ns/op      3371.66 MB/s           0 B/op          0 allocs/op
BenchmarkParse/medium/fastjson-validate-and-parse-12      982446              1267 ns/op        1838.35 MB/s           0 B/op          0 allocs/op

This new ValidateParser has identical test results to the validator except that it also enforces a maximum depth.

Putting together a PR now.