OWASP / json-sanitizer

Given JSON-like content, The JSON Sanitizer converts it to valid JSON.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JsonSanitizer removes more than expected to because of strict decimal format

Doenut opened this issue · comments

I've added the JsonSanitizer to my project. I am trying to parse a file name which is named with various '.','-' and numbers as well as text. The sanitizer chops it way more than I intended, for instance 2.0-237.0 becomes 2.0.

Is there any way to fix it or make the sanitizer work on such input as I expect? (I assume the sanitizer "thinks" its a decimal number of some sort).

Thank you!

Do you have a full sample input.
On Jun 30, 2015 4:27 PM, "Doenut" notifications@github.com wrote:

I've added the JsonSanitizer to my project. I am trying to parse a file
name which is named with various '.','-' and numbers as well as text. The
sanitizer chops it way more than I intended, for instance 2.0-237.0 becomes
2.0.

Is there any way to fix it or make the sanitizer work on such input as I
expect? (I assume the sanitizer "thinks" its a decimal number of some sort).

Thank you!


Reply to this email directly or view it on GitHub
#6.

These are the inputs paired with their sanitized version:
"2015-07-02 16:18" turns into "2015 "
"1.32-420.0" turns into "1.32"
"420.0.bla6.x86_64" turns into "420.0"
"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext" turns into ""bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext""

For clarification: note that the "" surrounding every input and output only signify that it is a string and it is not part of the actual input or output.

I tried this Java code

  @Test
  public final void testDoenutTestcases() {
    assertSanitized("2015 ", "2015-07-02 16:18");
    assertSanitized("\"2015-07-02 16:18\"", "\"2015-07-02 16:18");
    assertSanitized("1.32", "1.32-420.0");
    assertSanitized("\"1.32-420.0\"", "\"1.32-420.0\"");
    assertSanitized("420.0", "420.0.bla6.x86_64");
    assertSanitized("\"420.0.bla6.x86_64\"", "\"420.0.bla6.x86_64\"");
    assertSanitized(
        "\"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext\"",
        "bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext");
  }

and it passes.

The tests that have quotes preserve the string contents, and the ones
that look like numbers followed by junk drop the junk.

The JSON sanitizer takes JSON-like content and returns JSON.

The string, 2015-07-02 16:18, without quotes is not really JSON-like
content. It's a date. If you want the JSON string corresponding to
that date, then you should probably use JSON.stringify instead of
trying to sanitize non-JSON content.

2015-07-02 9:42 GMT-04:00 Doenut notifications@github.com:

These are the inputs paired with their sanitized version:
""2015-07-02 16:18" turns into "2015 "
"1.32-420.0" turns into "1.32"
"420.0.bla6.x86_64" turns into "420.0"
"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext" turns into
""bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext""


Reply to this email directly or view it on GitHub.