JsonSanitizer removes more than expected to because of strict decimal format
Doenut opened this issue · comments
I've added the JsonSanitizer to my project. I am trying to parse a file name which is named with various '.','-' and numbers as well as text. The sanitizer chops it way more than I intended, for instance 2.0-237.0 becomes 2.0.
Is there any way to fix it or make the sanitizer work on such input as I expect? (I assume the sanitizer "thinks" its a decimal number of some sort).
Thank you!
Do you have a full sample input.
On Jun 30, 2015 4:27 PM, "Doenut" notifications@github.com wrote:
I've added the JsonSanitizer to my project. I am trying to parse a file
name which is named with various '.','-' and numbers as well as text. The
sanitizer chops it way more than I intended, for instance 2.0-237.0 becomes
2.0.Is there any way to fix it or make the sanitizer work on such input as I
expect? (I assume the sanitizer "thinks" its a decimal number of some sort).Thank you!
—
Reply to this email directly or view it on GitHub
#6.
These are the inputs paired with their sanitized version:
"2015-07-02 16:18" turns into "2015 "
"1.32-420.0" turns into "1.32"
"420.0.bla6.x86_64" turns into "420.0"
"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext" turns into ""bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext""
For clarification: note that the "" surrounding every input and output only signify that it is a string and it is not part of the actual input or output.
I tried this Java code
@Test
public final void testDoenutTestcases() {
assertSanitized("2015 ", "2015-07-02 16:18");
assertSanitized("\"2015-07-02 16:18\"", "\"2015-07-02 16:18");
assertSanitized("1.32", "1.32-420.0");
assertSanitized("\"1.32-420.0\"", "\"1.32-420.0\"");
assertSanitized("420.0", "420.0.bla6.x86_64");
assertSanitized("\"420.0.bla6.x86_64\"", "\"420.0.bla6.x86_64\"");
assertSanitized(
"\"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext\"",
"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext");
}
and it passes.
The tests that have quotes preserve the string contents, and the ones
that look like numbers followed by junk drop the junk.
The JSON sanitizer takes JSON-like content and returns JSON.
The string, 2015-07-02 16:18, without quotes is not really JSON-like
content. It's a date. If you want the JSON string corresponding to
that date, then you should probably use JSON.stringify instead of
trying to sanitize non-JSON content.
2015-07-02 9:42 GMT-04:00 Doenut notifications@github.com:
These are the inputs paired with their sanitized version:
""2015-07-02 16:18" turns into "2015 "
"1.32-420.0" turns into "1.32"
"420.0.bla6.x86_64" turns into "420.0"
"bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext" turns into
""bla-blabla-blablabla-1.32-420.0.bla6.x86_64.ext""—
Reply to this email directly or view it on GitHub.