Some more issues about `to_chars`

Question

Some more issues about `to_chars`

jk-jeon opened this issue 8 months ago · comments

It seems that std::to_chars returns errc::value_too_large rather than errc::result_out_of_range. Is this simply a mistake or do you have a reason for using errc::result_out_of_range instead? In the code I edited I used errc::result_out_of_range for consistency.
to_chars with only buffer & value, with the format param as well, and with the precision param as well, all should be separate overloads. The reason why the first and the second should be separate overloads is because the first one should behave differently from the second one with fmt == chars_format::general; it needs to select whatever representation that is shortest, which does not always need to be equal to chars_format::general's output. The reason why the second and the third should be separate overloads is because by the spec, precision being negative should be treated as if precision == 6, because the spec of std::printf says that negative precision is ignored, which means it should fall back to the default, which is 6. I guess this is quite stupid, but well, it seems that's how the spec is written anyway.

EDIT: Ah, forgot to mention. When fixed format is chosen for the shortest representation, the output of Dragonbox may not be the correctly rounded one, because there can be trailing zeros in the integer part. (See fmtlib/fmt#3649 for a related discussion in libfmt.)

Peter Dimov · Answer 1 · Tue Feb 20 2024 13:26:37 GMT+0800 (China Standard Time)

It seems that std::to_chars returns errc::value_too_large rather than errc::result_out_of_range. Is this simply a mistake or do you have a reason for using errc::result_out_of_range instead?

That's how std::to_chars is specified, and it's consistent with POSIX error handling. ERANGE is when a result can't fit in the range of the numeric output; EOVERFLOW is when there's a buffer overflow.

Junekey Jeon · Answer 2 · Tue Feb 20 2024 13:54:27 GMT+0800 (China Standard Time)

@pdimov I mean, boost::charconv::to_chars returns errc::result_out_of_range while std::to_chars seems to be supposed to return errc::value_too_large. Is there a reason for this divergence?

Peter Dimov · Answer 3 · Tue Feb 20 2024 14:05:02 GMT+0800 (China Standard Time)

That should be a bug, then.

Matt Borland · Answer 4 · Tue Feb 20 2024 15:50:18 GMT+0800 (China Standard Time)

Is there a reason for this divergence?

No; it will be fixed by linked PR.

Matt Borland · Answer 5 · Tue Feb 20 2024 16:09:13 GMT+0800 (China Standard Time)

to_chars with only buffer & value, with the format param as well, and with the precision param as well, all should be separate overloads. The reason why the first and the second should be separate overloads is because the first one should behave differently from the second one with fmt == chars_format::general; it needs to select whatever representation that is shortest, which does not always need to be equal to chars_format::general's output.

I believe the difference is between formatting with 1e-4 as the crossover point between fixed and scientific like in your last issue right? Since the goal is the absolute minimum number of characters? https://godbolt.org/z/bnscGEbMn

The reason why the second and the third should be separate overloads is because by the spec, precision being negative should be treated as if precision == 6, because the spec of std::printf says that negative precision is ignored, which means it should fall back to the default, which is 6. I guess this is quite stupid, but well, it seems that's how the spec is written anyway.

That does seem to be the case: https://godbolt.org/z/6xPMKeTcM

Junekey Jeon · Answer 6 · Wed Feb 21 2024 17:00:22 GMT+0800 (China Standard Time)

I believe the difference is between formatting with 1e-4 as the crossover point between fixed and scientific like in your last issue right? Since the goal is the absolute minimum number of characters? https://godbolt.org/z/bnscGEbMn

Yeah, 1e-4 should be printed as 1e-4 because it's shorter than 0.0001, and similarly 1e-3 is preferred over 0.001. But those are "easy" cases. Real headaches are the cases like the ones described in the fmt issue I linked (like 123456792.0f, in case it was not clear).