Consider improving error handling and `StatsError` type
FreezyLemon opened this issue · comments
enum StatsError
is supposed to be an
Enumeration of possible errors thrown within the
statrs
library.
This indicates that it should not try to be a generic error type for any kind of statistics calculations, but instead only concern itself with the errors produced in statrs
.
With that basic assumption, there are currently some inconsistencies and outdated practices though (API guidelines for reference):
- Add test that
StatsError
implements bothSync
andSend
(this is more of a formality, but is trivial to implement and good future-proofing) Error::description
is deprecated and should not be implemented- There are a few unused variants (
ArgNotNegative
,ArgIntervalExclMin
, etc.), these are leftovers from older versions and should be removed because they're no longer needed bystatrs
. - There's at least one case of a function returning a
Result<T>
that cannot return an error:Empirical::new
. There's no real reason for this infallible API to return aResult<T, E>
. There might be others.
I realize that most of these are breaking changes, but seeing that the crate is pre-1.0, I don't think there's a big problem doing this.
Other things that could be improved:
StatsError
is big: 40 bytes on a Linux x64 target. This is because there are variants which contain 2x &str
(2 x 16 = 32 bytes plus discriminant and padding). Is it really necessary to have strings in the error type? The implementation could be replaced mostly 1:1 with some sort of ArgName
enum, but there might be an even better solution that does not need this argument name wrapping at all.
All new
functions seem to just return StatsError::BadParams
whenever the params are outside of the defined/allowed range. Is there a good reason for these to be so vague when compared to the more specific errors returned by other functions? After all, the more specific errors already exist, why not return more exact error information? There might even be value in providing multiple error types, to have errors that are more specific to the exact API being used.
I do see good reason for all of these and I'd be open to making changes for all but the infallible new
not returning a result.
All other public structs defined in the distribution
module have a new
method implemented that returns Result so it does provide consistency. Perhaps if Empirical
were in a different module or it's new
had a different name?
All other public structs defined in the distribution module have a new method implemented that returns Result so it does provide consistency. Perhaps if Empirical were in a different module or it's new had a different name?
I see your point about consistency. I would personally value the expressiveness ("this call cannot fail") over the consistency ("all constructors return a Result and might need error handling"), but it doesn't matter much tbh.
Hmm I'm not sure about renaming the new
function, it's a widespread naming convention in the Rust ecosystem and what most users would expect. Maybe a similar name like new_<something>
so people can quickly find it in their IDEs.
Perhaps an impl Default
over having Empirical::new
?
Regardless, the overall discussion you bring up on the error type is valid. I'd merge an effort that does any of
- scaling it down
- making it adhere closer to API guidelines
- returning
StatsError::BadParams
less where possible.