Creating a PIT class instance crashes if the ensemble produces a NaN value

Question

Creating a PIT class instance crashes if the ensemble produces a NaN value

drewoldag opened this issue a year ago · comments

Bug report
To generate the pit samples when instantiating the PIT class, we calculate the CDF for each of the distributions for the input ensemble at the given truth value.

Sometimes, the CDF for a distribution will be NaN at the true value (which may be indicative of a poorly performing model). These NaNs result in a run time error at the end of the initialization when creating an output ensemble.

There are two initial options under consideration:

Replace NaN values in the self._pit_samps array with 0.
Filter out NaN values in the self._pit_samps array.

Option 1 can be accomplished with a mask. Option 2 can be taken care by using np.nanquantile(...) instead of np.quantile(...). np.nanquantile doesn't do anything magical under the hood, it simply removes the NaNs. (Technically, it replaces the NaN values with elements from the end of the input array and returns an array that is shorter by n elements, where n = the number of NaNs.

Before submitting
Please check the following:

I have described the situation in which the bug arose, including what code was executed, information about my environment, and any applicable data others will need to reproduce the problem.
I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a descriprion of what I expected instead.
If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.