Improve the fit of the Beta distribution: Use the new `loc` and `scale`
npatki opened this issue · comments
Neha Patki commented
Problem Description
In the beta univariate fit function, we perform the following steps:
- Estimate the
loc
andscale
parameters - Call the scipy
fit
function using theloc
andscale
as starting guesses - After the fit is complete, get the values for
a
andb
The issue is that step 3 also returns new loc
and scale
parameters. The ones we input are just starting guesses. When we use the same loc
and scale
as step 1, they are out-of-sync with the a
and b
parameters.
Expected behavior
Stop using the initial guesses for loc
and scale
. Update them when setting a
and b
.
i.e. change line 30
def _fit(self, X):
loc = np.min(X)
scale = np.max(X) - loc
a, b, loc, scale = beta.fit(X, loc=loc, scale=scale)
self._params = {
'loc': loc,
'scale': scale,
'a': a,
'b': b
}
Additional context
We verified this change by comparing our fit distribution to scipy. Scipy's fit is better because it's actually updating the loc
and scale
parameters.