Using transform_regression with method "poly" and order 0 returns 3rd order fit
thomascamminady opened this issue · comments
I initially filed this bug for altair
. But since filing it there, I tried to find out where the issue stems from and I think it might relate to vega
. Original issue: altair-viz/altair#3012
I wrote:
I want to use transform_regression to fit a constant through my data. My understanding is that I should use method="poly" and order=0, since a 0th order polynomial is a constant. However, when using order=0 it seems like transform_regression defaults back to order=3, ...
I wonder whether the code below might be responsible for what I observe: Is it possible that _.order || 3
is evaluated as 3
if _.order=0
? I don't know JavaScript and just tried to figure out if this is possible because 0
gets converted to false
and then false||3
gets evaluated as 3
. But it is entirely sure that I am wrong.
const source = pulse.materialize(pulse.SOURCE).source,
groups = partition(source, _.groupby),
names = (_.groupby || []).map(accessorName),
method = _.method || 'linear',
order = _.order || 3,
dof = degreesOfFreedom(method, order),
as = _.as || [accessorName(_.x), accessorName(_.y)],
fit = Methods[method],
values = [];
found here:
Would something like the following snippet be a fix?
order = (_.order != null) ? _.order : 3;
Fixed in #3717 but I don't know whether order 0 actually works since there is no case for 0 here:
.Thanks!
I guess there could be something like
if (order === 0) return constant(data, x, y);
added with the constant
implementation looking something like this:
import rSquared from './r-squared';
export default function(data, x, y) {
const meanY = mean(y), // don't know how that works
predict = x => meanY;
return {
coef: coef,
predict: predict,
rSquared: rSquared(data, x, y, meanY, predict)
};
}
This is similar to linear
(https://github.com/vega/vega/blob/main/packages/vega-statistics/src/regression/linear.js)
I don't know if what I wrote is valid JavaScript, but from a mathematical point of view, the constant fit would just return the mean of the y
values.
It would be something like #3718 but I need some help to finish it.
The issue will be auto-closed when the pull request is merged.