vega / vega

A visualization grammar.

Home Page:https://vega.github.io/vega

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using transform_regression with method "poly" and order 0 returns 3rd order fit

thomascamminady opened this issue · comments

commented

I initially filed this bug for altair. But since filing it there, I tried to find out where the issue stems from and I think it might relate to vega. Original issue: altair-viz/altair#3012

I wrote:

I want to use transform_regression to fit a constant through my data. My understanding is that I should use method="poly" and order=0, since a 0th order polynomial is a constant. However, when using order=0 it seems like transform_regression defaults back to order=3, ...

I wonder whether the code below might be responsible for what I observe: Is it possible that _.order || 3 is evaluated as 3 if _.order=0? I don't know JavaScript and just tried to figure out if this is possible because 0 gets converted to false and then false||3 gets evaluated as 3. But it is entirely sure that I am wrong.

const source = pulse.materialize(pulse.SOURCE).source,
            groups = partition(source, _.groupby),
            names = (_.groupby || []).map(accessorName),
            method = _.method || 'linear',
            order = _.order || 3,
            dof = degreesOfFreedom(method, order),
            as = _.as || [accessorName(_.x), accessorName(_.y)],
            fit = Methods[method],
            values = [];

found here:

order = _.order || 3,

Would something like the following snippet be a fix?

order = (_.order != null) ? _.order : 3;

Fixed in #3717 but I don't know whether order 0 actually works since there is no case for 0 here:

// use more efficient methods for lower orders
.

commented

Thanks!
I guess there could be something like

 if (order === 0) return constant(data, x, y);

added with the constant implementation looking something like this:

import rSquared from './r-squared';

export default function(data, x, y) {
  const meanY = mean(y), // don't know how that works
        predict = x => meanY;

  return {
    coef: coef,
    predict: predict,
    rSquared: rSquared(data, x, y, meanY, predict)
  };
}

This is similar to linear (https://github.com/vega/vega/blob/main/packages/vega-statistics/src/regression/linear.js)

I don't know if what I wrote is valid JavaScript, but from a mathematical point of view, the constant fit would just return the mean of the y values.

It would be something like #3718 but I need some help to finish it.

commented

Thanks, #3718 looks great!

@domoritz Shall I close this issue here?

The issue will be auto-closed when the pull request is merged.