josdejong / typed-function

Runtime type-checking for JavaScript functions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Type conversion and `any`

Hypercubed opened this issue · comments

Hello again,

I've found what I think is another inconsistency between v1 and v0...

Give the following:

const typed = require("typed-function")

function MyNumber(value) {
    this.value = value;
    this.isMyNumber = true;
}

typed.addType({
  name: 'MyNumber',
  test: function (x) {
    return x && x.isMyNumber === true;
  }
});

typed.addConversion({
  from: 'number',
  to: 'MyNumber',
  convert: function (x) {
     return new MyNumber(x);
  }
});

const fn1 = typed({
  'MyNumber, string': function (a, b) {
    return 'a is a MyNumber, b is a string';
  },
  'any, string': function (a, b) {
    return 'a is a any, b is a string';
  }
});

console.log(fn1(3, 'str'));

In v0 this will print "a is a MyNumber, b is a string" while v1 prints "a is a any, b is a string". It appears v0 will perform conversion before matching any while v1 matches any first.

Is this new desired behavior or a bug?

.

I think the output is as I would expect:

console.log(fn1(3, 'str'));               // 'a is a any, b is a string'
console.log(fn1(new MyNumber(3), 'str')); // 'a is a MyNumber, b is a string'

I'm not sure though why this worked different in v0 if that's indeed the case.

o wait, you're right, because the conversion from number to MyNumber both cases should indeed return 'a is a MyNumber, b is a string'. Looks like a bug then, thanks for bringing this up.

Ah, this has bit me while working on "resolve" for josdejong/mathjs#2485, so I started looking into it a little bit. The mechanism for the current behavior (tagged here as a bug) is clear in compareSignatures, which is used to sort all of the possible signatures for a function, including ones that take advantage of conversions. Namely, the first thing compareSignatures does is check whether only one of the signatures has any conversions at all, and if so, it orders the one with conversions later. So then all signatures with any conversions come at the end after all the signatures that don't, and 'any' matching anything is not considered a conversion.

So that's how the current behavior occurs. But the question is, what would be the non-buggy behavior? Is it specified anywhere? It's a little tricky to fix this bug without clarity on exactly what should happen.

For example, we could say that an arglist is "class 3" if it has any "any" arguments in it (including a ... without a type), "class 2" if it has a type conversion in it but no "any" arguments, and "class 1" if has no conversions and no "any" arguments. And then we could change the first step of compareSignatures to order by class: class 1 before class 2 before class 3. That might well fix this instance of the bug. But is it the desired behavior for typed-function's dispatch behavior for a collection of signatures?

Any thoughts or reference on this would be very helpful. Thanks!

Finding the "best match" involves some heuristics. We can discuss what strategy is best in most cases, but I think there will always be cases that feel odd.

You summarize the behavior quite well, it's indeed determined by compareSignatures and compareParams:

  1. try to find an exact match (including any)
  2. try to find a match with conversions. Maximize the number of arguments from the start that do not need conversion
  3. try to find matches with any and rest parameters

There is a test that my former me made that shows the intended behavior of any vs conversions:

it('should not apply conversions when having an any type argument', function() {
var fn = typed({
'number': function (a) {
return 'number';
},
'any': function (a) {
return 'any';
}
});
assert.equal(fn(2), 'number');
assert.equal(fn(true), 'any');
assert.equal(fn('foo'), 'any');
assert.equal(fn('{}'), 'any');
});

We can argue whether we would like to change the order of preference from exact, any, conversion to exact, conversion, any or not. If there are no strong preferences/arguments for one or the other I suggest we just keep it as it is and close this issue.

It may help to collect some real world examples, probably from mathjs, to see what behavior makes most sense in practice.

I see, so the label of this issue as a "bug" is at direct odds with the "should" statement in that test you wrote. I had just assumed you wanted a change in behavior here because (a) clearly the OP of this issue had a different intuition, (b) apparently a previous version worked a different way, (c) you labeled it as a bug, (d) my intuition is certainly that "any" arguments are last-ditch fallbacks and that typed-function should prefer to call some signature that the client actually typed (recall from the design of typed-function that "any" is not a type, but rather a type constructor of no arguments that accepts anything), and perhaps most importantly, (e) if a client wants to prefer the fallback to conversion, it's trivial to add a type 'Entity' with test function '()=>true' and then typed-function will indeed prefer to call a signature with that rather than doing a conversion, but without making a change such as called for in this issue there isn't any reasonable way to get the behavior "please convert if possible to get one of the signatures listed, otherwise call this fallback". For example, if you wanted to install a specialized "onMismatch" handler for a specific function, as I in fact suggested in the new onMismatch docs, there is no way to do it without making a change to the current behavior, whereas if we do make this change, the current behavior is easily recoverable if desired.

As far as mathjs examples go, here's one to consider. Suppose someone creates a Sequence type, basically it's like a one-dimensional array but rather than giving the elements explicitly, it specifies a rule that generates the entries. (This is not hypothetical, it's what we are planning to do in Numberscope, which is the project that I came to this from.) At least initially to support arithmetic operations on Sequences, say the client defines a conversion from Sequence to either Array or Matrix (doesn't matter which). Then if A is a one-dimensional Array and S is a sequence, with the current behavior A - S will get handled by the 'Array, any' option in the definition of subtract, which is the one that handles subtracting a scalar from a collection, which will clearly produce unexpected/erroneous behavior. Surely the implementer would expect A - S to be handled by converting S to an Array or Matrix and then elementwise operation to be performed. Of course "subtract" is just one example from the arithmetic operations.

Basically the moral of the story is that with mathjs and typed-function as it stands, implementing a new collection class requires a very large number of patches, adding new signatures to just about every arithmetic function in the system; whereas if conversions were preferred to any, then at least those arithmetic operations where it sufficed to just allow the new collection to morph into a standard Array or Matrix would come along "for free."

So anyhow, that's why I advocate considering this a bug, and changing to some acceptable behavior in which conversions are at least sometimes preferred to "any". That said, I am not certain that my "class 1", "class 2", "class 3" suggestion above is ideal, so I was hoping you would have an already specified behavior we should just implement. Now I am guessing not; so let me know if you want to (A) close this, (B) for me to make a detailed proposal of behavior consistent with the OPs example, perhaps in the form of a PR, or (C) some other course of action....

OK, have posted a proposed fix. The sorting in the end doesn't quite start with the class 1, class 2, class 3 approach I suggested at first. I tried to make the sort as clear as possible by listing the "metrics" that will be used to judge two signatures in priority order. So basically it blends the distinction of those three classes with an additional desire to avoid the use of rest parameters when possible, and the desire that when there is a less preferred match (a conversion, or worse, an 'any'), to minimize the number of such less good matches. The idea behind minimizing the number was to reduce the cases where the position of the parameters make a difference. In other words, 'exact, conversion, conversion' will now lose to 'conversion, exact, exact' whereas before it would win just because in the first position, 'exact' is better than 'conversion'.

In light of this working better in mathjs at allowing the Matrix type to be listed after its specializations (see the discussion in #134), I am definitely a proponent of adopting something along these lines, whether it's the current proposal or a variation you might prefer.

Thanks for clearly formulating arguments on why it makes sense to go for the order exact, conversion, any. It makes sense, and I guess I had the same feeling right after it was reported and I marked it as a bug. I'm still confused as to why I did implement it like this in the first place in v1 of typed-function in the first place. I can't come up with why a reason, except maybe that an "explicitly defined signature" should have preference over a "fuzzy, auto-generated signatures". But I'm totally with you in that we should consider this behavior a bug and should change it.

Returned to preferring conversions over the use of 'any' signatures in v3.

Great to see this very old issue resolved 🎉