There is no way to distinguish a non-existing symbol from one bound to VMnull

Question

There is no way to distinguish a non-existing symbol from one bound to VMnull

vrurg opened this issue 3 years ago · comments

The following code throws in Raku:

use nqp;
my $*INVISIBLE := nqp::null();
say $*INVISIBLE; # Dynamic variable $*INVISIBLE not found

because with nqp::getlexdyn op there is no way to distinguish wether VMnull is returned due to a missing symbol, or the symbol is bound to VMnull. The problem is common across all nqp::getlex* family ops. It means that in the current situation one would need to manually reproduce the internal behavior of a particular nqp::getlex* op to find out what exactly caused the negative outcome.

Vadim Belman · Answer 1 · Wed Aug 25 2021 10:54:32 GMT+0800 (China Standard Time)

Apparently, the preferable fix would be to add nqp::existslex* (or nqp::lex*exists) family of ops. This would result in a little complication of symbol lookup code, produced by the compilers. But I consider it a worthwhile addition.

niner · Answer 2 · Wed Aug 25 2021 15:29:21 GMT+0800 (China Standard Time)

Where does this problem pop up? VMNull is an internal thing that is not available to normal Raku users. So in which situations do we bind a VMNull to a dynamic variable and still treat it as declared?

Elizabeth Mattijsen · Answer 3 · Wed Aug 25 2021 16:26:46 GMT+0800 (China Standard Time)

Also, nqp::null is what is being used extensively for checking whether there's a key in a hash, as a non-existent key returns nqp::null for its value.

So to me, having something be nqp::null just means something is not there.

So this feels like a non-problem to me: nqp::null indicates the absence of a value, just like Nil does in Raku?

Darren Duncan · Answer 4 · Wed Aug 25 2021 16:34:01 GMT+0800 (China Standard Time)

For any collection type that is supposed to be able to have elements of any type at all, it seems a very poor design decision to have a special value to indicate no value, which is logically a contradiction in terms. It sounds like Raku is behaving here analagously to using the Perl code defined $myhash{foo} to see if the key foo exists when Raku should also have the analogy to the Perl operation exists $myhash{foo} so we can distinguish the non-existence of a key from an existing key whose corresponding value is undefined.

Elizabeth Mattijsen · Answer 5 · Wed Aug 25 2021 17:00:00 GMT+0800 (China Standard Time)

@duncand Raku has :exists, .EXISTS-KEY and nqp::existskey.

@duncand This is about the very low level implementation in Rakudo. Nothing to do with design. I suggest you familiarize yourself with Raku features before seemingly providing judgment.

Jonathan Worthington · Answer 6 · Wed Aug 25 2021 17:53:21 GMT+0800 (China Standard Time)

Even NQP doesn't default undefined variables to nqp::null (they are NQPMu instead), so even standard NQP use won't run into this. One has to bind the nqp::null, and I think that's a case of DIHWIDT. Certainly I'm not inclined to add any "does it exist" variants into MoarVM for this.

Vadim Belman · Answer 7 · Wed Aug 25 2021 21:48:48 GMT+0800 (China Standard Time)

I have tripped over this problem while working on rakudo/rakudo#4495. One of the problems with pseudos which always annoyed me is that .WHO<sym>:exists is not always true even if sym is returned by .WHO.keys. It was caused by inconsistent rules used internally and I think it's done right way now. But then I started writing a test to make sure that all symbols from .keys are visible on .WHO – and tremendously failed on $*DISPATCHER which is iterable as a symbol, but not accessible. It is iterable because pseudos walk over symbol tables manually and use nqp::iterate on each one. It is even accessible on MY because it is PRECISE_SCOPE and can use nqp::existskey directly on pseudo's $!store (which is bound to a lexpad).

But for chained pseudos like OUTERS, or DYNAMIC there is no quick, VM backed, way to make sure a symbol really exists. This will clearly turn into user confusion (with new issue tickets opened) and possible source of future bugs.

There is a workaround which would use manual iteration over contexts to find the requested symbol for EXISTS-KEY, AT-KEY, and BIND-KEY. But this will have performance penalty, apparently. And still would not help with questions like "why I see $*DISPATCHER but Raku throws when I try to read from it?".

So, generally speaking, I agree with @duncand in the point that using a special value to indicate absence of something doesn't look good to me in this particular case.

Elizabeth Mattijsen · Answer 8 · Wed Aug 25 2021 22:01:35 GMT+0800 (China Standard Time)

Isn't the DIHWIDT caused by the fact that $*DISPATCHER has been set to nqp::null, rather than a Mu or NQPMu? Wouldn't setting that to e.g. NQPMu fix this?

Jonathan Worthington · Answer 9 · Wed Aug 25 2021 22:02:54 GMT+0800 (China Standard Time)

And still would not help with questions like "why I see $*DISPATCHER but Raku throws when I try to read from it?"

$*DISPATCHER is gone in new-disp

Vadim Belman · Answer 10 · Wed Aug 25 2021 22:14:24 GMT+0800 (China Standard Time)

$*DISPATCHER might not be the only symbol of the kind.

Anyway, of course I can drop the idea of testing for roundtrip behavior of pseudos because it would be unreliable. But it would remain a mystery to me why is it a problem to add a set of ops which are basically identical to the existing ones except for the different return value and its meaning?

Elizabeth Mattijsen · Answer 11 · Wed Aug 25 2021 22:16:45 GMT+0800 (China Standard Time)

I think the idea of testing is good.

If nqp::isnull($*FOO) is true, then for all practical purposes, that dynamic variable does not exist. What is the problem with that?

Vadim Belman · Answer 12 · Wed Aug 25 2021 22:22:33 GMT+0800 (China Standard Time)

First of all, you can't do $*FOO if it's VMnull. The problem is:

for DYNAMIC::.keys -> $sym {
    ok DYNAMIC::{$sym}:exists, "$sym exists"; # not ok on $*DISPATCHER
}

And, apparently, DYNAMIC::{$sym} would explode because it's a X::NoSuchSymbol failure. Correspondingly, I can't do a clear test if all symbols returned by .keys are actually accessible via the pseudo.

Darren Duncan · Answer 13 · Thu Aug 26 2021 03:35:48 GMT+0800 (China Standard Time)

@duncand Raku has :exists, .EXISTS-KEY and nqp::existskey.

@duncand This is about the very low level implementation in Rakudo. Nothing to do with design. I suggest you familiarize yourself with Raku features before seemingly providing judgment.

@lizmat I'm talking about implementation design. While I'm not an expert or implementer in Raku I have been involved with it to a degree since 2005 and feel I understand enough to give some feedback. In this case I'm going on @vrurg own comments and the leading discussion as a basis for how things seem to work and I'm not second-guessing whether what they said is true. I'm taking there is no way to distinguish wether VMnull is returned due to a missing symbol, or the symbol is bound to VMnull at face value. I was also speaking somewhat abstractly, the "collection type" being the collection of symbols in this case.

However, I apologize that my exact choice of words very poor design decision looks more harsh than it should have been and I could have used different words. Thank you all for your continuing effort to make Raku good.

Elizabeth Mattijsen · Answer 14 · Thu Aug 26 2021 04:00:58 GMT+0800 (China Standard Time)

@duncand no worries. It felt that you were talking about HLL issue of being able if something exists or not. For which Raku does have a solution that doesn't depend on a special value (well, at least visible from Raku / NQP. I'm pretty sure Perl hashes actually use a special value to indicate a key has been removed internally, to avoid churn).

@vrurg: maybe DYNAMIC::.keys should not produce $*DISPATCHER if it is nqp::null ?

Vadim Belman · Answer 15 · Thu Aug 26 2021 04:52:31 GMT+0800 (China Standard Time)

maybe DYNAMIC::.keys should not produce $*DISPATCHER if it is nqp::null ?

@lizmat I'm considering this. Not only $*DISPATCHER, but any other. After all, a purpose for the PR is to unify pseudo behaviors with compiler's approach to symbol visibility. Since the compiler considers it invisible – so be it. For anyone wishing to introspect symbol tables deeper than that, nqp::ctx* will always be at their disposal.

Vadim Belman · Answer 16 · Thu Aug 26 2021 06:06:32 GMT+0800 (China Standard Time)

maybe DYNAMIC::.keys should not produce $*DISPATCHER if it is nqp::null ?

@lizmat I'm considering this.

No, it's not an option. Things are totally different about lexicals. These are known at compile time. Therefore the following is working:

use nqp;
my $nullish := nqp::null();
sub foo {
    say $nullish; # (Mu)
}
foo

But for pseudos it is crucial to know their symbols at run time. Therefore, replacing say $nullish with say OUTERS::<$nullish> results in a X::NoSuchSymbol failure.

The argument of nqp::null not being valid in Raku land is not compelling to me because it is neither formally nor technically prohibited. Therefore, binding to VMnull is possible, though perhaps not pretty. In some cases, it might be useful if done with care.

BTW, there is catch in the situation. Because PRECISE_SCOPE pseudos do work with only one lexpad and can use nqp::existskey on it, they do see all local symbols with EXISTS-KEY. But neither they can iterate over VMnull-ones nor they can read them. Apparently, I can replace existskey with atkey and check for null, but ...

Eventually, I'm currently have only the following options if the decision not to add exists* family of ops is final:

Skip all VMnull-bounded symbols and keep pseudos behave differently to the compiler
Skip none of the existing symbol and consider AT-KEY failing on some of them a feature
Implement AT-KEY, and EXISTS-KEY in terms of manual iteration over contexts, effectively duplicating VM functionality, at the cost of lower performance

niner · Answer 17 · Thu Aug 26 2021 14:36:11 GMT+0800 (China Standard Time)

The argument of `nqp::null` not being valid in Raku land is not compelling to me because it is neither formally nor technically prohibited. Therefore, binding to `VMnull` is possible, though perhaps not pretty. In some cases, it might be useful if done with care.

But you can get a VMNull only through nqp::null. And nqp is an implementation detail and its use has been strongly discouraged since the very beginning. There are lots of ways you can break Rakudo and get completely bogus results by using nqp. Why should we treat this one case specially?

Vadim Belman · Answer 18 · Fri Aug 27 2021 04:35:18 GMT+0800 (China Standard Time)

There are lots of ways you can break Rakudo and get completely bogus results by using nqp.

I'm not talking about breaking, but rather about having a faster way for correct symbol introspection. It is currently possible to collect all available symbols by iterating over lexpads manually. But when it comes to finding out if a symbol exists nqp::getlex* does it faster, than NQP/Raku iterations would manage to. Except that nqp::getlex* may have it done incorrectly.

Why should we treat this one case specially?

Perhaps to have it done the right way? We all know that returning a value to signal about absence of an entry is no good if entries can carry that value. The reason I'm do not agree is because "strongly discouraged" doesn't mean "disallowed". I mean, as soon as my $var := nqp::null() is an error – I'd agree that exists is redundant. But not for now. Even though I clearly understand that this is an edge of edges case.

Anyway, I'm giving up and reconsidering my approach. However, I'm still puzzled as to why are you so reluctant to adding these ops? What's the reasoning behind this opposition? Because so far it all sounds just like "it is good as it is" which unconvincing. Would it affect optimizations? Would it bring extra complexity? Would it... what? My curiosity craves for food. :)