Adding suppressions to accept type-defensive code?

Question

Adding suppressions to accept type-defensive code?

Olivier-Boudeville opened this issue a year ago · comments

Hi,

Thanks for this interesting tool. Sometimes it is useful to report invalid types at runtime for better context (e.g. if the calling, user code may be faulty / not properly type-checked), like in this simplistic example:

-spec get_uniform_value( pos_integer() ) -> pos_integer().

get_uniform_value( N ) when is_integer( N ) ->
	rand:uniform( N );

get_uniform_value( N ) ->
	throw( { not_integer, N } ).

Gradualizer reports correctly that the second clause cannot be reached, this is true in theory, but in practice it may be useful to have such clauses, to report better runtime information that the default one (no matching clause). How can I tell Gradualizer not to be offended by such error-management code?

I could use, not to lose the typing information that was specified:

-type expected( _T ) :: any().

-spec get_uniform_value( expected(pos_integer()) ) -> pos_integer().

but then tools would not have access to the intended meaning/typing anymore.
On the other hand, explicit suppressions would be cumbersome to sparkle over such defensive code.
Would there be better options?
Thanks for any advice to manage such issues properly!

Radek Szymczyszyn · Answer 1 · Sat Feb 04 2023 20:16:03 GMT+0800 (China Standard Time)

Hi Olivier!

Thanks for your interest in Gradualizer :)

How can I tell Gradualizer not to be offended by such error-management code?

Currently, there are two available options that use any().

The first one is what you propose - using any() wrapped in a type carrying some extra information to the human reader, but not to the tools anymore.

The second one would be using any() in the spec, but asserting (downcasting) immediately upon entry to the function:

-spec get_uniform_value( any() ) -> pos_integer().

get_uniform_value( N ) when is_integer( N ) andalso N > 0 ->
        N = ?assert_type(N, pos_integer()),
	rand:uniform( N );

get_uniform_value( N ) ->
	throw( { not_a_pos_integer, N } ).

This way Gradualizer can use the asserted type for checking the rest of the clause(s) that do it. The assert macro is Gradualizer-specific, so for any other tools the type would remain any().

There's also another possibility which could be enabled, but due to the still experimental status and limited use of the tool it's not clear whether there's a need for it. Exhaustiveness checking is gated in the type checker with a feature flag. Currently the flag is not exposed in any interface, but it could be controlled from the CLI interface or using a -gradualizer(...) attribute within a file. This would be global for Gradualizer invocation or the specific file, respectively, and would enable/disable exhaustiveness checking in general. Do you think such a feature would be useful?

Olivier Boudeville · Answer 2 · Sat Feb 04 2023 21:53:40 GMT+0800 (China Standard Time)

Hi, many thanks for your answer!

I suppose a feature flag could help, yet developers may be a bit reluctant to add tool-specific attributes in their sources.

As for adding assertions, this would involve a lot of typing and generate some unfortunate visual noise, the sources would be less readable, this is a problem.

Wrapping with any() would destroy useful information, it would be a bit of a pity.

Actually the best solution would be to obtain natively error messages as informative as { not_a_pos_integer, xxx } - but, at least currently, this is not the case:


1> rand:uniform(xxx). 
** exception error: no function clause matching rand:uniform_s(xxx,
                                                               {#{bits => 58,jump => #Fun<rand.3.34006561>,
                                                                  next => #Fun<rand.0.34006561>,type => exsss,
                                                                  uniform => #Fun<rand.1.34006561>,
                                                                  uniform_n => #Fun<rand.2.34006561>},
                                                                [90045905750092168|234655894772600349]}) (rand.erl, line 346)
     in function  rand:uniform/1 (rand.erl, line 319)

So maybe this should be tackled at the VM/ERTS level, where complete error reports like the one above would still be issued by default, while leaving the possibility to the developer to define their own "fancy", function-specific error handler that would plug, ideally at no real cost, to the basic error reporting system (a bit like what could be done with log handlers) and override it when matching.

I have dim memories of a feature a bit along these lines, but could not find pointers to it.

Radek Szymczyszyn · Answer 3 · Sun Feb 05 2023 00:02:48 GMT+0800 (China Standard Time)

I suppose a feature flag could help, yet developers may be a bit reluctant to add tool-specific attributes in their sources.

That's why one could set it both in the file itself as well as externally to the source, on the command-line or in rebar.config. I think tool-specific attributes are already an accepted solution - both Xref and Dialyzer use them.

As for adding assertions [...]

Indeed. While quite a distant possibility, with time Gradualizer might be able to infer more and more without assertions. However, assertions will always be needed on downcasts, as the programmer has to provide the information what to downcast to from the dynamic type, which is the case here.

So maybe this should be tackled at the VM/ERTS level [...]

AFAICT, there were some improvements in error reporting in ERTS / OTP in the last 2-3 versions already (for example printing of non-matching args). I vaguely remember talking with @garazdawi on whether there are plans to provide more specific errors than, for example, just badarg, hinting at why an arg is actually bad, but that would be a huge effort and a backward incompatibility so it's very unlikely to happen.

Olivier Boudeville · Answer 4 · Mon Feb 13 2023 00:10:56 GMT+0800 (China Standard Time)

Thanks for your kind answer; I read https://github.com/josefs/Gradualizer/wiki/Type-annotations, this clarified much - but I had then two questions, maybe you will be able to enlighten me:

knowing that I prefer/need type checkers to be applied to BEAM files (rather than on the sources; reason: I am using quite a lot of parse transforms), if the assert_type macro has no runtime overhead (being inlined), I would expect the call to the ':::' operator not even to be in the final AST, and would thus to be invisible to Gradualizer?
with the second option that you shared (with N = ?assert_type(N, pos_integer())), I do not see why Gradualizer would not still complain about the second clause? I suppose that in the case where we still want such an error clause to exist, the guard of the first clause is still needed - and thus what would be the point of the assert_type, as it would add no new information to the type checker?

Thanks for any information that you may share!

Radek Szymczyszyn · Answer 5 · Tue Mar 07 2023 22:53:20 GMT+0800 (China Standard Time)

[...] if the assert_type macro has no runtime overhead (being inlined), I would expect the call to the ':::' operator not even to be in the final AST, and would thus to be invisible to Gradualizer?

I think you're right that the call to ::/2 or :::/3 should not be in the final AST. The question is which final AST we have in mind. The AST Gradualizer operates on is the AST of Erlang, the surface language we use. In the compiler pipeline we have at least two more representations: Core Erlang and then Kernel Erlang. A step from one to the next is associated with a number of transformations and opitmisations. We do have a -compile({inline, ['::'/2, ':::'/2]}) present in code, so I hope that at some point these are completely eliminated, but not as early as the AST Gradualizer deals with. In the worst case of the compiler not inlining the assertions (which I don't think ever happens), they come at the cost of a single, no-op, local function call.

with the second option that you shared (with N = ?assert_type(N, pos_integer())), I do not see why Gradualizer would not still complain about the second clause?

Gradualizer does not typecheck non-local returns or control flow such as exceptions or message passing. Another way to look at it is that throw() returns none() (also known as no_return()), the uninhabited type, which is a subtype of any other type.

If we wanted Gradualizer to be able to typecheck exceptions (throws) or message passing, we would have to use much more complex typechecking machinery to do so. Maybe one day, but not in the near future 😆

TBH, I'm not sure if message passing could be typechecked without adjusting the top-level Erlang syntax or idioms. I think that's the reason why languages with static type systems usually use the concept of channels - a channel is something you create and pass around in your code and thanks to that the type information is propagated where it's needed for typechecking. Erlang send/! and receive are way more like goto <label>, i.e. non-local jumps where it's hard to match the start and destination when analysing statically.

Olivier Boudeville · Answer 6 · Wed Mar 08 2023 05:59:45 GMT+0800 (China Standard Time)

Hi,
Thanks for your answer. I was thinking that in the BEAMs I would build first, the '::'/':::' calls would have already disappeared, and thus that Gradualizer would have no chance of detecting them later, when these BEAMs would be submitted to it. I have not verified that yet, though.

Thanks for your explanations regarding the second clause.

Yes, typed channels could be an option, but, in the context of Erlang, being able to send/receive any term seems consistent in my opinion with the rest of the language, even though checking the message exchange patterns would then be a real difficulty.

Currently, when time permits, I am checking a not-so-small codebase with Dialyzer first. Once this pass is completed, I will use do the same with Gradualizer (will take some time), its alternative approach and perhaps clearer reports will help much; I will try to share any issue I would encounter.
Feel free to close that this issue, and thanks for this much-appreciated tool!

Radek Szymczyszyn · Answer 7 · Wed Mar 08 2023 16:44:18 GMT+0800 (China Standard Time)

Thanks for your interest, @Olivier-Boudeville! And good luck with your typechecking endeavour :D Please share any problems with Gradualizer you run into!