domluna / JuliaFormatter.jl

An opinionated code formatter for Julia. Plot twist - the opinion is your own.

Home Page:https://domluna.github.io/JuliaFormatter.jl/dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make default nesting in SciML style more readable

ranocha opened this issue · comments

After #729, the default nesting in the SciML style of functions with many arguments became hard to read, e.g.,

function alg_cache(alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
    ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
    dt, reltol, p, calck,
    ::Val{true}) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
    reduced_rate_prototype = rate_prototype.x[2]
    tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))
    k1 = zero(rate_prototype)
    k2 = zero(reduced_rate_prototype)
    k3 = zero(reduced_rate_prototype)
    k4 = zero(reduced_rate_prototype)
    k5 = zero(reduced_rate_prototype)
    k = zero(rate_prototype)
    utilde = zero(u)
    atmp = similar(u, uEltypeNoUnits)
    recursivefill!(atmp, false)
    tmp = zero(u)
    FineRKN4Cache(u, uprev, k1, k2, k3, k4, k5, k, utilde, tmp, atmp, tab)
end

It is hard to spot where the function arguments end and the function body begins. Is there a way to let the formatter make this more obvious?

function alg_cache(alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
    ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
    dt, reltol, p, calck,
    ::Val{true}) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}

    reduced_rate_prototype = rate_prototype.x[2]
    tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))
    k1 = zero(rate_prototype)
    k2 = zero(reduced_rate_prototype)
    k3 = zero(reduced_rate_prototype)
    k4 = zero(reduced_rate_prototype)
    k5 = zero(reduced_rate_prototype)
    k = zero(rate_prototype)
    utilde = zero(u)
    atmp = similar(u, uEltypeNoUnits)
    recursivefill!(atmp, false)
    tmp = zero(u)
    FineRKN4Cache(u, uprev, k1, k2, k3, k4, k5, k, utilde, tmp, atmp, tab)
end

I'd just put a space in there. @YingboMa @issaacs ? I don't think we'd go back to the old rule of the argument indentation and want to look for something simple like this.

The new line is definitely necessary

New line makes sense to me.

I prefer to revert #729

I prefer to revert #729

I can understand that 😬 If that's not an option, what about considering to at least add additional indentation, e.g., something like

function alg_cache(alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
        ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
        dt, reltol, p, calck,
        ::Val{true}) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
    reduced_rate_prototype = rate_prototype.x[2]
    tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))
    k1 = zero(rate_prototype)
    k2 = zero(reduced_rate_prototype)
    k3 = zero(reduced_rate_prototype)
    k4 = zero(reduced_rate_prototype)
    k5 = zero(reduced_rate_prototype)
    k = zero(rate_prototype)
    utilde = zero(u)
    atmp = similar(u, uEltypeNoUnits)
    recursivefill!(atmp, false)
    tmp = zero(u)
    FineRKN4Cache(u, uprev, k1, k2, k3, k4, k5, k, utilde, tmp, atmp, tab)
end

Personally I prefer the double indent suggested by sloede. Doesn't result in cramped function arguments when you've got a long function name but provides visual distinction between arguments and body.

also an option to both double indent and newline when the signature spans multiple lines, which would be my first choice.

Perhaps I put up #729 a bit too quickly and it should've been discussed more. But based on the Slack thread and examples like SciML/RuntimeGeneratedFunctions.jl@d41842e it seemed like most agreed that aligning on opening parentheses is not a good rule to always apply. So IMO we shouldn't revert.

The suggestion to indent extra from #741 (comment) could alleviate the problem here, though personally I prefer what black does here, putting the ) on a new unindented line. That also leaves space for return type annotations or where statements. And with multi-line arguments I'd also put the first argument on a new line, like so:

function alg_cache(
    alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
    ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
    dt, reltol, p, calck,
    ::Val{true}
) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
    reduced_rate_prototype = rate_prototype.x[2]
    tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))
    k1 = zero(rate_prototype)
    k2 = zero(reduced_rate_prototype)
    k3 = zero(reduced_rate_prototype)
    k4 = zero(reduced_rate_prototype)
    k5 = zero(reduced_rate_prototype)
    k = zero(rate_prototype)
    utilde = zero(u)
    atmp = similar(u, uEltypeNoUnits)
    recursivefill!(atmp, false)
    tmp = zero(u)
    FineRKN4Cache(u, uprev, k1, k2, k3, k4, k5, k, utilde, tmp, atmp, tab)
end

I mentioned this on the Slack thread and others mentioned that BlueStyle and others also do some of this, so it probably is more common than the extra indent.

That style has been discussed before I believe. Among other issues it breaks folding of functions and alignment of the first non-argument line in some editors (for example VSCode).

@isaacsas if you can still find this discussion it'd be interesting to read. I tried in VS code and see:

  1. Folding: there are indeed 2 folds, one for the arguments and one for the body. Not sure how that can be fixed, but for Python it is fixed.
  2. If I press enter after function f( it does exactly what I want, arguments indented and closing ) unindented
  3. To start the function body an extra tab is indeed needed since auto-indent doesn't do that for the first line. This is also not a problem for Python.

So they are valid point but IMO nothing too serious + fixable.

I don't recall where the discussion occurred unfortunately. Either a previous issue here or an older Slack.

But generally I wouldn't be in favor of any format change that results in non-indented lines between function and end. I think it is likely to lead to issues with editors, and personally find it harder to visually delineate functions. If we wanted to double indent function arguments and single indent the ) where that would be fine with me. (Though I still prefer the original suggestion to just add a newline over having the possibility of lines with just a dangling ).)

I think the format suggested in #741 (comment) looks great. There's a clear separation between the function header and the body, and the distinction between the function arguments and the type information is also immediately obvious.

a bit late to the discussion here but I'm happy to help/accept any PR you folks decide on.

As @isaacsas has mentioned, it would be worth working out a solution to the editor problem. For example, the following behavior in vs-code is a nightmare

Screencast.from.2023-06-27.08.10.52.webm

This appears not only when pressing enter after having written a function signature, it also appears when pasting code into the top of the function body, or moving code there from further down the function body.

I guess most styles are just a matter of taste and I'm usually happy about the consistency introduced by them even if not every design choice is my personal favourite.

But I'm very glad that #729 was merged and would be a bit annoyed if it would be reverted - I think the previous nesting behaviour caused really unreasonable indentation. An example (which provoked this comment here) is SciML/MuladdMacro.jl#42 (comment).

In many other projects I use BlueStyle so I think something like #741 (comment) would be a good alternative. IMO if there are problems with editors, they should be fixed in these editors but not dictate the style choice (also because similar approaches are already used by other styles). I also tend to think that more lines are preferable to overly long lines.

It always bothered me that vscode aligns the ending parenthesis to function. I always wished it would align to the body of the function, as follows:

function alg_cache(
        alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
        ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
        dt, reltol, p, calck,
        ::Val{true}
    ) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
    reduced_rate_prototype = rate_prototype.x[2]
    tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))
    k1 = zero(rate_prototype)
    k2 = zero(reduced_rate_prototype)
    k3 = zero(reduced_rate_prototype)
    k4 = zero(reduced_rate_prototype)
    k5 = zero(reduced_rate_prototype)
    k = zero(rate_prototype)
    utilde = zero(u)
    atmp = similar(u, uEltypeNoUnits)
    recursivefill!(atmp, false)
    tmp = zero(u)
    FineRKN4Cache(u, uprev, k1, k2, k3, k4, k5, k, utilde, tmp, atmp, tab)
end

Given that this is stalled out, and could be discussed endlessly since it is a very subjective issue, perhaps we should just go with @sloede's suggestion of double indenting, maybe with a trailing extra newline as @ChrisRackauckas suggested? Double indenting seemed to have no real opposition, so perhaps it is a good compromise from the options in this discussion?

(I bring this up in the hope of getting some resolution, as I've been avoiding reformatting several libraries in anticipation of another change, but would like to stop seeing the resulting test failure indicators...)

Does the double indentation style cause problems in vscode? @baggepinnen

Code folding at least doesn't seem to have any issue with it.

I tried to reproduce @baggepinnen's video here.

The same issues are there, but now things get indented one too far instead of too little. Also noticed another issue that BlueStyle/#741 (comment) doesn't have: at the end I paste the formatted code and the whole function body loses one indent, which IMO is worse than having function argument code folding.

Though I agree with with #741 (comment) not to let editor issues dictate style choice.

I find VSCode annoyingly loses indentation information quite frequently, whether it be with the current or previous style. I haven't tried to understand the conditions where it occurs though. Usually hitting undo restores the indentation without removing the pasted code so it can at least be fixed quickly.

Perhaps we can just go with Chris' original suggestion of adding a space then? That also didn't seem to have any strong arguments against it, and shouldn't change any pasting / folding behavior from the current style.

We should try to finalize this. Because:

https://github.com/SciML/SciMLBase.jl/pull/370/files/7a24ffd822aa2c7a13115d8f5d8e34e9eea37286..7cf85c43aced812d700445537c221758936e217f#diff-ac94ce23506dc1295d335638baf95d7fb3ec8739d4e76676f3e943bd1ef9ec68R214-R221

it looks like the default nesting is incorrect and is preferring to put everything on new lines after #729, instead of filling lines, which is something everyone agreed before was unsatisfactory. So now we are in a position of never wanting to run the formatter because it newlines everything, and so I would like to probably get that reverted ASAP and switch to whatever the final solution is here.

Okay this is getting crazy, we've effectively been ignoring the formatter since June because of #729. Do we just revert it or add the space? I think those are the two on the table. Let's make a decision this week. Vote yes for revert, <3 for space, and down if you like it as is. Rocket emoji for #741 (comment) . Afterwards we keep the style to whatever is chosen and start using the formatter again.

But I think no matter what, can we get help @domluna figuring out what disabled the line filling and started putting every argument on a new line?

why is #741 (comment) off the table? it sounds like that's a bug in VSCode extension not a fundamental problem with the format

Do you or someone else want to commit to fixing the VS Code extension by the end of next week? It's been a few months now and so I think we need to just get serious about a choice. If no one is stepping up to do it then we should just assume it's not going to be done. Going a whole year with having formatter failures is not an option, and telling everyone developing SciML to stop using VS Code is also not a reasonable option. Thus the 3 options there are the options that can reasonably be implemented with the available resources that I know of unless someone is working on the VS Code extension fix but hasn't shared WIP PRs or anything.

Moreover, other options that were discussed did not have arguments against them, unlike #741, and some had more apparent support in votes (#741 (comment)). Chris' suggestion of adding a space is a simple fix that can be made immediately, and was one that no one really voiced a strong opinion against.

Oh I missed #741 (comment), that's a reasonable one too.

Edited the voting to add that one in.

If #741 (comment) is an option, why isn't #741 (comment)? In #741 (comment) I showed that both options have the same VSCode issues, the former even a bit more.

you or someone else want to commit to fixing the VS Code extension by the end of next week

Just to understand, what is the most important issue here that needs to be fixed? The separate folding of args and body, or the manual indent needed when starting the body. The extra enter is just as much work as an extra tab, and the latter can go away if fixed in VSCode.

I would just like to point out that whatever is the final decision here, it might be a good idea to make that change optional too (similar to #730). This would make the formatter more flexible and make people happy who don't agree with the decision.

I think #753 (comment) might be relevant here as well - the key issue being the options set by sciml are overridden by the defaults of another style when delegated to it. not sure this is intentional, but apart from indentation, which is the core of a style, we probably want all the options to stay consitent.

I had a look at julia-vscode and it looks like indentation is currently done by regex, which won't be able to tell if ) needs an indent or not.

Comparing to Python it looks like #741 (comment) is more like PEP8 and #741 (comment) more like Black.

I still prefer "Black" over "PEP8", though voted for "PEP8" since "Black" is not an option. If I could tweak the "PEP8" proposal ever so slightly, with multi-line args I'd put the ) at a separate line to be able to find it (and return type annotations, type parameters) more easily.

 function alg_cache(alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
         ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
-        dt, reltol, p, calck,
-        ::Val{true}) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
+        dt, reltol, p, calck, ::Val{true}
+        ) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
     reduced_rate_prototype = rate_prototype.x[2]
     tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))

Another thing I'd tweak for multi-line args would be a newline after (, to make it easier to distinguish the function and argument names.

-function alg_cache(alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
+function alg_cache(
+        alg::FineRKN4, u, rate_prototype, ::Type{uEltypeNoUnits},
         ::Type{uBottomEltypeNoUnits}, ::Type{tTypeNoUnits}, uprev, uprev2, f, t,
         dt, reltol, p, calck, ::Val{true}
         ) where {uEltypeNoUnits, uBottomEltypeNoUnits, tTypeNoUnits}
     reduced_rate_prototype = rate_prototype.x[2]
     tab = FineRKN4ConstantCache(constvalue(uBottomEltypeNoUnits), constvalue(tTypeNoUnits))

Looks like rocket emoji is the call. Thanks @sloede for the suggestion, you get a cookie. @domluna can we get some help implementing this and getting this closed?

I think #753 (comment) might be relevant here as well - the key issue being the options set by sciml are overridden by the defaults of another style when delegated to it. not sure this is intentional, but apart from indentation, which is the core of a style, we probably want all the options to stay consitent.

#753 (comment) is definitely relevant because what the maintainers are mostly having issue with is the fact that somewhere along the line with some other changes, short_to_long_function_def=false is no longer always respected, and we don't want to merge any PR with formatter changes that has the short_to_long_function_def=true because it's non-reversible, i.e. if a function is changed to the big multi-line form then the formatter never brings it back to the denser form, so the best option right now is to just tell people to ignore the formatter check until that's handled. If we can get help on these two issues in some timely fashion then I think we'll be in a good spot again.

right now #754 seems to fix #753 by detecting if the function being transformed from short to long form is within a macro.

the argument is respected btw, I misspoke about that. It's just that SciML overrides a function and calls applies this transform automatically.

https://github.com/domluna/JuliaFormatter.jl/blob/master/src/styles/sciml/nest.jl#L39-L44

it's not checking if short_to_long_function_def is set.

short_to_long_function_def should not be applied within a macro anyway since it can change the semantics.

@domluna can we get some help closing this? We still have some chaos.

@ChrisRackauckas what exactly needs to still be done?

@isaacsas can comment. I think the only thing missing is the extra indent after the function, i.e. matching #741 (comment) ?

- function get_num_majumps(smaj::SpatialMassActionJump{A, B, S, U, V}) where
+ function get_num_majumps(smaj::SpatialMassActionJump{
+     A,
+     B,
+     S,
+     U,
+     V,
+ }) where
+     {A <: AbstractVector, B <: AbstractMatrix, S, U, V}
+     length(smaj.uniform_rates) + size(smaj.spatial_rates, 1)
end

This isn't the style we voted for above, #741 (comment), and the expansion of this to multiple lines is excessively wasteful of the (more!) limited vertical space one has on typical screens...

I'll implement the double indent after a function/macro definition today. sorry i haven't gotten around to this yet.

Thanks!

What about this issue that it takes the type arguments and puts each one on a separate line as in my example above? I thought SciMLStyle is supposed to not do that?

can you post the complete code with options? it shouldn't do that since join_lines_based_on_source is set to true. https://github.com/domluna/JuliaFormatter.jl/blob/master/src/styles/sciml/pretty.jl#L28

@ArnoStrouwen can you post the info on how you reformatted JumpProcesses in SciML/JumpProcesses.jl#352 to help @domluna understand why the formatter was breaking up type arguments to one per line (along with what version of JuliaFormatter you used)? Thanks!

I was using the latest release of JuliaFormatter.
The only option I might have changed from default SciML style was activating the formatting of docstrings and .md files in /docs. I was not aware of the join_lines_based_on_source option before.

#761 would this work for the function/macro indent?

- function get_num_majumps(smaj::SpatialMassActionJump{A, B, S, U, V}) where
+ function get_num_majumps(smaj::SpatialMassActionJump{
+     A,
+     B,
+     S,
+     U,
+     V,
+ }) where
+     {A <: AbstractVector, B <: AbstractMatrix, S, U, V}
+     length(smaj.uniform_rates) + size(smaj.spatial_rates, 1)
end

This isn't the style we voted for above, #741 (comment), and the expansion of this to multiple lines is excessively wasteful of the (more!) limited vertical space one has on typical screens...

ok so this is occurring because in the pretty.jl file we are using YASStyle but then DefaultStyle for nesting.

julia> format_text(s, YASStyle()) |> print
function get_num_majumps(smaj::SpatialMassActionJump{Nothing,B,S,U,V}) where
         {B,S,U,V}
    return size(smaj.spatial_rates, 1)
end
julia> format_text(s, DefaultStyle()) |> print
function get_num_majumps(smaj::SpatialMassActionJump{Nothing,B,S,U,V}) where {B,S,U,V}
    size(smaj.spatial_rates, 1)
end
julia> format_text(s, SciMLStyle()) |> print
function get_num_majumps(
    smaj::SpatialMassActionJump{Nothing, B, S, U, V},
) where
    {B, S, U, V}
        size(smaj.spatial_rates, 1)
end
julia> # ^ with default style for pretty and nest

julia> # ^ with default style for pretty and nest

julia> format_text(s, SciMLStyle()) |> print
function get_num_majumps(smaj::SpatialMassActionJump{
    Nothing,
    B,
    S,
    U,
    V,
}) where
    {B, S, U, V}
        size(smaj.spatial_rates, 1)
end

it may have been an oversight to mix and match using a different style for each stage but you will likely get odd behavior that way since the intention is for the style to be used throughout the entire cycle.

1b33121 fixes it but breaks some tests

function get_num_majumps(
    smaj::SpatialMassActionJump{Nothing, B, S, U, V},
) where
    {B, S, U, V}
        size(smaj.spatial_rates, 1)
end

would be the output from it

tests pass 9edea60

Thanks! It seems like that is the wrong style though based on your example above. The following is what I would've expected from the style we voted for, #741 (comment).

That style specifically avoided any non-indented text between function and end, and indented the function args by two levels. Given it is within 80 characters, I also would have expected the minimal change, i.e. something like

function get_num_majumps(
        smaj::SpatialMassActionJump{Nothing, B, S, U, V}) where {B, S, U, V}
    size(smaj.spatial_rates, 1)
end

is it the function arguments or function body that we want to indent by 2x?

In [84]: s = "function get_num_majumps(smaj::SpatialMassActionJump{Nothing, B, S, U, V}) where {B, S, U, V}"

In [85]: len(s)
Out[85]: 93

that's why it's nesting. the original source code is not that long but SciML has options on to add whitespaces in type parameters which increases the length past the allowed margin

The function arguments and type info are double indent, the function body is single indent (4 characters).

I don’t know about your white space comment, but the code I put above is only 77 characters including the white space. That shouldn’t be more than the SciML col size is it?

By code I mean what I would’ve expected the reformatted code to be.

The function arguments and type info are double indent, the function body is single indent (4 characters).

I don’t know about your white space comment, but the code I put above is only 77 characters including the white space. That shouldn’t be more than the SciML col size is it?

ok I'll change it to the arguments, got confused

I think we can call this done. Thanks!