aviatesk / JET.jl

An experimental code analyzer for Julia. No need for additional type annotations.

Home Page:https://aviatesk.github.io/JET.jl/dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Weird error involving potential `Missing` in union split

gdalle opened this issue · comments

Hi there,
I'm currently trying to get Graphs.jl JET-approved, and I only have one error to fix.

   ═════ 2 possible errors found ═════
  ┌ ==(g::Graphs.SimpleGraphs.SimpleDiGraph, h::Graphs.SimpleGraphs.SimpleDiGraph) @ Graphs.SimpleGraphs /home/runner/work/Graphs.jl/Graphs.jl/src/SimpleGraphs/simpledigraph.jl:411
  │ non-boolean `Missing` found in boolean context (1/2 union split): goto %18 if not (Graphs.SimpleGraphs.fadj(g::Graphs.SimpleGraphs.SimpleDiGraph)::Array{Vector{T}, 1} where T<:Integer Graphs.SimpleGraphs.:(==) Graphs.SimpleGraphs.fadj(h::Graphs.SimpleGraphs.SimpleDiGraph)::Array{Vector{T}, 1} where T<:Integer)::Union{Missing, Bool}
  └────────────────────
  ┌ ==(g::Graphs.SimpleGraphs.SimpleDiGraph, h::Graphs.SimpleGraphs.SimpleDiGraph) @ Graphs.SimpleGraphs /home/runner/work/Graphs.jl/Graphs.jl/src/SimpleGraphs/simpledigraph.jl:411
  │ non-boolean `Missing` found in boolean context (1/2 union split): goto %18 if not (Graphs.SimpleGraphs.badj(g::Graphs.SimpleGraphs.SimpleDiGraph)::Array{Vector{T}, 1} where T<:Integer Graphs.SimpleGraphs.:(==) Graphs.SimpleGraphs.badj(h::Graphs.SimpleGraphs.SimpleDiGraph)::Array{Vector{T}, 1} where T<:Integer)::Union{Missing, Bool}
  └────────────────────

Basically I'm comparing two adjacency lists with fadj(g) == fadj(h) and JET is afrait that the result might be Missing, but I really don't know why.
See the CI log: https://github.com/JuliaGraphs/Graphs.jl/actions/runs/5344089300/jobs/9688032589?pr=249

More precisely, this is the guilty function:

function ==(g::SimpleDiGraph{T1}, h::SimpleDiGraph{T2}) where {T1,T2}
    if vertices(g) == vertices(h) && ne(g) == ne(h)
        if fadj(g) == fadj(h) && badj(g) == badj(h)  # fails JET
        # if all(fadj(g) .== fadj(h)) && all(badj(g) .== badj(h))  # passes JET
            return true
        else
            return false
        end
    else
        return false
    end
end

The results of fadj(g) and badj(g) are adjacency lists, ie. Vector{Vector{<:Integer}}. So it seems comparing these objects directly fails, but I just found that if I compare the nested vectors individually it passes.

I'm currently trying to get Graphs.jl JET-approved

Tangent here, but what exactly does that mean? report_package? If Graphs has a precompile workload it would be interesting to compare the MethodAnalysis-based approach in https://aviatesk.github.io/JET.jl/dev/tutorial/#Analyze-packages-using-a-representative-workload and see if it detects any other issues. report_package uses abstract inference, and the analysis stops at any runtime dispatch (https://aviatesk.github.io/JET.jl/stable/#Limitations), so there's a chance you'd discover different things if you're using concrete MethodInstance signatures.

Tangent here, but what exactly does that mean? report_package?

Yes, or in this case test_package

If Graphs has a precompile workload it would be interesting to compare the MethodAnalysis-based approach

It doesn't yet have one, but I think that might be interesting to put in place as soon as GraphsBase.jl emerges.

report_package uses abstract inference, and the analysis stops at any runtime dispatch

Indeed, and I seriously doubt all the code in Graphs.jl is inferrable. But JET already caught quite a few mistakes, so abstract interpretation is already better than nothing!

abstract interpretation is already better than nothing

Indeed! I don't think there's a single solution to all these problems, but the main realistic possibilities are being filled out.

This is not an issue with JET. Here is a minimal example:

julia> f() = Vector{Any}([1,2,3])
f (generic function with 1 method)

julia> function foo()
           if f() == f()
               1    
           else
               2
           end
       end

foo (generic function with 1 method)

julia> using JET

julia> @report_call foo()
═════ 1 possible error found ═════
┌ foo() @ Main ./REPL[2]:2
│ non-boolean `Missing` found in boolean context (1/2 union split): goto %6 if not (f()::Vector{Any} == f()::Vector{Any})::Union{Missing, Bool}

If brief, the issue is that ==(::Vector, ::Vector), when the element type of the vectors are unknown, is inferred to be Union{Bool, Missing}. This inference is reasonable given the definition of == on vectors. You then take this output and put it into an if-statement, which will fail if the == returns missing.

The solution is to make fadj and badj well-inferred.

Perhaps you are testing digraphs with abstract types? In that case, it's hard to avoid potential issues.

julia> report_call(==, (SimpleDiGraph{Int}, SimpleDiGraph{Int}))
No errors detected

julia> report_call(==, (SimpleDiGraph{Integer}, SimpleDiGraph{Integer}))
═════ 1 possible error found ═════
┌ ==(g::SimpleDiGraph{Integer}, h::SimpleDiGraph{Integer}) @ Graphs.SimpleGraphs /home/jakni/.julia/packages/Graphs/7SMZs/src/SimpleGraphs/simpledigraph.jl:383
│ non-boolean `Missing` found in boolean context (1/2 union split): goto %17 if not (Graphs.SimpleGraphs.fadj(g::SimpleDiGraph{Integer})::Vector{Vector{Integer}} Graphs.SimpleGraphs.:(==) Graphs.SimpleGraphs.fadj(h::SimpleDiGraph{Integer})::Vector{Vector{Integer}})::Union{Missing, Bool}

Following PR #543, JET ignores the possibility of a poorly-inferred x == y call returning missing during the report_package analysis. Refer to issue #542 for reasons justifying this behavior. Essentially, report_package often relies on poor input argument type information at the beginning of analysis, leading to noisy error reports for function definitions like:

struct MyToken end
ismytoken(x) = x == MyToken()

This error is arguably just noise when the target package does not handle missing. report_package is designed as an entry point for easy analysis, even at the cost of accuracy, so it is not sound from the beginning. Hence, it might be beneficial to simply ignore such noise. A package that handles missing can disable this behavior if necessary.

However, in interactive entry points like report_call, where concrete input argument types are available, this behavior should be turned off. This is because, if the code, when given specific input argument types, results in a Union{Bool,Missing} possibility, it likely signifies an inferrability issue or the code really needs to handle missing.

Oops, the original bug is fixed by JET 0.8.2 on Julia 1.9, but a new similar one appeared with Julia 1.10

https://github.com/JuliaGraphs/Graphs.jl/actions/runs/5377914717/jobs/9757016398

I don't think the fault lies with bad inference in Graphs.jl, Cthulhu shows no issue.

Full function for reference:

function cycle_basis(g::AbstractSimpleGraph, root=nothing)
    T = eltype(g)
    cycles = Vector{Vector{T}}()

    nv(g) == 0 && return cycles

    gnodes = Set(vertices(g))
    r::T = (root === nothing) ? pop!(gnodes) : T(root)
    while true
        stack = [r]
        pred = Dict(r => r)
        keys_pred = Set(r)
        used = Dict(r => T[])
        keys_used = Set(r)
        while !isempty(stack)
            z = pop!(stack)
            zused = used[z]
            for nbr in neighbors(g, z)
                if !in(nbr, keys_used)
                    pred[nbr] = z
                    push!(keys_pred, nbr)
                    push!(stack, nbr)
                    used[nbr] = [z]
                    push!(keys_used, nbr)
                elseif nbr == z
                    push!(cycles, [z])
                elseif !in(nbr, zused)  # JET errors here
                    pn = used[nbr]
                    cycle = [nbr, z]
                    p = pred[z]
                    while !in(p, pn)  # JET errors here
                        push!(cycle, p)
                        p = pred[p]
                    end
                    push!(cycle, p)
                    push!(cycles, cycle)
                    push!(used[nbr], z)
                end
            end
        end
        setdiff!(gnodes, keys_pred)
        isempty(gnodes) && break
        r = pop!(gnodes)
    end
    return cycles
end

Oops, the original bug is fixed by JET 0.8.2 on Julia 1.9, but a new similar one appeared with Julia 1.10

https://github.com/JuliaGraphs/Graphs.jl/actions/runs/5377914717/jobs/9757016398

It should be fixed by JET 0.8.3, which is waiting for JuliaRegistries/General#86248 to be merged.

Sorry for causing lots of releases 🤣
And thank you for being so reactive, it is extremely appreciated