Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Pi node after type assert (or, using the non type asserted SSA value leads to worse inference) #38274

Open
KristofferC opened this issue Nov 2, 2020 · 5 comments
Labels
compiler:inference Type inference performance Must go faster

Comments

@KristofferC
Copy link
Sponsor Member

KristofferC commented Nov 2, 2020

Not sure if the title is 100% accurate. Anyway, here are two similar versions of a function:

function f(x::Dict{String, Any})
    for (k, v) in x
        v::String
        return Base.UUID(v)
    end
    return nothing
end

function f2(x::Dict{String, Any})
    for (k, v) in x
        return Base.UUID(v::String)
    end
    return nothing
end

However, the first one fails to infer while the second one does not:

julia> Core.Compiler.return_type(f, Tuple{Dict{String, Any}})
Any

julia> Core.Compiler.return_type(f2, Tuple{Dict{String, Any}})
Union{Nothing, Base.UUID}

Looking at the typed code, we can see why:

julia> code_warntype(f, Tuple{Dict{String, Any}})
...
         (v = Core.getfield(%11, 1))
│         Core.getfield(%6, 2)
│         Core.typeassert(v, Main.String)
│   %15 = Base.UUID::Core.Const(Base.UUID)
│   %16 = (%15)(v)::Any

So julia here decided to pick the local variable v that was inferred to Any in the call to UUID instead of the one that is type asserted (%14). In the inferred version:

julia> code_warntype(f2, Tuple{Dict{String, Any}})
    %14 = Base.UUID::Core.Const(Base.UUID)
│   %15 = Core.typeassert(v, Main.String)::String%16 = (%14)(%15)::Base.UUID

Here it is using %15 which is the one where the type is known.

Perhaps the optimizer can be taught how to pick the better one of the two possible SSA values.

@KristofferC KristofferC added performance Must go faster compiler:inference Type inference labels Nov 2, 2020
@martinholters
Copy link
Member

Also compare to

function f3(x::Dict{String, Any})
    for (k, v) in x
        v = v::String
        return Base.UUID(v)
    end
    return nothing
end

where

julia> code_warntype(f3, Tuple{Dict{String, Any}})
Variables
  v::Any
│         (v = Core.getfield(%11, 1))
│         Core.getfield(%6, 2)
│         (v = Core.typeassert(v, Main.String))
│   %15 = Base.UUID::Core.Const(Base.UUID)
│   %16 = (%15)(v::String)::Base.UUID

With the assignment, v::String is propagated. Would it be invalid in general to do so without the assignment? Or is that just a shortcoming of the current implementation in inference?

@vchuravy
Copy link
Sponsor Member

vchuravy commented Nov 2, 2020

Compare to:

function f4(x::Dict{String, Any})
    for (k, v) in x
        (v isa String) || error("typeassert")
        return Base.UUID(v)
    end
    return nothing
end

Which inserts a PiNode after the error branch.

24 ─ %58 = (%54 isa Main.String)::Bool
└───       goto #26 if not %58
25 ─       nothing::Nothing
│    %61 = Base.UUID::Core.Compiler.Const(Base.UUID, false)
│    %62 = π (%54, String)
│    %63 = invoke %61(%62::String)::Base.UUID
└───       return %63
 function g(x)
         x::String
         return x
       end

 function h(x)
         (x isa String) || error("typeassert")
         return x
       end

julia> code_typed(g, Tuple{Any})
1-element Array{Any,1}:
 CodeInfo(
1 ─     Core.typeassert(x, Main.String)::String
└──     return x
) => Any

julia> code_typed(h, Tuple{Any})
1-element Array{Any,1}:
 CodeInfo(
1 ─ %1 = (x isa Main.String)::Bool
└──      goto #3 if not %1
2 ─      nothing::Nothing
│   %4 = π (x, String)
└──      return %4
3 ─      invoke Main.error("typeassert"::String)::Union{}
└──      $(Expr(:unreachable))::Union{}
) => String

@martinholters
Copy link
Member

This seems to do the trick:

--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -1390,6 +1390,17 @@ function typeinf_local(interp::AbstractInterpreter, frame::InferenceState)
                     else
                         frame.src.ssavaluetypes[pc] = t
                     end
+                    if hd === :call
+                        # give special treatment to typeasserts on variables as in
+                        stmt::Expr
+                        # which constrain the type of the variable in subsequent code
+                        if length(stmt.args) == 3
+                            f = abstract_eval_value(interp, stmt.args[1], changes, frame)
+                            if isa(f, Const) && f.val === Core.typeassert && isa(stmt.args[2], Slot)
+                                changes = StateUpdate(stmt.args[2], VarState(t, false), changes)
+                            end
+                        end
+                    end
                 end
                 if frame.cur_hand !== nothing && isa(changes, StateUpdate)
                     # propagate new type info to exception handler

It feels like a bit of hack though. We don't do this "if v is not of type T in this expression it would throw, so can assume v::T in the following" kind of analysis elsewhere (except for some rather special treatment of conditionals maybe). We could e.g. also think about f(v) restricting the type of v to the possible argument types of f. Meanwhile, I'd probably prefer not doing above change and instead recommending to use the v = v::T idiom instead.

@vchuravy
Copy link
Sponsor Member

vchuravy commented Nov 5, 2020

A different alternative would be to define typeassert as: typeassert(x,T) = (x isa T || error(""); return x), then after inlining the behaviour would fall out of the abstract interpretation rules for control flow, but such a definition has probably unwanted consequences for inference performance and we don't want typeassert to be "user" extendable.

Another option would be to expand typeassert during lowering into the isa form.

@vchuravy
Copy link
Sponsor Member

So I think effectivly the change I would like to see is>

a::Int

becomes

a = a::Int

and

b = a::Int

becomes

a = a::Int
b = a

If we only change the the first we still lose some information and make two similar expressions quite distinct. This is hard to implement on the abstract interpretation level since we currently have the assumption of each statement leading to one state update.

So probably the place to implement is during lowering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants