-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement opaque closures #37849
Implement opaque closures #37849
Conversation
As a mildly big change, I’m hoping to get a chunk of time next week to review, amidst any v1.6 branching questions and bugs. Can you make sure |
The intent of this PR is to get the semantics right and add the minimum possible stub code in all the right places to make the various cases I want work. I'm sure that there are lots of missing edge cases (for example I know I haven't done anything yet with opaque closures that take a vararg), but I think it's in a good enough state to get onto master and continue iterating there. It's not really in the state where I would recommend people immediately jump on it. I think the primary thing to focus on in the review would be the semantics of the new builtin, the lattice element, etc. Everything else is just a matter of fixing some corner cases, but if we get the semantics wrong, we're obviously in trouble. |
# only propagate information we know we can store | ||
# and is valid inter-procedurally | ||
rt = widenconst(rt) | ||
end | ||
if tchanged(rt, frame.bestguess) | ||
# new (wider) return type for frame | ||
frame.bestguess = tmerge(frame.bestguess, rt) | ||
frame.bestguess = frame.bestguess === NOT_FOUND ? rt : tmerge(frame.bestguess, rt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to add this into tmerge
such that: tmerge(::NotFound, x) = x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmerge is very hot and this isn't technically part of the lattice, but rather a sentinel value only used in a few places. It used to be in tmerge, but @vtjnash moved it out a few months ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -112,7 +121,7 @@ mutable struct InferenceState | |||
ssavalue_uses, throw_blocks, | |||
Vector{Tuple{InferenceState,LineNum}}(), # cycle_backedges | |||
Vector{InferenceState}(), # callers_in_cycle | |||
#=parent=#nothing, | |||
#=parent=#nothing, #= has_opaque_closures =# false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the style change from annoating on right (old line 113 and 114) to annotating inline?
#=parent=#nothing, #= has_opaque_closures =# false, | |
nothing, # parent | |
false, # has_opaque_closures |
@@ -242,6 +251,7 @@ end | |||
|
|||
# temporarily accumulate our edges to later add as backedges in the callee | |||
function add_backedge!(li::MethodInstance, caller::InferenceState) | |||
caller.linfo !== nothing || return # don't add backedges to opauqe closures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caller.linfo !== nothing || return # don't add backedges to opauqe closures | |
caller.linfo !== nothing || return # don't add backedges to opaque closures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a TODO XXX? I'd assume that OpaqueClosures do have edges (ie. can call functions), so we need to be able to propagate those back to their caller
caller.linfo !== nothing || return # don't add backedges to opauqe closures | ||
isa(caller.linfo.def, Method) || return # don't add backedges to toplevel exprs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it work adding documentairy helper functions
is_toplevel(::InferenceState)
is_opaque_closure(::InferenceState)
@@ -34,7 +34,7 @@ struct InliningState{S <: Union{EdgeTracker, Nothing}, T <: Union{InferenceCache | |||
end | |||
|
|||
mutable struct OptimizationState | |||
linfo::MethodInstance | |||
linfo::Union{MethodInstance, Nothing} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it add clarity to define a new singleton to use a signal something is a OpaqueClosure,
rather than using Nothing
for this?
Or is using nothing
a sitatution that in theory could occur more widely than just for OpaqueClosures?
Or from another direction, should there be a OpaqueClosureOptimizationState
that doens't have this field?
I am guessing not but am interested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that might be cleaner, but honestly this whole thing needs a refactor, but that's for another PR.
@@ -311,6 +311,13 @@ function statement_cost(ex::Expr, line::Int, src::CodeInfo, sptypes::Vector{Any} | |||
ftyp = argextype(farg, src, sptypes, slottypes) | |||
end | |||
end | |||
# Give calls to OpaqueClosures zero cost. The plan is for these to be a single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like everything below here should be a seperate function.
(Or rather a collection of seperate methods).
Is the reason it isn't to do with compilation costs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this could be refactored.
base/compiler/ssair/inlining.jl
Outdated
@@ -533,6 +550,12 @@ function batch_inline!(todo::Vector{Pair{Int, Any}}, ir::IRCode, linetable::Vect | |||
for ((old_idx, idx), stmt) in compact | |||
if old_idx == inline_idx | |||
argexprs = copy(stmt.args) | |||
if isa(item, InliningTodo) && item.mi === nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this again makes me this we should use a different marker than nothing
to indicate that this is a opaque closure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are MethodInstances for opaque closures at this stage, inlining just doesn't support that yet.
base/compiler/ssair/inlining.jl
Outdated
@@ -1165,6 +1223,10 @@ function assemble_inline_todo!(ir::IRCode, state::InliningState) | |||
ir.stmts[idx][:inst] = quoted(calltype.val) | |||
continue | |||
end | |||
# Refuse to inline OpaqueClosures we can't see otherwise, to preserve the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be either:
# Refuse to inline OpaqueClosures we can't see otherwise, to preserve the | |
# Otherwise: refuse to inline OpaqueClosures we can't see, to preserve the |
or:
# Refuse to inline OpaqueClosures we can't see otherwise, to preserve the | |
# Refuse to inline OpaqueClosures we can't see, to preserve the |
Or am i not understanding what is going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems to have gotten lost.
base/compiler/tfuncs.jl
Outdated
argt, argt_exact = instanceof_tfunc(arg) | ||
lbt, lb_exact = instanceof_tfunc(lb) | ||
if !lb_exact | ||
lbt = Union{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lbt = Union{} | |
lbt = Union{} |
base/compiler/typeinfer.jl
Outdated
# If we don't need to track specializations, just infer this here | ||
# right now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# If we don't need to track specializations, just infer this here | |
# right now. | |
# If we don't need to track specializations, just infer this here right now. |
Julia's line limit is 92 character so this fits on one line
jl_perm_symsvec(4, "env", "ci", "fptr1", "fptr"), | ||
jl_svec(4, jl_any_type, jl_code_info_type, pointer_void, pointer_void), 0, 0, 4)->name->wrapper; | ||
jl_opaque_closure_typename = ((jl_datatype_t*)jl_unwrap_unionall((jl_value_t*)jl_opaque_closure_type))->name; | ||
jl_compute_field_offsets((jl_datatype_t*)jl_unwrap_unionall((jl_value_t*)jl_opaque_closure_type)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be defined in boot.jl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the moment yes, but I'm anticipating giving these special layout to make the fields truly invisible and let LLVM resize them for the capture list, which wouldn't be available from boot.jl.
src/julia.h
Outdated
@@ -351,6 +351,19 @@ struct _jl_method_instance_t { | |||
uint8_t inInference; // flags to tell if inference is running on this object | |||
}; | |||
|
|||
// YACK - Yet another kind of closure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// YACK - Yet another kind of closure. | |
// OpaqueClosure |
union { | ||
jl_value_t *code; | ||
jl_code_info_t *source; | ||
jl_method_t *method; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When are each of these cases used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code
is generic. CodeInfo
is used by the frontend and in the optimized representation if the argument types are uniquely determined during inference. The optimizer inserts a Method
for this if it needs to potentially specialize the opaque closure on the argument types passed later.
A thing @vtjnash and I discussed in the multithreading call was that |
Ah, okay. I thought you were trying to merge this into v1.6. Makes more sense if we're actually giving this some time on master to finish and mature. I'm hoping we can branch that off ASAP, so that we can start merging more feature work and getting experience with great new advancements like this. |
2af782d
to
3574fda
Compare
caeafb5
to
2f7510b
Compare
@Keno I've been trying to use this, and I'm getting some errors. Can you confirm that this transformation should work?
When I try to run |
fee7fe3
to
ca5a8ea
Compare
There is frontend support now, so you'll probably want to try that:
That said, performance here may not yet reflect the performance characteristics of the final feature. |
Thanks for making the frontend work! This is now working. It currently is a little slower than a regular closure largely due to extra allocations. |
base/compiler/ssair/driver.jl
Outdated
@@ -34,15 +34,33 @@ function normalize(@nospecialize(stmt), meta::Vector{Any}) | |||
return stmt | |||
end | |||
|
|||
function convert_to_ircode(ci::CodeInfo, code::Vector{Any}, coverage::Bool, nargs::Int, sv::OptimizationState) | |||
function add_opaque_closure_argtypes!(argtypes, t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a few methods still need type annotations in the Core.Compiler module. I don't want to mark each case.
Aside: also seems like this particular one might be unused? but github might just be hiding the right file diff
base/compiler/ssair/driver.jl
Outdated
end | ||
|
||
|
||
function convert_to_ircode(ci::CodeInfo, code::Vector{Any}, coverage::Bool, nargs::Int, sv::OptimizationState, slottypes=sv.slottypes, stmtinfo=sv.stmt_info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: line length (github is acting unwilling to show me the whole line)
base/compiler/ssair/inlining.jl
Outdated
@@ -65,6 +67,9 @@ end | |||
function ssa_inlining_pass!(ir::IRCode, linetable::Vector{LineInfoNode}, state::InliningState, propagate_inbounds::Bool) | |||
# Go through the function, performing simple ininlingin (e.g. replacing call by constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Go through the function, performing simple ininlingin (e.g. replacing call by constants | |
# Go through the function, performing simple inlining (e.g. replacing call by constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inlining?
@noinline function (y::Core.OpaqueClosure{A, R})(args...) where {A,R} | ||
typeassert(args, A) | ||
ccall(y.fptr1, Any, (Any, Ptr{Any}, Int), y, Any[args...], length(args))::R | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code probably should probably now be moved to C (where jl_apply_generic
is) as the invoke handler for objects of this type, where we can better deal with race-conditions and updating the pointer after reoptimization and calling convention changes and such (and hopefully optimizations too). I think that'll actually even make the typeassert
disappear into the dynamic dispatch caching operation on the previous line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump: this should be in C
base/opaque_closure.jl
Outdated
end | ||
|
||
|
||
# @opaque macro goes here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# @opaque macro goes here | |
""" | |
@opaque macro docstring goes here | |
""" |
@@ -1015,6 +1017,34 @@ function abstract_call_known(interp::AbstractInterpreter, @nospecialize(f), | |||
return abstract_call_gf_by_type(interp, f, argtypes, atype, sv, max_methods) | |||
end | |||
|
|||
function abstract_call_opaque_closure(interp::AbstractInterpreter, clos::PartialOpaque, argtypes::Vector{Any}, sv::InferenceState) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function abstract_call_opaque_closure(interp::AbstractInterpreter, clos::PartialOpaque, argtypes::Vector{Any}, sv::InferenceState) | |
function abstract_call_opaque_closure(interp::AbstractInterpreter, closure::PartialOpaque, argtypes::Vector{Any}, sv::InferenceState) |
|
||
argt = unwrap_unionall(clos.t).parameters[1] | ||
|
||
argtypes = Any[argt.parameters...] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all subtypes of Tuple have .parameters
(maybe use fieldtype
?)
function finish_rettype!(@nospecialize(rt), mod::Module, interp::AbstractInterpreter) | ||
if isa(rt, PartialOpaque) | ||
finish_opaque_closure!(rt, mod, interp) | ||
elseif isa(rt, PartialStruct) | ||
for field in rt.fields | ||
if isa(field, PartialStruct) || isa(field, PartialOpaque) | ||
finish_rettype!(field, mod, interp) | ||
end | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function finish_rettype!(@nospecialize(rt), mod::Module, interp::AbstractInterpreter) | |
if isa(rt, PartialOpaque) | |
finish_opaque_closure!(rt, mod, interp) | |
elseif isa(rt, PartialStruct) | |
for field in rt.fields | |
if isa(field, PartialStruct) || isa(field, PartialOpaque) | |
finish_rettype!(field, mod, interp) | |
end | |
end | |
end | |
end |
seems dead?
# Infer with the most general types possible, so that optimization has | ||
# access to the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implication in finish
is that inference is already done, since otherwise we might still have an unresolved cycle that must be converged first before we can annotate the correct (sufficiently wide) types. Thus, this is not a point in the graph where it is valid to look to look at IPO information.
I just built this branch and ran some tests to see what opaque closures can do: I see some allocations hurting the performance for small functions, but it is all right, what bothers me more is that LLVM can't optimalize the opaque functions further and discarding the allocations for the arrays in return statements. As @Keno said in his video, we could go hard on opaque function optimizations, and trade little more compilation time for little more optimized code in the case of opaque closures. I believe there needs to be a way to get better performance, probably this is all about generating MLIR instead of LLVM-IR? Sorry if this question is a weak connected to the issue. |
This method was throwing an error when called after the compaction had processes the last instruction in the IR (with reverse affinity). This currently can't, because we never use this method after compacting a terminal instruction (inlining uses it to split BBs), but can after the new code in #37849, so fix the bug now. Forward port from #37849.
This method was throwing an error when called after the compaction had processes the last instruction in the IR (with reverse affinity). This currently can't, because we never use this method after compacting a terminal instruction (inlining uses it to split BBs), but can after the new code in #37849, so fix the bug now. Forward port from #37849.
This is #37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken). I believe I have addressed the review in the original PR relevant to the files extracted here.
This is #37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken). I believe I have addressed the review in the original PR relevant to the files extracted here.
This is #37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken). I believe I have addressed the review in the original PR relevant to the files extracted here.
This is #37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken). I believe I have addressed the review in the original PR relevant to the files extracted here.
This is #37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken).
May have taken 6 months, but almost every thing that was in here is on master now. |
@Keno in a related vein, any eta for a public/beta version of diffractor.jl? i'd love to try it out |
A few weeks, once it's actually in a tryable form. |
This method was throwing an error when called after the compaction had processes the last instruction in the IR (with reverse affinity). This currently can't, because we never use this method after compacting a terminal instruction (inlining uses it to split BBs), but can after the new code in JuliaLang#37849, so fix the bug now. Forward port from JuliaLang#37849.
…ang#39020) Forward port from JuliaLang#37849 to reduce the number of unrelated changes in that PR. The argument list here was simply getting unwieldy. The new `hasfirst` argument isn't yet used in this PR, but will be used in JuliaLang#37849.
This is JuliaLang#37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken).
This is JuliaLang#37849 modulo inference/optimization support. This is self-contained and the functionality tests pass, though the tests testing for various optimizations of course don't (they're marked as test_broken).
This is the end result of the design process in #31253.
Overview
This PR implements a new kind of closure, called an
opaque closure
.It is designed to be complimenatry to the existing closure mechanism
and makes some different trade offs. The motivation for this mechanism
comes primarily from closure-based AD tools, but I'm expecting it will
find other use cases as well. From the end user perspective, opaque
closures basically behave like regular closures, expect that they
are introduced by adding the
@opaque
macro (not part of this PR,but will be added after). In front of the existing closure. In
particular, all scoping, capture, etc. rules are identical. For
such user written closures, the primary difference is in the
performance characteristics. In particular:
on the argument and return types of the closure, but not on the
closure identity. (This also means that the opaque closure will not
be eligible for inlining into the higher order function, unless the
inliner can see both the definition and the call site).
opaque closure (e.g. dropping unused captures, or reducing
Box
edvalues back to value captures).
The
opaque
part of the naming comes from the notion that semantically,nothing is supposed to inspect either the code or the capture environment
of the opaque closure, since the optimizer is allowed to choose any value
for these that preserves the behavior of calling the opaque closure itself.
Motivation
Optimization across closure boundaries
Consider the following situation (type annotations are inference
results, not type asserts)
now, the traditional closure mechanism will lower this to:
the problem with this is apparent: Even though (after inference),
we know that
a
is unused in the closure (and thus would beable to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.
Closures from Casette transforms
Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this. This provides a very straightforward implementation
of this feature, as opaque closures may be inserted at any point during
the compilation process (unlike types, which may only be inserted
by the frontend).
Mechanism
The primary concept introduced by this PR is the
OpaqueClosure{A<:Tuple, R}
type, constructed, by the new
Core._opaque_closure
builtin, withthe following signature:
Captures are available to the CodeInfo as
getfield
from Slot 1(referenced by position).
Examples
I think the easiest way to understand opaque closures is look through
a few examples. These make use of the
@opaque
macro which isn'timplemented yet, but makes understanding the semantics easier.
Some of these examples, in currently available syntax can be seen
in test/opaque_closure.jl