-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor cache logic for easy replacement #35831
Conversation
895f9f8
to
33aa49a
Compare
This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen.
codeinst = (jl_code_instance_t*)ci; | ||
*src_out = (jl_code_info_t*)codeinst->inferred; | ||
jl_method_t *def = codeinst->def->def.method; | ||
if ((jl_value_t*)*src_out == jl_nothing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What with rettype_const
methods, those don't have a ci.inferred
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What with rettype_const methods, those don't have a ci.inferred?
Same as before, they won't get an LLVM function emitted for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what happens if the JIT deletes an inferred code instance,
Fair point, that should probably be turned off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before, they won't get an LLVM function emitted for them.
We currently do, though. If you call jl_create_native
for kernel() = return
you get the following un-optimized IR:
define internal void @julia_kernel_39() !dbg !5 {
top:
%0 = call %jl_value_t*** @julia.ptls_states()
%1 = bitcast %jl_value_t*** %0 to %jl_value_t addrspace(10)**
%2 = getelementptr inbounds %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %1, i64 4
%3 = bitcast %jl_value_t addrspace(10)** %2 to i64**
%4 = load i64*, i64** %3, !tbaa !7, !invariant.load !4
ret void, !dbg !11
}
It's only when calling this function from another that the return type gets folded.
Doing the same with the proposed PR and a lookup
cgparam yields Refusing to automatically run type inference with custom cache lookup
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that just seems like a bug then. We should just allow that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see what's happening here. In the regular case it actually does call type inference then, which will then give it a CodeInfo, even though we never cache those because they're not actually supposed to be used. I would say for now, let's just filter these in the AbstractInterpreter implementation that calls into here and maybe in the future, we can have it synthesize an appropriate body, without forcing it re-do type inference.
Co-authored-by: Tim Besard <[email protected]>
base/compiler/typeinfer.jl
Outdated
rettype_const = (result.src::Const).val | ||
const_flags = 0x3 | ||
function CodeInstance(result::InferenceResult, min_valid::UInt, max_valid::UInt, | ||
may_compress=true, always_cache_tree=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"always_cache" sounds like a bit of a misnomer, since there's always going to be intermediate work that we discarded because it'd be invalid to call cache_result
on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow_discard_tree=true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just rename this to allow_discard_tree
and then get this merged so I can put up some of the follow on work to this.
@@ -470,12 +488,12 @@ function typeinf_edge(interp::AbstractInterpreter, method::Method, @nospecialize | |||
end | |||
if frame === false | |||
# completely new | |||
mi.inInference = true | |||
lock_mi_inference(interp, mi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that this is a lock in the proper sense. It's a bootstrapping device, but it's also merely an advisory hint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a lock in the atomic sense, but I do think it's a semantic lock.
// The root propagation here doesn't have to be literal, but callers should | ||
// ensure that the return value outlives the MethodInstance | ||
typedef jl_value_t *(*jl_codeinstance_lookup_t)(jl_method_instance_t *mi JL_PROPAGATES_ROOT, | ||
size_t min_world, size_t max_world); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be interested in knowing if we can break this assumption (remove the JL_PROPAGATES_ROOT
attribute). It is required that the result be rooted (because it could hold a couple of the gc roots for the generated code result) as long as it could appear on the call-stack, but perhaps that's okay?
outlives the MethodInstance
These are typically required to be permanently allocated (so that we can track their edges), so this would imply they cannot be freed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are typically required to be permanently allocated
Yes, I just wanted to annotate this from a lifetime perspective here, since we don't want to require that the mi
literally roots the result value here.
Co-authored-by: Jameson Nash <[email protected]>
This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
As of #35831, we've had the ability to cache CodeInstances in an external cache and to replace that cache during jl_generate_native (e.g. for GPU compilation). This extends the same capability to CodeInstances to be added to the execution engine.
As of #35831, we've had the ability to cache CodeInstances in an external cache and to replace that cache during jl_generate_native (e.g. for GPU compilation). This extends the same capability to CodeInstances to be added to the execution engine.
As of #35831, we've had the ability to cache CodeInstances in an external cache and to replace that cache during jl_generate_native (e.g. for GPU compilation). This extends the same capability to CodeInstances to be added to the execution engine.
This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
As of #35831, we've had the ability to cache CodeInstances in an external cache and to replace that cache during jl_generate_native (e.g. for GPU compilation). This extends the same capability to CodeInstances to be added to the execution engine.
The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how #35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.
The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how #35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.
The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how #35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.
The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how #35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.
* Refactor cache logic for easy replacement This is the next step in the line of work started by JuliaLang#33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen. * Update base/compiler/cicache.jl Co-authored-by: Tim Besard <[email protected]> * Apply suggestions from code review Co-authored-by: Jameson Nash <[email protected]> * Rename always_cache_tree -> !allow_discard_tree Co-authored-by: Tim Besard <[email protected]> Co-authored-by: Jameson Nash <[email protected]>
This builds on top of JuliaLang#35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of JuliaLang#35831 is that it needs to run with a completely fresh cache, which JuliaLang#35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.
The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how JuliaLang#35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.
This is the next step in the line of work started by #33955,
though a lot of enabling work towards this was previously done
by Jameson in his codegen-norecursion branch. The basic
thrust here is to allow external packages to manage their own
cache of compiled code that may have been generated using entirely
different inference or compiler options. The GPU compilers are one
such example, but there are several others, including generating
code using offload compilers, such as XLA or compilers for secure
computation. A lot of this is just moving code around to make it
clear exactly which parts of the code are accessing the internal
code cache (which is now its own type to make it obvious when
it's being accessed), as well as providing clear extension points
for custom cache implementations. The second part is to refactor
CodeInstance construction to separate construction and insertion
into the internal cache (so it can be inserted into an external
cache instead if desired). The last part of the change
is to give cgparams another hook that lets the caller replace
the cache lookup to be used by codegen.
cc @maleadt @jpsamaroo