Refactor cache logic for easy replacement #35831

Keno · 2020-05-10T23:37:06Z

This is the next step in the line of work started by #33955,
though a lot of enabling work towards this was previously done
by Jameson in his codegen-norecursion branch. The basic
thrust here is to allow external packages to manage their own
cache of compiled code that may have been generated using entirely
different inference or compiler options. The GPU compilers are one
such example, but there are several others, including generating
code using offload compilers, such as XLA or compilers for secure
computation. A lot of this is just moving code around to make it
clear exactly which parts of the code are accessing the internal
code cache (which is now its own type to make it obvious when
it's being accessed), as well as providing clear extension points
for custom cache implementations. The second part is to refactor
CodeInstance construction to separate construction and insertion
into the internal cache (so it can be inserted into an external
cache instead if desired). The last part of the change
is to give cgparams another hook that lets the caller replace
the cache lookup to be used by codegen.

cc @maleadt @jpsamaroo

This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.

This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen.

base/compiler/cicache.jl

maleadt · 2020-06-04T12:25:50Z

src/aotcompile.cpp

+ codeinst = (jl_code_instance_t*)ci;
+ *src_out = (jl_code_info_t*)codeinst->inferred;
+ jl_method_t *def = codeinst->def->def.method;
+ if ((jl_value_t*)*src_out == jl_nothing)


What with rettype_const methods, those don't have a ci.inferred?

Also, what happens if the JIT deletes an inferred code instance, e.g.

julia/src/codegen.cpp

Line 6597 in 8b010b5

codeinst->inferred = jl_nothing;

from #35655, will that not mutate the external cache then?

What with rettype_const methods, those don't have a ci.inferred?

Same as before, they won't get an LLVM function emitted for them.

Also, what happens if the JIT deletes an inferred code instance,

Fair point, that should probably be turned off.

Same as before, they won't get an LLVM function emitted for them.

We currently do, though. If you call jl_create_native for kernel() = return you get the following un-optimized IR:

define internal void @julia_kernel_39() !dbg !5 { top: %0 = call %jl_value_t*** @julia.ptls_states() %1 = bitcast %jl_value_t*** %0 to %jl_value_t addrspace(10)** %2 = getelementptr inbounds %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %1, i64 4 %3 = bitcast %jl_value_t addrspace(10)** %2 to i64** %4 = load i64*, i64** %3, !tbaa !7, !invariant.load !4 ret void, !dbg !11 }

It's only when calling this function from another that the return type gets folded.

Doing the same with the proposed PR and a lookup cgparam yields Refusing to automatically run type inference with custom cache lookup.

Ah, that just seems like a bug then. We should just allow that case.

Ok, I see what's happening here. In the regular case it actually does call type inference then, which will then give it a CodeInfo, even though we never cache those because they're not actually supposed to be used. I would say for now, let's just filter these in the AbstractInterpreter implementation that calls into here and maybe in the future, we can have it synthesize an appropriate body, without forcing it re-do type inference.

Co-authored-by: Tim Besard <[email protected]>

vtjnash · 2020-06-05T02:39:41Z

base/compiler/typeinfer.jl

- rettype_const = (result.src::Const).val
- const_flags = 0x3
+function CodeInstance(result::InferenceResult, min_valid::UInt, max_valid::UInt,
+ may_compress=true, always_cache_tree=false)


"always_cache" sounds like a bit of a misnomer, since there's always going to be intermediate work that we discarded because it'd be invalid to call cache_result on it.

allow_discard_tree=true?

I'll just rename this to allow_discard_tree and then get this merged so I can put up some of the follow on work to this.

vtjnash · 2020-06-05T02:42:22Z

base/compiler/typeinfer.jl

@@ -470,12 +488,12 @@ function typeinf_edge(interp::AbstractInterpreter, method::Method, @nospecialize
 end
 if frame === false
 # completely new
- mi.inInference = true
+ lock_mi_inference(interp, mi)


I don't know that this is a lock in the proper sense. It's a bootstrapping device, but it's also merely an advisory hint.

It's not a lock in the atomic sense, but I do think it's a semantic lock.

base/compiler/types.jl

src/aotcompile.cpp

vtjnash · 2020-06-05T02:56:18Z

src/julia.h

+// The root propagation here doesn't have to be literal, but callers should
+// ensure that the return value outlives the MethodInstance
+typedef jl_value_t *(*jl_codeinstance_lookup_t)(jl_method_instance_t *mi JL_PROPAGATES_ROOT,
+ size_t min_world, size_t max_world);


I'd be interested in knowing if we can break this assumption (remove the JL_PROPAGATES_ROOT attribute). It is required that the result be rooted (because it could hold a couple of the gc roots for the generated code result) as long as it could appear on the call-stack, but perhaps that's okay?

outlives the MethodInstance

These are typically required to be permanently allocated (so that we can track their edges), so this would imply they cannot be freed?

These are typically required to be permanently allocated

Yes, I just wanted to annotate this from a lifetime perspective here, since we don't want to require that the mi literally roots the result value here.

base/compiler/typeinfer.jl

Co-authored-by: Jameson Nash <[email protected]>

This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.

Ref JuliaLang/julia#35831 (comment)

As of #35831, we've had the ability to cache CodeInstances in an external cache and to replace that cache during jl_generate_native (e.g. for GPU compilation). This extends the same capability to CodeInstances to be added to the execution engine.

This builds on top of #35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of #35831 is that it needs to run with a completely fresh cache, which #35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.

As of #35831, we've had the ability to cache CodeInstances in an external cache and to replace that cache during jl_generate_native (e.g. for GPU compilation). This extends the same capability to CodeInstances to be added to the execution engine.

The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how #35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.

* Refactor cache logic for easy replacement This is the next step in the line of work started by JuliaLang#33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen. * Update base/compiler/cicache.jl Co-authored-by: Tim Besard <[email protected]> * Apply suggestions from code review Co-authored-by: Jameson Nash <[email protected]> * Rename always_cache_tree -> !allow_discard_tree Co-authored-by: Tim Besard <[email protected]> Co-authored-by: Jameson Nash <[email protected]>

This builds on top of JuliaLang#35831, letting inference emit a custom message whenever it gives up on infering something. These messages are intended to be displayed by external tools, either to debug what inference is doing (e.g. for Cthulhu) or, if an external compiler needs to disallow certain kinds of missing information (e.g. no dynamic dispatch on GPUs), can be used to improve diagnostics. This is mostly a proof of concept, I think these messages/hooks need to be a bit richer for a full solution, though I think this can be already useful if we hook up something like Cthulhu. As a proof of concept, I hacked up a 10 line function that reads these messagse. It works something like the following: ``` function bar() sin = eval(:sin) sin(1) end foo() = bar() ``` ``` julia> Compiler3.analyze_static(Tuple{typeof(foo)}) In function: bar() ERROR: The called function was unknown 1| function bar() 2| sin = eval(:sin) => sin(1) 4| end [1] In foo() ``` The reason this needs to sit on top of JuliaLang#35831 is that it needs to run with a completely fresh cache, which JuliaLang#35831 provides the primitives for. Otherwise, while you could still get the annotations out, that would only work the first time something is inferred anywhere in the system. With a fresh cache, everything is analyzed again, and any messages like these that are opted in to can be collected.

The primary motivation here is to clean up the notion of "looking up a method in the method table" into a single object that can be passed around. At the moment, it's all a bit murky, with fairly large state objects being passed around everywhere and implicit accesses to the global environment. In my AD use case, I need to be a bit careful to make sure the various inference and optimization steps are looking things up in the correct tables/caches, so being very explicit about where things need to be looked up is quite helpful. In particular, I would like to clean up the optimizer, to not require the big `OptimizationState` which is currently a bit of a mix of things that go into IRCode and information needed for method lookup/edge tracking. That isn't part of this PR, but will build on top of it. More generally, with a bunch of the recent compiler work, I've been trying to define more crisp boundaries between the various components of the system, giving them clearer interfaces, and at least a little bit of documentation. The compiler is a very powerful bit of technology, but I think people having been avoiding it, because the code looks a bit scary. I'm hoping some of these cleanups will make it easier for people to understand what's going on. Here in particular, I'm using `findall(sig, table)` as the predicate for method lookup. The idea being that people familiar with the `findall(predicate, collection)` idiom from regular julia will have a good intuitive understanding of what's happening (a collection is searched for a predicate), an array of matches is returned, etc. Of course, it's not a perfect fit, but I think these kinds of mental aids can be helpful in making it easier for people to read compiler code (similar to how JuliaLang#35831 used `getindex` as the verb for cache lookup). While I was at it, I also cleaned up the use of out-parameters which leaked through too much of the underlying C API and replaced them by a proper struct of results.

Keno force-pushed the kf/cache_refactor branch 2 times, most recently from 895f9f8 to 33aa49a Compare May 12, 2020 00:24

Keno mentioned this pull request May 12, 2020

Inference annotations that can be read by an external tool #35845

Merged

Keno force-pushed the kf/cache_refactor branch from 33aa49a to c79044c Compare May 12, 2020 02:54

Keno requested review from vtjnash, maleadt and JeffBezanson May 14, 2020 04:56

maleadt reviewed Jun 4, 2020

View reviewed changes

base/compiler/cicache.jl Outdated Show resolved Hide resolved

maleadt reviewed Jun 4, 2020

View reviewed changes

Update base/compiler/cicache.jl

770487f

Co-authored-by: Tim Besard <[email protected]>

vtjnash reviewed Jun 5, 2020

View reviewed changes

base/compiler/types.jl Outdated Show resolved Hide resolved

vtjnash reviewed Jun 5, 2020

View reviewed changes

src/aotcompile.cpp Outdated Show resolved Hide resolved

vtjnash reviewed Jun 5, 2020

View reviewed changes

base/compiler/typeinfer.jl Outdated Show resolved Hide resolved

Keno and others added 2 commits June 5, 2020 14:53

Apply suggestions from code review

914d7f7

Co-authored-by: Jameson Nash <[email protected]>

Rename always_cache_tree -> !allow_discard_tree

2df4e31

Keno merged commit 9738e4b into master Jun 9, 2020

Keno deleted the kf/cache_refactor branch June 9, 2020 03:10

maleadt added a commit to Keno/Compiler3.jl that referenced this pull request Jun 11, 2020

Import WIP used with JuliaLang/julia#35831.

ea7ee90

maleadt added a commit to Keno/Compiler3.jl that referenced this pull request Jun 11, 2020

Work around rettyp_const not having ci.inferred.

0f30c9f

Ref JuliaLang/julia#35831 (comment)

Keno mentioned this pull request Jun 23, 2020

Allow external CodeInstances to be added to the execution engine #36400

Open

Keno mentioned this pull request Jul 20, 2020

Refactor compiler method lookup interface #36743

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor cache logic for easy replacement #35831

Refactor cache logic for easy replacement #35831

Keno commented May 10, 2020 •

edited

Loading

maleadt Jun 4, 2020

maleadt Jun 4, 2020

Keno Jun 4, 2020

Keno Jun 4, 2020

maleadt Jun 5, 2020

Keno Jun 5, 2020

Keno Jun 5, 2020

vtjnash Jun 5, 2020

Keno Jun 5, 2020

Keno Jun 8, 2020

vtjnash Jun 5, 2020

Keno Jun 5, 2020

vtjnash Jun 5, 2020 •

edited

Loading

Keno Jun 5, 2020

Refactor cache logic for easy replacement #35831

Refactor cache logic for easy replacement #35831

Conversation

Keno commented May 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtjnash Jun 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Keno commented May 10, 2020 •

edited

Loading

vtjnash Jun 5, 2020 •

edited

Loading