Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow PrecompileTools to see MI's inferred by foreign abstract interpreters #54069

Merged
merged 1 commit into from
Apr 15, 2024

Conversation

vchuravy
Copy link
Sponsor Member

Partially reverts #49391

PrecompileTools uses the timing infrastructure to snoop on the inference process.
The reason for #49391 was that this could lead to accidental pollution of the caches
with foreign results (timholy/SnoopCompile.jl#338)

After #52233 and especially #53336 we now filter results by cache owner and
don't try to cache foreign code using the native pipeline.

Motivated by JuliaGPU/GPUCompiler.jl#567 which demonstrated
that a foreign code instance would not be cached without PrecompileTools.

@vchuravy vchuravy added domain:gpu Affects running Julia on a GPU backport 1.11 Change should be backported to release-1.11 labels Apr 13, 2024
@vchuravy
Copy link
Sponsor Member Author

I locally ported this to v1.11 and modified CUDA.jl:

diff --git a/Project.toml b/Project.toml
index db564395a..47217fe27 100644
--- a/Project.toml
+++ b/Project.toml
@@ -23,6 +23,7 @@ Libdl = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
 Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
 NVTX = "5da4648a-3479-48b8-97b9-01cb529c0a1f"
+PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
 Preferences = "21216c6a-2e73-6563-6e65-726566657250"
 PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
 Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
@@ -65,6 +66,7 @@ Libdl = "1"
 LinearAlgebra = "1"
 Logging = "1"
 NVTX = "0.3.2"
+PrecompileTools = "1.2.1"
 Preferences = "1"
 PrettyTables = "2"
 Printf = "1"
diff --git a/src/precompile.jl b/src/precompile.jl
index dc1b2ce2e..ddbe26dd2 100644
--- a/src/precompile.jl
+++ b/src/precompile.jl
@@ -14,3 +14,15 @@ precompile(run_and_collect, (Cmd,))
 precompile(cudaconvert, (Function,))
 precompile(Core.kwfunc(cudacall), (NamedTuple{(:threads, :blocks), Tuple{Int64, Int64}},typeof(cudacall),CuFunction,Type{Tuple{}}))
 precompile(Core.kwfunc(launch), (NamedTuple{(:threads, :blocks), Tuple{Int64, Int64}},typeof(launch),CuFunction))
+
+using PrecompileTools: @setup_workload, @compile_workload
+@setup_workload let
+    @compile_workload begin
+        target = PTXCompilerTarget(; cap=v"7.5")
+        params = CUDACompilerParams(; cap=v"7.5", ptx=v"7.5")
+        config = CompilerConfig(target, params)
+        mi = GPUCompiler.methodinstance(typeof(identity), Tuple{Nothing})
+        job = CompilerJob(mi, config)
+        GPUCompiler.code_native(devnull, job)
+    end
+end

This works correctly now.

@vchuravy vchuravy merged commit c0611e8 into master Apr 15, 2024
9 checks passed
@vchuravy vchuravy deleted the vc/typeinf branch April 15, 2024 17:22
@aviatesk
Copy link
Sponsor Member

Does that mean external absints using the integrated (internal) cache is now able to persist its codeinst caches across sessions?

@vchuravy
Copy link
Sponsor Member Author

vchuravy commented Apr 16, 2024

They were already able to do so (after #52233), but the integration with PrecompileTools didn't work.

E.g. this test https://github.com/JuliaGPU/GPUCompiler.jl/blob/9f90e9d8c7b1049932083847cffcd442699c81c0/test/native_tests.jl#L594 was not working yet.

KristofferC pushed a commit that referenced this pull request Apr 17, 2024
…reters (#54069)

Partially reverts #49391

PrecompileTools uses the timing infrastructure to snoop on the inference
process.
The reason for #49391 was that this could lead to accidental pollution
of the caches
with foreign results
(timholy/SnoopCompile.jl#338)

After #52233 and especially #53336 we now filter results by cache owner
and
don't try to cache foreign code using the native pipeline.

Motivated by JuliaGPU/GPUCompiler.jl#567 which
demonstrated
that a foreign code instance would not be cached without
PrecompileTools.

(cherry picked from commit c0611e8)
@KristofferC KristofferC mentioned this pull request Apr 17, 2024
59 tasks
KristofferC added a commit that referenced this pull request May 28, 2024
Backported PRs:
- [x] #53665 <!-- use afoldl instead of tail recursion for tuples -->
- [x] #53976 <!-- LinearAlgebra: LazyString in interpolated error
messages -->
- [x] #54005 <!-- make `view(::Memory, ::Colon)` produce a Vector -->
- [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` -->
- [x] #54069 <!-- Allow PrecompileTools to see MI's inferred by foreign
abstract interpreters -->
- [x] #53750 <!-- inference correctness: fields and globals can revert
to undef -->
- [x] #53984 <!-- Profile: fix heap snapshot is valid char check -->
- [x] #54102 <!-- Explicitly compute stride in unaliascopy for SubArray
-->
- [x] #54070 <!-- Fix integer overflow in `skip(s::IOBuffer,
typemax(Int64))` -->
- [x] #54013 <!-- Support case-changes to Annotated{String,Char}s -->
- [x] #53941 <!-- Fix writing of AnnotatedChars to AnnotatedIOBuffer -->
- [x] #54137 <!-- Fix typo in docs for `partialsortperm` -->
- [x] #54129 <!-- use correct size when creating output data from an
IOBuffer -->
- [x] #54153 <!-- Fixup IdSet docstring -->
- [x] #54143 <!-- Fix `make install` from tarballs -->
- [x] #54151 <!-- LinearAlgebra: Correct zero element in
`_generic_matvecmul!` for block adj/trans -->
- [x] #54213 <!-- Add `public` statement to `Base.GC` -->
- [x] #54222 <!-- Utilize correct tbaa when emitting stores of unions.
-->
- [x] #54233 <!-- set MAX_OS_WRITE on unix -->
- [x] #54255 <!-- fix `_checked_mul_dims` in the presence of 0s and
overflow. -->
- [x] #54259 <!-- Fix typo in `readuntil` -->
- [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large
array -->
- [x] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing
imaginary part on diagonal -->
- [x] #54248 <!-- ensure package callbacks are invoked when no valid
precompile file exists for an "auto loaded" stdlib -->
- [x] #54308 <!-- Implement eval-able AnnotatedString 2-arg show -->
- [x] #54302 <!-- Specialised substring equality for annotated strs -->
- [x] #54243 <!-- prevent `package_callbacks` to run multiple time for a
single package -->
- [x] #54350 <!-- add a precompile signature to Artifacts code that is
used by JLLs -->
- [x] #54331 <!-- correctly track freed bytes in
jl_genericmemory_to_string -->
- [x] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [x] #54335 <!-- When accessing the data pointer for an array, first
decay it to a Derived Pointer -->
- [x] #54239 <!-- Make sure `fieldcount` constant-folds for `Tuple{...}`
-->
- [x] #54288
- [x] #54067
- [x] #53715 <!-- Add read/write specialisation for IOContext{AnnIO} -->
- [x] #54289 <!-- Rework annotation ordering/optimisations -->
- [x] #53815 <!-- create phantom task for GC threads -->
- [x] #54130 <!-- inference: handle `LimitedAccuracy` in
`handle_global_assignment!` -->
- [x] #54428 <!-- Move ConsoleLogging.jl into Base -->
- [x] #54332 <!-- Revert "add unsetindex support to more copyto methods
(#51760)" -->
- [x] #53826 <!-- Make all command-line options documented in all
related files -->
- [x] #54465 <!-- typeintersect: conservative typevar subtitution during
`finish_unionall` -->
- [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path
of type instantiation -->
- [x] #54499 <!-- make `@doc x` work without REPL loaded -->
- [x] #54210 <!-- attach finalizer in `mmap` to the correct object -->
- [x] #54359 <!-- Pkg REPL: cache `pkg_mode` lookup -->

Non-merged PRs with backport label:
- [ ] #54471 <!-- Actually setup jit targets when compiling
packageimages instead of targeting only one -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #54323 <!-- inference: fix too conservative effects for recursive
cycles -->
- [ ] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [ ] #54191 <!-- make `AbstractPipe` public -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53882 <!-- Warn about cycles in extension precompilation -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.11 Change should be backported to release-1.11 label May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:gpu Affects running Julia on a GPU
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants