-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable at-nospecialize for GPU codegen #63
Comments
Can you intercept the JuliaDebug/ASTInterpreter2.jl#37 pulls tricks like that for the purpose of optimizing a runtime interpreter. Not sure if that's a useful model, however. |
Interesting, I was instead waiting for something like the TPU hacks from So no, we don't do anything like that right now, but go straight from the method instance to its LLVM IR: https://github.com/JuliaGPU/CUDAnative.jl/blob/9b524d5e1da8e59715af5e37f303cdffc450f17b/src/compiler/irgen.jl#L124-L135 Is it possible to take and rewrite |
We need the inferred codeinfo objects to be able to rewrite the invokes, right?. https://github.com/vchuravy/Cthulhu.jl/blob/8e59d4d4d4278024adcd1348ac1baa5d8554e79c/src/Cthulhu.jl#L167-L180 is another way to directly interact with inference. |
If you can't/shouldn't do it "in place," you can always copy the MethodInstance and then tweak whatever's in its |
Now that we have a GPU runtime library that can allocate and box, I tried to get rid of this hack: https://github.com/JuliaGPU/CUDAnative.jl/blob/53368d48b6405ee962e54d4c3b9f90e3eb623310/src/compiler/irgen.jl#L229-L236
Turns out we still need it, as the argument to
throw
often is the value returned by theBoundsError
constructor, which has a@nospecialize
resulting in ajl_invoke
. LLVM obviously can't remove this function by itself, so we end up with GPU-incompatible code. Just try tocu([1]) .+ cu([2])
with--check-bounds=yes
(and the above hack disabled, of course).We could try and redefine
BoundsError
orthrow_boundserror
as it occurs for GPU code, e.g.:But then we still miss cases thrown from the broadcasting code (and I couldn't find a way to dispatch on
throw_boundserror(Broadcasted{...CuDeviceArray...}
. It would be best to just get rid of@nospecialize
for GPU code altogether.Not sure whether that would need to happen at the inference/optimizer/codegen level though. Monkey-patching
MethodInstance
s from within CUDAnative'semit_function
hook didn't seem to work.@Keno did you do something similar for XLA, since you worked on more exhaustive inference there?
Are those interfaces public already?
The text was updated successfully, but these errors were encountered: