-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of array wrappers & unions regress load time #453
Labels
Comments
Actually, let's leave this open as I'd want to go back to the wrapper-based approach one day. |
Has anyone checked for invalidation? That's the first step. |
I didn't, I'm not sure those tools existed or I was aware of them back then. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Previously we were using a
CuArray
type that could represent a view, reshape, reinterpret, etc. For the sake of simplicity, I switched to a simplerCuArray
type while reusingBase.SubArray
,Base.ReshapeArray
, etc. That requires use of type unions to, e.g., represent all dense or stridedCuArray
s:CUDA.jl/src/array.jl
Lines 146 to 164 in 75f7d30
These definitions are almost identical to how Base defines
StridedArray
. However, using them significantly regresses load time. For example, #450 adds them to a bunch ofLinearAlgebra.mul!
methods which badly affects time ofusing CUDA
: +25%, https://speed.juliagpu.org/timeline/#/?exe=4&ben=latency/import&env=1&revs=50&base=3+96&equid=off&quarts=on&extr=onIn a similar vein, Adapt.jl defines a union that captures all array instances that can be used on the GPU (i.e. not necessarily dense or strided, but an
Adjoint
orPermuteDimsArray
): https://github.com/JuliaGPU/Adapt.jl/blob/11d96a531cb70359e88ed2ad0d0a13a85727a204/src/wrappers.jl#L73-L92Using these unions makes load time go crazy, e.g. with
mul!(::CuArray, ::AnyCuArray...)
(whereAnyCuArray
uses theAdapt.WrappedArray
union) it goes from 5 to 25s.I can understand how the large union from Adapt.jl is needlessly taxing on inference, and I guess we may need something like an
AbstractWrappedArray
here (JuliaLang/julia#31563). However, withStridedCuArray
I had not expected these regressions, as Base uses similar patterns. Am I doing anything especially bad here? I'd like to start usingStridedCuArray
much more, in order to cover APIs that takestride
inputs (which there are quite some).The text was updated successfully, but these errors were encountered: