Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUBLASLt wrapper for cublasLtMatmulDescSetAttribute can have device buffers as input #2337

Closed
avik-pal opened this issue Apr 23, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@avik-pal
Copy link

avik-pal commented Apr 23, 2024

Describe the bug

Pointer input to cublasLtMatmulDescSetAttribute depends on the exact Attribute being set.

To reproduce

The Minimal Working Example (MWE) for this bug:

function cublaslt_matmul(x::CuArray{T, 2}, w::CuArray{T, 2}, b::CuVector{T}) where {T}
    y = similar(x, T, size(w, 1), size(x, 2))

    operationDesc = Ref{CUBLAS.cublasLtMatmulDesc_t}()
    CUBLAS.cublasLtMatmulDescCreate(operationDesc, CUBLAS.CUBLAS_COMPUTE_32F, CUDA.R_32F)

    opNoTranspose = CUBLAS.CUBLAS_OP_N

    epilogueBias = CUBLAS.CUBLASLT_EPILOGUE_BIAS

    CUBLAS.cublasLtMatmulDescSetAttribute(
        operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_TRANSA,
        Ref{CUBLAS.cublasOperation_t}(opNoTranspose), sizeof(opNoTranspose))
    CUBLAS.cublasLtMatmulDescSetAttribute(
        operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_TRANSB,
        Ref{CUBLAS.cublasOperation_t}(opNoTranspose), sizeof(opNoTranspose))

    CUBLAS.cublasLtMatmulDescSetAttribute(
        operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_EPILOGUE,
        Ref{CUBLAS.cublasLtEpilogue_t}(epilogueBias), sizeof(epilogueBias))
    CUBLAS.cublasLtMatmulDescSetAttribute(
        operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_BIAS_POINTER,
        b, sizeof(b))

    Adesc = Ref{CUBLAS.cublasLtMatrixLayout_t}()
    Bdesc = Ref{CUBLAS.cublasLtMatrixLayout_t}()
    Cdesc = Ref{CUBLAS.cublasLtMatrixLayout_t}()

    CUBLAS.cublasLtMatrixLayoutCreate(
        Adesc, CUDA.R_32F, size(w, 1), size(w, 2), stride(w, 2))
    CUBLAS.cublasLtMatrixLayoutCreate(
        Bdesc, CUDA.R_32F, size(x, 1), size(x, 2), stride(x, 2))
    CUBLAS.cublasLtMatrixLayoutCreate(
        Cdesc, CUDA.R_32F, size(y, 1), size(y, 2), stride(y, 2))

    preference = Ref{CUBLAS.cublasLtMatmulPreference_t}()
    CUBLAS.cublasLtMatmulPreferenceCreate(preference)

    lthandle = Ref{CUBLAS.cublasLtHandle_t}()
    CUBLAS.cublasLtCreate(lthandle)

    heuristic = Ref{CUBLAS.cublasLtMatmulHeuristicResult_t}()
    returnedResults = Ref{Cint}(0)
    CUBLAS.cublasLtMatmulAlgoGetHeuristic(
        lthandle[], operationDesc[], Adesc[], Bdesc[], Cdesc[],
        Cdesc[], preference[], 1, heuristic, returnedResults)

    if returnedResults[] == 0
        error("No cuBLASLt algorithm")
        return
    end

    CUBLAS.cublasLtMatmul(lthandle[], operationDesc[], Ref{Cfloat}(1.0), w, Adesc[],
        x, Bdesc[], Ref{Cfloat}(0.0), y, Cdesc[], y, Cdesc[],
        Ref(heuristic[].algo), CU_NULL, 0, CUDA.stream())

    return y
end

This is a very hardcoded example only valid for T = Float32. Removing the part for BIAS specific part, the matrix multiply code does work correctly.

Additional context

As discussed on slack the datatype for the buffer should probably be PtrOrCuPtr. However, I tried manually patching it as

# There might be a tiny bug in the call in CUDA using this to verify
function cublasLtMatmulDescSetAttribute22(matmulDesc, attr, buf, sizeInBytes)
    CUBLAS.initialize_context()
    xx = CUBLAS.@gcsafe_ccall CUBLAS.libcublas.cublasLtMatmulDescSetAttribute(
        matmulDesc::CUBLAS.cublasLtMatmulDesc_t,
        attr::CUBLAS.cublasLtMatmulDescAttributes_t,
        buf::PtrOrCuPtr{Cvoid}, sizeInBytes::Csize_t)::CUBLAS.cublasStatus_t
    @show xx
end
    CUBLAS.cublasLtMatmulDescSetAttribute(
        operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_EPILOGUE,
        Ref{CUBLAS.cublasLtEpilogue_t}(epilogueBias), sizeof(epilogueBias))
    cublasLtMatmulDescSetAttribute22(
        operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_BIAS_POINTER,
        b, sizeof(b))

But this still gives an invalid usage status code.

@maleadt
Copy link
Member

maleadt commented Apr 24, 2024

Fixed in b58c0f8

But this still gives an invalid usage status code.

You have to pass the size of a pointer, not the size of the array:

ptr_ref = Ref{CuPtr{Cvoid}}(pointer(b))
CUBLAS.cublasLtMatmulDescSetAttribute(
    operationDesc[], CUBLAS.CUBLASLT_MATMUL_DESC_BIAS_POINTER,
    ptr_ref, sizeof(ptr_ref))

@maleadt maleadt closed this as completed Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants