Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broken Base.unsafe_convert method returns garbage pointers #53

Closed
vtjnash opened this issue Oct 21, 2016 · 9 comments
Closed

broken Base.unsafe_convert method returns garbage pointers #53

vtjnash opened this issue Oct 21, 2016 · 9 comments

Comments

@vtjnash
Copy link

vtjnash commented Oct 21, 2016

data_pointer_from_objref returns an invalid pointer when called on SMatrix. This function should thrown an error in this case (the case where the object can be stack allocated), but that error checking hasn't been implemented yet. That means this function is invalid:

@inline function Base.unsafe_convert{N,M,T}(::Type{Ptr{T}}, m::SMatrix{N,M,T})
    Base.unsafe_convert(Ptr{T}, Base.data_pointer_from_objref(m))
end

The correct way to implement this is to use Ref:

Base.cconvert{T}(::Type{Ptr{T}}, m::SMatrix) = Ref(m)
Base.unsafe_convert{N,M,T}(::Type{Ptr{T}}, m::Ref{SMatrix{N,M,T}}) =
    Ptr{T}(Base.unsafe_convert(Ptr{SMatrix{N,M,T}}, m))
@andyferris
Copy link
Member

andyferris commented Oct 24, 2016

Thanks Jameson for the input.

I must admit I find it slightly difficult to navigate what is possible and what is not (my approach has been to play at the REPL and see what works and what doesn't).

The purpose of this function was to make it easy to pass a pointer to a stack allocated SMatrix to a BLAS or LAPACK function (though currently this is mostly used for MMatrix). I think that I came to understand recently that I can pass a pointer to a stack-allocated immutable to a C function using the special & syntax in ccall (assuming the C function doesn't mutate the data). Is that correct? Is there some clever way to use cconvert and unsafe_convert to avoid allocating the Ref in the current implementation of Julia (where the C function expects Ptr{Float64} or similar)?

(remember, allocating and garbage collecting the Ref can be the most expensive step for small matrices)

To give a better idea of what I'm attempting, take the command

julia> eig(@SMatrix eye(2,2))
([1.0,1.0],
[-1.0 -0.0; -0.0 1.0])

Here the SMatrix makes it way through to Base.LAPACK.geevx!() with a ccall on this line. Currently it gets its way into LAPACK correctly and the output is returned as MMatrix and MVector (because of similar). Furthermore, it also currently returns the correct output...

@vtjnash
Copy link
Author

vtjnash commented Oct 24, 2016

I think that I came to understand recently that I can pass a pointer to a stack-allocated immutable to a C function using the special & syntax in ccall (assuming the C function doesn't mutate the data). Is that correct?

Yes (assumption not needed)

Is there some clever way to use cconvert and unsafe_convert to avoid allocating the Ref in the current implementation of Julia (where the C function expects Ptr{Float64} or similar)?

No.

(remember, allocating and garbage collecting the Ref can be the most expensive step for small matrices)

They both allocate. The only different between using Ref and using pointer_from_objref, is that the latter garbage collects the object before the ccall while the former defers garbage collection until after the ccall. Usually that difference isn't enough time for the object to get overwritten, so the ccall will appear to work.

@c42f
Copy link
Member

c42f commented Oct 25, 2016

@andyferris I suspect the main problem with a lot of these subtle issues is that you're still thinking about immutables as stack allocated C types, with similar lifetime rules. But this is quite far from the real language semantics. Julia semantics says nothing about this (as far as I know), simply that an immutable is an object whose fields may not be rebound; pretty much everything else is implementation details.

For example:

  • Where is an immutable allocated? Anywhere the compiler sees fit. It may be allocated on the heap (currently when it contains pointers). It may be allocated on the stack. The allocation may be removed entirely and the object fields put in registers instead. In principle there could be two copies of it, or partial copies if parts are copied from the stack to registers.
  • What is the lifetime of an immutable; will it last past my ccall? This is pretty much undefined, as copies are allowed without changing the semantics of the program.
  • Can I access a modified immutable value after a ccall? No, the language knows that you "can't" mutate them, if you happen to be able to, it's just dirty tricks with the current implementation ;-)

I suspect the compiler should feel quite free to transform this

function foo()
    m = eye(SMatrix{2,2,Int})
    dirty_tricks!(m) # Attempt to modify m in place
    m + m
end

into this

function foo()
    m = eye(SMatrix{2,2,Int})
    m2 = m # Copies allowed, it's immutable after all.
    dirty_tricks!(m) # Woo, the implementation didn't stop me
    m2 + m2 # Uh oh
end

I could be wrong about some of the details above, but if you break the semantics of the language, anything can happen, and frequently does when you don't expect it :-) Eg, you wouldn't try to use a reference to a C++ class after it goes out of scope. It might "work" simply because the associated memory hasn't be reused for anything else yet (eg, the stack slots recycled for another variable newly in scope in the same function, or the heap memory reallocated). But it'll fail randomly after some refactoring.

@andyferris
Copy link
Member

They both allocate. The only different between using Ref and using pointer_from_objref, is that the latter garbage collects the object before the ccall while the former defers garbage collection until after the ccall. Usually that difference isn't enough time for the object to get overwritten, so the ccall will appear to work.

Right, thanks @vtjnash for clearing that up - I think I understand better now! Does the heap allocation occur because of the (Any,) input type parameter to the ccall here? I've never used an abstract type in a ccall before. (At first I wondered if maybe it returned an (expired) stack frame pointer from jl_value_ptr , but that didn't seem to make sense.)

@c42f I do agree with all of that... and the attempt here was never to mutate them - just to pass immutable inputs to Fortran functions at high speed. But things like you mention here

Where is an immutable allocated? Anywhere the compiler sees fit. It may be allocated on the heap (currently when it contains pointers). It may be allocated on the stack. The allocation may be removed entirely and the object fields put in registers instead. In principle there could be two copies of it, or partial copies if parts are copied from the stack to registers.

probably make it impossible to use this approach.

I think what I should do is just use the & in direct ccalls when necessary (and make a Ref wherever they get mutated), and give up on trying to allow Base.LAPACK and Base.BLAS functions to work. It's a bit of a shame, since I've been trying to support generic programming as much as possible, but oh well.

@andyferris
Copy link
Member

Does the heap allocation occur because of the (Any,) input type parameter to the ccall here?

Oh wait, nevermind, I guess they are boxed because of the ::ANY...

@c42f
Copy link
Member

c42f commented Oct 25, 2016

just use the & in direct ccall

Hmm, this is deprecated in favor of Ref ( http:https://docs.julialang.org/en/release-0.5/manual/calling-c-and-fortran-code/#special-reference-syntax-for-ccall-deprecated ), but in julia-0.5 the & seems to generate much nicer assembly, provided you don't want to read from the Ref afterward.

@andyferris
Copy link
Member

but in julia-0.5 the & seems to generate much nicer assembly,

Yes, I learnt & is non-allocating (it gives a direct stack pointer) from a conversation between Jeff and Jameson somewhere. I presume this was originally done to make a performant ccall, especially with regards to Fortran functions. I suppose that future inlining of mutables to the stack will make Ref generate similar assembly, and that is when I'm guessing that & will be removed entirely.

@c42f
Copy link
Member

c42f commented Oct 25, 2016

Right, there seems to be a fair bit of specialized code in ccall.cpp around julia_to_native and the addressOf variable.

@andyferris
Copy link
Member

Thank you very much Jameson for the contribution (I pushed this one to master).

oschulz pushed a commit to oschulz/StaticArrays.jl that referenced this issue Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants