Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Round-trip reintrepretation of all bits types #47116

Merged
merged 25 commits into from
Jun 29, 2023

Conversation

BioTurboNick
Copy link
Contributor

Successor to #32660 and rebasing of #45723 onto master.

This PR allows any bits type to be safely reinterpreted as any other bits type containing the same number of valid bytes, regardless of how either type is padded.

Example:

using BenchmarkTools

struct Foo
    x::Int32
    y::Int64
    z::NTuple{2, UInt8}
    a::Float32
end
# layout: xxxx____xxxxxxxxxx__xxxx

struct Foo1
    x::Int64
    y::Int32
    z::NTuple{2, Int8}
    a::Float32
end
# layout: xxxxxxxxxxxxxx__xxxx____

a = Foo(43, -56, (0x08, 0xf5), 5.6f0)
sizeof(a) # 24

b = reinterpret(Foo1, a) # Foo1(-240518168533, -1, (8, -11), 5.6f0)
sizeof(b) # 24
@btime reinterpret(Foo1, $a) # 754.630 ns (0 allocations: 0 bytes)

c = reinterpret(Foo, b) # Foo(43, -56, (0x08, 0xf5), 5.6f0)
@btime reinterpret(Foo, $b) # 754.183 ns (0 allocations: 0 bytes)

a === c # true

base/reinterpretarray.jl Outdated Show resolved Hide resolved
@Seelengrab
Copy link
Contributor

Seelengrab commented Oct 12, 2022

As mentioned in the original PR, this at least needs docs to warn about the possibility of creating potentially illegal instances this way. I don't think this has been addressed in general yet, has it?

Since the behavior of "ignore padding and just fill "valid" bits" behavior is kind of unusual, that should probably be documented as well.

@BioTurboNick

This comment was marked as outdated.

@BioTurboNick
Copy link
Contributor Author

If people feel it's necessary to narrow what conversions are allowed though, that's valid.

I approached this from a "well, why not allow this?" perspective. But like, if we wanted to require same-size-primitives or same-type-primitives, and then special case UInts as universally convertible, I could see that allowing most reinterpretations that could be desired normally.

base/reinterpretarray.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
@Seelengrab
Copy link
Contributor

Also, the doc change mentioned above also isn't in the PR yet, as far as I can tell - important not to forget, I think :)

@BioTurboNick
Copy link
Contributor Author

BioTurboNick commented Jun 23, 2023

Just added docstring and I think the test suite passed (yes - only failures are general CI issues).

Two items to consider:

  • Change representation of padding to "padding start offset + padding length" instead of "next valid byte offset + previous padding length"; the latter might be more intuitive and wouldn't require a special case for end padding. This function probably isn't considered API and a cursory look didn't show usage of this function in JuliaHub?
  • Occurs to me it could be more efficient to make the padding function recursive and then just flatten the paddings while copying chunks, rather than making the copyto/from functions recursive. (Then again, this overhead may only be during compilation for each type?)

base/exports.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
base/reinterpretarray.jl Outdated Show resolved Hide resolved
@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 28, 2023

Those items to consider can be done as followup after merging this

@vtjnash vtjnash merged commit cf34aa2 into JuliaLang:master Jun 29, 2023
1 check passed
@Liozou
Copy link
Member

Liozou commented Jun 29, 2023

When building julia, there is now the following:

WARNING: Method definition reinterpret(Type{Out}, In) where {Out, In} in module Base at essentials.jl:560 overwritten at reinterpretarray.jl:823.

See for instance https://buildkite.com/julialang/julia-master/builds/25396#01890455-aaaf-4e11-a538-619a97d6db6c/1343-2975 in CI.

@BioTurboNick
Copy link
Contributor Author

Hmm...

So the bitcast of primitive types needs to be available via reintrepret through essentials.jl.

julia/base/essentials.jl

Lines 545 to 560 in e4600c5

"""
reinterpret(type, x)
Change the type-interpretation of the binary data in the primitive value `x`
to that of the primitive type `type`.
The size of `type` has to be the same as that of the type of `x`.
For example, `reinterpret(Float32, UInt32(7))` interprets the 4 bytes corresponding to `UInt32(7)` as a
[`Float32`](@ref).
# Examples
```jldoctest
julia> reinterpret(Float32, UInt32(7))
1.0f-44
```
"""
reinterpret(::Type{T}, x) where {T} = bitcast(T, x)

Maybe it could be replaced with:

@inline function reinterpret(::Type{Out}, x::In) where {Out, In}
    isbitstype(Out) || throw(ArgumentError("Target type for `reinterpret` must be isbits"))
    isbitstype(In) || throw(ArgumentError("Source type for `reinterpret` must be isbits"))
    if isprimitivetype(Out) && isprimitivetype(In)
        outsize = sizeof(Out)
        insize = sizeof(In)
        outsize == insize ||
            throw(ArgumentError("Sizes of types $Out and $In do not match; got $outsize \
                and $insize, respectively."))
        return bitcast(Out, x)
    end
    return _reinterpret(Out, x)
end

And the reinterpret function in this PR renamed to _reinterpret? And the docstrings merged.

@Seelengrab
Copy link
Contributor

bitcast already checks the sizes but the error message is not as nice as the one in this PR. If possible, I think changing its error message and using

@inline function reinterpret(::Type{Out}, x::In) where {Out, In}
    isbitstype(Out) || throw(ArgumentError("Target type for `reinterpret` must be isbits"))
    isbitstype(In) || throw(ArgumentError("Source type for `reinterpret` must be isbits"))
    if isprimitivetype(Out) && isprimitivetype(In)
        return bitcast(Out, x)
    end
    return _reinterpret(Out, x)
end

would be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:types and dispatch Types, subtyping and method dispatch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants