Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: syntax: add noinline/inline support overrides for structs #51495

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Sep 28, 2023

I have wanted to this sometimes, to be able to control whether a particularly large struct is stored by reference or value. This was somewhat of a quick job, so it feels like a bit of a hacky implementation, but maybe that is okay. Long term, I would probably also like @noinline to work if applied to a field (but that is hard) or when syntactically written in the struct instead of outside of it. But this seemed to at least get the feature working.

An example preview:

julia> @noinline struct NoinlineSingleton end

julia> @inline struct NormalSingleton end

julia> @inline struct Impossible; x::Impossible; end
ERROR: Cannot apply @inline to this struct which is self-referential or mutable
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

julia> sizeof(Tuple{NoinlineSingleton}) # observe this is allocated as a pointer to a singleton
8

julia> sizeof(Tuple{NormalSingleton}) # observe this is allocated inline as zero bytes
0

An example:

```julia
julia> @noinline struct NoinlineSingleton end

julia> @inline struct NormalSingleton end

julia> @inline struct Impossible; x::Impossible; end
ERROR: Cannot apply @inline to this struct which is self-referential
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

julia> sizeof(Tuple{NoinlineSingleton})
8

julia> sizeof(Tuple{NormalSingleton})
0
```
@vtjnash vtjnash added the compiler:lowering Syntax lowering (compiler front end, 2nd stage) label Sep 28, 2023
@MasonProtter
Copy link
Contributor

This looks nice to have! However, I think it'd typically be more useful to have something like this in the field position rather than a universal decorator for a given type -- that is I want to be able to write

struct Foo
    @inline x::Bar
end

to have Bar be inline in Foo. The reason I want that is that writing e.g.

struct Foo{T, U, V, W}
     x::Tuple{T, U, V, W}
end

is not always the same as having

struct Foo{T, U, V, W}
    a::T
    b::U
    c::V
    d::W
end

and I have no control over the definition of Tuple, so I can't mark it as @inline (even if it were a regular julia struct)

@KristofferC
Copy link
Sponsor Member

Can we use something different than @inline? Having the same macro mean completely different thing in different contexts feels confusing to me.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Oct 2, 2023

What else would you call it though? In lisp languages (like Julia), code is simply data, and both are decisions of whether to copy data inline into a parent object or not. Should we be more precise like @allocateinline (to match the pre-existing function name for reflection of this property, which is Base.allocatedinline())? It currently does potentially make the patch much harder though, since we need to adapt lowering, and not hack it on later like this though. I realize that reasoning probably shouldn't drive our API choices though.

@MasonProtter you are welcome to implement that, but I specifically mentioned that is harder and a long-term idea, but not one for this PR

@@ -8,6 +8,7 @@ New language features
difference between `public` and `export` is that `public` names do not become
available when `using` a package/module. ([#50105])
* `ScopedValue` implement dynamic scope with inheritance across tasks ([#50958]).
* A struct definition can be marked with `@inline` or `@noinline` to allow/prohibit it from being treated as inline-alloc eligable.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A struct definition can be marked with `@inline` or `@noinline` to allow/prohibit it from being treated as inline-alloc eligable.
* A struct definition can be marked with `@inline` or `@noinline` to allow/prohibit it from being treated as inline-alloc eligible

@KristofferC
Copy link
Sponsor Member

Yes, I think something like @allocateinline would be better.

@topolarity
Copy link
Member

topolarity commented Jul 4, 2024

I'm interested in a feature like this to be able to do atomic operations on inner structs like:

struct Foo
    @atomic data::Bar
    meta::Baz
end

If Bar is a non-mutable struct, then it often ends up inline allocated in Foo which makes the atomic operations very slow.

I think in an "ideal world", the atomic field would be non-inline-allocated by default (preferring word size atomics), and then I could annotate the field with @inlinefield (or @allocateinline or w/e) to force it to be inlined if I really want that behavior.

However, I don't really want to decide that behavior globally like this PR does now. I think it doesn't make any more sense to declare Bar as globally inline-allocated than it does to declare Bar as globally atomic - the decision ought to be on the user not the definer of the datatype (although setting a default seems fine)

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 5, 2024

which makes the atomic operations very slow.

Citation needed? It only has one extra atomic operation (on the same cache line) compared to the pointer case but elides an allocation, so it would be expected to be faster to inline in nearly all cases where it is possible

@topolarity
Copy link
Member

Citation needed? It only has one extra atomic operation (on the same cache line) compared to the pointer case but elides an allocation, so it would be expected to be faster to inline in nearly all cases where it is possible

Sure, this example is about ~113x slower when inline-allocated:

julia> struct Inner
           a::UInt
           b::UInt
           c::UInt
           d::UInt
           e::UInt
       end

julia> mutable struct Foo
           @atomic inner::Inner
           const y::UInt
       end

julia> function swapalot(o, a, b)
           for i = 1:1_000_000
               inner = @atomic :acquire o.inner
               if inner.a == 1
                   @atomic :release o.inner = a
               elseif inner.a == 0
                   @atomic :release o.inner = b
               end
           end
       end

julia> a = Inner(0,0,0,0,0); b = Inner(1,0,0,0,0); o = Foo(a, 0)

julia> using BenchmarkTools

julia> @btime swapalot(o, a, b)
  22.156 ms (0 allocations: 0 bytes)

Changing the code to mutable struct Inner speeds it up dramatically:

julia> @btime swapalot(o, a, b)
  196.052 μs (0 allocations: 0 bytes)

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 5, 2024

Sure, if you are comparing 2 unrelated operations, that may be the case. But it you compare similar things, then the difference is much reduced:

julia> mutable struct Foo
           @atomic inner::Base.RefValue{Inner}
           const y::UInt
       end

julia> function swapalot(o, a, b)
           for i = 1:1_000_000
               inner = (@atomic :acquire o.inner)[]
               if inner.a == 1
                   @atomic :release o.inner = Ref(a)
               elseif inner.a == 0
                   @atomic :release o.inner = Ref(b)
               end
           end
       end

julia> a = Inner(0,0,0,0,0); b = Inner(1,0,0,0,0); o = Foo(Ref(a), 0);

julia> @btime swapalot(o, a, b)
  11.001 ms (1000000 allocations: 45.78 MiB)
# vs 33.200 ms (0 allocations: 0 bytes) on this machine

As far as I know, the remaining difference could be solved by using a better design for the lock itself, since right now it is a very generic jl_mutex_unlock_nogc and is thus may be somewhat slow for this particular exact use case

@topolarity topolarity mentioned this pull request Jul 13, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants