Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize: revise inlining costs #51599

Merged
merged 2 commits into from
Oct 6, 2023
Merged

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Oct 5, 2023

Add a bonus for Intrinsics called with mostly constant arguments. We know that simple expressions like x*1 + 0 will get optimized later by LLVM, and also likely fold into other expressions, so try to reflect that in the cost estimated earlier. Additionally rebalance some of the other costs to more accurately reflect what they take in assembly.

Add a bonus for Intrinsics called with mostly constant arguments. We
know that simple expressions like `x*1 + 0` will get optimized later by
LLVM, and also likely fold into other expressions, so try to reflect
that in the cost estimated earlier. Additionally rebalance some of the
other costs to more accurately reflect what they take in assembly.
@vtjnash vtjnash added performance Must go faster compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) needs nanosoldier run This PR should have benchmarks run on it labels Oct 5, 2023
Copy link
Sponsor Member

@aviatesk aviatesk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

(f === Intrinsics.cglobal || f === Intrinsics.llvmcall) # these hold malformed IR, so argextype will crash on them
return cost
end
aty2 = widenconditional(argextype(ex.args[2], src, sptypes))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

widenconditional should be unnecessary at this phase.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure if some external package might try to request inlining costs before we called this on everything, and I wanted to still give a consistent answer then. I think it is no-cost to be here?

@aviatesk
Copy link
Sponsor Member

aviatesk commented Oct 6, 2023

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@vtjnash vtjnash removed the needs nanosoldier run This PR should have benchmarks run on it label Oct 6, 2023
@vtjnash vtjnash merged commit 0ab032a into master Oct 6, 2023
7 of 8 checks passed
@vtjnash vtjnash deleted the jn/inlining-cost-update-consts branch October 6, 2023 13:24
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Oct 6, 2023

The benchmark change here is whether we inline the call or dispatch the signature to call

broadcast(::typeof(+), ::Int64, ::NTuple{10, Float64}, ::Int64, ::Vararg{Any})

The penalty for making that choice is higher than it should be, and poorly reflected in the way we compute the inlining cost heuristics, but that is not relevant to this PR.

@maleadt
Copy link
Member

maleadt commented Oct 26, 2023

This change broke GPUCompiler's always_inline mode. MWE:

@inline @generated function sink(i::T) where {T}
    llvmcall_str = """%slot = alloca i64
                     store volatile i64 %0, i64* %slot
                     %value = load volatile i64, i64* %slot
                     ret i64 %value"""
    return :(Base.llvmcall($llvmcall_str, T, Tuple{T}, i))
end

@eval f(x) = $(foldl((e, _) -> :($sink($e) + $sink(x)), 1:100; init=:x))
g() = f(10)

include("newinterp.jl")
@newinterp AlwaysInline
Core.Compiler.OptimizationParams(::AlwaysInline) =
    Core.Compiler.OptimizationParams(; inline_cost_threshold=typemax(Int))
Base.code_ircode(g, Tuple{}; interp=AlwaysInline())

@vtjnash Are we doing something fishy here, or why was this affected by this PR?

@aviatesk
Copy link
Sponsor Member

Currently OptimizationParams(; inline_cost_threshold=typemax(Int)) doesn't allow for "always inline", given we cap the inlining cost at typemax(UInt16) (, which is very confusing). We're gonna need #48257 to make this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants