Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve codegen, accuracy of inlining cost for unknown intrinsics #44349

Merged
merged 3 commits into from
Oct 28, 2023

Conversation

ianatol
Copy link
Member

@ianatol ianatol commented Feb 25, 2022

This PR gives a more accurate cost to non-constant intrinsics, which were previously calculated as cost 1000, and correspondingly improves their codegen a bit using a runtime lookup table. My first time really touching our C level codegen logic, so there may be some flaws there.

Credit for this idea and help along the way: @Keno

An example:

Before:

test_ukint(f) in Main at REPL[3]:1
│ ─ %-1  = invoke test_ukint(::Core.IntrinsicFunction)::Any
2 11000 %1 = (f)(Main.Int32, 5)::Any                                     │
  └──    0      return %1                                                   │
Select a call to descend into or  to ascend. [q]uit. [b]ookmark.
Toggles: [o]ptimize, [w]arn, [h]ide type-stable statements, [d]ebuginfo, [r]emarks, [e]ffects, [i]nlining costs, [t]ype annotations, [s]yntax highlight for Source/LLVM/Native.
Show: [S]ource code, [A]ST, [T]yped code, [L]LVM IR, [N]ative code
Actions: [E]dit source code, [R]evise and redisplay
Advanced: dump [P]arams cache.
 • 

After:

test_ukint(f) in Main at REPL[1]:1
│ ─ %-1  = invoke test_ukint(::Core.IntrinsicFunction)::Any
2 120 %1 = (f)(Main.Int32, 5)::Any                                       │
  └──  0      return %1                                                     │
Select a call to descend into or  to ascend. [q]uit. [b]ookmark.
Toggles: [o]ptimize, [w]arn, [h]ide type-stable statements, [d]ebuginfo, [r]emarks, [e]ffects, [i]nlining costs, [t]ype annotations, [s]yntax highlight for Source/LLVM/Native.
Show: [S]ource code, [A]ST, [T]yped code, [L]LVM IR, [N]ative code
Actions: [E]dit source code, [R]evise and redisplay
Advanced: dump [P]arams cache.
 • 

src/codegen.cpp Outdated Show resolved Hide resolved
src/intrinsics.cpp Outdated Show resolved Hide resolved
src/intrinsics.cpp Outdated Show resolved Hide resolved
src/intrinsics.cpp Outdated Show resolved Hide resolved
src/intrinsics.cpp Outdated Show resolved Hide resolved
src/intrinsics.cpp Outdated Show resolved Hide resolved
src/intrinsics.cpp Outdated Show resolved Hide resolved
@ViralBShah ViralBShah added the compiler:codegen Generation of LLVM IR and native code label Feb 27, 2022
@vtjnash vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label Oct 27, 2023
@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 27, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

Not really a giant improvement, but it does take the time for this call from 35ns to 17ns, and similar improvement if it is declared as just Builtin as well:

z::Core.IntrinsicFunction = Core.Intrinsics.neg_int
@bprofile z(1)

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 27, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@IanButterworth IanButterworth merged commit 4975c02 into JuliaLang:master Oct 28, 2023
9 checks passed
@giordano giordano removed the status:merge me PR is reviewed. Merge when all tests are passing label Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants