Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference: continue const-prop' when concrete-eval returns non-inlineable #50618

Merged
merged 1 commit into from
Aug 5, 2023

Conversation

aviatesk
Copy link
Sponsor Member

This commit fixes the regression in the following example:

julia> stritr() = iterate(("1", '2'), 1);

julia> @time stritr();
  0.000001 seconds (2 allocations: 64 bytes) # on master
  0.000000 seconds                           # on 1.9

The problem is that currently we don't inline result of concrete-eval when its result is not inlineable, although const-prop' or semi-concrete eval may be able to optimize and inline the method body. To improve the situation, this commit simply allows abstract_call_method_with_const_args to continue to semi-concrete eval and const-prop' when concrete-eval returned non-inlineable result.

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@aviatesk aviatesk requested a review from Keno July 21, 2023 08:29
@KristofferC KristofferC added the backport 1.10 Change should be backported to the 1.10 release label Jul 21, 2023
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

KristofferC added a commit that referenced this pull request Jul 24, 2023
Backported PRs:
- [x] #50411 <!-- Fix weird dispatch of * with zero arguments -->
- [x] #50202 <!-- Remove dynamic dispatch from _wait/wait2 -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50026 <!-- Store heapsnapshot files in tempdir() instead of
current directory -->
- [x] #50402 <!-- Add CPU feature helper function -->
- [x] #50387 <!-- update newpages pointer after actually sweeping pages
-->
- [x] #50424 <!-- avoid potential type-instability in _replace_(str,
...) -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50466 <!-- relax assertion involving pg->nold to reflect that it
may be a bit in… -->
- [x] #50490 <!-- Fix compat annotation for italic printstyled -->
- [x] #50488 <!-- fix typo in `Base.isassigned` with `Tridiagonal` -->
- [x] #50476 <!-- Profile: Add specifying dir for `take_heap_snapshot`
and handling if current dir is unwritable -->
- [x] #50461 <!-- fix typo in the --gcthreads argument description -->
- [x] #50528 <!-- ssair: Correctly handle stmt insertion at end of basic
block -->
- [x] #50533 <!-- ensure internal_obj_base_ptr checks whether objects
past freelist pointer are in freelist -->
- [x] #49322 <!-- improve cat design / performance -->
- [x] #50540 <!-- gc: remove over-eager assertion -->
- [x] #50542 <!-- gf: remove unnecessary assert cycle==depth -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #50058 <!-- Add unwrapping mechanism for triangular mul and solves
-->
- [x] #50551 <!-- typeintersect: also record chained `innervars` -->
- [x] #50552 <!-- read(io, Char): fix read with too many leading ones
-->
- [x] #50541 <!-- precompile: ensure globals are not accidentally
created where disallowed -->
- [x] #50576 <!-- use atomic compare exchange when setting the GC
mark-bit -->
- [x] #50578 <!-- gf: make method overwrite/delete an error during
precompile -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50597 <!-- Fix memory corruption if task is launched inside
finalizer -->
- [x] #50591 <!-- build: fix various makefile bugs -->
- [x] #50599 <!-- faster invalid object lookup in conservative gc -->
- [x] #50634 <!-- 🤖 [master] Bump the SparseArrays stdlib from b4b0e72
to 99c99b4 -->
- [x] #50639 <!-- Backport LLVM patches to fix various issues. -->
- [x] #50546 <!-- Revert storage of method instance in LineInfoNode -->
- [x] #50631 <!-- Shift DCE pass to optimize imaging mode code better
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50587 <!-- isassigned for ranges with BigInt indices -->
- [x] #50144 <!-- Page based heap size heuristics -->


Need manual backport:
- [ ] #50595 <!-- Rename ENV variable `JULIA_USE_NEW_PARSER` ->
`JULIA_USE_FLISP_PARSER` -->



Non-merged PRs with backport label:
- [ ] #50637 <!-- Remove SparseArrays legacy code -->
- [ ] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [ ] #50598 <!-- only limit types in stack traces in the REPL -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [ ] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [ ] #50172 <!-- print feature flags used for matching pkgimage -->
@oscardssmith
Copy link
Member

why can't we just widenconst the result?

@aviatesk
Copy link
Sponsor Member Author

Concrete evaluation only returns a result value, not optimized body of the method. When we can't inline the concrete-eval result, it is more efficient if we can inline optimized method body.

@aviatesk aviatesk force-pushed the avi/continue-const-prop branch 2 times, most recently from 2646e3b to 6441803 Compare July 31, 2023 16:25
@vtjnash vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label Jul 31, 2023
@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 31, 2023

(note that the commit message may need some corrections when merging, since it is attempting to ping several users, which is bad for a commit message)

@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Aug 1, 2023

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@aviatesk aviatesk added status:DO NOT MERGE Do not merge this PR! and removed status:merge me PR is reviewed. Merge when all tests are passing labels Aug 2, 2023
@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Aug 2, 2023

This turns out to be still WIP 'cause I found the following kind of cases are not well optimized by this PR:

julia> Base.@assume_effects :foldable function continue_const_prop(i, j)
           chars = map(Char, i:j)
           String(chars)
       end
continue_const_prop (generic function with 1 method)

julia> code_typed() do
           length(continue_const_prop(1, 5))
       end
1-element Vector{Any}:
 CodeInfo(
1%1 = invoke Base._collect($(QuoteNode(1:5))::UnitRange{Int64}, $(QuoteNode(Base.Generator{UnitRange{Int64}, Type{Char}}(Char, 1:5)))::Base.Generator{UnitRange{Int64}, Type{Char}}, $(QuoteNode(Base.EltypeUnknown()))::Base.EltypeUnknown, $(QuoteNode(Base.HasShape{1}()))::Base.HasShape{1})::Vector{Char}
│        invoke Main.String(%1::Vector{Char})::String
└──      return 5
) => Int64

This is because now continue_const_prop is inlined and then the inlined body isn't DCE-eligible. Previously this example is fully optimized as all concrete-evaluated calls are folded into constant or noinlined.

@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Aug 2, 2023

This PR requires #50764, #50765 and #50767 to be completed.

@aviatesk aviatesk force-pushed the avi/continue-const-prop branch 2 times, most recently from 52bf956 to 7e341f0 Compare August 3, 2023 13:13
@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Aug 3, 2023

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@vtjnash
Copy link
Sponsor Member

vtjnash commented Aug 3, 2023

SGTM

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

…able

This commit fixes the regression in the following example:
```julia
julia> stritr() = iterate(("1", '2'), 1);

julia> @time stritr();
  0.000001 seconds (2 allocations: 64 bytes) # on master
  0.000000 seconds                           # on 1.9
```

The problem is that currently we don't inline result of concrete-eval
when its result is not inlineable, although const-prop' or semi-concrete
eval may be able to optimize and inline the method body.
To improve the situation, this commit simply allows
`abstract_call_method_with_const_args` to continue to const-prop' when
concrete-eval returned non-inlineable result.
@aviatesk aviatesk removed the status:DO NOT MERGE Do not merge this PR! label Aug 4, 2023
@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Aug 4, 2023

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk aviatesk merged commit 117ef2e into master Aug 5, 2023
7 of 8 checks passed
@aviatesk aviatesk deleted the avi/continue-const-prop branch August 5, 2023 03:15
KristofferC added a commit that referenced this pull request Aug 16, 2023
Backported PRs:
- [x] #50637 <!-- Remove SparseArrays legacy code -->
- [x] #50665 <!-- print `@time` msg into print buffer -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #50670 <!-- Make reinterpret specialize fully. -->
- [x] #50666 <!-- include `--pkgimage=no` caches for stdlibs -->
- [x] #50765 
- [x] #50764
- [x] #50768
- [x] #50767
- [x] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [x] #50689 <!-- Attach `tanpi` docstring to method -->
- [x] #50671 <!-- Fix rdiv of complex lhs by real factorizations -->
- [x] #50598 <!-- only limit types in stack traces in the REPL -->
- [x] #50766 <!-- Don't partition alwaysinline functions -->
- [x] #50771 <!-- re-allow non-string values in ENV `get!` -->
- [x] #50682 <!-- Add fallback if we have make a weird GC decision. -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50172 <!-- print feature flags used for matching pkgimage -->
- [x] #50844 <!-- Bump OpenBLAS binaries to use the new GEMM
multithreading threshold -->
- [x] #50826 <!-- Update dependency builds -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50655 <!-- fix hashing regression. -->
- [x] #50779 <!-- Minor refactor to image generation -->
- [x] #50791 <!-- Make symbols internal in jl_create_native, and only
externalize them when partitioning -->
- [x] #50724 <!-- Merge opaque closure modules with the rest of the
workqueue -->
- [x] #50738 <!-- Add alignment to constant globals -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:

Contains multiple commits, manual intervention needed:

Non-merged PRs with backport label:
- [ ] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50809 <!-- Limit type-printing in MethodError -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants