Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize findall(f, ::AbstractArray{Bool}) #42202

Merged
merged 9 commits into from
Jun 1, 2022

Conversation

jakobnissen
Copy link
Contributor

@jakobnissen jakobnissen commented Sep 10, 2021

Fix #42187

The code is slightly more complicated than before, but faster than before the regression.
Also added more tests to cover some cases that was not covered before.

Benchmark (updated 2022-01-08):

using BenchmarkTools
A = rand(Bool, 2^16);

@btime findall(A);
@btime findall(!, A);
@btime findall(i -> i > 1, A);

1.7.0:
  267.358 μs (2 allocations: 255.86 KiB)
  59.157 μs (7 allocations: 268.67 KiB)
  7.508 μs (6 allocations: 12.44 KiB)

This PR:
  36.038 μs (2 allocations: 254.48 KiB)
  45.537 μs (5 allocations: 257.72 KiB)
  524.746 ns (2 allocations: 80 bytes)

@KristofferC
Copy link
Sponsor Member

Out of curiosity, do you know what caused the regression in the first place?

@jakobnissen
Copy link
Contributor Author

Appears to be #37177, which sped up findall by instantiating a vector of bool (or rather, a bitvector) firsts and working on that. Of course, that is needless when the input is already a vector of bool.

@jakobnissen
Copy link
Contributor Author

I don't understand the whitespace CI failure. There is no trailing whitespace. Running git rebase --whitespace=fix HEAD~1 doesn't seem to change anything (according to git diff findallbool origin/findallbool). In the CI failure there is a message: # We're changing some of the values so the let trick isn't applicable. Is there a problem with using let in Base?

base/array.jl Outdated Show resolved Hide resolved
base/array.jl Outdated Show resolved Hide resolved
@KristofferC
Copy link
Sponsor Member

I don't understand the whitespace CI failure.

You need to rebase on master to get the fix.

@jakobnissen jakobnissen force-pushed the findallbool branch 2 times, most recently from 8340cc9 to 2aa6627 Compare September 12, 2021 12:19
@jakobnissen
Copy link
Contributor Author

Okay, should be good now.

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, thanks! I can't believe I missed that obvious case at #37177. :-/

base/array.jl Outdated Show resolved Hide resolved
base/array.jl Outdated Show resolved Hide resolved
base/array.jl Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 22, 2021

Codecov Report

Merging #42202 (5bf1ca5) into master (f2c627e) will increase coverage by 0.11%.
The diff coverage is n/a.

❗ Current head 5bf1ca5 differs from pull request most recent head 61a725c. Consider uploading reports for the commit 61a725c to get more accurate results

@@            Coverage Diff             @@
##           master   #42202      +/-   ##
==========================================
+ Coverage   89.16%   89.28%   +0.11%     
==========================================
  Files         343      343              
  Lines       78671    79375     +704     
==========================================
+ Hits        70146    70867     +721     
+ Misses       8525     8508      -17     
Impacted Files Coverage Δ
base/compiler/methodtable.jl 59.09% <0.00%> (-34.53%) ⬇️
base/compiler/types.jl 66.66% <0.00%> (-23.04%) ⬇️
stdlib/REPL/src/TerminalMenus/RadioMenu.jl 79.31% <0.00%> (-20.69%) ⬇️
base/compiler/compiler.jl 25.00% <0.00%> (-18.75%) ⬇️
base/tuple.jl 73.66% <0.00%> (-17.25%) ⬇️
base/timing.jl 74.40% <0.00%> (-15.99%) ⬇️
stdlib/LinearAlgebra/src/blas.jl 88.70% <0.00%> (-9.09%) ⬇️
base/rounding.jl 93.18% <0.00%> (-6.82%) ⬇️
base/gcutils.jl 90.90% <0.00%> (-5.97%) ⬇️
base/lock.jl 92.74% <0.00%> (-5.62%) ⬇️
... and 236 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1eaa372...61a725c. Read the comment docs.

@jakobnissen
Copy link
Contributor Author

What this PR needs now is a decision on two details:

  • Currently, this PR evaluates f(true) and f(false) up front for speed. If f is a slow function and the input array contains zero or one element, this will make the PR slower. In most cases, it will be faster though. Also, its correctness assumes f is side effect-free. Are these two things acceptable?
  • Currently, this PR uses @inbounds, assuming the input type has a well-defined AbstractArray interface (e.g. length is not implemented wrong). Is this OK?

If both points are fine, this can be merged.

@nalimilan nalimilan added the status:triage This should be discussed on a triage call label Oct 5, 2021
base/array.jl Outdated Show resolved Hide resolved
base/array.jl Outdated Show resolved Hide resolved
base/array.jl Outdated Show resolved Hide resolved
base/array.jl Outdated Show resolved Hide resolved
jakobnissen and others added 7 commits December 26, 2021 21:08
* Take shortcuts if f(::Bool) always returns true or false
* Avoid branching in main loop to please branch predictor
* Switch to indexing-agnostic code
* Fix regression mentioned in JuliaLang#42187
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@oscardssmith oscardssmith removed the status:triage This should be discussed on a triage call label Jan 6, 2022
@oscardssmith oscardssmith added the status:forget me not PRs that one wants to make sure aren't forgotten label Jan 6, 2022
@aviatesk
Copy link
Sponsor Member

@nanosoldier runbenchmarks(!"scalar", vs=":master")
@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@aviatesk
Copy link
Sponsor Member

@nanosoldier runtests(["AeroAcoustics", "CellMLToolkit", "DiffEqFinancial", "FundamentalsNumericalComputation", "ImageBase", "InspectDR", "LongwaveModePropagator", "NonconvexMMA", "Tapestree", "UKCarbonIntensityData", "ClusteringGA", "DistributedArrays", "ElasticArrays", "FHIRClient", "GPUArrays", "GaussianMixtureAlignment", "HorseML", "OptimizationAlgorithms", "PointwiseKDEs", "QuadEig", "Rfam"], vs = ":master")

@aviatesk
Copy link
Sponsor Member

Any remaining discussion on this? The nanosolidier benchmark/pkgeval look good to me.

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@aviatesk
Copy link
Sponsor Member

@nanosoldier runtests(["DistributedArrays", "PointwiseKDEs"], vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@aviatesk
Copy link
Sponsor Member

The failure within PointwiseKDEs seems real.

@jakobnissen
Copy link
Contributor Author

I can look at it tomorrow if necessary :)

@aviatesk
Copy link
Sponsor Member

Yes please, I didn't follow the details.

@jakobnissen
Copy link
Contributor Author

jakobnissen commented May 21, 2022

I think the failure of PointwiseKDEs is unrelated:

  • I don't think the test suite even calls findall. Modifying findall using Revise to print to the terminal does not result in any printing when running the test suite (but does print when calling findall manually)
  • When running the test suite on master without this PR also fails sporadically (about one in every three runs).

@nalimilan
Copy link
Member

We don't usually run all packages' testsuites before merging PRs anyway. I don't see any reason why this one would be breaking.

@aviatesk aviatesk merged commit 4c4c94f into JuliaLang:master Jun 1, 2022
@jakobnissen jakobnissen deleted the findallbool branch June 1, 2022 13:59
@DilumAluthge DilumAluthge removed the status:forget me not PRs that one wants to make sure aren't forgotten label Sep 6, 2022
JeffBezanson added a commit that referenced this pull request Aug 24, 2023
KristofferC pushed a commit that referenced this pull request Aug 25, 2023
This reverts commit 4c4c94f. (Except
keeps the tests and adds a new one)

fixes #46425
KristofferC pushed a commit that referenced this pull request Sep 15, 2023
This reverts commit 4c4c94f. (Except
keeps the tests and adds a new one)

fixes #46425

(cherry picked from commit defe187)
KristofferC added a commit that referenced this pull request Oct 2, 2023
Backported PRs:
- [x] #48625 <!-- add replace(io, str, patterns...) -->
- [x] #48387 <!-- Improve documentation of sort-related functions -->
- [x] #48363 <!-- Revise sort.md and docstrings in sort.jl -->
- [x] #48977 <!-- Update SparseArrays.jl stdlib for SuiteSparse 7 -->
- [x] #50719 <!-- fix `CyclePadding(::DataType)` -->
- [x] #50694 <!-- inference: permit recursive type traits -->
- [x] #50860 <!-- Add `Base.get_extension` to docs/API -->
- [x] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [x] #50802 <!-- Makes IntrusiveLinkedListSynchronized mutable to avoid
allocation on poptask -->
- [x] #50858 <!-- Add a `threadpool` parameter to `Channel` constructor
-->
- [x] #50874 <!-- Restrict COFF to a single thread when symbol count is
high -->
- [x] #50822 <!-- Add default method for setmodifiers! -->
- [x] #50730 <!-- Fix integer overflow in `isapprox` -->
- [x] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [x] #50809 <!-- Limit type-printing in MethodError -->
- [x] #50915 <!-- Add note the `Task` about sticky bit -->
- [x] #50929 <!-- when widening tuple types in tmerge, only widen the
complex parts -->
- [x] #50928 <!-- Bump JuliaSyntax to 0.4.6 -->
- [x] #50959 <!-- Update libssh2 patches -->
- [x] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [x] #48542 <!-- Add docs on task-specific buffering using
multithreading -->
- [x] #50912 <!-- Separate foreign threads into a :foreign threadpool
-->
- [x] #51010 <!-- Add ORIGIN to SuiteSparse rpath on Linux/FreeBSD -->
- [x] #50753 <!-- cat: remove unused promote_eltype methods that confuse
inference -->
- [x] #51027 <!-- Implement realloc accounting correctly -->
- [x] #51019 <!-- fix a case of potentially use of undefined variable
when handling error in distributed message processing -->
- [x] #51039 <!-- Revert "Optimize findall(f, ::AbstractArray{Bool})
(#42202)" -->
- [x] #51036 <!-- add missing invoke edge for nospecialize targets -->
- [x] #51042 <!-- inference: fix return_type_tfunc modeling of concrete
functions -->
- [x] #51114 <!-- Workaround upstream FreeBSD issue #272992 -->
- [x] #50892 <!-- Add `JL_DLLIMPORT` to `small_typeof` declaration -->
- [x] #51154 <!-- broadcast: use recursion rather than ntuple to map
over a tuple -->
- [x] #51153 <!-- fix debug typo in "add missing invoke edge for
nospecialize targets (#51036)" -->
- [x] #51222 <!-- Check again if the tty is open inside the IO lock -->
- [x] #51236 <!-- Add lock around uv_unref during init -->
- [x] #51243 <!-- GMP: Gracefully handle more overflows. -->
- [x] #51254 <!-- Ryu: make sure adding zeros does not overwrite
trailing dot -->
- [x] #51175 <!-- shorten stale_age for cachefile lock -->
- [x] #51300 <!-- fix method definition error for bad vararg -->
- [x] #51307 <!-- fix force-throw ctrl-C on Windows -->
- [x] #51303 <!-- ensure revising structs is safe -->
- [x] #51393 
- [x] #51403 

Need manual backport:
- [x] #51009 <!-- fix #50562, regression in `in` of tuple of Symbols -->
- [x] #51053 <!-- Bump Statistics stdlib -->
- [x] #51013 <!-- fix #50709, type bound error due to free typevar in
sparam env -->
- [x] #51305 <!-- reduce test time for rounding and floatfuncs -->

Contains multiple commits, manual intervention needed:
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [ ] #51092 <!-- inference: fix bad effects for recursion -->
- [x] #51247 <!-- Check if malloc has succeeded before incrementing gc
stats -->
- [x] #51294 <!-- use LibGit2_jll for LibGit2 library -->

Non-merged PRs with backport label:
- [ ] #51132 <!-- Handle `AbstractQ` in concatenation -->
- [x] #51029 <!-- testdefs: make sure that if a test set changes the
active project, they change it back when they're done -->
- [ ] #50919 <!-- Code loading: do the "skipping mtime check for stdlib"
check regardless of the value of `ispath(f)` -->
- [ ] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions -->
- [x] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
nalimilan pushed a commit that referenced this pull request Nov 5, 2023
This reverts commit 4c4c94f. (Except
keeps the tests and adds a new one)

fixes #46425

(cherry picked from commit defe187)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance regression in findall on Bool array with predicate
10 participants