Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore: Avoid contention on num_pending #2642

Closed

Conversation

tavianator
Copy link
Contributor

Previously, every worker would increment the shared num_pending count on
every new work item, and decrement it after finishing them, leading to
lots of contention. Now, we only track the number of workers actively
running, so there is no contention except when workers go to sleep or
wake up.

Previously, every worker would increment the shared num_pending count on
every new work item, and decrement it after finishing them, leading to
lots of contention.  Now, we only track the number of workers actively
running, so there is no contention except when workers go to sleep or
wake up.
@tavianator
Copy link
Contributor Author

tavianator commented Oct 30, 2023

Quick benchmark results (updated):

tavianator@tachyon $ hyperfine -w2 fd-{before,after}" -j24 -u '^$' ~/code/bfs/bench/corpus/chromium"
Benchmark 1: fd-before -j24 -u '^$' ~/code/bfs/bench/corpus/chromium
  Time (mean ± σ):     297.6 ms ±  10.9 ms    [User: 3453.3 ms, System: 3379.4 ms]
  Range (min … max):   279.9 ms … 313.3 ms    10 runs
 
Benchmark 2: fd-after -j24 -u '^$' ~/code/bfs/bench/corpus/chromium
  Time (mean ± σ):     231.6 ms ±   6.9 ms    [User: 1568.7 ms, System: 3660.1 ms]
  Range (min … max):   224.6 ms … 248.9 ms    12 runs
 
Summary
  fd-after -j24 -u '^$' ~/code/bfs/bench/corpus/chromium ran
    1.29 ± 0.06 times faster than fd-before -j24 -u '^$' ~/code/bfs/bench/corpus/chromium
tavianator@tachyon $ hyperfine -w2 fd-{before,after}" -j48 -u '^$' ~/code/bfs/bench/corpus/chromium"
Benchmark 1: fd-before -j48 -u '^$' ~/code/bfs/bench/corpus/chromium
  Time (mean ± σ):     226.1 ms ±  17.9 ms    [User: 4597.6 ms, System: 4339.5 ms]
  Range (min … max):   210.0 ms … 260.0 ms    13 runs
 
Benchmark 2: fd-after -j48 -u '^$' ~/code/bfs/bench/corpus/chromium
  Time (mean ± σ):     199.6 ms ±  17.7 ms    [User: 2232.0 ms, System: 5361.9 ms]
  Range (min … max):   185.8 ms … 238.4 ms    15 runs
 
Summary
  fd-after -j48 -u '^$' ~/code/bfs/bench/corpus/chromium ran
    1.13 ± 0.13 times faster than fd-before -j48 -u '^$' ~/code/bfs/bench/corpus/chromium

Copy link
Owner

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll need to think on this one. I've had problems with termination in the past and this changes the logic around that.

crates/ignore/src/walk.rs Outdated Show resolved Hide resolved
crates/ignore/src/walk.rs Outdated Show resolved Hide resolved
This also switches the memory orderings of the dec/inc to acquire
/release, and adds a comment explaining the correctness of the
termination condition based on the region between the access pair.
@tavianator
Copy link
Contributor Author

Thanks! I'll need to think on this one. I've had problems with termination in the past and this changes the logic around that.

Understandable! I am reasonably sure of the correctness now that I've strengthened the memory orderings to acquire/release. But it's not a formal proof, and I haven't tried too hard to break it either.

@BurntSushi
Copy link
Owner

Oh, also, what about the relaxed loads? It looks like they should be fine to me here, but I wonder if Acquire/Release are necessary. I think that's probably why I used SeqCst originally (because I didn't think through what was actually required).

@tavianator
Copy link
Contributor Author

Oh, also, what about the relaxed loads? It looks like they should be fine to me here, but I wonder if Acquire/Release are necessary. I think that's probably why I used SeqCst originally (because I didn't think through what was actually required).

I think it was still correct with relaxed accesses due to other orderings imposed by the deque implementation, but it's easier to argue correctness with acquire/release and they're not performance-critical any more so I switched them.

Copy link
Owner

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've tried this out:

  1. I think I buy that this is a correct change.
  2. I have actually tried it on a number of corpora and everything looks OK.
  3. I haven't been able to observe any reliable performance difference. Nevertheless, I buy the argument here.

Thank you!!!

@BurntSushi BurntSushi added the rollup A PR that has been merged with many others in a rollup. label Nov 21, 2023
BurntSushi pushed a commit that referenced this pull request Nov 21, 2023
Previously, every worker would increment the shared num_pending count on
every new work item, and decrement it after finishing them, leading to
lots of contention.  Now, we only track the number of workers actively
running, so there is no contention except when workers go to sleep or
wake up.

Closes #2642
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Nov 28, 2023
14.0.2 (2023-11-27)
===================
This is a patch release with a few small bug fixes.

Bug fixes:

* [BUG #2654](BurntSushi/ripgrep#2654):
  Fix `deb` release sha256 sum file.
* [BUG #2658](BurntSushi/ripgrep#2658):
  Fix partial regression in the behavior of `--null-data --line-regexp`.
* [BUG #2659](BurntSushi/ripgrep#2659):
  Fix Fish shell completions.
* [BUG #2662](BurntSushi/ripgrep#2662):
  Fix typo in documentation for `-i/--ignore-case`.


14.0.1 (2023-11-26)
===================
This a patch release meant to fix `cargo install ripgrep` on Windows.

Bug fixes:

* [BUG #2653](BurntSushi/ripgrep#2653):
  Include `pkg/windows/Manifest.xml` in crate package.


14.0.0 (2023-11-26)
===================
ripgrep 14 is a new major version release of ripgrep that has some new
features, performance improvements and a lot of bug fixes.

The headlining feature in this release is hyperlink support. In this release,
they are an opt-in feature but may change to an opt-out feature in the future.
To enable them, try passing `--hyperlink-format default`. If you use [VS Code],
then try passing `--hyperlink-format vscode`. Please [report your experience
with hyperlinks][report-hyperlinks], positive or negative.

[VS Code]: https://code.visualstudio.com/
[report-hyperlinks]: BurntSushi/ripgrep#2611

Another headlining development in this release is that it contains a rewrite
of its regex engine. You generally shouldn't notice any changes, except for
some searches may get faster. You can read more about the [regex engine rewrite
on my blog][regex-internals]. Please [report your performance improvements or
regressions that you notice][report-perf].

[report-perf]: BurntSushi/ripgrep#2652

Finally, ripgrep switched the library it uses for argument parsing. Users
should not notice a difference in most cases (error messages have changed
somewhat), but flag overrides should generally be more consistent. For example,
things like `--no-ignore --ignore-vcs` work as one would expect (disables all
filtering related to ignore rules except for rules found in version control
systems such as `git`).

[regex-internals]: https://blog.burntsushi.net/regex-internals/

**BREAKING CHANGES**:

* `rg -C1 -A2` used to be equivalent to `rg -A2`, but now it is equivalent to
  `rg -B1 -A2`. That is, `-A` and `-B` no longer completely override `-C`.
  Instead, they only partially override `-C`.

Build process changes:

* ripgrep's shell completions and man page are now created by running ripgrep
with a new `--generate` flag. For example, `rg --generate man` will write a
man page in `roff` format on stdout. The release archives have not changed.
* The optional build dependency on `asciidoc` or `asciidoctor` has been
dropped. Previously, it was used to produce ripgrep's man page. ripgrep now
owns this process itself by writing `roff` directly.

Performance improvements:

* [PERF #1746](BurntSushi/ripgrep#1746):
  Make some cases with inner literals faster.
* [PERF #1760](BurntSushi/ripgrep#1760):
  Make most searches with `\b` look-arounds (among others) much faster.
* [PERF #2591](BurntSushi/ripgrep#2591):
  Parallel directory traversal now uses work stealing for faster searches.
* [PERF #2642](BurntSushi/ripgrep#2642):
  Parallel directory traversal has some contention reduced.

Feature enhancements:

* Added or improved file type filtering for Ada, DITA, Elixir, Fuchsia, Gentoo,
  Gradle, GraphQL, Markdown, Prolog, Raku, TypeScript, USD, V
* [FEATURE #665](BurntSushi/ripgrep#665):
  Add a new `--hyperlink-format` flag that turns file paths into hyperlinks.
* [FEATURE #1709](BurntSushi/ripgrep#1709):
  Improve documentation of ripgrep's behavior when stdout is a tty.
* [FEATURE #1737](BurntSushi/ripgrep#1737):
  Provide binaries for Apple silicon.
* [FEATURE #1790](BurntSushi/ripgrep#1790):
  Add new `--stop-on-nonmatch` flag.
* [FEATURE #1814](BurntSushi/ripgrep#1814):
  Flags are now categorized in `-h/--help` output and ripgrep's man page.
* [FEATURE #1838](BurntSushi/ripgrep#1838):
  An error is shown when searching for NUL bytes with binary detection enabled.
* [FEATURE #2195](BurntSushi/ripgrep#2195):
  When `extra-verbose` mode is enabled in zsh, show extra file type info.
* [FEATURE #2298](BurntSushi/ripgrep#2298):
  Add instructions for installing ripgrep using `cargo binstall`.
* [FEATURE #2409](BurntSushi/ripgrep#2409):
  Added installation instructions for `winget`.
* [FEATURE #2425](BurntSushi/ripgrep#2425):
  Shell completions (and man page) can be created via `rg --generate`.
* [FEATURE #2524](BurntSushi/ripgrep#2524):
  The `--debug` flag now indicates whether stdin or `./` is being searched.
* [FEATURE #2643](BurntSushi/ripgrep#2643):
  Make `-d` a short flag for `--max-depth`.
* [FEATURE #2645](BurntSushi/ripgrep#2645):
  The `--version` output will now also contain PCRE2 availability information.

Bug fixes:

* [BUG #884](BurntSushi/ripgrep#884):
  Don't error when `-v/--invert-match` is used multiple times.
* [BUG #1275](BurntSushi/ripgrep#1275):
  Fix bug with `\b` assertion in the regex engine.
* [BUG #1376](BurntSushi/ripgrep#1376):
  Using `--no-ignore --ignore-vcs` now works as one would expect.
* [BUG #1622](BurntSushi/ripgrep#1622):
  Add note about error messages to `-z/--search-zip` documentation.
* [BUG #1648](BurntSushi/ripgrep#1648):
  Fix bug where sometimes short flags with values, e.g., `-M 900`, would fail.
* [BUG #1701](BurntSushi/ripgrep#1701):
  Fix bug where some flags could not be repeated.
* [BUG #1757](BurntSushi/ripgrep#1757):
  Fix bug when searching a sub-directory didn't have ignores applied correctly.
* [BUG #1891](BurntSushi/ripgrep#1891):
  Fix bug when using `-w` with a regex that can match the empty string.
* [BUG #1911](BurntSushi/ripgrep#1911):
  Disable mmap searching in all non-64-bit environments.
* [BUG #1966](BurntSushi/ripgrep#1966):
  Fix bug where ripgrep can panic when printing to stderr.
* [BUG #2046](BurntSushi/ripgrep#2046):
  Clarify that `--pre` can accept any kind of path in the documentation.
* [BUG #2108](BurntSushi/ripgrep#2108):
  Improve docs for `-r/--replace` syntax.
* [BUG #2198](BurntSushi/ripgrep#2198):
  Fix bug where `--no-ignore-dot` would not ignore `.rgignore`.
* [BUG #2201](BurntSushi/ripgrep#2201):
  Improve docs for `-r/--replace` flag.
* [BUG #2288](BurntSushi/ripgrep#2288):
  `-A` and `-B` now only each partially override `-C`.
* [BUG #2236](BurntSushi/ripgrep#2236):
  Fix gitignore parsing bug where a trailing `\/` resulted in an error.
* [BUG #2243](BurntSushi/ripgrep#2243):
  Fix `--sort` flag for values other than `path`.
* [BUG #2246](BurntSushi/ripgrep#2246):
  Add note in `--debug` logs when binary files are ignored.
* [BUG #2337](BurntSushi/ripgrep#2337):
  Improve docs to mention that `--stats` is always implied by `--json`.
* [BUG #2381](BurntSushi/ripgrep#2381):
  Make `-p/--pretty` override flags like `--no-line-number`.
* [BUG #2392](BurntSushi/ripgrep#2392):
  Improve global git config parsing of the `excludesFile` field.
* [BUG #2418](BurntSushi/ripgrep#2418):
  Clarify sorting semantics of `--sort=path`.
* [BUG #2458](BurntSushi/ripgrep#2458):
  Make `--trim` run before `-M/--max-columns` takes effect.
* [BUG #2479](BurntSushi/ripgrep#2479):
  Add documentation about `.ignore`/`.rgignore` files in parent directories.
* [BUG #2480](BurntSushi/ripgrep#2480):
  Fix bug when using inline regex flags with `-e/--regexp`.
* [BUG #2505](BurntSushi/ripgrep#2505):
  Improve docs for `--vimgrep` by mentioning footguns and some work-arounds.
* [BUG #2519](BurntSushi/ripgrep#2519):
  Fix incorrect default value in documentation for `--field-match-separator`.
* [BUG #2523](BurntSushi/ripgrep#2523):
  Make executable searching take `.com` into account on Windows.
* [BUG #2574](BurntSushi/ripgrep#2574):
  Fix bug in `-w/--word-regexp` that would result in incorrect match offsets.
* [BUG #2623](BurntSushi/ripgrep#2623):
  Fix a number of bugs with the `-w/--word-regexp` flag.
* [BUG #2636](BurntSushi/ripgrep#2636):
  Strip release binaries for macOS.
@tavianator tavianator deleted the num-pending-contention branch December 19, 2023 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR that has been merged with many others in a rollup.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants