Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark time on very small corpora #33

Closed
BurntSushi opened this issue Sep 23, 2016 · 3 comments
Closed

benchmark time on very small corpora #33

BurntSushi opened this issue Sep 23, 2016 · 3 comments
Labels
enhancement An enhancement to the functionality of the software.

Comments

@BurntSushi
Copy link
Owner

An end user reports that rg isn't as fast on ag on very small repositories. While it seems trivial, if this is because of startup time, then it's worth investigating and fixing.

@BurntSushi BurntSushi added the enhancement An enhancement to the functionality of the software. label Sep 23, 2016
@BurntSushi
Copy link
Owner Author

#77 is another report of grep beating rg on a single file because of startup time.

@BurntSushi
Copy link
Owner Author

Someone ran micobenchmarks on infinitesimally sized input. grep is twice as fast.

@BurntSushi
Copy link
Owner Author

OK, I think I can call this one done. I think a variety of small improvements have mostly fixed this:

  • Switch from Docopt to clap (probably the biggest contributor).
  • Permit ignore handling/parsing to reuse previous results when applicable.
  • Use a parallel recursive directory iterator.

In particular, on the previously linked microbenchmark:

$ python -m timeit -n 1 -r 10 -v -s 'import os' 'os.system("cat /etc/group | grep 1000 > /dev/null")'
raw times: 0.0115 0.0104 0.0103 0.0104 0.00902 0.00929 0.00842 0.00919 0.00792 0.00861
1 loops, best of 10: 7.92 msec per loop
$ python -m timeit -n 1 -r 10 -v -s 'import os' 'os.system("cat /etc/group | rg 1000 > /dev/null")'
raw times: 0.0103 0.0103 0.00946 0.00881 0.00871 0.00809 0.00802 0.00764 0.00722 0.00714
1 loops, best of 10: 7.14 msec per loop

Compare this with ripgrep 0.2.6:

python -m timeit -n 1 -r 10 -v -s 'import os' 'os.system("cat /etc/group | rg-0.2.6 1000 > /dev/null")'
raw times: 0.0261 0.0251 0.0257 0.0217 0.0234 0.0239 0.0237 0.0235 0.022 0.0245
1 loops, best of 10: 21.7 msec per loop

I've also done some testing on small repos. This repo qualifies as quite small. There is a ton of variance. The following were ran in succession:

[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305

real    0m0.033s
user    0m0.257s
sys     0m0.007s
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305

real    0m0.025s
user    0m0.157s
sys     0m0.023s
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305

real    0m0.013s
user    0m0.017s
sys     0m0.030s
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305

real    0m0.040s
user    0m0.270s
sys     0m0.007s

ag seems to have slightly less variance:

[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306

real    0m0.029s
user    0m0.007s
sys     0m0.010s
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306

real    0m0.022s
user    0m0.007s
sys     0m0.010s
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306

real    0m0.023s
user    0m0.020s
sys     0m0.003s
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306

real    0m0.016s
user    0m0.013s
sys     0m0.003s

But, if we fix both programs to use a single thread, then variance becomes almost non-existent:

[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305

real    0m0.011s
user    0m0.003s
sys     0m0.007s
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305

real    0m0.011s
user    0m0.007s
sys     0m0.007s
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305

real    0m0.011s
user    0m0.000s
sys     0m0.010s
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305

real    0m0.010s
user    0m0.003s
sys     0m0.003s

and ag:

[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306

real    0m0.018s
user    0m0.010s
sys     0m0.000s
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306

real    0m0.018s
user    0m0.010s
sys     0m0.000s
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306

real    0m0.018s
user    0m0.010s
sys     0m0.003s
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306

real    0m0.021s
user    0m0.007s
sys     0m0.003s

In other words, on small corpora, ripgrep (and, to a lesser extent, ag) seem to be quite susceptible to the overhead of creating threads, which also seems to introduce large variance.

I'm not sure there's anything we can really do about it. The difference in this particular example, at least, is nominal. From a user's perspective, we don't really care about the difference between 10ms and 30ms. However, the place where this can hurt us is if folks use ripgrep like it was grep, for example, in an xargs pipeline:

$ time find ./ -name '*.[ch]' -print0 | xargs -0 -P8 grep PM_RESUME | wc -l
10

real    0m0.213s
user    0m0.423s
sys     0m0.330s
$ time find ./ -name '*.[ch]' -print0 | xargs -0 -P8 rg PM_RESUME | wc -l
10

real    0m0.402s
user    0m4.750s
sys     0m0.460s

The overhead of ripgrep launching threads ends up making it twice as slow. Of course, in this case, ripgrep launching threads doesn't really make much sense since xargs is doing it for us. But this isn't something that a user will obviously know. Of course, forcing ripgrep to use a single thread brings it back down to grep speeds:

time find ./ -name '*.[ch]' -print0 | xargs -0 -P8 rg -j1 PM_RESUME | wc -l
10

real    0m0.206s
user    0m0.507s
sys     0m0.487s

The other case to consider is using xargs without -P and having ripgrep handle parallelism:

time find ./ -name '*.[ch]' -print0 | xargs -0 rg PM_RESUME | wc -l
10

real    0m0.463s
user    0m1.297s
sys     0m0.597s

While worse than using xargs -P, compare with grep:

time find ./ -name '*.[ch]' -print0 | xargs -0 grep PM_RESUME | wc -l
10

real    0m0.655s
user    0m0.477s
sys     0m0.347s

I'm not sure there's really too much we can do here. An end user can use ripgrep in an xargs pipeline optimally, but it definitely requires a bit of intuition on the user's behalf to realize that ripgrep is already multithreaded where as grep is not.

(This is yet another example proving that benchmarking these tools is ridiculously hard.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to the functionality of the software.
Projects
None yet
Development

No branches or pull requests

1 participant