Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using '^' (or '\A') with multi-line search doesn't always produce expected result #1878

Closed
BurntSushi opened this issue May 29, 2021 · 0 comments

Comments

@BurntSushi
Copy link
Owner

Initially found on HN: https://news.ycombinator.com/item?id=27324265

Specifically, while this output is correct (since ^ is set to be in regex multi-line mode always):

$ printf 'a\nbaz\nabc\n' | rg -U '^b'
baz

It should be the case that using (?-m)^b or \Ab would not print baz as a match. But that's not the case here:

$ printf 'a\nbaz\nabc\n' | rg -U '(?-m)^b'
baz
$ printf 'a\nbaz\nabc\n' | rg -U '\Ab'
baz

The issue here is that in this case, ripgrep isn't memory mapping the input. In that case, ripgrep tries to be "smart" and not actually read the entire contents on to the heap if it knows the pattern can't match through a line terminator. But in this case, we can't quite make that assumption since anchors can match line terminators as look-around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant