Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--only-matching only prints first match of matched line N times #451

Closed
s3rb31 opened this issue Apr 17, 2017 · 4 comments
Closed

--only-matching only prints first match of matched line N times #451

s3rb31 opened this issue Apr 17, 2017 · 4 comments

Comments

@s3rb31
Copy link

s3rb31 commented Apr 17, 2017

Take this file as test data. If I do the following with good old grep:

grep -Po "\/get\/v\/\d*" fgi.txt

I get the following result:

/get/v/19670
/get/v/19637
/get/v/19632
/get/v/19600
/get/v/19598
/get/v/19574
/get/v/19523
/get/v/19521
/get/v/19463
/get/v/19457
/get/v/19433
/get/v/19425
/get/v/19392
/get/v/19390
/get/v/19363
/get/v/19358
/get/v/19337
/get/v/19317
/get/v/19295
/get/v/19278
/get/v/19243
/get/v/19241
/get/v/19208
/get/v/19186
/get/v/19128
/get/v/19126

But if I try to achive the same result with rg:

rg -o "/get/v/\d*" fgi.txt -N

/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670
/get/v/19670

I think that behaviour ist really odd and cannot be intended. If I am not mistaken this also violates the documentation (manpage):

-o, --only-matching
    Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

But it does not print all the parts. It just prints the first match N times, where N is actually the correct number of matched parts.

I hope this can get fixed. I may create a PR myself if it is not too complicated and someone can lead me the right direction.

Greetings

@bmalehorn
Copy link
Contributor

Here's a much simpler example:

$ cat example.txt
1 2 3

$ grep -Po "[0-9]+" example.txt
1
2
3

$ rg -o "[0-9]+" example.txt -N
1
1
1

@s3rb31 if you want to take a stab it, I believe the bug is here: https://github.com/BurntSushi/ripgrep/blob/master/src/printer.rs#L311 Instead of running the regex again, it should use the original match data to get the start & end.

@kpp
Copy link
Sponsor Contributor

kpp commented Apr 20, 2017

The column is ok, while data is not:

~/ripgrep$ cat tests/digits.txt 
1 2 3
~/ripgrep$ cargo run -- "\d" tests/digits.txt -o -n --column
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/rg '\d' tests/digits.txt -o -n --column`
1:1:1
1:3:1
1:5:1

@kpp
Copy link
Sponsor Contributor

kpp commented Apr 20, 2017

Sorry for the bug 😭

kpp added a commit to kpp/ripgrep that referenced this issue Apr 20, 2017
@s3rb31
Copy link
Author

s3rb31 commented Apr 20, 2017

Oh, this was really quick. Thanks for the fix!

I compiled it and it performs well and solid until now. Will report back if I encounter more bugs.

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants