Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignoring files based on their size #369

Closed
mensfeld opened this issue Feb 18, 2017 · 7 comments
Closed

Ignoring files based on their size #369

mensfeld opened this issue Feb 18, 2017 · 7 comments
Labels
enhancement An enhancement to the functionality of the software. help wanted Others are encouraged to work on this issue.

Comments

@mensfeld
Copy link

Hey, is there an option to use ripgrep and exclude files based on their size? I have a lot of +10MB text files that I don't want to grep and would love to be able to exclude them all. Thank you.

@BurntSushi
Copy link
Owner

No. Can you use some other attribute of the text files? You can specify paths to ignore at the command line with -g '!dir/to/ignore' or -g '!*.txt'.

@BurntSushi BurntSushi added the question An issue that is lacking clarity on one or more points. label Feb 18, 2017
@mensfeld
Copy link
Author

Hello @BurntSushi. Thank you for such fast reaction. Unfortunately I can't. I use rg as a part in a "chain" where I get some locations and I need to do search based on that. If it is not possible, I will just wrap it around with a shell script with a list of dirs/files to ignore.

@BurntSushi
Copy link
Owner

It is indeed not possible right now. It would have to be explicitly added to ripgrep as a new flag. Which I'd be OK with, since it should be relatively easy to add. It will have a performance cost, but that seems fine since the end user would have to specifically ask for it anyway.

@BurntSushi BurntSushi added enhancement An enhancement to the functionality of the software. help wanted Others are encouraged to work on this issue. and removed question An issue that is lacking clarity on one or more points. labels Feb 18, 2017
@mensfeld
Copy link
Author

@BurntSushi - yes it would have a perf. impact but only when needed which shouldn't be that bad. For now I will just find those files on my own and exclude them. Thank you

@tiehuis
Copy link
Contributor

tiehuis commented Feb 23, 2017

I'll probably end up looking at implementing this over the next few days. Just to outline the intended approach from a quick glance before beginning.

  • Add a new size matcher into Ignore and co in the ignore crate (in file dir.rs)
  • Add the size check into matched (in file dir.rs)
  • Add the top-level options into all the required locations
  • Add some tests

There probably should be some nice intuitive values expected for the command line option, too (i.e. --limit-size=2K --limit-size=1G).

@BurntSushi
Copy link
Owner

Add a new size matcher into Ignore and co in the ignore crate (in file dir.rs)

I think the ignore crate is probably the right place, but it feels like the filter should be implemented in walk.rs, not dir.rs. The stuff in dir.rs is pretty tightly coupled to glob matching, where as the stuff in walk.dir is generalized to recursive directory walking. Note that you will have to implement the check on both the single threaded and parallel walkers.

There probably should be some nice intuitive values expected for the command line option, too (i.e. --limit-size=2K --limit-size=1G).

That seems like nice extra credit. Feel free to punt on that if you like. :-)

tiehuis added a commit to tiehuis/ripgrep that referenced this issue Feb 28, 2017
The --max-filesize option allows filtering files which are larger than
the specified limit. This is potentially useful if one is attempting to
search a number of large files without common file-types/suffixes.

See BurntSushi#369.
BurntSushi pushed a commit that referenced this issue Mar 8, 2017
The --max-filesize option allows filtering files which are larger than
the specified limit. This is potentially useful if one is attempting to
search a number of large files without common file-types/suffixes.

See #369.
@BurntSushi
Copy link
Owner

Fixed in #385

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to the functionality of the software. help wanted Others are encouraged to work on this issue.
Projects
None yet
Development

No branches or pull requests

3 participants