Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"file length overflows usize" #922

Closed
K-arti-k opened this issue May 19, 2018 · 8 comments · Fixed by #1017
Closed

"file length overflows usize" #922

K-arti-k opened this issue May 19, 2018 · 8 comments · Fixed by #1017
Labels
libripgrep An issue related to modularizing ripgrep into libraries. question An issue that is lacking clarity on one or more points.

Comments

@K-arti-k
Copy link

#### What version of ripgrep are you using?

ripgrep 0.7.1 -AVX -SIMD.

#### What operating system are you using ripgrep on?

Operating System: 64-bit
Windows 7: Cygwin Terminal

#### Describe your question, feature request, or bug.

With specific files when trying to grep I get an error saying "file length overflows usize"

#### If this is a bug, what are the steps to reproduce the behavior?

$ ./rg -i -w "Gurkiran" "zoosk.txt"
file length overflows usize

@BurntSushi
Copy link
Owner

BurntSushi commented May 19, 2018

Sorry, but this isn't reproducible. Please provide enough information (like the file you're searching) in order to reproduce the problem. The issue template you filled out explicitly requests this:

If possible, please include both your search patterns and the corpus on which
you are searching. Unless the bug is very obvious, then it is unlikely that it
will be fixed if the ripgrep maintainers cannot reproduce it.

If the corpus is too big and you cannot decrease its size, file the bug anyway
and the ripgrep maintainers will help figure out next steps.

Please also state how you installed ripgrep. Please also try the latest version, which is 0.8.1.

@BurntSushi BurntSushi added the question An issue that is lacking clarity on one or more points. label May 19, 2018
@peter-bertok
Copy link

This is indirectly caused by this issue in memmap-rs: danburkert/memmap-rs#67

Someone already submitted a patch for memmap-rs back in April 2018, but fixing this for the general case will take some time...

@danburkert
Copy link
Contributor

Something doesn't add up here. Ripgrep isn't using file offsets, so the usize/u64 offset API issue in memmap shouldn't be in play here. That error message only occurs when you attempt to memory map a file whose length exceeds usize::MAX. That shouldn't be possible on a 64-bit OS.

@BurntSushi
Copy link
Owner

@danburkert Yeah, something is definitely fishy with thia bug report. Either the info is probably wrong, or Windows is running in 32 bit mode or maybe even they downloaded the 32 bit executable? I don't know.

@BatmanAoD
Copy link

BatmanAoD commented Jul 5, 2018

Per @peter-bertok's comment in users.rust-lang, this is indeed reproducible with the 32 bit executable, so I expect @K-arti-k did indeed encounter this issue with the 32 bit executable.

I feel that searching through a multi-Gb file in 32-bit mode (especially on a 64-bit OS!) is not a particularly compelling use-case for RipGrep. That said, Peter Bertok is presumably correct that this could eventually be fixed in a future version of memmap-rs.

@BurntSushi
Copy link
Owner

BurntSushi commented Jul 5, 2018

@BatmanAoD Note that this bug should be fixed on master. ripgrep just won't use memory maps and use standard read calls instead. (Also, I'm not sure that a fix to this belongs in the memmap crate.)

@danburkert
Copy link
Contributor

That said, Peter Bertok is presumably correct that this could eventually be fixed in a future version of memmap-rs.

No, this is not correct. This ripgrep issue has no relation to the u64/usize offset API issue, which is the only memmap bug which has been brought forward. It's fundamentally not possible to create a memory map with a length which exceeds usize::MAX.

@BatmanAoD
Copy link

@danburkert In the linked discussion (which unfortunately is long and contentious, and has a broad range of topics that are mostly unrelated), @peter-bertok claims that 32-bit systems can create memory maps for arbitrarily sized files, using "views" that fit into the actual available memory space (which is really isize::MAX rather than usize::MAX for some or possibly all 32-bit Windows OSes).

@BurntSushi BurntSushi added the libripgrep An issue related to modularizing ripgrep into libraries. label Jul 22, 2018
BurntSushi added a commit that referenced this issue Aug 19, 2018
This commit updates the CHANGELOG to reflect all the work done to make
libripgrep a reality.

* Closes #162 (libripgrep)
* Closes #176 (multiline search)
* Closes #188 (opt-in PCRE2 support)
* Closes #244 (JSON output)
* Closes #416 (Windows CRLF support)
* Closes #917 (trim prefix whitespace)
* Closes #993 (add --null-data flag)
* Closes #997 (--passthru works with --replace)

* Fixes #2 (memory maps and context handling work)
* Fixes #200 (ripgrep stops when pipe is closed)
* Fixes #389 (more intuitive `-w/--word-regexp`)
* Fixes #643 (detection of stdin on Windows is better)
* Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird)
* Fixes #764 (coalesce color escapes)
* Fixes #922 (memory maps failing is no big deal)
* Fixes #937 (color escapes no longer used for empty matches)
* Fixes #940 (--passthru does not impact exit status)
* Fixes #1013 (show runtime CPU features in --version output)
BurntSushi added a commit that referenced this issue Aug 20, 2018
This commit updates the CHANGELOG to reflect all the work done to make
libripgrep a reality.

* Closes #162 (libripgrep)
* Closes #176 (multiline search)
* Closes #188 (opt-in PCRE2 support)
* Closes #244 (JSON output)
* Closes #416 (Windows CRLF support)
* Closes #917 (trim prefix whitespace)
* Closes #993 (add --null-data flag)
* Closes #997 (--passthru works with --replace)

* Fixes #2 (memory maps and context handling work)
* Fixes #200 (ripgrep stops when pipe is closed)
* Fixes #389 (more intuitive `-w/--word-regexp`)
* Fixes #643 (detection of stdin on Windows is better)
* Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird)
* Fixes #764 (coalesce color escapes)
* Fixes #922 (memory maps failing is no big deal)
* Fixes #937 (color escapes no longer used for empty matches)
* Fixes #940 (--passthru does not impact exit status)
* Fixes #1013 (show runtime CPU features in --version output)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libripgrep An issue related to modularizing ripgrep into libraries. question An issue that is lacking clarity on one or more points.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants