Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when searching Ruby language interpreter source code #1052

Closed
jackc opened this issue Sep 13, 2018 · 2 comments · Fixed by #1065
Closed

Panic when searching Ruby language interpreter source code #1052

jackc opened this issue Sep 13, 2018 · 2 comments · Fixed by #1065
Labels
bug A bug.

Comments

@jackc
Copy link

jackc commented Sep 13, 2018

What version of ripgrep are you using?

jack@happy:~/Downloads/ruby-2.5.1$ rg --version
ripgrep 0.10.0
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

cargo install ripgrep

What operating system are you using ripgrep on?

Ubuntu 18.04.1

jack@happy:~/Downloads/ruby-2.5.1$ uname -a
Linux happy 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

If this is a bug, what are the steps to reproduce the behavior?

Download Ruby source code and search in it's directory.

jack@happy:/tmp$ curl --silent https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz > ruby-2.5.1.tar.gz
jack@happy:/tmp$ tar xf ruby-2.5.1.tar.gz 
jack@happy:/tmp$ cd ruby-2.5.1/
jack@happy:/tmp/ruby-2.5.1$ rg foobarbazquz
thread '<unnamed>' panicked at 'index out of bounds: the len is 1 but the index is 1', /home/jack/.cargo/registry/src/github.com-1ecc6299db9ec823/encoding_rs-0.8.6/src/handles.rs:309:21
note: Run with `RUST_BACKTRACE=1` for a backtrace.
^C

Process then is hung.

If this is a bug, what is the actual behavior?

Command: rg --debug foobarbazquz

Output: https://gist.github.com/jackc/06e3cd8ce8ae238e6762249564cc1a76

If this is a bug, what is the expected behavior?

Not panic.

@BurntSushi
Copy link
Owner

Interesting! It looks like the panic is coming from inside encoding_rs, but it could still be ripgrep's fault. I'll need to dig into this and come up with a smaller reproduction. Thanks for reporting this!

@BurntSushi BurntSushi added the bug A bug. label Sep 13, 2018
BurntSushi added a commit to BurntSushi/encoding_rs_io that referenced this issue Sep 25, 2018
This works around what *appears* to be a bug in encoding_rs where the
UTF-16 decoder will panic seemingly because the output buffer is too
small, even though we use a buffer of size 4 which should be sufficient
for transcoding to UTF-8.

A bug was filed upstream:
hsivonen/encoding_rs#34

This bug was originally found in ripgrep:
BurntSushi/ripgrep#1052
BurntSushi added a commit that referenced this issue Sep 25, 2018
This update includes a work-around for a presumed bug in encoding_rs
that causes a panic:
hsivonen/encoding_rs#34

Specifically, to reproduce this in ripgrep, one can run the following:

    $ curl -LO https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
    $ tar xf ruby-2.5.1.tar.gz
    $ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg
    thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1'

Fixes #1052
@BurntSushi
Copy link
Owner

BurntSushi commented Sep 25, 2018

A smaller reproduction of this bug:

$ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg

It turns out that this svg file starts with a UTF-16LE BOM, but does not actually appear to be UTF-16. In any case, this trips over a corner case in the streaming transcoder that in turn causes a panic. The fault is either in my implementation of the streaming transcoder (by not upholding some precondition of the transcoder) or in the implementation of the encoding handling itself. Either way, I filed a bug to get to the bottom of it: hsivonen/encoding_rs#34

For now, I patched this via a work-around in the streaming transcoder, although it's not clear that the root cause has been fixed. PR #1065 brings in the workaround to ripgrep.

Thanks so much for reporting this!

BurntSushi added a commit that referenced this issue Sep 25, 2018
This update includes a work-around for a presumed bug in encoding_rs
that causes a panic:
hsivonen/encoding_rs#34

Specifically, to reproduce this in ripgrep, one can run the following:

    $ curl -LO https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
    $ tar xf ruby-2.5.1.tar.gz
    $ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg
    thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1'

Fixes #1052
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants