Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise an error on regex with backreference #268

Closed
anntzer opened this issue Dec 5, 2016 · 5 comments
Closed

Raise an error on regex with backreference #268

anntzer opened this issue Dec 5, 2016 · 5 comments
Labels
question An issue that is lacking clarity on one or more points.

Comments

@anntzer
Copy link
Contributor

anntzer commented Dec 5, 2016

I understand why rg's design fundamentally prevents support for backreferences, but it'd be nice if rg at least printed an error (warning?) message when a regex with a backreference is passed.

@BurntSushi
Copy link
Owner

Doesn't it print an error today?

@anntzer
Copy link
Contributor Author

anntzer commented Dec 5, 2016

rg '(foo) \1' prints no error as of 0.3.1 (Arch Linux AUR).

@BurntSushi
Copy link
Owner

Ah right. Unfortunately, \1 is actually an octal escape sequence, so I don't think that can change at the regex level. We could adopt a heuristic that emits a warning message from within ripgrep, but if someone intended to type an octal escape (which seems quite rare), then the warning would be wrong.

@BurntSushi BurntSushi added the question An issue that is lacking clarity on one or more points. label Dec 5, 2016
@anntzer
Copy link
Contributor Author

anntzer commented Dec 5, 2016

My guess is that more people are going to get burnt from trying a backreference, but I don't feel strongly about it either.

@BurntSushi
Copy link
Owner

In light of microsoft/vscode#27802, I think I should re-open this. I've been thinking a lot about this as I rewrite the regex parser, and I regret that the regex engine supports octal escapes at all. It seems like such an incredibly niche feature that it would be much better to just not support them at all in favor of emitting a better failure mode when someone tries to use backreferences.

With that said, ripgrep could certainly decide not to support octal escapes, even if the underlying regex engine does. The current parser doesn't support this kind of introspection, but my rewrite will. Once that's ready for primetime, I'll disable octal escapes and add a better error message inside ripgrep only.

@BurntSushi BurntSushi reopened this Jun 1, 2017
BurntSushi added a commit that referenced this issue Mar 14, 2018
This update brings with it many bug fixes:

  * Better error messages are printed overall. We also include
    explicit call out for unsupported features like backreferences
    and look-around.
  * Regexes like `\s*{` no longer emit incomprehensible errors.
  * Unicode escape sequences, such as `\u{..}` are now supported.

For the most part, this upgrade was done in a straight-forward way. We
resist the urge to refactor the `grep` crate, in anticipation of it
being rewritten anyway.

Note that we removed the `--fixed-strings` suggestion whenever a regex
syntax error occurs. In practice, I've found that it results in a lot of
false positives, and I believe that its use is not as paramount now that
regex parse errors are much more readable.

Closes #268, Closes #395, Closes #702, Closes #853
BurntSushi added a commit that referenced this issue Mar 14, 2018
This update brings with it many bug fixes:

  * Better error messages are printed overall. We also include
    explicit call out for unsupported features like backreferences
    and look-around.
  * Regexes like `\s*{` no longer emit incomprehensible errors.
  * Unicode escape sequences, such as `\u{..}` are now supported.

For the most part, this upgrade was done in a straight-forward way. We
resist the urge to refactor the `grep` crate, in anticipation of it
being rewritten anyway.

Note that we removed the `--fixed-strings` suggestion whenever a regex
syntax error occurs. In practice, I've found that it results in a lot of
false positives, and I believe that its use is not as paramount now that
regex parse errors are much more readable.

Closes #268, Closes #395, Closes #702, Closes #853
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question An issue that is lacking clarity on one or more points.
Projects
None yet
Development

No branches or pull requests

2 participants