Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flag to enable "hybrid" regex mode #1155

Closed
BurntSushi opened this issue Jan 8, 2019 · 0 comments
Closed

add flag to enable "hybrid" regex mode #1155

BurntSushi opened this issue Jan 8, 2019 · 0 comments
Labels
enhancement An enhancement to the functionality of the software.

Comments

@BurntSushi
Copy link
Owner

BurntSushi commented Jan 8, 2019

In some cases, it would be nice for ripgrep to use Rust's regex engine by default whenever possible, and then fall back to PCRE2 when "advanced" regex features are used. In terms of user level documentation, I might suggest this specification:

--auto-hybrid-regex
    When this flag is used, ripgrep will dynamically choose between supported
    regex engines depending on the features used in a pattern. When ripgrep
    chooses a regex engine, it applies that choice for every regex provided to
    ripgrep (e.g., via multiple `-e/--regexp` or `-f/--file` flags).

    As an example of how this flag might behave, ripgrep will attempt to use its
    default finite automata based regex engine whenever the pattern can be
    successfully compiled with that regex engine. If PCRE2 is enabled and if the
    pattern given could not be compiled with the default regex engine, then PCRE2
    will be automatically used for searching. If PCRE2 isn't available, then this flag
    has no effect because there is only one regex engine to choose from.

    In the future, ripgrep may adjust its heuristics for how it decides which regex
    engine to use. In general, the heuristics will be limited to a static analysis of
    the patterns, and not to any specific runtime behavior observed while searching
    files.

    The primary downside of using this flag is that it may not always be obvious
    which regex engine ripgrep uses, and thus, the match semantics or performance
    profile of ripgrep may subtly and unexpectedly change. However, in many cases,
    all regex engines will agree on what constitutes a match and it can be nice to
    transparently support more advanced regex features like look-around and
    backreferences without explicitly needing to enable them.

    This flag can be disabled with `--no-auto-hybrid-regex`.

I had initially thought about adding this when I added PCRE2, since it's a somewhat logical addition. However, I held off because I wanted a real use case first. One such use case is here: microsoft/vscode#64606

In terms of implementation, we should add debug logs indicating which regex engine is being used and any compilation errors that are otherwise suppressed in normal output.

cc @roblourens

@BurntSushi BurntSushi added the enhancement An enhancement to the functionality of the software. label Jan 24, 2019
BurntSushi added a commit that referenced this issue Apr 14, 2019
This flag, when set, will automatically dispatch to PCRE2 if the given
regex cannot be compiled by Rust's regex engine. If both engines fail to
compile the regex, then both errors are surfaced.

Closes #1155
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to the functionality of the software.
Projects
None yet
Development

No branches or pull requests

1 participant