Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--encoding auto #1103

Closed
roblourens opened this issue Nov 7, 2018 · 2 comments
Closed

--encoding auto #1103

roblourens opened this issue Nov 7, 2018 · 2 comments
Labels
doc An issue with or an improvement to documentation.

Comments

@roblourens
Copy link
Contributor

It's not really clear how the automatic encoding detection is supposed to work - which encodings should it be able to detect? Do you have any test cases that I can look at, or can you point to where the code is? It doesn't appear that encoding_rs is responsible for this, as best I can tell?

If I have a better idea of how it should work, I can file a better issue (or not file one)

@BurntSushi BurntSushi added the doc An issue with or an improvement to documentation. label Nov 7, 2018
@BurntSushi
Copy link
Owner

Yeah, the man page is pretty light on details here, but I think the guide explains it a bit better: https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md#file-encoding Although, the guide never mentions the auto value, instead, it just talks about what ripgrep does by "default."

In summary, by default, ripgrep looks for a BOM. If it sees a UTF-16 BOM, then it does UTF-16 to UTF-8 transcoding automatically and searches the UTF-8. I don't think anything else is done.

Folks have filed issues in the past about being more aggressive in encoding detection, but I'd rather not dive into those waters if possible. It is possible for one to use --pre and probably --pre-glob to implement one's own detection & transcoding though.

@roblourens
Copy link
Contributor Author

Got it, yeah I think the man page implies that it might do more than it currently does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc An issue with or an improvement to documentation.
Projects
None yet
Development

No branches or pull requests

2 participants