-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(commands): unescape yank-join separator #11012
base: master
Are you sure you want to change the base?
Conversation
dd0d9b8
to
b5ca706
Compare
In testing the user input I noticed that |
373a091
to
d2de7dc
Compare
This commit introduces a `str` module and an `unescape` function to `helix-stdx`, which processes escape sequences in strings and converts them into their corresponding literal characters. The function handles a variety of escape sequences, including: - `\n` for newlines - `\t` for tabs - `\u{...}` for Unicode characters The function does not unescape sequences like `\\` to `\`, leaving them as they are. This opinionated behavior ensures that only certain escape sequences are processed, and is built around user input, not general input. Given that its based around user input, a conservative approach was taken for handling bad input, where if the string cannot be processed as expected, it returns the original input. Examples: - Converting escaped newlines: `unescape("hello\\nworld")` results in `"hello\nworld"`. - Converting escaped tabs: `unescape("hello\\tworld")` results in `"hello\tworld"`. - Converting Unicode escape sequences: `unescape("hello\\u{1f929}world")` results in `"hello馃ぉworld"`. - Handling invalid Unicode escape sequence: `unescape("hello\\u{999999999}world")` results in the original `"hello\\u{999999999}world"`. The implementation also includes tests, but no gaurantees for edgecases.
This commit enhances the `yank-join` command by incorporating the `unescape` function to process the separator provided by the user. This improvement ensures that any escape sequences in the separator are correctly interpreted, aligning with user expectations. Previously, the `yank-join` command joined the current selection with the separator as-is. With this update, escape sequences in the separator such as: - `\\n` for newlines - `\\t` for tabs - `\\u{...}` for Unicode characters are unescaped to their corresponding literal characters before joining the selection.
d2de7dc
to
b58fe56
Compare
@pascalkuthe and I discussed this a bit and we're thinking that shellwords needs larger changes. Kakoune has different strategies for how it parses - In any case I'm wary of adding a |
I would agree with the assessment. Id be willing to devote some time for these changes. After a quick look at current usage, we might be able to get away with storing the raw I've not had the pleasure of using Kakoune, so I'm not familiar with its specificities, but from what I gather, in helix, we would only need two states, not three? If you have any guidance as to what needs to be covered I can see what I can put together. |
Introduces a
str
module and anunescape
function tohelix-stdx
, which processes escape sequences in strings and converts them into their corresponding literal characters. The function handles a variety of escape sequences, including:\n
for newlines\t
for tabs\u{...}
for Unicode charactersThe function does not unescape sequences like
\\
to\
, leaving them as they are. This opinionated behavior ensures that only certain escape sequences are processed, and is built around user input, not general input.Given that its based around user input, a conservative approach was taken for handling bad input, where if the string cannot be processed as expected, it returns the original input.
Examples:
unescape("hello\\nworld")
results in"hello\nworld"
.unescape("hello\\tworld")
results in"hello\tworld"
.unescape("hello\\u{1f929}world")
results in"hello馃ぉworld"
.unescape("hello\\u{999999999}world")
results in the original"hello\\u{999999999}world"
.The implementation also includes tests, but no guarantees for edge cases.
Previously, the
yank-join
command joined the current selection with the separator as-is. With this update, escape sequences in the separator are unescaped to their corresponding literal characters before joining the selection, aligning with user expectations:Newline:
![separator_unescape_newline](https://private-user-images.githubusercontent.com/12489689/341956257-dcc917ee-6a08-4711-9062-d32c4cc2edd0.gif?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA3Njg1ODMsIm5iZiI6MTcyMDc2ODI4MywicGF0aCI6Ii8xMjQ4OTY4OS8zNDE5NTYyNTctZGNjOTE3ZWUtNmEwOC00NzExLTkwNjItZDMyYzRjYzJlZGQwLmdpZj9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzEyVDA3MTEyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNhYmMyZGEzMzNlZjJhYTU3OTQxMjUxMTQwMGZiNjRjODQ1NTIxOTU0NTA2MmE1YjcxMDAyOTk3MzM2NGE2YWYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.DD_zOAwVuOOzZdNVV9EFUHiPMI2dhP6rVUnBfK3GY0U)
Tab:
![separator_unescape_tab](https://private-user-images.githubusercontent.com/12489689/341956261-94f9a594-d2aa-4171-b0dd-e41f46a931a6.gif?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA3Njg1ODMsIm5iZiI6MTcyMDc2ODI4MywicGF0aCI6Ii8xMjQ4OTY4OS8zNDE5NTYyNjEtOTRmOWE1OTQtZDJhYS00MTcxLWIwZGQtZTQxZjQ2YTkzMWE2LmdpZj9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzEyVDA3MTEyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTc3NzI3MmY1Zjc1NzY2ZDZiMTY0YTNhMWVhYWQ0NWI3Njg2Y2U4MTRhY2I2MWQzZmE2ODEzZWQwMTY3MjZmZjUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.2ScGOgEvpVlQfgygQ9iA7VylD9G7RDFaR9q9yb3s1s0)
Unicode:
![separator_unescape_unicode](https://private-user-images.githubusercontent.com/12489689/341956269-cdf251f7-6a56-4efa-a42b-5de3f8efc9a8.gif?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA3Njg1ODMsIm5iZiI6MTcyMDc2ODI4MywicGF0aCI6Ii8xMjQ4OTY4OS8zNDE5NTYyNjktY2RmMjUxZjctNmE1Ni00ZWZhLWE0MmItNWRlM2Y4ZWZjOWE4LmdpZj9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzEyVDA3MTEyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTRhODA3YzA5OTZiMTM1NDZjNDQxYjQ0ODgxODIyOTdjMGU3MTg2ZWU0Nzc5NTJlMGY1NGQwY2EzODA1MWM3ZjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Yw3qbBAglHwDKfpQ9rjzT61rm2Cz6aknpc5FXAyQYe0)
Closes: #10993