Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Honor whitespace escape sequences in yank-join #10993

Open
bitcrshr opened this issue Jun 19, 2024 · 4 comments · May be fixed by #11012
Open

Honor whitespace escape sequences in yank-join #10993

bitcrshr opened this issue Jun 19, 2024 · 4 comments · May be fixed by #11012
Labels
C-enhancement Category: Improvements

Comments

@bitcrshr
Copy link

Say you have the following file:
image

And you select just the initial asdf bits:
image

Then, you enter :yank-join ,\n. I think the expected behavior would be this:
image

But instead, you get this:
image

This isn't a very big deal, as it's pretty easy to just go ahead with the default newline separator and add things as you wish, but it could be pretty convenient to just do it all at once.

I dug into the code a bit, and I think that the ShellWords::From<&str> implementation is to blame here. I suspect there's quite good reason for it, but I'm wondering if there might be a reasonable workaround.

I would be happy to give a shot at a PR for this, but I'm not quite sure what the implications might be for not escaping the newlines (or whatever the workaround might be) so some guidance may be needed.

Thanks a bunch :)

@bitcrshr bitcrshr added the C-enhancement Category: Improvements label Jun 19, 2024
@RoloEdits
Copy link
Contributor

RoloEdits commented Jun 20, 2024

What operating system are you on?

When trying to reproduce behavior on Windows I get asdf,\nasdf not asdf,nasdf.

I also grepped for ShellWords and nothing came up. What module would this be in? Ah, found it. Shellwords, not ShellWords.

@RoloEdits
Copy link
Contributor

Made a pretty naive implimentation to escape given sequences. Seems to work fine? I added other a few other common patterns that might crop up for light testing. Not sure if something like this already exists as a helper function somewhere, so just hacked this one together.

fn yank_joined_impl(editor: &mut Editor, separator: &str, register: char) {
    let (view, doc) = current!(editor);
    let text = doc.text().slice(..);

    let selection = doc.selection(view.id);
    let selections = selection.len();
    let joined = selection
        .fragments(text)
        .fold(String::new(), |mut acc, fragment| {
            if !acc.is_empty() {
-                acc.push_str(separator);
+                acc.push_str(&escape(separator));
            }
            acc.push_str(&fragment);
            acc
        });

    match editor.registers.write(register, vec![joined]) {
        Ok(_) => editor.set_status(format!(
            "joined and yanked {selections} selection{} to register {register}",
            if selections == 1 { "" } else { "s" }
        )),
        Err(err) => editor.set_error(err.to_string()),
    }
}
fn escape(separator: &str) -> Cow<'_, str> {
    enum State {
        Normal,
        Escape,
    }

    let mut escaped = String::new();
    let mut state = State::Normal;
    let mut is_escaped = false;

    for (idx, ch) in separator.char_indices() {
        match state {
            State::Normal => match ch {
                '\\' => {
                    if !is_escaped {
                        // PERF: As not every separator will be escaped, we use `String::new` as that has no initial
                        // allocation. If an escape is found, then we reserve capacity thats the len of the separator
                        // as the new escaped string will be at least that long.
                        escaped.reserve(separator.len());
                        if idx > 0 {
                            // First time finding an escape, so all prior chars can be added to the new escaped version
                            // if its not the very first char found.
                            escaped.push_str(&separator[0..idx]);
                        }
                    }
                    state = State::Escape;
                    is_escaped = true;
                }
                _ => {
                    if is_escaped {
                        escaped.push(ch);
                    }
                }
            },
            State::Escape => match ch {
                'n' => {
                    escaped.push('\n');
                    state = State::Normal;
                }
                't' => {
                    escaped.push('\t');
                    state = State::Normal;
                }
                'r' => {
                    escaped.push('\r');
                    state = State::Normal;
                }
                '\\' => {
                    escaped.push('\\');
                    state = State::Normal;
                }
                _ => {
                    escaped.push('\\');
                    escaped.push(ch);
                    state = State::Normal;
                }
            },
        }
    }

    if is_escaped {
        escaped.into()
    } else {
        separator.into()
    }
}

@bitcrshr
Copy link
Author

Ah, interesting. I work with a remote setup with helix running in tmux on Ubuntu, but SSHd from MacOS with Alacritty. Thanks for the code, I'll give it a shot and see what I can come up with!

@RoloEdits
Copy link
Contributor

This kind of escaping can expand further than just newlines. Currently you cannot paste in unicode, like 🤩, in the yank-joined command buffer. But by offering a way to escape the literal \u{1f929} then you could join with this emoji even if you can't paste it in. Tabs would be another one. Can't tab in the command buffer. You could even add spaces with \u{20}, something that has no meaning currently as it gets ignored.

If you don't mind, I'll try opening up a pr that would provide a way to unescape these things in general, and then use it to unescape the separator.

@RoloEdits RoloEdits linked a pull request Jun 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: Improvements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants