Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement last match capture group registers #8088

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

doy
Copy link
Contributor

@doy doy commented Aug 28, 2023

fixes #2167

Comment on lines +389 to +465
#[derive(Debug, Clone)]
pub struct Captures {
groups: Vec<Option<String>>,
}

impl Captures {
pub fn from_regex(text: RopeSlice<'_>, cap: &regex::Captures<'_>, start_byte: usize) -> Self {
Self {
groups: cap
.iter()
.map(|mat| {
mat.map(|mat| {
let start = text.byte_to_char(start_byte + mat.start());
let end = text.byte_to_char(start_byte + mat.end());
Range::new(start, end).fragment(text).to_string()
})
})
.collect(),
}
}
}

#[derive(Debug, Clone)]
pub struct SelectionRange {
inner: Range,
captures: Option<Captures>,
}

impl SelectionRange {
pub fn range(&self) -> Range {
self.inner
}

pub fn with_range(mut self, range: Range) -> Self {
self.inner = range;
self
}

pub fn capture(&self, group: usize) -> Option<String> {
self.captures
.as_ref()
.and_then(|captures| captures.groups.get(group).cloned())
.flatten()
}

pub fn with_captures(mut self, captures: Captures) -> Self {
self.captures = Some(captures);
self
}
}

impl Eq for SelectionRange {}

impl PartialEq for SelectionRange {
fn eq(&self, other: &Self) -> bool {
self.inner == other.inner
}
}

impl From<Range> for SelectionRange {
fn from(full: Range) -> Self {
Self {
inner: full,
captures: None,
}
}
}

impl From<&Range> for SelectionRange {
fn from(full: &Range) -> Self {
Self {
inner: *full,
captures: None,
}
}
}

Copy link
Member

@pascalkuthe pascalkuthe Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that captuers should be added to helix core/the selection. Instead I think we should simply write the content of each capture group (as a string) into a register greedily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tried this first, and it didn't work well at all. the problem is that the thing you're typically trying to do is something similar to vim's s/foo/bar/gc where you want to make a substitution based on the match, but not necessarily all of the substitutions. this maps very clearly onto multiple cursors, but if you just greedily write the capture group contents as the match happens, things like removing individual selections breaks everything (the captures get shifted around, etc).

for example, given this file:

foo
bar
baz

if you use %s with a regex of ..(.), you'll get three selections, one for each line. "& will hold foo, bar, and baz as separate selections, and "1 will hold o, r, and z as separate selections. if you then toggle to the second selection (on bar) and remove it, using something like "&p will paste foo after foo, but will paste bar after baz, which isn't what i would want at all.

i think for this to work as a coherent feature at all, the capture groups need to stick to the selections they were captured from in some way.

additionally, just writing the contents to registers and leaving it at that makes the behavior of multiple views of the same file open at once fairly nonsensical for similar reasons, although i don't think that would come up in practice as often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also matches kakoune's behavior too, for what it's worth - removing one of a group of selections doesn't change what the "1 register will contain for the remaining selections

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I am gonna need to look at how kakoune handles this. I do kind of see what you mean but I stil dont think that something like this belongs into helix core. Did you refer to kakounse source code? Do they really store that in their core selection type?

Also it's not entirely clear to me how capture groups are actually associated to ranges? So if you do 3Cxs<regex><ret>gg3Cp what would happen? You do a jump so the selection is basically replaced. Wouldn't that mean the regex capture is "gone". So " 1p does nothing? Or does return the value of the primary selection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't look at the kakoune source when writing this, but looking at it now it does seem fairly similar https://github.com/mawww/kakoune/blob/master/src/selection.hh#L64

and yeah, in your example, the gg collapses it down to a single selection, so the captures from the other selections are gone, even if you increase the number of selections later. it does keep the values from one of the selections (rather than clearing them entirely) because i'm not sure how to differentiate between this case and just like pressing n to clear a selection (which would ideally keep the captures from the remaining selections the same). i'm open to suggestions on how to better handle the selection merging - i tried to pick options that made sense, but it's not always obvious what the right answer should be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to diverge from kakoune here. Even kakounes own creator has called the current implementation not ideal. It adds a lot of overhead and means that these registers would persist across undo checkpoints and jumplist.... Really not ideal.

I think the capture groups should just be normal registers. We can solve these usecases you described bymodify these register in a few select commands (ideally we would not actually change the register content but would just have a "map" of indecies that point at the original string. I think realistically we would only really need to update that register in the regex selection commands K s (and their various variants like c-s), C/a-c and maybe ( and ). The extra code would still be annoying but I its still better than modifying the core selection type. We likely only want to do this "update of ranges" as long as the current number of selection matches the regist length. That way we avoid permanently updating those registers (beffer performance) and also allow the uscase I described above (gg5C"1p)

@pascalkuthe pascalkuthe added C-enhancement Category: Improvements S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 10, 2023
the goal is to propagate this metadata when the selections remain
basically "the same" for some fuzzy definition of same, including when
individual ranges in the selection are added/deleted/etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: Improvements S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integer registers for regex matches
2 participants