-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
take_while_m_n
is invalid for multi-byte UTF-8 characters
#1630
Comments
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Feb 22, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Mar 6, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Mar 7, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Mar 7, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Mar 7, 2023
ackxolotl
added a commit
to ackxolotl/nom
that referenced
this issue
Mar 7, 2023
Geal
added a commit
that referenced
this issue
Mar 15, 2023
fixes #1630 * test slice_index for strings with multibyte chars * fix take_while_m_n for multibyte UTF-8 chars * reintroduce Input::iter_indices Co-authored-by: Simon Ellmann <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
take_m_n
does:while
Input::position
is defined asSo it is looking up the byte position of the first non-matching character but then treats it as the
Item
count, comparing it withm
andn
and then usingslice_index
to convert it to a byte position. This should cause it to hitm
andn
sooner than it should and split at the wrong position.I see that #913 was previously reported but the 7483885 and #1097 just added the "
Item
count to byte position" conversion but didn't address that it was using a byte position.The reason the test from #913 is working is the else-clause for when there are more valid elements than
n
(4 is greater than 1), it caps it byn
(1), making theItem
count valid that it passes toslice_index
.To show this failing, we need to change
m
,n
, and the input slightlyThis fails, with the left-hand side reporting
Ok(("", "😃!")
. The only reason it doesn't crash is thatslice_index
, when exhausted, returnsself.len()
.I believe this problem predates #1612
The text was updated successfully, but these errors were encountered: